This attempts to fix a special case of the problem described in #12385,
where trying to do `clientv3.Watch` with an expired token would result
in `ErrGRPCPermissionDenied`, due to the failing authorization check in
`isWatchPermitted`. Furthermore, the client can't auto recover, since
`shouldRefreshToken` rightly returns false for the permission denied
error.
In this case, we would like to have a runbook to dynamically disable
auth, without causing any disruption. Doing so would immediately expire
all existing tokens, which would then cause the behavior described
above. This means existing watchers would still work for a period of
time after disabling auth, until they have to reconnect, e.g. due to a
rolling restart of server nodes.
This commit adds a client-side fix and a server-side fix, either of
which is sufficient to get the added test case to pass. Note that it is
an e2e test case instead of an integration one, as the reconnect only
happens if the server node is stopped via SIGINT or SIGTERM.
A generic fix for the problem described in #12385 would be better, as
that shall also fix this special case. However, the fix would likely be
a lot more involved, as some untangling of authn/authz is required.
In the sample code demonstrating how to specify a client request
timeout, the `cancel()` is called immediately after the Put, but it
should be deferred instead, giving the Put enough time to complete.
In the canonical Golang context
[docs](https://pkg.go.dev/context#WithTimeout), the sample code sets a
`defer cancel()` immediately after context creation, and with this
commit we adhere to that convention.
fixes:
```json
{
"level": "warn",
"ts": "2021-12-29T09:56:42.439-0800",
"logger": "etcd-client",
"caller": "v3@v3.5.1/retry_interceptor.go:62",
"msg": "retrying of unary invoker failed",
"target": "etcd-endpoints://0xc000213340/localhost:2379",
"attempt": 0,
"error": "rpc error: code = Canceled desc = context canceled"
}
```
Upgrading from v1.0.1.
Upgrading related dependencies
------------------------------
The following dependencies also had to be upgraded:
- go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.26.1
From v0.25.0. This gets rid of a transitive dependency on go.opentelemetry.io/otel@v1.0.1.
- google.golang.org/genproto@v0.0.0-20211118181313-81c1377c94b1
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
To fix a bug in the retry logic caused when the auth token is cleared after receiving `ErrInvalidAuthToken` from the server and the subsequent call to `getToken` also fails due to some reason (eg. context deadline exceeded).
This leaves the client without a token and the retry will continue to fail with `ErrUserEmpty` unless the token is refreshed.
A client-side optimization was made in #6100 to filter ascending key sorts to avoid an unnecessary re-sort since this is the order already returned by the back-end logic.
It seems to me that this really belongs on the server side since it's tied to the server implementation and should apply for any caller of the kv api (for example non-go clients).
Related, the client/v3 syncer depends on this default sorting which isn't explicit in the kv api contract. So I'm proposing the required sort parameters be included explicitly; it will take the fast path either way.
`Client.dial` can be called multiple times. For example, from regular
`NewClient` and [from the `Maintenance`
wrapper](6d451ab61d/client/v3/maintenance.go (L81-L84)).
This ends up creating two (if `Client.dial` is called twice) independent
`credentials.Bundle` instances:
- the first one is wired into `PerRPCCredentials` callback in gRPC
client.
- the second one is in `Client.authTokenBundle` and receives the latest
auth token updates.
When using username/password authentication and the server is configured
to issue JWTs, token rotation logic breaks because of the above
`credentials.Bundle` confusion.
- ErrGRPCNotCapable("etcdserver: not capable") -> codes.FailedPrecondition (it will not autofix, it requires new version of server)
- ErrGPRCNotSupportedForLearner("etcdserver: rpc not supported for learner") -> codes.FailedPrecondition (as long as its learner, the call will not work)
- ErrGRPCClusterVersionUnavailable("etcdserver: cluster version not found during downgrade") -> codes.FailedPrecondition (backend does not contain the version (old etcd?) so retry will not help)
https://github.com/etcd-io/etcd/runs/2599598633?check_suite_focus=true
```
{"level":"warn","ts":"2021-05-17T09:55:30.246Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc000539880/#initially=[unix://localhost:m30]","attempt":0,"error":"rpc error: code = Unavailable desc = etcdserver: rpc not supported for learner"}
{"level":"warn","ts":"2021-05-17T09:55:30.270Z","logger":"etcd-client","caller":"v3/retry_interceptor.go:62","msg":"retrying
of unary invoker
failed","target":"etcd-endpoints://0xc000539880/#initially=[unix://localhost:m30]","attempt":1,"error":"rpc
error: code = Unavailable desc = etcdserver: rpc not supported for
learner"}`
```
clientv3 logs (especially tests) were poluted with unattributed to testing.T log lines:
```
{"level":"warn","ts":"2021-04-29T12:42:11.055+0200","logger":"etcd-client","caller":"v3/retry_interceptor.go:64","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0000fafc0/#initially=[unix://localhost:m10]","attempt":0,"error":"rpc error: code = ResourceExhausted desc = etcdserver: mvcc: database space exceeded"}
```
The reasons were 2 fold:
- Interceptors were copying logger before "WithLogger" could modify it.
- We were not propagating the loggers in a few testing contexts.