Once chk(ai) fails with auth.ErrAuthOldRevision it will always do,
regardless how many times you retry. So the error is better be returned
to fail the pending request and make the client re-authenticate.
This PR adds another probing routine to monitor the connection
for Raft message transports. Previously, we only monitored
snapshot transports.
In our production cluster, we found one TCP connection had >8-sec
latencies to a remote peer, but "etcd_network_peer_round_trip_time_seconds"
metrics shows <1-sec latency distribution, which means etcd server
was not sampling enough while such latency spikes happen
outside of snapshot pipeline connection.
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
"read index" doesn't tell much about the root cause.
Most likely, the local follower node is having slow
network, thus timing out waiting to receive read
index response from leader.
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
Distribution would be:
0.1 second or more
...
25.6 seconds or more
51.2 seconds or more
etcd_network_snapshot_send_success
etcd_network_snapshot_send_failures
etcd_network_snapshot_send_total_duration_seconds
etcd_network_snapshot_receive_success
etcd_network_snapshot_receive_failures
etcd_network_snapshot_receive_total_duration_seconds
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
```
etcd_server_id{server_id="8e9e05c52164694d"} 1
```
Useful for automating membership change operations,
no need to run "endpoint status" or "member list"
command to get member IDs.
Also, useful for "etcd_network" metrics with "To/From" labels.
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
etcd server
To improve debuggability of etcd v3. Added a grpc interceptor to log
info on incoming requests to etcd server. The log output includes
remote client info, request content (with value field redacted), request
handling latency, response size, etc. Uses zap logger if available,
otherwise uses capnslog.
Also did some clean up on the chaining of grpc interceptors on server
side.
Fix
=== RUN TestEmbedEtcd
==================
WARNING: DATA RACE
Write at 0x000001df86d0 by goroutine 711:
github.com/coreos/etcd/embed.(*Config).setupLogging.func1()
/go/src/github.com/coreos/etcd/vendor/google.golang.org/grpc/grpclog/loggerv2.go:68 +0x16c
sync.(*Once).Do()
/usr/local/go/src/sync/once.go:44 +0xe1
github.com/coreos/etcd/embed.(*Config).setupLogging()
in gRPC proxy tests.
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>