7 Commits

Author SHA1 Message Date
Gyuho Lee
abdb7ca17b etcdserver/api: add "etcd_network_snapshot_send_inflights_total", "etcd_network_snapshot_receive_inflights_total"
Useful for deciding when to terminate the unhealthy follower.
If the follower is receiving a leader snapshot, operator may wait.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-08 14:01:45 -07:00
Gyuho Lee
7b1ef37054 etcdserver/api/rafthttp: probe all Raft messages' RTT
This PR adds another probing routine to monitor the connection
for Raft message transports. Previously, we only monitored
snapshot transports.

In our production cluster, we found one TCP connection had >8-sec
latencies to a remote peer, but "etcd_network_peer_round_trip_time_seconds"
metrics shows <1-sec latency distribution, which means etcd server
was not sampling enough while such latency spikes happen
outside of snapshot pipeline connection.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-07 03:28:54 -07:00
Gyuho Lee
6f4c509ad8 etcdserver/api/rafthttp: add v3 snapshot send/receive metrics
Distribution would be:
0.1 second or more
...
25.6 seconds or more
51.2 seconds or more

etcd_network_snapshot_send_success
etcd_network_snapshot_send_failures
etcd_network_snapshot_send_total_duration_seconds
etcd_network_snapshot_receive_success
etcd_network_snapshot_receive_failures
etcd_network_snapshot_receive_total_duration_seconds

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-15 12:56:50 -07:00
Gyuho Lee
3821f3364d etcdserver/api/rafthttp: add "etcd_network_active_peers/disconnected_peers_total"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-23 14:23:45 -07:00
Gyuho Lee
640f5e64a9 etcdserver/api/rafthttp: document round-trip metrics, clean up
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-23 14:03:28 -07:00
Gyuho Lee
5a9e48be30 etcdserver/api/rafthttp: increase bucket upperbound up-to 3-sec
From 0.8 sec to 3.2 sec for more detailed latency analysis

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-23 14:03:28 -07:00
Gyuho Lee
7940113906 *: move internal "etcdserver/api/rafthttp"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-21 10:31:16 -07:00