Similar counts are exposed via Prometheus.
This adds the one that are perceived by etcd server.
e.g.
os_fd_limit 120000
os_fd_used 14
process_cpu_seconds_total 0.31
process_max_fds 120000
process_open_fds 17
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
Previously, only display the one that has been requested at least once.
Now it shows all metrics, as we do in v3.3 and v3.4+.
grpc_server_started_total{grpc_method="Alarm",grpc_service="etcdserverpb.Maintenance",grpc_type="unary"} 0
grpc_server_started_total{grpc_method="AuthDisable",grpc_service="etcdserverpb.Auth",grpc_type="unary"} 0
grpc_server_started_total{grpc_method="AuthEnable",grpc_service="etcdserverpb.Auth",grpc_type="unary"} 0
grpc_server_started_total{grpc_method="Authenticate",grpc_service="etcdserverpb.Auth",grpc_type="unary"} 0
grpc_server_started_total{grpc_method="Compact",grpc_service="etcdserverpb.KV",grpc_type="unary"} 0
grpc_server_started_total{grpc_method="Defragment",grpc_service="etcdserverpb.Maintenance",grpc_type="unary"} 0
grpc_server_started_total{grpc_method="DeleteRange",grpc_service="etcdserverpb.KV",grpc_type="unary"} 0
Should help document metrics.
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
etcdserver.
To improve debuggability of etcd v3. Added a grpc interceptor to log
info on incoming requests to etcd server. The log output includes remote
client info, request content (with value field redacted), request
handling latency, response size, etc.
Dependency on zap logger and grpc_middleware is removed during
backporting.
Added checking in logging interceptor. If debug level is disabled, skip
logUnaryRequestStats() to avoid potential performance degradation. (PR #10021)
{"level":"warn","ts":1527101858.6985068,"caller":"etcdserver/util.go:115","msg":"apply request took too long","took":0.114101529,"expected-duration":0.1,"prefix":"","request":"header:<ID:1029181977902852337> put:<key:\"\\000\\000...
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
{"level":"warn","ts":1527101858.4149103,"caller":"etcdserver/raft.go:370","msg":"failed to send out heartbeat; took too long, server is overloaded likely from slow disk","heartbeat-interval":0.1,"expected-duration":0.2,"exceeded-duration":0.025771662}
{"level":"warn","ts":1527101858.4149644,"caller":"etcdserver/raft.go:370","msg":"failed to send out heartbeat; took too long, server is overloaded likely from slow disk","heartbeat-interval":0.1,"expected-duration":0.2,"exceeded-duration":0.034015766}
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
We enable "raft.Config.CheckQuorum" by default in other
Raft initial starts. So should start with "ForceNewCluster".
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
The context error with cancel code is typically for user cancellation which
should be at debug level. For other error codes we should display a warning.
Fixes#9085