39 Commits

Author SHA1 Message Date
Gyuho Lee
008074187c etcdserver: add OS level FD metrics
Similar counts are exposed via Prometheus.
This adds the one that are perceived by etcd server.

e.g.

os_fd_limit 120000
os_fd_used 14
process_cpu_seconds_total 0.31
process_max_fds 120000
process_open_fds 17

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2020-08-12 18:38:35 -07:00
cfc4n
ee963470f4 etcdserver:FDUsage set ticker to 10 minute from 5 seconds. This ticker will check File Descriptor Requirements ,and count all fds in used. And recorded some logs when in used >= limit/5*4. Just recorded message. If fds was more than 10K,It's low performance due to FDUsage() works. So need to increase it.
see https://github.com/etcd-io/etcd/issues/11969 for more detail.
2020-06-24 13:28:40 +08:00
zhangjianweibj
d5f79adc9c etcdserver: remove dup percentage sign in log 2019-09-04 22:03:49 -07:00
Gyuho Lee
f786b6ba16 etcdserver: add "etcd_server_snapshot_apply_in_progress_total"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-08-08 14:02:13 -07:00
宇慕
0b8727b3f3 etcdserver: add learner metrics 2019-06-05 10:51:21 +08:00
Gyuho Lee
34bd797e67 *: revert module import paths
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2019-05-28 15:39:35 -07:00
shivaramr
9150bf52d6 go modules: Fix module path version to include version number 2019-04-26 15:29:50 -07:00
nolouch
4de27039cb server: drop read request if found leader changed 2018-09-14 15:58:35 +08:00
Gyuho Lee
1399bc69ce etcdserver: update import paths to "go.etcd.io/etcd"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 17:47:55 -07:00
Gyuho Lee
eb6738053b etcdserver: add "etcd_server_id" metric
```
etcd_server_id{server_id="8e9e05c52164694d"} 1
```

Useful for automating membership change operations,
no need to run "endpoint status" or "member list"
command to get member IDs.

Also, useful for "etcd_network" metrics with "To/From" labels.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-13 00:39:18 -07:00
Gyuho Lee
643d791a11 etcdserver: add "etcd_server_go_version" metric
Currently, one has to look at server logs manually,
to see what Go version was used to build etcd server.

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-23 09:15:22 -07:00
Gyuho Lee
57ec2226cc etcdserver: support zap.Logger in FD monitoring routine
Keep replacing capnslog

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-20 14:59:03 -07:00
Gyuho Lee
37000cc4b8 etcdserver: add "etcd_server_slow_read_indexes_total"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-02 12:58:35 -07:00
Gyuho Lee
7dd7018835 etcdserver: add "etcd_server_quota_backend_bytes"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-06-07 10:44:51 -07:00
Gyuho Lee
a1aade8c1b etcdserver: rename to "heartbeat_send_failures_total"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-23 13:11:08 -07:00
Gyuho Lee
dd1baf6e96 etcdserver: add "etcd_server_slow_apply_total"
{"level":"warn","ts":1527101858.6985068,"caller":"etcdserver/util.go:115","msg":"apply request took too long","took":0.114101529,"expected-duration":0.1,"prefix":"","request":"header:<ID:1029181977902852337> put:<key:\"\\000\\000...

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-23 13:09:42 -07:00
Gyuho Lee
896a5e4a2b etcdserver: add "etcd_server_heartbeat_failures_total"
{"level":"warn","ts":1527101858.4149103,"caller":"etcdserver/raft.go:370","msg":"failed to send out heartbeat; took too long, server is overloaded likely from slow disk","heartbeat-interval":0.1,"expected-duration":0.2,"exceeded-duration":0.025771662}
{"level":"warn","ts":1527101858.4149644,"caller":"etcdserver/raft.go:370","msg":"failed to send out heartbeat; took too long, server is overloaded likely from slow disk","heartbeat-interval":0.1,"expected-duration":0.2,"exceeded-duration":0.034015766}

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-23 13:09:42 -07:00
Maciej Borsz
46bc966aa7 etcdserver: add is_leader prometheus metric that is 1 on the leader.
Before this change, we had now way to find a leader using /metrics
endpoint. This commit adds a metric to do that.
2018-04-19 11:47:40 +02:00
Gyuho Lee
0850ccbf45 *: revert "internal/version" change
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-02-26 17:11:40 -08:00
Gyuho Lee
37546f74ab *: move "version" to "internal/version"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-01-29 10:00:20 -08:00
Joe Betz
4cacbf19dd
metrics: Add server_version metric 2017-12-01 15:25:46 -08:00
Gyu-Ho Lee
45fd8279f0 etcdserver: add leaseExpired debugging metrics
Fix https://github.com/coreos/etcd/issues/8050.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-06-08 10:36:25 -07:00
Gyu-Ho Lee
d219e96359 etcdserver: use Counter for proposals_failed_total
It only ever goes up.
2016-08-10 09:27:51 -07:00
Xiang Li
598fa7a10e *: add pending/failed proposal metrics 2016-06-17 13:09:38 -07:00
Xiang Li
57474697af etcdserver: add applied metrics 2016-06-17 11:52:50 -07:00
Gyu-Ho Lee
abb4cd5646 etcdserver: update LICENSE header 2016-05-12 20:49:40 -07:00
Xiang Li
ab11415d25 *: add proposalsCommitted metrics 2016-05-10 10:56:25 -07:00
Xiang Li
824478be5f *: add has leader metrics 2016-05-06 13:59:19 -07:00
Xiang Li
76d073a2b5 *: add leader changes to metrics 2016-05-06 13:12:20 -07:00
Xiang Li
063307ec0a *: add metrics for grpc api 2016-05-05 13:45:52 -07:00
Brian Brazil
ea1d0f3e0d etcdserver: Improve some debug metrics.
The _total suffix is by convention for counters,
don't use it on a gauge. Clarify help string.
Tweak metric name so it'll sort with related metrics,
and be a little more understandable.

Remove open file descriptor metric, as Prometheus client_golang
provides that out of the box as process_open_fds which is also
more up to date. Both only support Linux, so there's no loss of
platform support.

Fixes #5229
2016-04-30 01:29:13 +01:00
Xiang Li
67645095e9 *: add debugging metrics 2016-04-26 09:52:56 -07:00
Anthony Romano
bd832e5b0a *: migrate Godeps to vendor/ 2016-03-22 17:10:28 -07:00
Xiang Li
d90a47656e etcdserver: use Histogram for proposal_durations 2015-10-17 12:48:25 -07:00
Xiang Li
52c2a5731f etcdserver: fix typo in metrics.go 2015-06-24 12:42:40 -07:00
Xiang Li
e0f9796653 etcdserver: use leveled logging
Leveled logging for etcdserver pkg.
2015-06-09 13:53:07 -07:00
Xiang Li
34ac145b38 *: use namespace and subsystem in metrics
Fix #2841.

From Prometheus developer:
```
the recommended way for etcd as an open source project and under
consideration of its size would be etcd_<subsystem>_<name>.
```

We made the naming change accordingly.
2015-05-26 14:39:04 -07:00
Yicheng Qin
7a7e1f7a7c etcdserver: metrics and monitor number of file descriptor
It exposes the metrics of file descriptor limit and file descriptor used.
Moreover, it prints out warning when more than 80% of fd limit has been used.

```
2015/04/08 01:26:19 etcdserver: 80% of the file descriptor limit is open
[open = 969, limit = 1024]
```
2015-04-08 11:17:48 -07:00
Xiang Li
95bba154d6 etcdserver: add propose summary 2015-02-28 11:16:42 -08:00