11788 Commits

Author SHA1 Message Date
Gyuho Lee
06cec40911 version: 3.2.26
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
v3.2.26
2019-01-11 10:04:58 -08:00
Gyuho Lee
ab4693d97f
Merge pull request #10386 from hexfusion/release-3.2
[Cherry-pick 3.2] auth: disable CommonName auth for gRPC-gateway
2019-01-11 10:01:12 -08:00
Sam Batschelet
a2b420c364 auth: disable CommonName auth for gRPC-gateway
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2019-01-08 21:09:07 +00:00
Gyuho Lee
dfd8fe97c5
Merge pull request #10334 from gyuho/patch-grpc-proxy
[Cherry-pick 3.2] grpcproxy: fix memory leak
2018-12-17 20:35:23 -08:00
Igor German
ada4af3b2a grpcproxy: fix memory leak
use set instead of slice as interval value

fixes #10326
2018-12-17 18:58:04 -08:00
Joe Betz
2e27fef277
version: bump up to 3.2.25+git 2018-10-10 11:13:42 -07:00
Joe Betz
182de1a9e1 version: bump up to 3.2.25 v3.2.25 2018-10-10 10:53:11 -07:00
Gyuho Lee
6e15f11fd9 etcdserver: add "etcd_server_read_indexes_failed_total"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-09 18:21:41 -07:00
Gyuho Lee
b6d11019e0 rafthttp: probe all raft transports
This PR adds another probing routine to monitor the connection
for Raft message transports. Previously, we only monitored
snapshot transports.

In our production cluster, we found one TCP connection had >8-sec
latencies to a remote peer, but "etcd_network_peer_round_trip_time_seconds"
metrics shows <1-sec latency distribution, which means etcd server
was not sampling enough while such latency spikes happen
outside of snapshot pipeline connection.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-09 18:17:16 -07:00
Gyuho Lee
86fdbdc7f9 etcdserver: add "etcd_server_health_success/failures"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-09 18:04:40 -07:00
Jingyi Hu
afa5beda46
Merge pull request #10162 from jingyih/automated-cherry-pick-of-#10153-origin-release-3.2
clientv3: automated cherry pick of #10153 to release-3.2
2018-10-08 18:38:02 -07:00
yura
26cce20022 clientv3: concurrency.Mutex.Lock() - preserve invariant
Convenient invariant:
- if werr == nil then lock is supposed to be locked at the moment.

While we could not be confident in stronger invariant ('is exactly locked'),
it were inconvenient that previous code could return `werr == nil` after
Mutex.Unlock.

It could happen when ctx is canceled/timeouted exactly after waitDeletes
successfully returned werr == nil and before `<-ctx.Done()` checked.
While such situation is very rare, it is still possible.

fixes #10111
2018-10-08 16:46:22 -07:00
Gyuho Lee
a4a8d0752e
Merge pull request #10123 from jingyih/cherry-pick-of-#10109-origin-release-3.2
etcdctl: cherry pick of #10109 to release-3.2
2018-09-25 19:55:07 -07:00
Jingyi Hu
affd468424 etcdctl: cherry pick of #10109 to release-3.2
Add snapshot file integrity verification when querying snapshot status.
2018-09-25 17:07:44 -07:00
Wenjia
9452e5c1e5
Merge pull request #10042 from wenjiaswe/automated-cherry-pick-of-#9997-upstream-release-3.2
Automated cherry pick of #9997
2018-09-04 12:54:58 -07:00
Gyuho Lee
e2dfe0f5d9 etcdserver/api/rafthttp: add v3 snapshot send/receive metrics
Distribution would be:
0.1 second or more
...
25.6 seconds or more
51.2 seconds or more

etcd_network_snapshot_send_success
etcd_network_snapshot_send_failures
etcd_network_snapshot_send_total_duration_seconds
etcd_network_snapshot_receive_success
etcd_network_snapshot_receive_failures
etcd_network_snapshot_receive_total_duration_seconds

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-29 14:51:31 -07:00
Gyuho Lee
9d7242e271 etcdserver: add "etcd_server_id"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-29 14:49:26 -07:00
Gyuho Lee
95afd1fb24 etcdserver: clarify read index wait timeout warnings
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-29 14:38:43 -07:00
Gyuho Lee
7f337ef13a rafthttp: clarify "became inactive" warning
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-29 14:33:46 -07:00
Gyuho Lee
8f1d366c0e etcdserver/api/snap: add v3 snapshot fsync metrics
etcd_snap_db_fsync_duration_seconds_count
etcd_snap_db_save_total_duration_seconds_bucket

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 14:03:12 -07:00
Xiang Li
b3fa36eb7f
Merge pull request #10032 from gyuho/init-metrics-3.2
etcdserver/api/v3rpc: display all registered gRPC metrics at start (v3.2)
2018-08-24 18:52:34 -07:00
Gyuho Lee
4928558bc9 etcdserver/api/v3rpc: display all registered gRPC metrics at start
Previously, only display the one that has been requested at least once.
Now it shows all metrics, as we do in v3.3 and v3.4+.

grpc_server_started_total{grpc_method="Alarm",grpc_service="etcdserverpb.Maintenance",grpc_type="unary"} 0
grpc_server_started_total{grpc_method="AuthDisable",grpc_service="etcdserverpb.Auth",grpc_type="unary"} 0
grpc_server_started_total{grpc_method="AuthEnable",grpc_service="etcdserverpb.Auth",grpc_type="unary"} 0
grpc_server_started_total{grpc_method="Authenticate",grpc_service="etcdserverpb.Auth",grpc_type="unary"} 0
grpc_server_started_total{grpc_method="Compact",grpc_service="etcdserverpb.KV",grpc_type="unary"} 0
grpc_server_started_total{grpc_method="Defragment",grpc_service="etcdserverpb.Maintenance",grpc_type="unary"} 0
grpc_server_started_total{grpc_method="DeleteRange",grpc_service="etcdserverpb.KV",grpc_type="unary"} 0

Should help document metrics.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-22 19:12:58 -07:00
Joe Betz
73b1a2b8db
Merge pull request #10025 from jingyih/automated-cherry-pick-of-#9990-origin-release-3.2-1534373481
etcdserver: cherry pick of #9990 to release-3.2
2018-08-20 12:48:14 -07:00
Jingyi Hu
ae0f433761 etcdserver: add grpc interceptor to log info on incoming request to
etcdserver.

To improve debuggability of etcd v3. Added a grpc interceptor to log
info on incoming requests to etcd server. The log output includes remote
client info, request content (with value field redacted), request
handling latency, response size, etc.

Dependency on zap logger and grpc_middleware is removed during
backporting.

Added checking in logging interceptor. If debug level is disabled, skip
logUnaryRequestStats() to avoid potential performance degradation. (PR #10021)
2018-08-17 17:06:13 -07:00
Joe Betz
5a3cbe4cf7 version: bump up to 3.2.24+git 2018-07-24 10:29:31 -07:00
Joe Betz
420a452267 version: bump up to 3.2.24 v3.2.24 2018-07-24 10:24:31 -07:00
Gyuho Lee
348edfeae6 etcdserver: add "etcd_server_go_version" metric
Currently, one has to look at server logs manually,
to see what Go version was used to build etcd server.

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-23 16:38:52 -07:00
Gyuho Lee
0d5497a107 clientv3: fix keepalive send interval when response queue is full
client should update next keepalive send time
even when lease keepalive response queue becomes full.

Otherwise, client sends keepalive request every 500ms
regardless of TTL when the send is only expected to happen
with the interval of TTL / 3 at minimum.

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-23 08:50:44 -07:00
Gyuho Lee
87418c3432
Merge pull request #9942 from wenjiaswe/automated-cherry-pick-of-#9761-upstream-release-3.2
Automated cherry pick of #9761
2018-07-20 14:26:13 -07:00
Wenjia
8c9fd1b5e6
remove hashRevDurations 2018-07-20 13:48:35 -07:00
Wenjia
a3c0a99067
remove hashRevDurations 2018-07-20 13:45:33 -07:00
Wenjia
b3ab14ca9a
remove HashByRev 2018-07-20 13:44:15 -07:00
Gyuho Lee
8798c5cd43 etcdserver: rename to "heartbeat_send_failures_total"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-20 09:58:32 -07:00
Gyuho Lee
4e08898571 mvcc: add "etcd_mvcc_hash_(rev)_duration_seconds"
etcd_mvcc_hash_duration_seconds
etcd_mvcc_hash_rev_duration_seconds

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-20 09:57:47 -07:00
Gyuho Lee
8ac6c888cd mvcc/backend: fix defrag duration scale
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-20 09:52:46 -07:00
Gyuho Lee
aca5c8f4b6 mvcc/backend: add "etcd_disk_backend_defrag_duration_seconds"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-20 09:52:46 -07:00
Gyuho Lee
3535f7a61f mvcc/backend: document metrics ExponentialBuckets
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-20 09:44:15 -07:00
Gyuho Lee
fae9b6f667 mvcc/backend: clean up mutex, logging
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-20 09:44:15 -07:00
Gyuho Lee
66d8194e4d etcdserver: add "etcd_server_slow_apply_total"
{"level":"warn","ts":1527101858.6985068,"caller":"etcdserver/util.go:115","msg":"apply request took too long","took":0.114101529,"expected-duration":0.1,"prefix":"","request":"header:<ID:1029181977902852337> put:<key:\"\\000\\000...

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-20 09:42:52 -07:00
Gyuho Lee
2f0e3fd2df etcdserver: add "etcd_server_heartbeat_failures_total"
{"level":"warn","ts":1527101858.4149103,"caller":"etcdserver/raft.go:370","msg":"failed to send out heartbeat; took too long, server is overloaded likely from slow disk","heartbeat-interval":0.1,"expected-duration":0.2,"exceeded-duration":0.025771662}
{"level":"warn","ts":1527101858.4149644,"caller":"etcdserver/raft.go:370","msg":"failed to send out heartbeat; took too long, server is overloaded likely from slow disk","heartbeat-interval":0.1,"expected-duration":0.2,"exceeded-duration":0.034015766}

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-20 09:37:04 -07:00
Gyuho Lee
cad3cf7b11 mvcc/backend: avoid unnecessary metrics update
https://github.com/coreos/etcd/pull/9300

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 14:52:16 -07:00
Gyuho Lee
bedba66c69 mvcc: add "etcd_mvcc_db_total_size_in_use_in_bytes"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 14:32:56 -07:00
Gyuho Lee
9bc1e15386 mvcc: add "etcd_mvcc_db_total_size_in_bytes"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 14:24:56 -07:00
Gyuho Lee
6e0131e83b etcdserver: add "etcd_server_quota_backend_bytes"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 13:27:15 -07:00
Gyuho Lee
c0e9e14248 etcdserver: add "etcd_server_slow_read_indexes_total"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 12:59:53 -07:00
Gyuho Lee
b763b506ab etcdserver: clarify read index warnings
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 12:54:42 -07:00
Gyuho Lee
d22ee8423d
Merge pull request #9894 from xmudrii/3.2-grpcproxy-tls
etcdmain: backport support for different certs for etcd-gRPC proxy
2018-07-02 10:57:39 -07:00
Gyu-Ho Lee
e5531a4d54
etcdmain/grpc-proxy: add 'metrics-addr' option
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2018-07-02 12:06:25 +02:00
Anthony Romano
8dabfe12ca
etcdmain: cleanup grpcproxy; support different certs for proxy/etcd
Enables TLS termination in grpcproxy.
2018-07-02 11:20:14 +02:00
Gyuho Lee
360484a3f0 tests: update test scripts
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-06-18 14:14:15 -07:00