12980 Commits

Author SHA1 Message Date
Gyuho Lee
6f250f9a47 version: 3.3.10+git
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-10 13:30:14 -07:00
Gyuho Lee
27fc7e2296 version: 3.3.10
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
v3.3.10
2018-10-10 10:17:54 -07:00
Gyuho Lee
eb932c2083 travis.yml: use Go 1.10.4
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-10 10:17:36 -07:00
Gyuho Lee
957700f444 etcdserver: add "etcd_server_read_indexes_failed_total"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-09 18:22:02 -07:00
Gyuho Lee
b45f5306dc rafthttp: probe all raft transports
This PR adds another probing routine to monitor the connection
for Raft message transports. Previously, we only monitored
snapshot transports.

In our production cluster, we found one TCP connection had >8-sec
latencies to a remote peer, but "etcd_network_peer_round_trip_time_seconds"
metrics shows <1-sec latency distribution, which means etcd server
was not sampling enough while such latency spikes happen
outside of snapshot pipeline connection.

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-09 18:18:27 -07:00
Gyuho Lee
8491137b55 etcdserver: add "etcd_server_health_success/failures"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-09 17:54:30 -07:00
Jingyi Hu
ebe950fc1c
Merge pull request #10161 from jingyih/automated-cherry-pick-of-#10153-origin-release-3.3
clientv3: automated cherry pick of #10153 to release-3.3
2018-10-08 18:37:52 -07:00
yura
20d83e405f clientv3: concurrency.Mutex.Lock() - preserve invariant
Convenient invariant:
- if werr == nil then lock is supposed to be locked at the moment.

While we could not be confident in stronger invariant ('is exactly locked'),
it were inconvenient that previous code could return `werr == nil` after
Mutex.Unlock.

It could happen when ctx is canceled/timeouted exactly after waitDeletes
successfully returned werr == nil and before `<-ctx.Done()` checked.
While such situation is very rare, it is still possible.

fixes #10111
2018-10-08 16:42:26 -07:00
Wenjia
cb57901e03
Merge pull request #10041 from wenjiaswe/automated-cherry-pick-of-#9997-upstream-release-3.3
Automated cherry pick of #9997
2018-10-03 13:52:02 -07:00
Gyuho Lee
d838e24f80 etcdserver/api/rafthttp: add v3 snapshot send/receive metrics
Distribution would be:
0.1 second or more
...
25.6 seconds or more
51.2 seconds or more

etcd_network_snapshot_send_success
etcd_network_snapshot_send_failures
etcd_network_snapshot_send_total_duration_seconds
etcd_network_snapshot_receive_success
etcd_network_snapshot_receive_failures
etcd_network_snapshot_receive_total_duration_seconds

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-03 11:12:42 -07:00
Gyuho Lee
7ec9ff62b5 etcdserver/api/snap: add v3 snapshot fsync metrics
etcd_snap_db_fsync_duration_seconds_count
etcd_snap_db_save_total_duration_seconds_bucket

Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-03 11:12:41 -07:00
Gyuho Lee
dc02dc2ede tests/Dockerfile: update, fix GOPATH
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-01 01:30:23 -07:00
Gyuho Lee
40ed18a457
Merge pull request #10122 from jingyih/cherry-pick-of-#10109-origin-release-3.3
etcdctl: cherry pick of #10109 to release-3.3
2018-09-25 17:30:01 -07:00
Jingyi Hu
60d546e309 etcdctl: cherry pick of #10109 to release-3.3
Add snapshot file integrity verification in snapshot status.
2018-09-25 16:50:47 -07:00
Gyuho Lee
e774f7309c
Merge pull request #10093 from jingyih/remove_duplicated_import
etcdserver: remove duplicated imports
2018-09-13 20:57:09 -07:00
Jingyi Hu
9eee0b078e etcdserver: remove duplicated imports
Removed duplicated imports of package 'context' in server.go
2018-09-13 20:44:03 -07:00
Gyuho Lee
d1acb5a5c8 etcdserver: add "etcd_server_id"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-29 14:50:17 -07:00
Gyuho Lee
73c1100b04 etcdserver: clarify read index wait timeout warnings
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-29 14:38:59 -07:00
Gyuho Lee
c577335a64 rafthttp: clarify "became inactive" warning
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-29 14:34:15 -07:00
Gyuho Lee
f69413e9ee
Merge pull request #10027 from hexfusion/cherry-pick-a205cfe
etcdserver: cherry-pick #9861 to release-3.3
2018-08-20 12:54:43 -07:00
Gyuho Lee
0dc4632e28 Merge pull request #9861 from gyuho/race
etcdserver/api/v3rpc: remove duplicate gRPC logger set
2018-08-17 22:32:10 -04:00
Gyuho Lee
f8fc923fc0
Merge pull request #10004 from jingyih/automated-cherry-pick-of-#9990-origin-release-3.3
Automated cherry pick of #9990
2018-08-15 06:37:33 -07:00
Jingyi Hu
264bb51a9a etcdserver: code clean up
Code clean up in interceptor.go
2018-08-14 17:08:45 -07:00
Jingyi Hu
c6c0d03522 vendor: add go-grpc-middleware
Rebased to master PR #9994.  Fixed a Go format issue in
v3rpc/interceptor.go.  Updated vendor to include go-grpc-middleware.
2018-08-14 17:08:45 -07:00
Jingyi Hu
94f81368ae etcdserver: add grpc interceptor to log info on incoming requests to etcd server
To improve debuggability of etcd v3. Added a grpc interceptor to log
info on incoming requests to etcd server. The log output includes
remote client info, request content (with value field redacted), request
handling latency, response size, etc. Uses zap logger if available,
otherwise uses capnslog.

Also did some clean up on the chaining of grpc interceptors on server
side.
2018-08-14 16:20:13 -07:00
Gyuho Lee
051587f56f version: bump up to 3.3.9+git 2018-07-24 10:17:06 -07:00
Gyuho Lee
fca8add78a version: 3.3.9
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
v3.3.9
2018-07-24 09:48:32 -07:00
Gyuho Lee
ea40e9f059 etcdserver: add "etcd_server_go_version" metric
Currently, one has to look at server logs manually,
to see what Go version was used to build etcd server.

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-23 16:39:24 -07:00
Gyuho Lee
fbc0510a4e clientv3: fix keepalive send interval when response queue is full
client should update next keepalive send time
even when lease keepalive response queue becomes full.

Otherwise, client sends keepalive request every 500ms
regardless of TTL when the send is only expected to happen
with the interval of TTL / 3 at minimum.

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-23 08:51:18 -07:00
Gyuho Lee
267a62199c
Merge pull request #9940 from wenjiaswe/automated-cherry-pick-of-#9761-upstream-release-3.3
Automated cherry pick of #9761
2018-07-19 18:27:15 -07:00
Wenjia
143fc4ce79
added "now := time.Now()" 2018-07-19 17:27:40 -07:00
Wenjia
7f421efe48
remove "github.com/gogo/protobuf/plugin/stringer" 2018-07-19 17:15:32 -07:00
Gyuho Lee
d509620793 etcdserver: rename to "heartbeat_send_failures_total"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-19 16:58:14 -07:00
Gyuho Lee
d5654ba459 mvcc: add "etcd_mvcc_hash_(rev)_duration_seconds"
etcd_mvcc_hash_duration_seconds
etcd_mvcc_hash_rev_duration_seconds

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-19 16:57:04 -07:00
Gyuho Lee
da304d7aae mvcc/backend: fix defrag duration scale
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-19 16:54:26 -07:00
Gyuho Lee
978727a963 mvcc/backend: add "etcd_disk_backend_defrag_duration_seconds"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-19 16:54:26 -07:00
Gyuho Lee
4ad350482e mvcc/backend: document metrics ExponentialBuckets
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-19 16:53:31 -07:00
Gyuho Lee
f7367d94ff mvcc/backend: clean up mutex, logging
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-19 16:53:31 -07:00
Gyuho Lee
e43224c3b6 etcdserver: add "etcd_server_slow_apply_total"
{"level":"warn","ts":1527101858.6985068,"caller":"etcdserver/util.go:115","msg":"apply request took too long","took":0.114101529,"expected-duration":0.1,"prefix":"","request":"header:<ID:1029181977902852337> put:<key:\"\\000\\000...

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-19 16:52:37 -07:00
Gyuho Lee
4c7bf51030 etcdserver: add "etcd_server_heartbeat_failures_total"
{"level":"warn","ts":1527101858.4149103,"caller":"etcdserver/raft.go:370","msg":"failed to send out heartbeat; took too long, server is overloaded likely from slow disk","heartbeat-interval":0.1,"expected-duration":0.2,"exceeded-duration":0.025771662}
{"level":"warn","ts":1527101858.4149644,"caller":"etcdserver/raft.go:370","msg":"failed to send out heartbeat; took too long, server is overloaded likely from slow disk","heartbeat-interval":0.1,"expected-duration":0.2,"exceeded-duration":0.034015766}

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-19 16:51:08 -07:00
Gyuho Lee
ffe52f74c0 e2e: log errors TestV3CurlCipherSuitesMismatch for now
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-19 10:11:10 -07:00
Gyuho Lee
1da638c4dc Makefile: use Go 1.10.3 by default
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-19 10:01:27 -07:00
Gyuho Lee
82ce873987 *: use Go 1.10.3 for testing
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-19 09:56:59 -07:00
Gyuho Lee
adfd0d3fe7 mvcc: avoid unnecessary metrics update
https://github.com/coreos/etcd/pull/9300

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 14:51:08 -07:00
Gyuho Lee
a410463a0b mvcc: add "etcd_mvcc_db_total_size_in_use_in_bytes"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 14:36:18 -07:00
Gyuho Lee
1da3603e31 mvcc: add "etcd_mvcc_db_total_size_in_bytes"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 14:35:48 -07:00
Gyuho Lee
72c51d3e12 etcdserver: add "etcd_server_quota_backend_bytes"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 13:26:49 -07:00
Gyuho Lee
4481238224 etcdserver: add "etcd_server_slow_read_indexes_total"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 13:00:08 -07:00
Gyuho Lee
82e670766a etcdserver: clarify read index warnings
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 12:53:21 -07:00
Gyuho Lee
09addbdaa0 tests: update test scripts
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-06-18 14:08:36 -07:00