10666 Commits

Author SHA1 Message Date
Joe Betz
bb205caa68 version: bump up to 3.1.19+git 2018-07-24 10:07:31 -07:00
Joe Betz
a1d6802da2 version: bump up to 3.1.19 v3.1.19 2018-07-24 10:04:37 -07:00
Gyuho Lee
79d80bd259 etcdserver: add "etcd_server_go_version" metric
Currently, one has to look at server logs manually,
to see what Go version was used to build etcd server.

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-23 16:38:10 -07:00
Gyuho Lee
081519c323 clientv3: fix keepalive send interval when response queue is full
client should update next keepalive send time
even when lease keepalive response queue becomes full.

Otherwise, client sends keepalive request every 500ms
regardless of TTL when the send is only expected to happen
with the interval of TTL / 3 at minimum.

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-23 08:50:07 -07:00
Gyuho Lee
e0d5a028d5
Merge pull request #9944 from wenjiaswe/automated-cherry-pick-of-#9761-upstream-release-3.1
Automated cherry pick of #9761
2018-07-20 14:51:20 -07:00
Wenjia
a421a604d6
remove hashRevDurations 2018-07-20 13:49:58 -07:00
Gyuho Lee
0fbf49df11 etcdserver: rename to "heartbeat_send_failures_total"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-20 11:40:37 -07:00
Gyuho Lee
fb5080b306 mvcc: add "etcd_mvcc_hash_(rev)_duration_seconds"
etcd_mvcc_hash_duration_seconds
etcd_mvcc_hash_rev_duration_seconds

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-20 11:37:06 -07:00
Gyuho Lee
cac6ce756d mvcc/backend: fix defrag duration scale
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-20 10:53:26 -07:00
Gyuho Lee
9f58e57a3c mvcc/backend: add "etcd_disk_backend_defrag_duration_seconds"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-20 10:53:26 -07:00
Gyuho Lee
22c25dd4e7 mvcc/backend: document metrics ExponentialBuckets
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-20 10:44:52 -07:00
Gyuho Lee
92a7b5df80 mvcc/backend: clean up mutex, logging
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-20 10:35:39 -07:00
Gyuho Lee
3f1fe618ad etcdserver: add "etcd_server_slow_apply_total"
{"level":"warn","ts":1527101858.6985068,"caller":"etcdserver/util.go:115","msg":"apply request took too long","took":0.114101529,"expected-duration":0.1,"prefix":"","request":"header:<ID:1029181977902852337> put:<key:\"\\000\\000...

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-20 10:25:16 -07:00
Gyuho Lee
b8547734ae etcdserver: add "etcd_server_heartbeat_failures_total"
{"level":"warn","ts":1527101858.4149103,"caller":"etcdserver/raft.go:370","msg":"failed to send out heartbeat; took too long, server is overloaded likely from slow disk","heartbeat-interval":0.1,"expected-duration":0.2,"exceeded-duration":0.025771662}
{"level":"warn","ts":1527101858.4149644,"caller":"etcdserver/raft.go:370","msg":"failed to send out heartbeat; took too long, server is overloaded likely from slow disk","heartbeat-interval":0.1,"expected-duration":0.2,"exceeded-duration":0.034015766}

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-20 10:24:40 -07:00
Gyuho Lee
78a13e67a0 mvcc/backend: avoid unnecessary metrics update
https://github.com/coreos/etcd/pull/9300

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 14:53:20 -07:00
Gyuho Lee
84d11a51c1 mvcc: use "t.tx.DB()" to fetch DB
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 14:34:20 -07:00
Gyuho Lee
a9c4b98756 mvcc: add "etcd_mvcc_db_total_size_in_use_in_bytes"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 14:21:11 -07:00
Gyuho Lee
5531e3b0f5 mvcc: add "etcd_mvcc_db_total_size_in_bytes"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 13:51:06 -07:00
Gyuho Lee
c2623bb840 etcdserver: add "etcd_server_quota_backend_bytes"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 13:30:10 -07:00
Gyuho Lee
f46b4677c0 etcdserver: add "etcd_server_slow_read_indexes_total"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 12:58:29 -07:00
Gyuho Lee
09843d5d90 etcdserver: clarify read index warnings
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 12:55:31 -07:00
Gyuho Lee
be3e6f6ed5 tests: update test scripts
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-06-18 14:15:52 -07:00
Joe Betz
d84dd18637 version: bump up to 3.1.18+git 2018-06-15 09:51:30 -07:00
Joe Betz
b7ff47f9d5 version: bump up to 3.1.18 v3.1.18 2018-06-15 09:47:04 -07:00
Gyuho Lee
fab24fbdab
Merge pull request #9848 from wenjiaswe/automated-cherry-pick-of-#8960-upstream-release-3.1
Automated cherry pick of #8960
2018-06-13 16:49:48 -07:00
Joe Betz
b3ee996629 metrics: Add server_version metric 2018-06-13 16:31:18 -07:00
Gyuho Lee
06da6cf983 tests/semaphore.test.bash: update
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-06-13 14:42:45 -07:00
Gyuho Lee
9c00100550 Makefile: update
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-06-13 14:42:10 -07:00
Gyuho Lee
1d7a2ca520
Merge pull request #9838 from jpbetz/automated-cherry-pick-of-#9821-origin-release-3.1-1528833932
etcdserver: Automated cherry pick of detailed "took too long" warnings to release-3.1
2018-06-12 13:54:40 -07:00
Joe Betz
e90934ec71 etcdserver: Fix txn request 'took too long' warnings to use loggable request stringer 2018-06-12 13:22:45 -07:00
Joe Betz
23c5c71426 etcdserver: Add response byte size and range response count to took too long warning 2018-06-12 13:22:45 -07:00
Joe Betz
72a2483d42 etcdserver: Replace value contents with value_size in request took too long warning 2018-06-12 13:15:19 -07:00
Hitoshi Mitake
53eae781fe etcdserver: not print password in the warning message of expensive request
Fix https://github.com/coreos/etcd/issues/9635
2018-06-12 13:15:18 -07:00
Joe Betz
7b1b7def84 etcdserver: Fix to backport of #9288 for pre-RequestV2 code 2018-06-12 13:13:46 -07:00
Xiang
df000fd776 etcdserver: improve request took too long warning 2018-06-12 13:13:46 -07:00
Joe Betz
fd61be4814 version: bump up to 3.1.17+git 2018-06-06 10:36:22 -07:00
Joe Betz
781cc0be83 version: bump up to 3.1.17 v3.1.17 2018-06-06 09:54:59 -07:00
Joe Betz
ebe351e3b4
Merge pull request #9808 from jpbetz/snapshot-recover-3.1
etcdserver: Backport snapshot recovery from #7917 to 3.1 branch
2018-06-05 16:13:40 -07:00
Joe Betz
e31510975a
etcdserver: Backport snapshot recovery from #7917 to 3.1 branch 2018-06-04 21:52:26 -07:00
Joe Betz
43b0cafcb6 version: bump up to 3.1.16+git 2018-05-31 12:54:30 -07:00
Joe Betz
169af4470e version: bump up to 3.1.16 v3.1.16 2018-05-31 12:51:28 -07:00
Gyuho Lee
c4c487eaca mvcc: fix panic by allowing future revision watcher from restore operation
This also happens without gRPC proxy.

Fix panic when gRPC proxy leader watcher is restored:

```
go test -v -tags cluster_proxy -cpu 4 -race -run TestV3WatchRestoreSnapshotUnsync

=== RUN   TestV3WatchRestoreSnapshotUnsync
panic: watcher minimum revision 9223372036854775805 should not exceed current revision 16

goroutine 156 [running]:
github.com/coreos/etcd/mvcc.(*watcherGroup).chooseAll(0xc4202b8720, 0x10, 0xffffffffffffffff, 0x1)
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watcher_group.go:242 +0x3b5
github.com/coreos/etcd/mvcc.(*watcherGroup).choose(0xc4202b8720, 0x200, 0x10, 0xffffffffffffffff, 0xc420253378, 0xc420253378)
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watcher_group.go:225 +0x289
github.com/coreos/etcd/mvcc.(*watchableStore).syncWatchers(0xc4202b86e0, 0x0)
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watchable_store.go:340 +0x237
github.com/coreos/etcd/mvcc.(*watchableStore).syncWatchersLoop(0xc4202b86e0)
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watchable_store.go:214 +0x280
created by github.com/coreos/etcd/mvcc.newWatchableStore
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watchable_store.go:90 +0x477
exit status 2
FAIL	github.com/coreos/etcd/integration	2.551s
```

gRPC proxy spawns a watcher with a key "proxy-namespace__lostleader"
and watch revision "int64(math.MaxInt64 - 2)" to detect leader loss.
But, when the partitioned node restores, this watcher triggers
panic with "watcher minimum revision ... should not exceed current ...".

This check was added a long time ago, by my PR, when there was no gRPC proxy:

https://github.com/coreos/etcd/pull/4043#discussion_r48457145

> we can remove this checking actually. it is impossible for a unsynced watching to have a future rev. or we should just panic here.

However, now it's possible that a unsynced watcher has a future
revision, when it was moved from a synced watcher group through
restore operation.

This PR adds "restore" flag to indicate that a watcher was moved
from the synced watcher group with restore operation. Otherwise,
the watcher with future revision in an unsynced watcher group
would still panic.

Example logs with future revision watcher from restore operation:

```
{"level":"info","ts":1527196358.9057755,"caller":"mvcc/watcher_group.go:261","msg":"choosing future revision watcher from restore operation","watch-key":"proxy-namespace__lostleader","watch-revision":9223372036854775805,"current-revision":16}
{"level":"info","ts":1527196358.910349,"caller":"mvcc/watcher_group.go:261","msg":"choosing future revision watcher from restore operation","watch-key":"proxy-namespace__lostleader","watch-revision":9223372036854775805,"current-revision":16}
```

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-31 11:42:25 -07:00
Joe Betz
6bb88b9617 version: bump up to 3.1.15+git 2018-05-09 10:26:18 -07:00
Joe Betz
380b833c18 version: bump up to 3.1.15 v3.1.15 2018-05-09 10:21:07 -07:00
Gyuho Lee
01d9b368a4
Merge pull request #9693 from mohitsoni/release-3.1
Cherry-picking PR 7967 to release-3.1
2018-05-04 12:16:27 -07:00
Anthony Romano
ea82927473 etcdserver: purge old snap.db files
Lots of garbage db files in #7957. Should purge.
2018-05-04 10:29:26 -07:00
Joe Betz
d2f60651b7 version: bump up to 3.1.14+git 2018-04-24 13:45:40 -07:00
Joe Betz
2373ddb445 version: bump up to 3.1.14 v3.1.14 2018-04-24 13:23:45 -07:00
Gyuho Lee
5da3a7232f
Merge pull request #9606 from jpbetz/automated-cherry-pick-of-#9587-release-3.1
Automated cherry pick of #9587
2018-04-23 12:48:23 -07:00
Maciej Borsz
3865d69db3
etcdserver: add is_leader prometheus metric that is 1 on the leader.
Before this change, we had now way to find a leader using /metrics
endpoint. This commit adds a metric to do that.
2018-04-23 11:10:12 -07:00