111 Commits

Author SHA1 Message Date
Joe Betz
a0d497b54a
mvcc/backend: Delete orphaned db.tmp files before defrag 2020-02-13 12:44:59 -08:00
Jingyi Hu
948c284235 mvcc: add store revision metrics
Add experimental metrics etcd_debugging_mvcc_current_revision and
etcd_debugging_mvcc_compact_revision.
2019-09-06 17:22:45 -07:00
Yingnan Zhang
bcbe6dbc29
mvcc: fix db_compaction_total_duration_milliseconds 2019-04-17 16:32:46 -07:00
Wenjia
8c9fd1b5e6
remove hashRevDurations 2018-07-20 13:48:35 -07:00
Wenjia
a3c0a99067
remove hashRevDurations 2018-07-20 13:45:33 -07:00
Wenjia
b3ab14ca9a
remove HashByRev 2018-07-20 13:44:15 -07:00
Gyuho Lee
4e08898571 mvcc: add "etcd_mvcc_hash_(rev)_duration_seconds"
etcd_mvcc_hash_duration_seconds
etcd_mvcc_hash_rev_duration_seconds

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-20 09:57:47 -07:00
Gyuho Lee
8ac6c888cd mvcc/backend: fix defrag duration scale
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-20 09:52:46 -07:00
Gyuho Lee
aca5c8f4b6 mvcc/backend: add "etcd_disk_backend_defrag_duration_seconds"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-20 09:52:46 -07:00
Gyuho Lee
3535f7a61f mvcc/backend: document metrics ExponentialBuckets
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-20 09:44:15 -07:00
Gyuho Lee
fae9b6f667 mvcc/backend: clean up mutex, logging
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-20 09:44:15 -07:00
Gyuho Lee
cad3cf7b11 mvcc/backend: avoid unnecessary metrics update
https://github.com/coreos/etcd/pull/9300

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 14:52:16 -07:00
Gyuho Lee
bedba66c69 mvcc: add "etcd_mvcc_db_total_size_in_use_in_bytes"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 14:32:56 -07:00
Gyuho Lee
9bc1e15386 mvcc: add "etcd_mvcc_db_total_size_in_bytes"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 14:24:56 -07:00
Gyuho Lee
4ace7c7d77 mvcc: fix panic by allowing future revision watcher from restore operation
This also happens without gRPC proxy.

Fix panic when gRPC proxy leader watcher is restored:

```
go test -v -tags cluster_proxy -cpu 4 -race -run TestV3WatchRestoreSnapshotUnsync

=== RUN   TestV3WatchRestoreSnapshotUnsync
panic: watcher minimum revision 9223372036854775805 should not exceed current revision 16

goroutine 156 [running]:
github.com/coreos/etcd/mvcc.(*watcherGroup).chooseAll(0xc4202b8720, 0x10, 0xffffffffffffffff, 0x1)
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watcher_group.go:242 +0x3b5
github.com/coreos/etcd/mvcc.(*watcherGroup).choose(0xc4202b8720, 0x200, 0x10, 0xffffffffffffffff, 0xc420253378, 0xc420253378)
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watcher_group.go:225 +0x289
github.com/coreos/etcd/mvcc.(*watchableStore).syncWatchers(0xc4202b86e0, 0x0)
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watchable_store.go:340 +0x237
github.com/coreos/etcd/mvcc.(*watchableStore).syncWatchersLoop(0xc4202b86e0)
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watchable_store.go:214 +0x280
created by github.com/coreos/etcd/mvcc.newWatchableStore
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watchable_store.go:90 +0x477
exit status 2
FAIL	github.com/coreos/etcd/integration	2.551s
```

gRPC proxy spawns a watcher with a key "proxy-namespace__lostleader"
and watch revision "int64(math.MaxInt64 - 2)" to detect leader loss.
But, when the partitioned node restores, this watcher triggers
panic with "watcher minimum revision ... should not exceed current ...".

This check was added a long time ago, by my PR, when there was no gRPC proxy:

https://github.com/coreos/etcd/pull/4043#discussion_r48457145

> we can remove this checking actually. it is impossible for a unsynced watching to have a future rev. or we should just panic here.

However, now it's possible that a unsynced watcher has a future
revision, when it was moved from a synced watcher group through
restore operation.

This PR adds "restore" flag to indicate that a watcher was moved
from the synced watcher group with restore operation. Otherwise,
the watcher with future revision in an unsynced watcher group
would still panic.

Example logs with future revision watcher from restore operation:

```
{"level":"info","ts":1527196358.9057755,"caller":"mvcc/watcher_group.go:261","msg":"choosing future revision watcher from restore operation","watch-key":"proxy-namespace__lostleader","watch-revision":9223372036854775805,"current-revision":16}
{"level":"info","ts":1527196358.910349,"caller":"mvcc/watcher_group.go:261","msg":"choosing future revision watcher from restore operation","watch-key":"proxy-namespace__lostleader","watch-revision":9223372036854775805,"current-revision":16}
```

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-31 11:42:50 -07:00
Joe Betz
33633da64c
mvcc: fix watchable store test for 3.2 cherrypick of #9281 2018-02-07 15:57:34 -08:00
Iwasaki Yudai
e08abbeae4
mvcc: restore unsynced watchers
In case syncWatchersLoop() starts before Restore() is called,
watchers already added by that moment are moved to s.synced by the loop.
However, there is a broken logic that moves watchers from s.synced
to s.uncyned without setting keyWatchers of the watcherGroup.
Eventually syncWatchers() fails to pickup those watchers from s.unsynced
and no events are sent to the watchers, because newWatcherBatch() called
in the function uses wg.watcherSetByKey() internally that requires
a proper keyWatchers value.
2018-02-07 15:34:21 -08:00
Iwasaki Yudai
6999bbb47b mvcc: check null before set FillPercent not to panic
Since CreateBucketIfNotExists() can return nil when it gets an error,
accessing FillPercent must be done after a nil check, not to cause
a panic.
2018-01-08 17:46:06 -08:00
Joe Betz
8de0c0419a vendor: Switch from boltdb v1.3.0 to coreos/bbolt v1.3.1-coreos.3 2017-11-16 12:43:17 -08:00
fengshaobao 00231050
78d68226e6 mvcc: sending events after restore
Fixes: #8411
2017-08-21 10:39:46 -07:00
Anthony Romano
e197c14847 mvcc: test keys gauge is reloaded correctly on restore 2017-08-10 12:59:24 -07:00
Anthony Romano
e7bf5477de mvcc: reset keys gauge on restore
Fixes #8388
2017-08-10 12:59:19 -07:00
Anthony Romano
71d2008385 mvcc: use GaugeFunc metric to load db size when requested
Relying on mvcc to set the db size metric can cause it to
miss size changes when a txn commits after the last write
completes before a quiescent period. Instead, load the
db size on demand.

Fixes #8146
2017-06-22 09:47:01 -07:00
Anthony Romano
4526284326 mvcc: restore into tree index with one key index
Clobbering the mvcc kvindex with new keyIndexes for each restore
chunk would cause index corruption by dropping historical information.
2017-06-20 10:58:42 -07:00
Anthony Romano
0b0b1992b8 mvcc: test restore and deletes with small chunk sizes 2017-06-20 10:58:35 -07:00
Anthony Romano
ed7ef5be8b mvcc: set db size metric on restore
Fixes #8080
2017-06-20 10:58:16 -07:00
Anthony Romano
e72ad5dd2a mvcc: create TxnWrites from TxnRead with NewReadOnlyTxnWrite
Already used internally by mvcc, but needed by etcdserver txns.
2017-06-09 09:50:37 -07:00
Anthony Romano
c85f736522 mvcc: time restore in restore benchmark
This never worked.
2017-06-01 14:59:31 -07:00
Anthony Romano
a375ff172e mvcc: chunk reads for restoring
Loading all keys at once would cause etcd to use twice as much
memory than it would need to serve the keys, causing RSS to spike on
boot. Instead, load the keys into the mvcc by chunk. Uses pipelining
for some concurrency.

Fixes #7822
2017-06-01 14:59:27 -07:00
Anthony Romano
8516d8ccc5 backend: force initial mmap size to 0 for windows
boltdb on windows allocates a file with the full mmap size even if the
db is empty. Force the initial mmap size to 0 so there's no huge initial
db file on windows.

Fixes #7910
2017-05-12 14:34:07 -07:00
fanmin shi
8468b38631 backend: dynamically set snapshotWarningTimeout based on db size 2017-05-11 15:25:35 -07:00
fanmin shi
230106dd3c backend: add prometheus metric for large snapshot duration.
FIXES #7878
2017-05-05 17:27:33 -07:00
fanmin shi
f7f30f2361 backend: print snapshotting duration warning every 30s
FIXES #7870
2017-05-04 16:41:03 -07:00
Anthony Romano
14d6ed9e5f *: clear redundant return statement warnings (S1027) 2017-04-21 14:01:00 -07:00
Gyu-Ho Lee
5000d29b4a mvcc: remove stopc select case in Hash
Revert change in 33acbb694b.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-17 14:19:48 -07:00
Gyu-Ho Lee
8ffd58fb3b mvcc/backend: remove t.tx.DB()==nil checks with GracefulStop
Revert https://github.com/coreos/etcd/pull/6662.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-04-17 14:17:00 -07:00
Gyu-Ho Lee
cd470f9ccd Revert "mvcc: test inflight Hash to trigger Size on nil db"
This reverts commit 994e8e4f40397aca54640af52d763d250693919e.

Since now etcdserver gracefully shuts down the gRPC server
2017-04-17 14:15:43 -07:00
Anthony Romano
78a5eb79b5 *: add swagger and grpc-gateway assets for v3lock and v3election 2017-04-10 15:21:07 -07:00
Anthony Romano
f67bdc2eed *: support checking that an interval tree's keys cover an entire interval 2017-04-03 15:38:07 -07:00
Gyu-Ho Lee
161c7f6bdf Merge pull request #7579 from gyuho/fix-defrage
*: fix panic during defrag operation
2017-03-23 10:08:33 -07:00
Anthony Romano
7ef75e373a Merge pull request #7525 from heyitsanthony/big-backend
etcdserver, backend: configure mmap size based on quota
2017-03-23 10:06:00 -07:00
Gyu-Ho Lee
26abd25cd3 mvcc/backend: hold 'readTx.Lock' until completing bolt.Tx reset
Fix https://github.com/coreos/etcd/issues/7526.

When resetting `bolt.Tx` in `defrag` and `batchTxBuffered.commit`
operation, we do not hold `readTx` lock, so the inflight range
requests can trigger panic in `mvcc.Range` paths. This fixes by
moving mutexes out and hold it while resetting the `readTx`.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-23 09:47:43 -07:00
Xiang Li
7698a2a546 Merge pull request #7553 from xiang90/fix_defrag
backend: add FillPercent option
2017-03-21 11:16:17 -07:00
Xiang
95870a21eb backend: add FillPercent option 2017-03-21 08:06:03 -07:00
Anthony Romano
8a3fee15a3 etcdserver, backend: only warn if exceeding max quota 2017-03-17 15:38:57 -07:00
Anthony Romano
5e4b008106 *: base initial mmap size on quota size 2017-03-17 15:38:49 -07:00
Anthony Romano
2f1542c06d *: use filepath.Join for files 2017-03-16 07:46:06 -07:00
Anthony Romano
33acbb694b mvcc: txns and r/w views
Clean-up of the mvcc interfaces to use txn interfaces instead of an id.

Adds support for concurrent read-only mvcc transactions.

Fixes #7083
2017-03-08 20:52:59 -08:00
Anthony Romano
8d438c2939 backend: readtx
ReadTxs are designed for read-only accesses to the backend using a
read-only boltDB transaction. Since BatchTx's are long-running
transactions, all writes to BatchTx will writeback to ReadTx, overlaying
the base read-only transaction.
2017-03-08 20:52:59 -08:00
Gyu-Ho Lee
3d75395875 *: remove never-unused vars, minor lint fix
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-06 14:59:12 -08:00