207 Commits

Author SHA1 Message Date
Gyuho Lee
bc18474029 mvcc: remove unnecessary type conversion
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-05 10:09:53 -07:00
Xiang Li
2f1730fcae backend: more metrics for bboltdb transcation 2018-06-11 14:05:04 -07:00
Gyuho Lee
f2db05a869 mvcc: server db size with "etcd_debugging" namespace for backward compatibility
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-06-07 10:23:12 -07:00
Gyuho Lee
21130d5fb6 mvcc: promote db size metrics to "etcd"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-06-07 10:20:45 -07:00
Gyuho Lee
e239cc276a mvcc: separate synced/unsynced benchmarks
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-06-01 10:00:18 -07:00
Gyuho Lee
0398ec7dcb mvcc: fix panic by allowing future revision watcher from restore operation
This also happens without gRPC proxy.

Fix panic when gRPC proxy leader watcher is restored:

```
go test -v -tags cluster_proxy -cpu 4 -race -run TestV3WatchRestoreSnapshotUnsync

=== RUN   TestV3WatchRestoreSnapshotUnsync
panic: watcher minimum revision 9223372036854775805 should not exceed current revision 16

goroutine 156 [running]:
github.com/coreos/etcd/mvcc.(*watcherGroup).chooseAll(0xc4202b8720, 0x10, 0xffffffffffffffff, 0x1)
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watcher_group.go:242 +0x3b5
github.com/coreos/etcd/mvcc.(*watcherGroup).choose(0xc4202b8720, 0x200, 0x10, 0xffffffffffffffff, 0xc420253378, 0xc420253378)
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watcher_group.go:225 +0x289
github.com/coreos/etcd/mvcc.(*watchableStore).syncWatchers(0xc4202b86e0, 0x0)
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watchable_store.go:340 +0x237
github.com/coreos/etcd/mvcc.(*watchableStore).syncWatchersLoop(0xc4202b86e0)
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watchable_store.go:214 +0x280
created by github.com/coreos/etcd/mvcc.newWatchableStore
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watchable_store.go:90 +0x477
exit status 2
FAIL	github.com/coreos/etcd/integration	2.551s
```

gRPC proxy spawns a watcher with a key "proxy-namespace__lostleader"
and watch revision "int64(math.MaxInt64 - 2)" to detect leader loss.
But, when the partitioned node restores, this watcher triggers
panic with "watcher minimum revision ... should not exceed current ...".

This check was added a long time ago, by my PR, when there was no gRPC proxy:

https://github.com/coreos/etcd/pull/4043#discussion_r48457145

> we can remove this checking actually. it is impossible for a unsynced watching to have a future rev. or we should just panic here.

However, now it's possible that a unsynced watcher has a future
revision, when it was moved from a synced watcher group through
restore operation.

This PR adds "restore" flag to indicate that a watcher was moved
from the synced watcher group with restore operation. Otherwise,
the watcher with future revision in an unsynced watcher group
would still panic.

Example logs with future revision watcher from restore operation:

```
{"level":"info","ts":1527196358.9057755,"caller":"mvcc/watcher_group.go:261","msg":"choosing future revision watcher from restore operation","watch-key":"proxy-namespace__lostleader","watch-revision":9223372036854775805,"current-revision":16}
{"level":"info","ts":1527196358.910349,"caller":"mvcc/watcher_group.go:261","msg":"choosing future revision watcher from restore operation","watch-key":"proxy-namespace__lostleader","watch-revision":9223372036854775805,"current-revision":16}
```

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-25 12:40:02 -07:00
Gyuho Lee
210c842345 mvcc: improve watcherGroup panic message
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-23 15:38:40 -07:00
Gyuho Lee
1d91698268 mvcc: document, clean up histogram variables
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-23 14:03:28 -07:00
Gyuho Lee
e6a113cdcd mvcc/backend: clean up histogram variables
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-23 14:03:28 -07:00
Gyuho Lee
bc59f7b42f mvcc: add "etcd_mvcc_hash_(rev)_duration_seconds"
etcd_mvcc_hash_duration_seconds
etcd_mvcc_hash_rev_duration_seconds

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-23 13:09:42 -07:00
Gyuho Lee
966ee9323c mvcc/backend: fix defrag duration scale
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-23 13:09:42 -07:00
Gyuho Lee
d326b2933c mvcc/backend: add "etcd_disk_backend_defrag_duration_seconds"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-23 13:09:42 -07:00
Gyuho Lee
60a9ec8a15 mvcc/backend: document metrics ExponentialBuckets
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-23 13:09:42 -07:00
Gyuho Lee
58e3ead219 mvcc/backend: clean up mutex, logging
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-23 13:09:42 -07:00
Gyuho Lee
1a83c6ad80 mvcc: remove unused parameters
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-09 15:42:45 -07:00
Gyuho Lee
5165344981 mvcc: use latest revision to tombstone
We replace/insert into in-memory B-tree, which means
we only keep a single node per key thus do not support
delete by revision on B-tree. So, (*keyIndex).tombstone
has always been marked with latest revision.

tombstone with key's modified revision panics:

panic: store.keyindex: put with unexpected smaller revision [{2 0} / {2 0}]

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-09 09:07:39 -07:00
Gyuho Lee
03ef9745a9 mvcc: add more structured logging
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-04 13:15:51 -07:00
Gyuho Lee
4d863dac5a mvcc: support structured logging in compact restore
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-02 11:57:23 -07:00
Gyuho Lee
3df30b9c7f mvcc: fix "unconvert" warnings
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-04-30 15:32:16 -07:00
jocalvert
f176427791 mvcc: Clone the key index for compaction and lock on each item
For compaction, clone the original Btree for traversal purposes, so as to
not hold the lock for the duration of compaction. This allows read/write
throughput by not blocking when the index tree is large (> 1M entries).

mvcc: add comment for index compaction lock
mvcc: explicitly unlock store to do index compaction synchronously
mvcc: formatting index bench
mvcc: add release note for index compaction changes
mvcc: add license header
2018-04-18 13:29:27 -07:00
Gyuho Lee
c00c6cb685 mvcc: support structured logger
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-04-16 17:36:00 -07:00
Gyuho Lee
6c40b2b5d4 mvcc/backend: defrag to block concurrent read requests while resetting tx
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-16 03:29:18 -04:00
Gyuho Lee
8a518b01c4 *: revert "internal/mvcc" change
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-02-26 17:11:40 -08:00
Gyuho Lee
80d15948bc *: move "mvcc" to "internal/mvcc"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-01-26 11:14:41 -08:00
Gyuho Lee
349a377a67 *: move "lease" to "internal/lease"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-01-26 11:09:29 -08:00
Manjunath A Kumatagi
89221a25b8 mvcc : Fix Govet errors 2018-01-25 02:30:37 -05:00
Iwasaki Yudai
0b1b82aff2 mvcc: check null before set FillPercent not to panic
Since CreateBucketIfNotExists() can return nil when it gets an error,
accessing FillPercent must be done after a nil check, not to cause
a panic.
2018-01-08 11:34:34 -08:00
Gyuho Lee
82a164e3b9 mvcc: make test struct fields unexported
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2017-12-31 13:20:41 -08:00
Connor Peet
fc3b59046f mvcc: allow clients to assign watcher IDs
This allows for watchers to be created concurrently
without needing potentially complex and latency-adding
queuing on the client.

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2017-12-31 13:20:40 -08:00
Gyuho Lee
76dd9d56a1 mvcc: clean-up godoc in key_index.go
Minor clean-up.

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2017-12-18 13:20:00 -08:00
Gyuho Lee
2e95ace82b mvcc: fetch revisions with current revision, not 0, in HashByRev
It was getting revisions with "atRev==0", which makes
"available" from "keep" method always empty since
"walk" on "keyIndex" only returns true.

"available" should be populated with all revisions to be
kept if the compaction happens with the given revision.
But, "available" was being empty when "kvindex.Keep(0)"
since it's always the case that "rev.main > atRev==0".

Fix https://github.com/coreos/etcd/issues/9022.

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2017-12-18 12:17:06 -08:00
Gyuho Lee
bcd5390b35 *: regenerate protobuf, grpc-gateway
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2017-12-07 21:31:13 -08:00
Gyu-Ho Lee
9154b31bf3 mvcc: move 'keyi' define before holding locks
To make it consistent with other code paths.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-10-05 10:06:28 -07:00
Anthony Romano
4fa1dd196c *: make receiver names consistent 2017-09-12 03:54:04 -07:00
Gyu-Ho Lee
f65aee0759 *: replace 'golang.org/x/net/context' with 'context'
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-09-07 13:39:42 -07:00
Anthony Romano
9d79d5fe65 mvcc: don't allocate keys when computing Revisions 2017-08-31 13:23:23 -07:00
Anthony Romano
be7d488982 mvcc: add range benchmark for fetching 100 keys 2017-08-31 13:23:23 -07:00
Anthony Romano
896447ed99 mvcc: only remove watch cancel after cancel completes
If Close() is called before Cancel()'s cancel() completes, the
watch channel will be closed while the watch is still in the
synced list. If there's an event, etcd will try to write to a
closed channel. Instead, remove the watch from the bookkeeping
structures only after cancel completes, so Close() will always
call it.

Fixes #8443
2017-08-28 17:06:33 -07:00
Anthony Romano
bd53ae5680 mvcc: test concurrently closing watch streams and canceling watches
Triggers a race that causes a write to a closed watch stream channel.
2017-08-28 17:06:32 -07:00
Anthony Romano
f58c0cfb66 mvcc: Revisions() method for index to avoid key allocation
Save another alloc on the one key path.
2017-08-21 11:30:02 -07:00
fengshaobao 00231050
13041c15ba mvcc: sending events after restore
Fixes: #8411
2017-08-21 10:32:49 -07:00
Anthony Romano
8b872196d0 backend: cache buckets in read tx
Saves an alloc and about 10% of Range() time.
2017-08-21 02:16:55 -07:00
Anthony Romano
10b65c97dd mvcc: benchmark Range() on a single key 2017-08-21 00:14:46 -07:00
Anthony Romano
ccd1bb1780 mvcc: test keys gauge is reloaded correctly on restore 2017-08-10 09:21:39 -07:00
Anthony Romano
32866572bf mvcc: reset keys gauge on restore
Fixes #8388
2017-08-10 08:37:50 -07:00
fanmin shi
df5a3d15ce mvcc: increase rev for TestHashKVWhenCompacting 2017-07-31 17:59:49 -07:00
fanmin shi
bb86c327e2 mvcc: HashKV gets keep from kvindex.Keep 2017-07-31 17:59:49 -07:00
fanmin shi
4c2c5b0084 mvcc: add tests for Keep 2017-07-31 17:59:42 -07:00
fanmin shi
7b8fb3cf0a mvcc: add and implement Keep api to index
Keep finds all revisions to be kept for a Compaction at the given rev.
2017-07-31 14:04:03 -07:00
fanmin shi
451b062184 mvcc/backend: add TestBackendWritebackForEach to backend_test.go 2017-07-28 09:39:48 -07:00