135 Commits

Author SHA1 Message Date
Gyuho Lee
f7367d94ff mvcc/backend: clean up mutex, logging
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-19 16:53:31 -07:00
Gyuho Lee
adfd0d3fe7 mvcc: avoid unnecessary metrics update
https://github.com/coreos/etcd/pull/9300

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 14:51:08 -07:00
Gyuho Lee
a410463a0b mvcc: add "etcd_mvcc_db_total_size_in_use_in_bytes"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 14:36:18 -07:00
Gyuho Lee
1da3603e31 mvcc: add "etcd_mvcc_db_total_size_in_bytes"
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-03 14:35:48 -07:00
Gyuho Lee
41888ddbaa mvcc: fix panic by allowing future revision watcher from restore operation
This also happens without gRPC proxy.

Fix panic when gRPC proxy leader watcher is restored:

```
go test -v -tags cluster_proxy -cpu 4 -race -run TestV3WatchRestoreSnapshotUnsync

=== RUN   TestV3WatchRestoreSnapshotUnsync
panic: watcher minimum revision 9223372036854775805 should not exceed current revision 16

goroutine 156 [running]:
github.com/coreos/etcd/mvcc.(*watcherGroup).chooseAll(0xc4202b8720, 0x10, 0xffffffffffffffff, 0x1)
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watcher_group.go:242 +0x3b5
github.com/coreos/etcd/mvcc.(*watcherGroup).choose(0xc4202b8720, 0x200, 0x10, 0xffffffffffffffff, 0xc420253378, 0xc420253378)
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watcher_group.go:225 +0x289
github.com/coreos/etcd/mvcc.(*watchableStore).syncWatchers(0xc4202b86e0, 0x0)
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watchable_store.go:340 +0x237
github.com/coreos/etcd/mvcc.(*watchableStore).syncWatchersLoop(0xc4202b86e0)
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watchable_store.go:214 +0x280
created by github.com/coreos/etcd/mvcc.newWatchableStore
	/home/gyuho/go/src/github.com/coreos/etcd/mvcc/watchable_store.go:90 +0x477
exit status 2
FAIL	github.com/coreos/etcd/integration	2.551s
```

gRPC proxy spawns a watcher with a key "proxy-namespace__lostleader"
and watch revision "int64(math.MaxInt64 - 2)" to detect leader loss.
But, when the partitioned node restores, this watcher triggers
panic with "watcher minimum revision ... should not exceed current ...".

This check was added a long time ago, by my PR, when there was no gRPC proxy:

https://github.com/coreos/etcd/pull/4043#discussion_r48457145

> we can remove this checking actually. it is impossible for a unsynced watching to have a future rev. or we should just panic here.

However, now it's possible that a unsynced watcher has a future
revision, when it was moved from a synced watcher group through
restore operation.

This PR adds "restore" flag to indicate that a watcher was moved
from the synced watcher group with restore operation. Otherwise,
the watcher with future revision in an unsynced watcher group
would still panic.

Example logs with future revision watcher from restore operation:

```
{"level":"info","ts":1527196358.9057755,"caller":"mvcc/watcher_group.go:261","msg":"choosing future revision watcher from restore operation","watch-key":"proxy-namespace__lostleader","watch-revision":9223372036854775805,"current-revision":16}
{"level":"info","ts":1527196358.910349,"caller":"mvcc/watcher_group.go:261","msg":"choosing future revision watcher from restore operation","watch-key":"proxy-namespace__lostleader","watch-revision":9223372036854775805,"current-revision":16}
```

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-31 11:41:34 -07:00
Iwasaki Yudai
eaf7d631ad mvcc: restore unsynced watchers
In case syncWatchersLoop() starts before Restore() is called,
watchers already added by that moment are moved to s.synced by the loop.
However, there is a broken logic that moves watchers from s.synced
to s.uncyned without setting keyWatchers of the watcherGroup.
Eventually syncWatchers() fails to pickup those watchers from s.unsynced
and no events are sent to the watchers, because newWatcherBatch() called
in the function uses wg.watcherSetByKey() internally that requires
a proper keyWatchers value.
2018-02-06 11:34:46 -08:00
Iwasaki Yudai
1ae0c0b47d mvcc: check null before set FillPercent not to panic
Since CreateBucketIfNotExists() can return nil when it gets an error,
accessing FillPercent must be done after a nil check, not to cause
a panic.
2018-01-08 13:08:03 -08:00
Gyuho Lee
76dd9d56a1 mvcc: clean-up godoc in key_index.go
Minor clean-up.

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2017-12-18 13:20:00 -08:00
Gyuho Lee
2e95ace82b mvcc: fetch revisions with current revision, not 0, in HashByRev
It was getting revisions with "atRev==0", which makes
"available" from "keep" method always empty since
"walk" on "keyIndex" only returns true.

"available" should be populated with all revisions to be
kept if the compaction happens with the given revision.
But, "available" was being empty when "kvindex.Keep(0)"
since it's always the case that "rev.main > atRev==0".

Fix https://github.com/coreos/etcd/issues/9022.

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2017-12-18 12:17:06 -08:00
Gyuho Lee
bcd5390b35 *: regenerate protobuf, grpc-gateway
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2017-12-07 21:31:13 -08:00
Gyu-Ho Lee
9154b31bf3 mvcc: move 'keyi' define before holding locks
To make it consistent with other code paths.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-10-05 10:06:28 -07:00
Anthony Romano
4fa1dd196c *: make receiver names consistent 2017-09-12 03:54:04 -07:00
Gyu-Ho Lee
f65aee0759 *: replace 'golang.org/x/net/context' with 'context'
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-09-07 13:39:42 -07:00
Anthony Romano
9d79d5fe65 mvcc: don't allocate keys when computing Revisions 2017-08-31 13:23:23 -07:00
Anthony Romano
be7d488982 mvcc: add range benchmark for fetching 100 keys 2017-08-31 13:23:23 -07:00
Anthony Romano
896447ed99 mvcc: only remove watch cancel after cancel completes
If Close() is called before Cancel()'s cancel() completes, the
watch channel will be closed while the watch is still in the
synced list. If there's an event, etcd will try to write to a
closed channel. Instead, remove the watch from the bookkeeping
structures only after cancel completes, so Close() will always
call it.

Fixes #8443
2017-08-28 17:06:33 -07:00
Anthony Romano
bd53ae5680 mvcc: test concurrently closing watch streams and canceling watches
Triggers a race that causes a write to a closed watch stream channel.
2017-08-28 17:06:32 -07:00
Anthony Romano
f58c0cfb66 mvcc: Revisions() method for index to avoid key allocation
Save another alloc on the one key path.
2017-08-21 11:30:02 -07:00
fengshaobao 00231050
13041c15ba mvcc: sending events after restore
Fixes: #8411
2017-08-21 10:32:49 -07:00
Anthony Romano
8b872196d0 backend: cache buckets in read tx
Saves an alloc and about 10% of Range() time.
2017-08-21 02:16:55 -07:00
Anthony Romano
10b65c97dd mvcc: benchmark Range() on a single key 2017-08-21 00:14:46 -07:00
Anthony Romano
ccd1bb1780 mvcc: test keys gauge is reloaded correctly on restore 2017-08-10 09:21:39 -07:00
Anthony Romano
32866572bf mvcc: reset keys gauge on restore
Fixes #8388
2017-08-10 08:37:50 -07:00
fanmin shi
df5a3d15ce mvcc: increase rev for TestHashKVWhenCompacting 2017-07-31 17:59:49 -07:00
fanmin shi
bb86c327e2 mvcc: HashKV gets keep from kvindex.Keep 2017-07-31 17:59:49 -07:00
fanmin shi
4c2c5b0084 mvcc: add tests for Keep 2017-07-31 17:59:42 -07:00
fanmin shi
7b8fb3cf0a mvcc: add and implement Keep api to index
Keep finds all revisions to be kept for a Compaction at the given rev.
2017-07-31 14:04:03 -07:00
fanmin shi
451b062184 mvcc/backend: add TestBackendWritebackForEach to backend_test.go 2017-07-28 09:39:48 -07:00
fanmin shi
785deebd62 mvcc/backend: enforce ordering for UnsafeForEach in read_tx.go
This pr changes  UnsafeForEach to traverse on boltdb before on the buffer.
This ordering guarantees that UnsafeForEach traverses in the same order
before or after the commit of buffer.
2017-07-28 09:30:23 -07:00
Xiang Li
2a348fb8e9 Merge pull request #8263 from fanminshi/hash_by_rev
api: hash by rev
2017-07-26 11:22:33 -07:00
fanmin shi
8609521ce2 mvcc: add TestHashKVWhenCompacting to kvstore_test 2017-07-26 09:48:29 -07:00
fanmin shi
deca9879c2 mvcc: add HashByRev to kv.go
HashByRev computes the hash of all MVCC keys up to a given revision.
2017-07-25 17:00:46 -07:00
Joe Betz
c06953ae08 mvcc: Add metric for count of db key revisions compacted.
When digging into etcd/boltdb "storage space exceeded" issues, this metric may help answer questions about if/when compactions occured and how much data was freed.
2017-07-20 10:07:56 -07:00
Anthony Romano
e9d096ae6b mvcc: don't allocate end revision while computing range
Use 'nil' since it's only reading a single key. Also preallocates
the result slice based on limit / number of revisions fetched.

Fixes #8208
2017-07-06 15:59:27 -07:00
Gyu-Ho Lee
870302afa6 mvcc/backend: enable 'NoFreelistSync' by default (linux)
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-07-05 16:10:04 -07:00
Gyu-Ho Lee
318e9c766f *: replace 'boltdb' import paths with 'coreos/bbolt'
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-07-05 14:32:13 -07:00
Anthony Romano
522e75cb4f mvcc: use GaugeFunc metric to load db size when requested
Relying on mvcc to set the db size metric can cause it to
miss size changes when a txn commits after the last write
completes before a quiescent period. Instead, load the
db size on demand.

Fixes #8146
2017-06-21 23:58:37 -07:00
Anthony Romano
51a568aa81 mvcc: restore into tree index with one key index
Clobbering the mvcc kvindex with new keyIndexes for each restore
chunk would cause index corruption by dropping historical information.
2017-06-19 12:04:01 -07:00
Anthony Romano
02164874d9 mvcc: test restore and deletes with small chunk sizes 2017-06-19 12:04:01 -07:00
Anthony Romano
7f149d8fb6 mvcc: set db size metric on restore
Fixes #8080
2017-06-16 11:27:34 -07:00
Anthony Romano
da48f1feaf mvcc: create TxnWrites from TxnRead with NewReadOnlyTxnWrite
Already used internally by mvcc, but needed by etcdserver txns.
2017-06-09 09:20:38 -07:00
Anthony Romano
300feea177 Merge pull request #8052 from heyitsanthony/watch-victim-test
mvcc: test watch victim/delay path
2017-06-08 11:10:33 -07:00
Anthony Romano
83b2ea2f60 mvcc: test watch victim/delay path
Current tests don't normally trigger the watch victim path because the
constants are too large; set the constants to small values and hammer
the store to cause watch delivery delays.
2017-06-07 17:02:00 -07:00
Anthony Romano
0352ce79b8 mvcc: count range/put/del operations for txns
Txns were previously only bumping the txn counter; now bumps all operation
counters.
2017-06-07 16:53:50 -07:00
Anthony Romano
fd71da47d1 mvcc: remove unused store.Equals function 2017-06-07 09:25:42 -07:00
Anthony Romano
ef63abdf7f mvcc: don't use pointer for storeTxnRead in storeTxnWrite
Saves an allocation when creating a storeTxnWrite.
2017-06-06 09:51:57 -07:00
Anthony Romano
6846e49edf Merge pull request #7859 from heyitsanthony/cache-consistent-get
mvcc: cache consistent index
2017-05-26 10:52:53 -07:00
Anthony Romano
ac4855e911 mvcc: benchmark ConsistentIndex 2017-05-26 09:49:40 -07:00
Anthony Romano
73dee0bec4 mvcc: cache consistentIndex
Called on every entry apply and boltdb requests aren't free.
2017-05-26 09:49:40 -07:00
Anthony Romano
0506f49f9e backend: don't hold boltdb read txn lock on cursor scanning
Large fetches hold the lock when they do not need to do so.
2017-05-26 09:28:08 -07:00