191 Commits

Author SHA1 Message Date
Marek Siarkowicz
f96957adba tests: Add compact failpoints
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-11-15 14:59:03 +01:00
Benjamin Wang
4f824336ad etcdserver: add two failpoints for backend
1. before and after create boltDB transaction;
2. before and after writebuf back to read buffer;

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-11-15 08:09:05 +08:00
Benjamin Wang
3f18816e7d etcdserver: add gofail points before and after OnPreCommitUnsafe
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-11-14 11:02:18 +08:00
Benjamin Wang
5a3ef953eb etcdserver: call the OnPreCommitUnsafe in unsafeCommit
`unsafeCommit` is called by both `(*batchTxBuffered) commit` and
`(*backend) defrag`. When users perform the defragmentation
operation, etcd doesn't update the consistent index. If etcd
crashes(e.g. panicking) in the process for whatever reason, then
etcd replays the WAL entries starting from the latest snapshot,
accordingly it may re-apply entries which might have already been
applied, eventually the revision isn't consistent with other members.

Refer to discussion in https://github.com/etcd-io/etcd/pull/14685

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-11-11 10:57:15 +08:00
Marek Siarkowicz
2a1055c7f3 raft: Remove dependency on etcd api
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-11-08 13:56:46 +01:00
Cenk Alti
580a86ebe5
server: add more context to panic message
Signed-off-by: Cenk Alti <cenkalti@gmail.com>
2022-10-31 20:29:15 -04:00
Benjamin Wang
d15a9d0edc
Merge pull request #14457 from jbml/hashbyrev_compact_main
etcdserver: fix corruption check when server has just been compacted
2022-10-13 15:17:38 +08:00
Benjamin Wang
abef537a90
Merge pull request #14515 from spongecaptain/btree-generics
upate:use google/btree in the genric way
2022-09-27 16:44:13 +08:00
wathenjiang
319db38b0a update: add benchmark test
benchmark result:
(1) master branch
$ go test -bench='BenchmarkIndexPut$' -count=5
goos: darwin
goarch: amd64
pkg: go.etcd.io/etcd/server/v3/storage/mvcc
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkIndexPut-12             1000000              2591 ns/op
BenchmarkIndexPut-12             1000000              2531 ns/op
BenchmarkIndexPut-12             1000000              2536 ns/op
BenchmarkIndexPut-12             1000000              2546 ns/op
BenchmarkIndexPut-12             1000000              2538 ns/op
PASS
ok      go.etcd.io/etcd/server/v3/storage/mvcc  167.439s

$ go test -bench='BenchmarkIndexGet$' -count=5
goos: darwin
goarch: amd64
pkg: go.etcd.io/etcd/server/v3/storage/mvcc
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkIndexGet-12             1000000              2021 ns/op
BenchmarkIndexGet-12             1000000              2029 ns/op
BenchmarkIndexGet-12             1000000              2044 ns/op
BenchmarkIndexGet-12             1000000              1973 ns/op
BenchmarkIndexGet-12             1000000              2027 ns/op
PASS
ok      go.etcd.io/etcd/server/v3/storage/mvcc  177.815s

(2) google/btree in the generic way
$ go test -bench='BenchmarkIndexPut$' -count=5
goos: darwin
goarch: amd64
pkg: go.etcd.io/etcd/server/v3/storage/mvcc
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkIndexPut-12             1000000              2477 ns/op
BenchmarkIndexPut-12             1000000              2380 ns/op
BenchmarkIndexPut-12             1000000              2360 ns/op
BenchmarkIndexPut-12             1000000              2396 ns/op
BenchmarkIndexPut-12             1000000              2382 ns/op
PASS
ok      go.etcd.io/etcd/server/v3/storage/mvcc  165.841s

$ go test -bench='BenchmarkIndexGet$' -count=5
goos: darwin
goarch: amd64
pkg: go.etcd.io/etcd/server/v3/storage/mvcc
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkIndexGet-12             1000000              1985 ns/op
BenchmarkIndexGet-12             1000000              1914 ns/op
BenchmarkIndexGet-12             1000000              1900 ns/op
BenchmarkIndexGet-12             1000000              1905 ns/op
BenchmarkIndexGet-12             1000000              1894 ns/op
PASS
ok      go.etcd.io/etcd/server/v3/storage/mvcc  177.573s

Signed-off-by: wathenjiang <wathenjiang@tencent.com>
2022-09-27 14:33:02 +08:00
Spongecaptain
c53dfc7c5b upate:use google/btree in the genric way
Signed-off-by: wathenjiang <wathenjiang@tencent.com>
2022-09-27 10:16:15 +08:00
Kafuu Chino
f1d4935e91 *: avoid closing a watch with ID 0 incorrectly
Signed-off-by: Kafuu Chino <KafuuChinoQ@gmail.com>

add test
2022-09-26 20:30:33 +08:00
Jeremy Leach
cc1e245368 etcdserver: fix corruption check when server has just been compacted
When a key-value store corruption check happens immediately after a
compaction, the revision at which the key-value store hash is computed,
is the compacted revision itself.
In that case, the hash computation logic was incorrect because it
returned an ErrCompacted error; this error should instead be returned when
the revision at which the key-value store is hashed, is strictly lower
than the compacted revision.

Fixes #14325

Signed-off-by: Jeremy Leach <44558776+jbml@users.noreply.github.com>
2022-09-24 22:20:26 +10:00
SimFG
5702765729 wal: Fix the walWriteBytes metric
Signed-off-by: SimFG <1142838399@qq.com>
2022-09-22 19:23:06 +08:00
Benjamin Wang
7f10dccbaf Bump go 1.19: update all the dependencies and go.sum files
1. run ./scripts/fix.sh;
2. cd tools/mod; gofmt -w . & go mod tidy;

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-09-22 08:47:46 +08:00
demoManito
a9c3d56508 etcd: remove redundant type conversion
Signed-off-by: demoManito <1430482733@qq.com>
2022-09-20 11:26:02 +08:00
demoManito
72cf0cc04a etcd: modify declaring empty slices
declare an empty slice to var s []int replace  s :=[]int{}, https://github.com/golang/go/wiki/CodeReviewComments#declaring-empty-slices

Signed-off-by: demoManito <1430482733@qq.com>
2022-09-16 14:41:14 +08:00
lovehhf
3b585e94fc mvcc: Remove unused revisions and change comment rev to modified
Signed-off-by: Hongfei Huang <853885165@qq.com>
2022-09-14 23:36:54 +08:00
Vladimir Sokolov
a4f140c9fa testing: fix TestOpenWithMaxIndex cleanup
A WAL object was closed by defer, however the WAL was rewritten afterwards,
so defer closed already closed WAL but not the new one. It caused a data
race between writing file and cleaning up a temporary test directory,
which led to a non-deterministic bug.

Fixes #14332

Signed-off-by: Vladimir Sokolov <vsvastey@gmail.com>
2022-09-07 21:30:16 +03:00
Abirdcfly
08a9d1da07
chore: remove duplicate word in comments
Signed-off-by: Abirdcfly <fp544037857@gmail.com>
2022-08-27 13:39:48 +08:00
Benjamin Wang
fff5d00ccf
Merge pull request #14149 from lavacat/main-txn-panic
server: don't panic in readonly serializable txn
2022-08-14 05:41:57 +08:00
Bogdan Kanivets
43bb9d5c22 server: don't panic in readonly serializable txn
Problem: We pass grpc context down to applier in readonly serializable txn.
This context can be cancelled for example due to timeout.
This will trigger panic inside applyTxn

Solution: Only panic for transactions with write operations

fixes https://github.com/etcd-io/etcd/issues/14110

Signed-off-by: Bogdan Kanivets <bkanivets@apple.com>
2022-08-09 00:46:50 -07:00
Benjamin Wang
3dd7d3f9af enhance the WAL file related error
The `ErrFileNotFound` was used for for three cases:
1. There is no any WAL files (probably due to no read permission);
2. There is no WAL files matching the snapshot index;
3. The WAL file seqs do not increase continuously.

It's not good for debug when users see the `ErrFileNotFound` error,
so in this PR, a different error is returned for each case above.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-08-03 05:37:30 +08:00
Marek Siarkowicz
6697fca97d server: Implement compaction hash checking
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-07-26 09:31:14 +02:00
Marek Siarkowicz
264498258b tests: Move CorruptBBolt to testutil
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-07-25 13:59:30 +02:00
Benjamin Wang
33347b6845
Merge pull request #13954 from mitake/perm-cache-lock
server/auth: protect rangePermCache with a mutex
2022-07-03 07:01:46 +08:00
Hitoshi Mitake
de09174a3f server/auth: protect rangePermCache with a RW lock
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
2022-07-02 23:23:13 +09:00
Benjamin Wang
e0998f49e1
Merge pull request #14124 from wayblink/index-op
mvcc:add ut for Revisions/CountRevisions and remove RangeSince as it …
2022-06-21 12:41:25 +08:00
wayblink
428d21f5eb mvcc:add ut for Revisions/CountRevisions and remove RangeSince as it is not used
Signed-off-by: wayblink <wayasxxx@gmail.com>
2022-06-21 11:41:44 +08:00
Benjamin Wang
8038e876d3 replace ioutil with os package
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-06-17 10:11:30 +08:00
Benjamin Wang
ccf477d12b restrict the max size of each WAL entry to the remaining size of the file
Currently the max size of each WAL entry is hard coded as 10MB. If users
set a value > 10MB for the flag --max-request-bytes, then etcd may run
into a situation that it successfully processes a big request, but fails
to decode it when replaying the WAL file on startup.

On the other hand, we can't just remove the limitation, because if a
WAL entry is somehow corrupted, and its recByte is a huge value, then
etcd may run out of memory. So the solution is to restrict the max size
of each WAL entry as a dynamic value, which is the remaining size of
the WAL file.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-06-17 05:41:49 +08:00
SimFG
d83925e357 schedule: Provide logs when the fifo job panic happens
To make the fifo scheduler better debuggability.

Signed-off-by: SimFG <1142838399@qq.com>
2022-06-15 20:58:17 +08:00
Marek Siarkowicz
71bba3c761
Merge pull request #14106 from SimFG/remove_case
wal: remove the repeated test case
2022-06-14 12:13:00 +02:00
SimFG
402188a98c wal: remove the repeated test case
The TestFilePipelineFailLockFile case maybe test the line 77 in the alloc function. But the LockFile will not throw an error when the file lock is hold. And TestFilePipelineFailPreallocate and TestFilePipelineFailLockFile are exactly the same except for the comments.

Signed-off-by: SimFG <1142838399@qq.com>
2022-06-14 17:26:20 +08:00
Marek Siarkowicz
2c12954158
Merge pull request #14049 from serathius/compact-hash
Calculate hash during compaction
2022-06-14 11:10:19 +02:00
Marek Siarkowicz
9612fc1194 tests: Unify TestCompactionHash and extend it to also Delete keys and Defrag
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:20:19 +02:00
Marek Siarkowicz
0e739da9a4 server: Cache compaction hash for HashByRev API
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:20:18 +02:00
Marek Siarkowicz
2b090e86a6 server: Extract hasher to separate interface
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:20:18 +02:00
Marek Siarkowicz
80828b593a server: Remove duplicated compaction revision
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:20:18 +02:00
Marek Siarkowicz
34a02ba621 server: Return revision range that hash was calcualted for
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:20:17 +02:00
Marek Siarkowicz
76d3c527ae server: Store real rv range in hasher
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:26 +02:00
Marek Siarkowicz
8f7b053480 server: Move adjusting revision to hasher
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:26 +02:00
Marek Siarkowicz
19941654fe server: Pass revision as int
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:26 +02:00
Marek Siarkowicz
887e53e61a server: Calculate hash during compaction
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:26 +02:00
Marek Siarkowicz
7381c81a7e server: Fix range in mock not returning same number of keys and values
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:26 +02:00
Marek Siarkowicz
08a43178fa server: Move reading KV index inside scheduleCompaction function
Makes it easier to test hash match between scheduleCompaction and
HashByRev.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:26 +02:00
Marek Siarkowicz
638ca1006a server: Return error from scheduleCompaction
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:26 +02:00
Marek Siarkowicz
0f90359c4b server: Refactor hasher
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:25 +02:00
Marek Siarkowicz
e62c358793 server: Extract kvHash struct
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:25 +02:00
Marek Siarkowicz
fcaf76dbca server: Move unsafeHashByRev to new hash.go file
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:25 +02:00
Marek Siarkowicz
0984878ae7 server: Extract unsafeHashByRev function
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:25 +02:00