Benjamin Wang
ea3d78faae
Merge pull request #14505 from ahrtr/revert_14400
...
etcdserve: revert the etcdserver side change for the data loss in one node cluster
2022-09-22 17:06:10 +08:00
Benjamin Wang
9097e61b40
etcdserve: revert the etcdserver side change for the data loss on one node cluster
...
Since the raft side change has been merged, so we need to revert the etcdserver
side change.
Refer to
https://github.com/etcd-io/etcd/pull/14413
https://github.com/etcd-io/etcd/pull/14400
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-09-22 15:19:20 +08:00
Benjamin Wang
997260a832
Merge pull request #14463 from ahrtr/bump_go_1.19
...
Bump golang version to 1.19.1
2022-09-22 09:25:58 +08:00
Benjamin Wang
dd7d30017c
Bump go 1.19: revert the change to pkg/adt/interval_tree.go
...
Some comments in the file are formatted automatically into ugly style,
because the hierarchical structure is missing. After removing the
leading numbers in the comments, `go fmt` will not format the comments
anymore.
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-09-22 08:47:46 +08:00
Benjamin Wang
7f10dccbaf
Bump go 1.19: update all the dependencies and go.sum files
...
1. run ./scripts/fix.sh;
2. cd tools/mod; gofmt -w . & go mod tidy;
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-09-22 08:47:46 +08:00
Benjamin Wang
cb5f7276c3
Bump go 1.19: upgrade go version to 1.19.1 in the pipeline
...
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-09-22 08:47:46 +08:00
Benjamin Wang
cd0b1d0c66
Bump go 1.19: upgrade go version to 1.19 in all go.mod files
...
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-09-22 08:47:46 +08:00
Benjamin Wang
31d9664cb5
Merge pull request #14413 from tbg/raft-single-voter
...
raft: don't emit unstable CommittedEntries
2022-09-22 08:43:37 +08:00
Marek Siarkowicz
026794495f
Merge pull request #14494 from demoManito/remove/redundant-type-conversion
...
etcd: remove redundant type conversion
2022-09-21 11:34:19 +02:00
Benjamin Wang
6333f375a7
Merge pull request #14488 from serathius/update-fix
...
Improve static analysis fixing scripts
2022-09-21 06:20:08 +08:00
Benjamin Wang
2441a24cee
Merge pull request #14493 from demoManito/style/format-import-order
...
etcd: format import order
2022-09-21 06:03:31 +08:00
Marek Siarkowicz
bea478266e
makefile: Raname targets update* to fix* to distinquish from update_dep
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-20 13:58:17 +02:00
Marek Siarkowicz
5bfda80836
makefile: test the update target
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-20 13:57:59 +02:00
Marek Siarkowicz
bb139b15f8
makefile: Don't run update_dep.sh as it's not a check
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-20 13:55:51 +02:00
Marek Siarkowicz
05104ee9a7
makefile: Remove verify-revive as it is already run by golangci
...
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-20 13:55:51 +02:00
demoManito
f67ec10779
etcd: format import order
...
golang CodeReviewComments:
https://github.com/golang/go/wiki/CodeReviewComments#imports
Signed-off-by: demoManito <1430482733@qq.com>
2022-09-20 18:41:39 +08:00
Tobias Grieger
d56676c9b3
raft: benchmark results for ./benchmark put
...
I ran this PR against its main merge-base twice (on my 2021 Mac M1 pro),
and in both cases this PR was slightly faster, using the benchmark
invocation from [^1].
2819.6 vs 2808.4
2873.1 vs 2835
Full output below.
----
Script:
```
killall etcd
rm -rf default.etcd
scripts/build.sh
nohup ./bin/etcd --quota-backend-bytes=4300000000 &
sleep 10
f=bench-$(git log -1 --pretty=%s | sed -E 's/[^A-Za-z0-9]+/_/g').txt
go run ./tools/benchmark txn-put --endpoints="http://127.0.0.1:2379 " --clients=200 --conns=200 --key-space-size=4000000000 --key-size=128 --val-size=10240 --total=200000 --rate=40000 | tee "${f}"
```
PR:
```
Summary:
Total: 70.9320 secs.
Slowest: 0.3003 secs.
Fastest: 0.0044 secs.
Average: 0.0707 secs.
Stddev: 0.0437 secs.
Requests/sec: 2819.6030 (second run: 2873.0935)
Response time histogram:
0.0044 [1] |
0.0340 [2877] |
0.0636 [119485] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
0.0932 [17436] |∎∎∎∎∎
0.1228 [27364] |∎∎∎∎∎∎∎∎∎
0.1524 [20349] |∎∎∎∎∎∎
0.1820 [10214] |∎∎∎
0.2116 [1248] |
0.2412 [564] |
0.2707 [318] |
0.3003 [144] |
Latency distribution:
10% in 0.0368 secs.
25% in 0.0381 secs.
50% in 0.0416 secs.
75% in 0.0998 secs.
90% in 0.1375 secs.
95% in 0.1571 secs.
99% in 0.1850 secs.
99.9% in 0.2650 secs.
```
main:
```
Summary:
Total: 71.2152 secs.
Slowest: 0.6926 secs.
Fastest: 0.0040 secs.
Average: 0.0710 secs.
Stddev: 0.0461 secs.
Requests/sec: 2808.3903 (second run: 2834.98)
Response time histogram:
0.0040 [1] |
0.0728 [125816] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
0.1417 [59127] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
0.2105 [13476] |∎∎∎∎
0.2794 [1125] |
0.3483 [137] |
0.4171 [93] |
0.4860 [193] |
0.5549 [4] |
0.6237 [16] |
0.6926 [12] |
Latency distribution:
10% in 0.0367 secs.
25% in 0.0379 secs.
50% in 0.0417 secs.
75% in 0.0993 secs.
90% in 0.1367 secs.
95% in 0.1567 secs.
99% in 0.1957 secs.
99.9% in 0.4361 secs.
```
[^1]: https://github.com/etcd-io/etcd/pull/14394#issuecomment-1229606410
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 09:01:42 +02:00
Tobias Grieger
9ad36eecab
fixup! address comments
...
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 09:01:42 +02:00
Tobias Grieger
304e260038
raft: benchmark results
...
```
for sha in :/^Revert :/BenchmarkRawNode :/^raft:.directly; do git checkout raft-single-voter && git checkout $(git log -n 1 '--pretty=format:%H' $sha) && f=$(git log -1 --pretty=%s | sed -E 's/[^A-Za-z0-9]+/_/g').txt && go test -run - -count 10 -bench BenchmarkRawNode -benchmem -benchtime=100000x . > $f; done; git checkout raft-single-voter
```
The two possible solutions (directly updating progress and calling
maybeCommit in `(*raft).advance` vs calling `r.Step`) are identical. In
fact, we've gotten a tiny bit better with the `.Step` solution in terms
of not calling `firstIndex` as much, in the common case of not being a
single voter.
```
$ benchstat raft_directly_update_leader_in_advance.txt Revert_raft_directly_update_leader_in_advance_.txt
name old time/op new time/op delta
RawNode/single-voter-10 482ns ± 2% 742ns ± 1% +54.02% (p=0.000 n=9+9)
RawNode/two-voters-10 1.29µs ± 1% 1.31µs ± 2% +1.70% (p=0.000 n=9+10)
name old firstIndex/op new firstIndex/op delta
RawNode/single-voter-10 4.00 ± 0% 5.00 ± 0% +25.00% (p=0.000 n=10+10)
RawNode/two-voters-10 10.0 ± 0% 9.0 ± 0% -10.00% (p=0.000 n=10+10)
name old lastIndex/op new lastIndex/op delta
RawNode/single-voter-10 1.00 ± 0% 2.00 ± 0% +100.00% (p=0.000 n=10+10)
RawNode/two-voters-10 2.00 ± 0% 2.00 ± 0% ~ (all equal)
name old ready/op new ready/op delta
RawNode/single-voter-10 1.00 ± 0% 2.00 ± 0% +100.00% (p=0.000 n=10+10)
RawNode/two-voters-10 2.00 ± 0% 2.00 ± 0% ~ (all equal)
name old term/op new term/op delta
RawNode/single-voter-10 0.00 ± 0% 0.00 ± 0% ~ (all equal)
RawNode/two-voters-10 1.00 ± 0% 1.00 ± 0% ~ (all equal)
name old alloc/op new alloc/op delta
RawNode/single-voter-10 372B ± 0% 388B ± 0% +4.30% (p=0.000 n=10+10)
RawNode/two-voters-10 964B ± 0% 964B ± 0% ~ (all equal)
name old allocs/op new allocs/op delta
RawNode/single-voter-10 4.00 ± 0% 5.00 ± 0% +25.00% (p=0.000 n=10+10)
RawNode/two-voters-10 7.00 ± 0% 7.00 ± 0% ~ (all equal)
```
We then compare the `.Step` solution against the previous "status quo"
that prematurely emitted uncommitted entries for command application
below.
Importantly, we don't regress in the case of multiple peers. We actually
gain slightly in terms of `lastIndex` calls, but run a bit more code;
acceptable.
In the single-voter case, since we now need two Ready handling cycles
per op instead of one, we see additional calls to lastIndex and
firstIndex as well as slightly increased allocations. These are expected
and trade-offs we're willing to make to avoid correctness problems. Note
that the benchmark intentionally forces full processing of each
individual entries, so some of the new overhead would likely amortize on
a singleton voter seeing high throughput as multiple proposals could
share the Ready cycles.
```
$ benchstat raft_add_BenchmarkRawNode.txt Revert_raft_directly_update_leader_in_advance_.txt
name old time/op new time/op delta
RawNode/single-voter-10 482ns ± 2% 742ns ± 1% +54.02% (p=0.000 n=9+9)
RawNode/two-voters-10 1.29µs ± 1% 1.31µs ± 2% +1.70% (p=0.000 n=9+10)
name old firstIndex/op new firstIndex/op delta
RawNode/single-voter-10 4.00 ± 0% 5.00 ± 0% +25.00% (p=0.000 n=10+10)
RawNode/two-voters-10 10.0 ± 0% 9.0 ± 0% -10.00% (p=0.000 n=10+10)
name old lastIndex/op new lastIndex/op delta
RawNode/single-voter-10 1.00 ± 0% 2.00 ± 0% +100.00% (p=0.000 n=10+10)
RawNode/two-voters-10 2.00 ± 0% 2.00 ± 0% ~ (all equal)
name old ready/op new ready/op delta
RawNode/single-voter-10 1.00 ± 0% 2.00 ± 0% +100.00% (p=0.000 n=10+10)
RawNode/two-voters-10 2.00 ± 0% 2.00 ± 0% ~ (all equal)
name old term/op new term/op delta
RawNode/single-voter-10 0.00 ± 0% 0.00 ± 0% ~ (all equal)
RawNode/two-voters-10 1.00 ± 0% 1.00 ± 0% ~ (all equal)
name old alloc/op new alloc/op delta
RawNode/single-voter-10 372B ± 0% 388B ± 0% +4.30% (p=0.000 n=10+10)
RawNode/two-voters-10 964B ± 0% 964B ± 0% ~ (all equal)
name old allocs/op new allocs/op delta
RawNode/single-voter-10 4.00 ± 0% 5.00 ± 0% +25.00% (p=0.000 n=10+10)
RawNode/two-voters-10 7.00 ± 0% 7.00 ± 0% ~ (all equal)
```
`tools/benchmark put`:
```
Summary[main]: | Summary[this PR]:
Total: 284.4443 secs. | Total: 288.1100 secs.
Slowest: 0.1626 secs. | Slowest: 0.1456 secs.
Fastest: 0.0027 secs. | Fastest: 0.0018 secs.
Average: 0.0284 secs. | Average: 0.0288 secs.
Stddev: 0.0178 secs. | Stddev: 0.0182 secs.
Requests/sec: 35.1563 | Requests/sec: 34.7090 [=0.98727681809x main]
Response time histogram: | Response time histogram:
0.0027 [1] | | 0.0018 [1] |
0.0187 [137] | | 0.0162 [34] |
0.0347 [7895] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ | 0.0305 [7938] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
0.0507 [86] | | 0.0449 [103] |
0.0667 [1328] |∎∎∎∎∎∎ | 0.0593 [1056] |∎∎∎∎∎
0.0827 [480] |∎∎ | 0.0737 [420] |∎∎
0.0987 [45] | | 0.0881 [370] |∎
0.1147 [18] | | 0.1025 [48] |
0.1306 [7] | | 0.1168 [19] |
0.1466 [2] | | 0.1312 [6] |
0.1626 [1] | | 0.1456 [5] |
Latency distribution: | Latency distribution:
10% in 0.0195 secs. | 10% in 0.0194 secs.
25% in 0.0198 secs. | 25% in 0.0198 secs.
50% in 0.0201 secs. | 50% in 0.0201 secs.
75% in 0.0210 secs. | 75% in 0.0214 secs.
90% in 0.0585 secs. | 90% in 0.0589 secs.
95% in 0.0727 secs. | 95% in 0.0731 secs.
99% in 0.0762 secs. | 99% in 0.0788 secs.
99.9% in 0.1244 secs. | 99.9% in 0.1240 secs.
```
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 09:01:42 +02:00
Tobias Grieger
3c3e30a30e
Revert "raft: directly update leader in advance"
...
This reverts commit d73a986e4edb15ef9dbfc994f1cbf5e96694d877, which
was added only for benchmarking purposes.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 09:01:42 +02:00
Tobias Grieger
67c3522893
raft: directly update leader in advance
...
This makes the alternative option of implementing the leader's self-ack
of entry append the default.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 09:01:42 +02:00
Tobias Grieger
894e5cb685
move ctx param to the front
...
to appease linter
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 09:01:42 +02:00
Tobias Grieger
f62b9d5e19
remove TestNodeReadIndex
...
This is tested directly at the level of `RawNode` in
`TestRawNodeReadIndex`. `*node` is a thin wrapper around `RawNode` so
this is sufficient.
The reason to remove the test is that it now incurs data races
since it's not possible to adjust the `readStates` and `step`
fields while the node is running, and there is no primitive
to synchronize with its goroutine. This could all be fixed
but it's not worth it.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 09:01:40 +02:00
Tobias Grieger
f7dcb9ec2a
TestInteraction
...
Reviewed the diff in detail.
The changes here were benign, just the extra raft cycle.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
f7b0a6ad33
TestRawNodeBoundedLogGrowthWithPartition
...
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
02efe5135d
TestRawNodeStart
...
Now also sees the extra Ready cycle.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
79bf3b0df4
TestRawNodeJointAutoLeave
...
This now needs an additional Ready cycle to apply the previous conf change,
so the finalizing conf change does too.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
fbe4d40086
TestLeaderTransferIgnoreProposal
...
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
182e1a371d
TestReadOnlyWithLearner
...
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
b462fd15c2
TestMsgAppRespWaitReset
...
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
ff837f3a0b
TestProposal
...
Don't check on `committed` but `lastIndex` instead.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
15abe294e7
TestDueling{Pre,}Candidates
...
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
d6f3e88a52
TestSingleNodeCommit
...
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
99adcaa299
TestLearnerLogReplication
...
Needed to `(*raft).advance` on `n1` so that it would actually commit
the entry.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
7060d75527
TestLeaderOnlyCommitsLogFromCurrentTerm
...
Leader only acks to itself on `(*raft).advance` so we have to
make this test a bit more like the real thing.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
9ff144ef75
TestProgressLeader
...
This was expecting the progress of the leader to be updated as a
result of MsgProp but it is now happening in `(*raft).advance`.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
bd46776f03
Commit + apply all in nextEnts
...
This fixes essentially all tests using this, since now they don't have
to do anything special about the extra cycle introduced for single
nodes.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
f10579d3b5
TestLeaderAcknowledgeCommit
...
This needed to call `(*raft).advance` so that the leader would
self-ack the entries.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
b2dba1c86c
TestNodeAdvance
...
Switched this to baking the conf changes into the initial state
to have fewer cycles to walk through in the test.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
0d9a6061c3
TestNodeReadIndex
...
Needs to ignore the injected MsgAppResp.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
14a76d755f
TestNodeStart
...
This now sees the extra append-then-commit cycle.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
873cdf3fa6
TestNodeProposeWaitDropped
...
The test just needs to ignore the MsgAppResp.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
1a81b27bed
TestNodePropose{,Config}
...
This test now observes the `MsgAppResp` injected in `(*raft).advance`.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
87a9b80d7b
TestNodeProposeAddDuplicateNode
...
This needed to apply entries from CommittedEntries, not Entries.
Previously the test got away with it because the two slices were
equal. Now it was hanging because when it proposed the second
conf change the first one hadn't applied yet, and so it got dropped,
and the test would hang.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
931fec3b6d
TestCommitPagination
...
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
1f39a8fe79
raft: teach readyWithTimeout to log received Ready
s
...
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
36860f863f
TestLeaderAcknowledgeCommit
...
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
dad8208a4d
raft: avoid panics during *node tests
...
`StartNode` runs a naked goroutine, so it's impossible to test against
it in a way that will reliably produce contained test failures when
assertions are hit on the `(*node).run` goroutine.
This commit introduces a harness that we can use in tests to wrap
this goroutine and allow it to defer errors to `*testing.T`.
Note that tests of `Node` still need to be architected carefully
since it's easy to produce a deadlock in them should things not
go exactly as planned.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
169f4c3cc7
raft: don't emit unstable CommittedEntries
...
See https://github.com/etcd-io/etcd/issues/14370 .
When run in a single-voter configuration, prior to this PR
raft would emit `HardState`s that would emit a proposed `Entry`
simultaneously in `CommittedEntries` and `Entries`.
To be correct, this requires users of the raft library to enforce an
ordering between appending to the log and notifying the client about
`CommittedEntries` also present in `Entries`. This was easy to miss.
Walk back this behavior to arrive at a simpler contract: what's
emitted in `CommittedEntries` is truly committed, i.e. present
in stable storage on a quorum of voters.
This in turn pessimizes the single-voter case: rather than fully
handling an `Entry` in just one `Ready`, now two are required,
and in particular one has to do extra work to save on allocations.
We accept this as a good tradeoff, since raft primarily serves
multi-voter configurations.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00
Tobias Grieger
21be9fa337
raft: add single_node InteractionEnv test case
...
Show-cases the current behavior and changes made in future commits for [^1].
The test demonstrates that a single-voter raft instance will emit an
entry as committed while it still needs to be appended to the log.
[^1]: https://github.com/etcd-io/etcd/issues/14370
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
2022-09-20 08:59:37 +02:00