1331 Commits

Author SHA1 Message Date
Benjamin Wang
1683231216
Merge pull request #15667 from fuweid/deflake-issue-15545-TestV3WatchRestoreSnapshotUnsync
tests: deflake TestV3WatchRestoreSnapshotUnsync
2023-04-11 06:00:42 +08:00
Wei Fu
536953ec6c tests: deflake TestV3WatchRestoreSnapshotUnsync
The TestV3WatchRestoreSnapshotUnsync setups three members' cluster.
Before serving any update requests from client, after leader elected,
each member will have index 8 log: 3 x ConfChange +
3 x ClusterMemberAttrSet + 1 x ClusterVersionSet.

Based on the config (SnapshotCount: 10, CatchUpCount: 5), we need to
file update requests to trigger snapshot at least twice.

T1: L(snapshot-index: 11, compacted-index:  6) F_m0(index: 8)
T2: L(snapshot-index: 22, compacted-index: 17) F_m0(index: 8, out of date)

After member0 recovers from network partition, it will reject leader's
request and return hint (index:8, term:x). If it happens after
second snapshot, leader will find out the index:8 is out of date and
force to transfer snapshot.

However, the client only files 15 update requests and leader doesn't
finish the process of snapshot in time. Since the last of
compacted-index is 6, leader can still replicate index:9 to member0
instead of snapshot.

```bash
cd tests/integration
CLUSTER_DEBUG=true go test -v -count=1 -run TestV3WatchRestoreSnapshotUnsync ./
...

INFO    m2.raft 3da8ba707f1a21a4 became leader at term 2        {"member": "m2"}
...
INFO    m2      triggering snapshot     {"member": "m2", "local-member-id": "3da8ba707f1a21a4", "local-member-applied-index": 22, "local-member-snapshot-index": 11, "local-member-snapshot-count": 10, "snapshot-forced": false}
...

cluster.go:1359: network partition between: 99626fe5001fde8b <-> 1c964119da6db036
cluster.go:1359: network partition between: 99626fe5001fde8b <-> 3da8ba707f1a21a4
cluster.go:416: WaitMembersForLeader

INFO    m0.raft 99626fe5001fde8b became follower at term 2      {"member": "m0"}
INFO    m0.raft raft.node: 99626fe5001fde8b elected leader 3da8ba707f1a21a4 at term 2   {"member": "m0"}
DEBUG   m2.raft 3da8ba707f1a21a4 received MsgAppResp(rejected, hint: (index 8, term 2)) from 99626fe5001fde8b for index 23      {"member": "m2"}
DEBUG   m2.raft 3da8ba707f1a21a4 decreased progress of 99626fe5001fde8b to [StateReplicate match=8 next=9 inflight=15]  {"member": "m2"}

DEBUG   m0      Applying entries        {"member": "m0", "num-entries": 15}
DEBUG   m0      Applying entry  {"member": "m0", "index": 9, "term": 2, "type": "EntryNormal"}

....

INFO    m2      saved snapshot  {"member": "m2", "snapshot-index": 22}
INFO    m2      compacted Raft logs     {"member": "m2", "compact-index": 17}
```

To fix this issue, the patch uses log monitor to watch "compacted Raft
log" and expect that two members should compact log twice.

Fixes: #15545

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-04-10 22:27:58 +08:00
Marek Siarkowicz
7153a8f2f4
Merge pull request #15646 from serathius/robustness-readme-watch-issue
tests/robustness: Document analysing watch issue
2023-04-07 23:45:42 +02:00
Marek Siarkowicz
a5a5862e0b tests: Make using etcdctl expicit in e2e tests
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-04-06 13:29:37 +02:00
Benjamin Wang
8b1cd036ff security: remove password after authenticating the user
fix https://nvd.nist.gov/vuln/detail/CVE-2021-28235

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-04-06 17:11:54 +08:00
Benjamin Wang
801bb4c6df test: add an e2e test to reproduce https://nvd.nist.gov/vuln/detail/CVE-2021-28235
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-04-06 16:47:31 +08:00
Benjamin Wang
2d0d3c3fdf security: bump go to 1.19.8 to fix four CVEs
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-04-06 13:38:58 +08:00
Marek Siarkowicz
2d9aeec91f
Merge pull request #15645 from serathius/tests-cleanup-alternative-binaries
tests/framework: Cleanup alternative binaries in e2e tests
2023-04-06 07:33:17 +02:00
Marek Siarkowicz
540d012e5e tests/robustness: Ensure that etcdctl binary is provided
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-04-05 23:04:20 +02:00
Marek Siarkowicz
1e41d95ab2 tests/robustness: Document analysing watch issue
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-04-05 22:40:47 +02:00
Marek Siarkowicz
651873cf7b tests/framework: Cleanup alternative binaries in e2e tests
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-04-05 15:32:31 +02:00
Peter Wortmann
42a2643df9 tests/robustness: Reproduce issue #15220
This issue is somewhat easily reproduced simply by bombarding the
server with requests for progress notifications, which eventually
leads to one being delivered ahead of the payload message. This is
then caught by the watch response validation code previously added by
Marek Siarkowicz.

Signed-off-by: Peter Wortmann <peter.wortmann@skao.int>
2023-04-05 11:23:02 +01:00
Peter Wortmann
af25936fb7 tests/integration: Demonstrate manual progress notification race
This will fail basically every time, as the progress notification
request catches the watcher in an asynchronised state.

Signed-off-by: Peter Wortmann <peter.wortmann@skao.int>
2023-04-05 11:19:07 +01:00
Marek Siarkowicz
5bae6b1e44 tests/robustness: Detect trigger timeout and exit
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-04-04 15:23:58 +02:00
James Blair
1227754284 Cancel watch if cluster not healthy before or after injecting failpoints.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-04-04 13:58:17 +02:00
Marek Siarkowicz
6582e349db tests: Enfoce timeout on failpoints
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-04-04 12:25:07 +02:00
Marek Siarkowicz
523f235c82
Merge pull request #15603 from serathius/robustness-finish-with-success
tests: Ensure that operation history finishes with successful request
2023-04-04 12:03:36 +02:00
Benjamin Wang
32acc662c9
Merge pull request #15638 from ahrtr/dependency_20230404
Bump some dependencies
2023-04-04 17:11:26 +08:00
Marek Siarkowicz
6a5d326519 tests: Ensure that operation history finishes with successful request
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-04-04 09:40:17 +02:00
Marek Siarkowicz
5e0119eadc
Merge pull request #15636 from lavacat/main-test-watch-delay
tests: increase maxWatchDelay to prevent flaky TestWatchDelay*
2023-04-04 09:38:03 +02:00
Marek Siarkowicz
138fae6246
Merge pull request #15632 from serathius/fix-comparing-etcd-version
tests: Fix comparing etcd version
2023-04-04 09:34:55 +02:00
Marek Siarkowicz
8b6bf90c0d
Merge pull request #15580 from chaochn47/fix_flaking_auth_member_remove_test
fix flaking auth member remove test
2023-04-04 09:34:16 +02:00
Marek Siarkowicz
4fab20aa75
Merge pull request #15618 from serathius/robustness-fix-periodic-etcd-version
tests: Fix building incorrect etcd version and make switch strict
2023-04-04 09:30:20 +02:00
Benjamin Wang
072c5cb5da dependency: bump google.golang.org/protobuf from 1.28.1 to 1.30.0
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-04-04 15:28:09 +08:00
Benjamin Wang
56284d5dfe dependency: bump github.com/golang/protobuf from 1.5.2 to 1.5.3
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-04-04 15:21:22 +08:00
Benjamin Wang
0c66fc9f29 dependency: bump go.uber.org/multierr from 1.9.0 to 1.11.0
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-04-04 15:15:32 +08:00
Bogdan Kanivets
757910958e tests: increase maxWatchDelay to prevent flaky TestWatchDelay*
value is selected empirically after spot checking some logs of flaky workflows

fixes: https://github.com/etcd-io/etcd/issues/15634
Signed-off-by: Bogdan Kanivets <bkanivets@apple.com>
2023-04-03 21:49:36 -07:00
Chao Chen
caed563e08 fix flaking auth member remove test
Signed-off-by: Chao Chen <chaochn@amazon.com>
2023-04-03 17:41:08 -07:00
Marek Siarkowicz
69afcd1960 tests: Fix comparing etcd version
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-04-03 21:13:36 +02:00
Marek Siarkowicz
6f4e5f316e
Merge pull request #15592 from serathius/cleanup-endpoints
tests: Cleanup endpoints
2023-04-03 16:00:44 +02:00
Marek Siarkowicz
9c72ecb1f9 tests: Fix building incorrect etcd version and make switch strict
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-04-03 15:06:10 +02:00
Benjamin Wang
e57dcd5ceb test: fix typo in robustness test
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-04-03 18:46:32 +08:00
Marek Siarkowicz
0cbd56e8b6 tests: Cleanup endpoints
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-04-03 12:18:54 +02:00
Marek Siarkowicz
7c7f636aea
Merge pull request #15615 from serathius/robustness-snapshot-older-version
tests/robustness: Support running snapshot tests on older versions
2023-04-03 12:13:01 +02:00
Marek Siarkowicz
029315f57e tests/robustness: Support running snapshot tests on older versions
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-04-03 10:43:06 +02:00
Hitoshi Mitake
4da39e4b1e
Merge pull request #15294 from mitake/range-check
server/auth: disallow creating empty permission ranges
2023-04-03 09:03:50 +09:00
Marek Siarkowicz
03214c0239 Revert "tests/robustness: Disable testing network blackhole until #15595 is fixed"
This reverts commit 013e25fab9f76f0c1a00459555fe42b33f379eb9.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-04-01 16:32:20 +02:00
Marek Siarkowicz
71ba0873e3 tests/robustness: Encrypt peer traffic to prevent proxy manipulating packets
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-04-01 16:31:53 +02:00
Marek Siarkowicz
4529f01876
Merge pull request #15601 from serathius/robustness-disable-blackhole
tests/robustness: Disable testing network blackhole until #15595 is fixed
2023-03-31 15:04:24 +02:00
Marek Siarkowicz
013e25fab9 tests/robustness: Disable testing network blackhole until #15595 is fixed
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-31 13:55:58 +02:00
Marek Siarkowicz
be7be34800 client: Hide v2 client package
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-31 10:26:11 +02:00
Marek Siarkowicz
e11a32366e
Merge pull request #15544 from jmhbnz/remove_e2e_calc
Remove e2e from coverage calculation
2023-03-30 16:26:36 +02:00
Marek Siarkowicz
0bd0b6b0b5
Merge pull request #15446 from serathius/separate-grpc-server
Allow user to separate http and grpc server
2023-03-30 11:52:25 +02:00
James Blair
870d478844
Merge e2e spawn files.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-03-30 22:38:00 +13:00
Marek Siarkowicz
4340cbb4aa
Merge pull request #15575 from serathius/ensure-watch
tests: Ensure watch catches all events generated in traffic
2023-03-30 10:28:22 +02:00
Marek Siarkowicz
65add8cec4 tests: Test separate http port connection multiplexing
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-30 09:49:45 +02:00
Marek Siarkowicz
bf12179a5a server: Add --listen-client-http-urls flag to allow running grpc server separate from http server
Difference in load configuration for watch delay tests show how huge the
impact is. Even with random write scheduler grpc under http
server can only handle 500 KB with 2 seconds delay. On the other hand,
separate grpc server easily hits 10, 100 or even 1000 MB within 100 miliseconds.

Priority write scheduler that was used in most previous releases
is far worse than random one.

Tests configured to only 5 MB to avoid flakes and taking too long to fill
etcd.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-30 09:49:45 +02:00
James Blair
5faad23812
Merge branch 'main' into remove_e2e_calc 2023-03-30 16:46:31 +13:00
James Blair
4b87bb1852
Remove coverage implementation for ctl_v3_watch test.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-03-30 15:44:17 +13:00
James Blair
3c40a68d09
Remove nocov flags for e2e tests.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-03-30 15:37:09 +13:00