1488 Commits

Author SHA1 Message Date
James Blair
5a5b5a1c5d
dependency: bump github.com/prometheus/client_golang from 1.15.0 to 1.15.1
Signed-off-by: James Blair <mail@jamesblair.net>
2023-05-16 09:26:44 +12:00
James Blair
1798730cd8
dependency: bump golang.org/x/crypto from 0.8.0 to 0.9.0
Signed-off-by: James Blair <mail@jamesblair.net>
2023-05-16 08:39:19 +12:00
Marek Siarkowicz
6e53792568 tests/robustness: Implement Kubernetes optimistic concurrency operations
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-15 13:45:27 +02:00
Marek Siarkowicz
911c40a347 tests/robustness: Implement kubernetes list watch protocol
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-15 10:11:05 +02:00
Marek Siarkowicz
05ed91d76d
Merge pull request #15892 from lavacat/main-mono-clock-watch
tests/robustness: use monotonic clock for watch events
2023-05-15 09:25:41 +02:00
Benjamin Wang
2df32102ca
Merge pull request #15835 from yellowzf/grpcproxy_fix_memberlist_results_not_update_when_proxy_node_down
grpcproxy: fix memberlist results not update when proxy node down
2023-05-15 13:34:05 +08:00
yellowzf
ca221208d2 grpcproxy: fix memberlist results not update when proxy node down
If start grpc proxy with --resolver-prefix, memberlist will return all alive proxy nodes, when one grpc proxy node is down, it is expected to not return the down node, but it is still return

Signed-off-by: yellowzf <zzhf3311@163.com>
2023-05-15 10:59:02 +08:00
Bogdan Kanivets
c338882d7a tests/robustness: use monotonic clock for watch events
see: https://github.com/etcd-io/etcd/pull/15323
For consistency watch events should also use only time-measurement operations.

fixes: https://github.com/etcd-io/etcd/issues/15328
Signed-off-by: Bogdan Kanivets <bkanivets@apple.com>
2023-05-14 12:58:13 -07:00
Benjamin Wang
52dfd4bbed
Merge pull request #15867 from chaochn47/auth_test_split_8
migrate e2e auth tests to common #8
2023-05-13 14:21:37 +08:00
Chao Chen
c846b087db migrate e2e auth tests to common #8
Signed-off-by: Chao Chen <chaochn@amazon.com>
2023-05-12 22:51:47 -07:00
Marek Siarkowicz
2a0c989662
Merge pull request #15882 from serathius/robustness-txn-fields
tests/robustness: Improve naming of Txn fields
2023-05-12 13:34:02 +02:00
Marek Siarkowicz
831ce4c3cf tests/robustness: Improve naming of Txn fields
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-12 13:10:25 +02:00
Benjamin Wang
67ec1a4d30
Merge pull request #15862 from pchan/bump_dependency
dependency: bump dependabot dependencies
2023-05-12 06:59:47 +08:00
Marek Siarkowicz
e9900f6fff tests/robustness: Separate stream id from client id and improve AppendableHistory doc
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-11 21:03:52 +02:00
Marek Siarkowicz
9a922091ed
Merge pull request #15873 from serathius/robustness-safeguards
tests/robustness: Add safeguards to client and history
2023-05-11 13:37:42 +02:00
Marek Siarkowicz
962e15038e tests/robustness: Add safeguards to client and history
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-11 13:12:09 +02:00
Marek Siarkowicz
d77c618e6a
Merge pull request #15874 from serathius/robustness-fix-traffic-pointer
tests/robustness: Fix pointer causing all cluster tests using kuberne…
2023-05-11 11:16:21 +02:00
Benjamin Wang
05b663fbe8
Merge pull request #15828 from chaochn47/add_leadership_transfer_coverage
tests/e2e: add graceful shutdown test
2023-05-11 07:39:25 +08:00
Benjamin Wang
e3db9dc616
Merge pull request #15868 from jmhbnz/main
tests: Deflake TestEtcdGrpcResolverRoundRobin
2023-05-11 05:08:53 +08:00
Marek Siarkowicz
165a76b506 tests/robustness: Fix pointer causing all cluster tests using kubernetes traffic
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-10 16:08:08 +02:00
Marek Siarkowicz
dd248518d1 tests/robustness: Move request progress field from traffic to watch config and pass testScenario to reduce number of arguments
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-10 11:43:02 +02:00
James Blair
3f5ad36039
Deflake TestEtcdGrpcResolverRoundRobin.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-05-10 21:03:01 +12:00
Chao Chen
f31d0eafb9 tests/e2e: add graceful shutdown test
Signed-off-by: Chao Chen <chaochn@amazon.com>
2023-05-09 17:08:53 -07:00
Benjamin Wang
b404d25d84
Merge pull request #15741 from AngstyDuck/set-default-value-for-AutoCompactionMode
server: default value for config file field auto-compaction-mode is n…
2023-05-10 05:44:16 +08:00
AngstyDuck
a7344da7d3 server: default value for config file field auto-compaction-mode is now 'periodic'; added additional checks if auto-compaction-mode is undefined
Signed-off-by: AngstyDuck <solsticedante@gmail.com>
2023-05-09 23:10:44 +08:00
Prasad Chandrasekaran
c863f1f8c0 dependency: bump dependabot dependencies
Signed-off-by: Prasad Chandrasekaran <prasadc@vmware.com>
2023-05-09 18:38:35 +05:30
Marek Siarkowicz
ad20230e07 test/robustness: Create dedicated traffic package
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-09 10:50:13 +02:00
Marek Siarkowicz
f6161673af
Merge pull request #15851 from serathius/robustness-generic
tests/robustness: Make weighted pick random generic
2023-05-09 10:36:11 +02:00
Marek Siarkowicz
b14b468661 tests/robustness: Make weighted pick random generic
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-08 19:58:38 +02:00
Marek Siarkowicz
7c68be4cf3 tests/robustness: Implement Range limit and count
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-07 09:32:07 +02:00
Marek Siarkowicz
40f71ef3c6 tests/robustness: Implement delete request for kubernetes scenario
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-05 13:40:46 +02:00
Marek Siarkowicz
92366a5338 tests/robustness: Split model code into deterministic and non-deterministic
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
Co-authored-by: Benjamin Wang <wachao@vmware.com>
Co-authored-by: chao <54131596+chaochn47@users.noreply.github.com>
2023-05-05 12:25:10 +02:00
Marek Siarkowicz
cfe154209c tests/robustness: Separate describe model functions to dedicated file
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-04 14:03:18 +02:00
Marek Siarkowicz
2c16812841
Merge pull request #15817 from serathius/robustness-k8s-1
tests/robustness: Implement first step in validating the Kubernetes-etcd contract
2023-05-04 13:52:25 +02:00
Marek Siarkowicz
9b5680c5f1 tests/robustness: Implement first step in validating the Kubernetes-etcd contract.
* Use mod revision for optimistic concurrency.
* Introduce range requests as more general then get
* Add kubernetes specific traffic generation, for now using pull, but
  expected to evolve to use watch.
* Introduce kubernetes specific test scenario

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-04 13:26:54 +02:00
Hitoshi Mitake
49b59cc8e5
Merge pull request #15656 from mitake/lease-timetolive-auth
protect LeaseTimeToLive with RBAC
2023-05-02 23:02:29 +09:00
Benjamin Wang
b089474b01
Merge pull request #15795 from jmhbnz/deflake-roundrobin-resolver-test
tests: Deflake TestEtcdGrpcResolverRoundRobin
2023-05-02 06:09:46 +08:00
James Blair
b9533ca98b
Deflake TestEtcdGrpcResolverRoundRobin.
Increase request to 1000 to increase sample size/reduce variability and increase tolerance threshold from 10 to 15%.

Signed-off-by: James Blair <mail@jamesblair.net>
2023-04-29 14:14:16 +12:00
Wei Fu
09d053e035 tests/robustness: tune timeout policy
In a [scheduled test][1], the error shows

```
2023-04-19T11:16:15.8166316Z     traffic.go:96: rpc error: code = Unavailable desc = keepalive ping failed to receive ACK within timeout
```

According to [grpc-keepalive@v1.51.0][2], each frame from server will
fresh the `lastRead` and it won't file `Ping` frame to server. But the
client used by [`tombstone` request][3] might hit the race. Since we use
5ms as timeout, the client might not receive the result of `Ping` from
server in time. The keepalive will mark it timeout and close the
connection.

I didn't reproduce it in my local. If we add the sleep before update
`lastRead`, it can reproduce it sometimes. Still investigating this
part.

```diff
diff --git a/internal/transport/http2_client.go b/internal/transport/http2_client.go
index d518b07e..bee9c00a 100644
--- a/internal/transport/http2_client.go
+++ b/internal/transport/http2_client.go
@@ -1560,6 +1560,7 @@ func (t *http2Client) reader(errCh chan<- error) {
                t.controlBuf.throttle()
                frame, err := t.framer.fr.ReadFrame()
                if t.keepaliveEnabled {
+                       time.Sleep(2 * time.Millisecond)
                        atomic.StoreInt64(&t.lastRead, time.Now().UnixNano())
                }
                if err != nil {
```

`DialKeepAliveTime` is always >= [10s][4]. I think we should increase
the timeout to avoid flaky caused by unstable env.

And in a [scheduled test][5], the error shows

```
logger.go:130: 2023-04-22T10:45:52.646Z	INFO	Failed to trigger failpoint	{"failpoint": "blackhole", "error": "context deadline exceeded"}
```

Before sending `Status` to member, the client doesn't [pick][6] the
connection in time (100ms) and returns the error.

The `waitTillSnapshot` is used to ensure that it is good enough to
trigger snapshot transfer. And we have 1min timeout for
injectFailpoints, so I think we can remove the 100ms timeout to reduce
unnecessary stop.

```
injectFailpoints(1min timeout)
  failpoint.Inject
    triggerBlockhole.Trigger
      blackhole
        waitTillSnapshot
```

> NOTE: I didn't reproduce it either. :(

Reference:

[1]: <https://github.com/etcd-io/etcd/actions/runs/4741737098/jobs/8419176899>
[2]: <eeb9afa1f6/internal/transport/http2_client.go (L1647)>
[3]: <7450cd886d/tests/robustness/traffic.go (L94)>
[4]: <eeb9afa1f6/dialoptions.go (L445)>
[5]: <https://github.com/etcd-io/etcd/actions/runs/4772033408/jobs/8484334015>
[6]: <eeb9afa1f6/clientconn.go (L932)>

REF: #15763

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-04-29 07:03:47 +08:00
Benjamin Wang
c7d81acaf0 test: forcibly save data on pinicking
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-04-27 14:54:35 +08:00
Hitoshi Mitake
c9b368119e tests: e2e and integration test for timetolive
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
Co-authored-by: Benjamin Wang <wachao@vmware.com>
2023-04-26 20:35:20 +09:00
James Blair
18e3acae0e
Add new test for round robin resolver.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-04-25 18:44:24 +12:00
Benjamin Wang
4a8817bfb0
Merge pull request #15737 from jmhbnz/update-dependencies
Bump dependencies identified by dependabot
2023-04-21 06:35:08 +08:00
James Blair
04f3e9cb9a
dependency: bump golang.org/x/crypto from 0.7.0 to 0.8.0
Signed-off-by: James Blair <mail@jamesblair.net>
2023-04-21 05:34:21 +12:00
James Blair
042e2e9a57
dependency: bump github.com/prometheus/client_golang from 1.14.0 to 1.15.0
Signed-off-by: James Blair <mail@jamesblair.net>
2023-04-21 05:14:40 +12:00
Wei Fu
50aa00b203 tests: make log monitor as common helper
It's followup of #15667.

This patch is to use zaptest/observer as base to provide a similar
function to pkg/expect.Expect.

The test env

```bash
11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
mkdir /sys/fs/cgroup/etcd-followup-15667
echo 0-2 | tee /sys/fs/cgroup/etcd-followup-15667/cpuset.cpus # three cores
```

Before change:

* memory.peak: ~ 681 MiB
* Elapsed (wall clock) time (h:mm:ss or m:ss): 6:14.04

After change:

* memory.peak: ~ 671 MiB
* Elapsed (wall clock) time (h:mm:ss or m:ss): 6:13.07

Based on the test result, I think it's safe to be enabled by default.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-04-18 09:00:24 +08:00
Wei Fu
9f034fbaa8 chore: use tools/mod to lock the cfssl cmd version
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-04-13 12:06:31 +08:00
Wei Fu
8cd5969248 chore: use strict mode for tests/*/*.sh
REF: #15514

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-04-13 12:05:39 +08:00
Wei Fu
78d2ead804 chore: deprecate tests/manual folder
REF: #15514

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-04-13 12:05:39 +08:00
Marek Siarkowicz
e48ef9ea6a tests/robustness: Disable blackholing traffic till snapshot for v3.4.X
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-04-12 14:39:57 +02:00