183 Commits

Author SHA1 Message Date
Marek Siarkowicz
9fc438cb6b tests/robustness: Add List and StaleList requests to etcd traffic
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-06-21 18:03:41 +02:00
Marek Siarkowicz
fd3e338d88
Merge pull request #16115 from serathius/robustness-kubernetes-tune
tests/robustness: Tune Kubernetes tests to reduce number of delete requests
2023-06-20 11:16:02 +02:00
Marek Siarkowicz
519617cfd0 tests/robustness: Tune Kubernetes tests to reduce number of delete requests
Having too many delete requests is bad as they are not unique requests, so
linearization is more prone to timeout on them.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-06-20 09:45:23 +02:00
Marek Siarkowicz
1217548acf tests/robustness: Separate traffic name from cluster setup in test name
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-06-20 09:16:36 +02:00
Marek Siarkowicz
9c659eb4e0
Merge pull request #16072 from serathius/robustness-stale-read
Validate stale read
2023-06-19 18:22:08 +02:00
Marek Siarkowicz
1663600bec tests/robustness: Validate stale get requests by replaying etcd state
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-06-19 14:17:38 +02:00
Marek Siarkowicz
09b9f889e7 tests/robustness: Refactor etcd traffic client
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-06-19 12:08:17 +02:00
Marek Siarkowicz
b7e7811ba4
Merge pull request #16091 from serathius/robustness-stale-read-1
tests/robustness: Implement stale reads without validation
2023-06-19 11:21:48 +02:00
Marek Siarkowicz
5e7349b44c
Merge pull request #16094 from serathius/robustness-retry-failpoint
Robustness retry failpoint
2023-06-19 09:10:46 +02:00
Marek Siarkowicz
43b2477c28 tests/robustness: Retry injecting failpoint
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-06-17 17:30:20 +02:00
Marek Siarkowicz
96987d8b5e tests/robustness: Implement stale reads without validation
For now we just validate stale read revision, but not response content.
Reason is that etcd model only stores latest version of keys, and no
history like real etcd.

Validating stale read contents needs to be done outside of model
as storing whole history is just to costly for linearization validation.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-06-16 21:17:37 +02:00
Marek Siarkowicz
6f2a5b710f
Merge pull request #16096 from serathius/robustness-limit-to-fresh-state
tests/robustness: Limit model to start only from fresh state
2023-06-16 21:15:48 +02:00
Marek Siarkowicz
57258759c6
Merge pull request #16085 from serathius/robustness-disable-blackhole
tests/robustness: Disable blackhole until snapshot for v3.5 and v3.4
2023-06-16 21:15:26 +02:00
Marek Siarkowicz
ea3255b477 tests/robustness: Limit model to start only from fresh state
It is just to complicated to support starting from non-empty etcd.
Existing implementation was very naive to assume that we can build
full state from just one request. We might consider implementing
validation of non-empty history in future, but for now settting
this limit should clean up the code and speed up development.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-06-16 13:50:20 +02:00
Marek Siarkowicz
fb16bca44a tests/robustness: Disable blackhole until snapshot for v3.5 and v3.4
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-06-16 13:26:24 +02:00
Marek Siarkowicz
34cbf4cd6f tests/robustness: Allow errors and unknown responses in deterministic model
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-06-15 22:25:48 +02:00
Marek Siarkowicz
6979318108 tests/robustness: Make Range a proper request type to allow setting Range.Revision != 0 for stale reads
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-06-14 13:55:45 +02:00
Marek Siarkowicz
974655e02c tests/robustness: Rename operations const to separate from RequestType
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-06-14 13:24:14 +02:00
Marek Siarkowicz
da49157b20
Merge pull request #16066 from serathius/robusness-validate
tests/robustness: Extract validation to separate package
2023-06-14 11:54:46 +02:00
Marek Siarkowicz
7bbc738ec4 tests/robustness: Extract validation to separate package
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-06-14 09:14:27 +02:00
Marek Siarkowicz
a268a67e45 tests/robustness: Move get to list of randomized operations 2023-06-13 21:09:05 +02:00
Marek Siarkowicz
a6ab774458
Merge pull request #16044 from serathius/robusness-empty
tests/robustness: Assume starting from empty etcd instead of throwing out first failed request
2023-06-12 10:18:34 +02:00
Marek Siarkowicz
b366cda70f
Merge pull request #16046 from serathius/robusness-test-diff
tests/robustness: Provide a response diff in model test to make debugging easier
2023-06-12 10:15:43 +02:00
Marek Siarkowicz
f410c6e6df tests/robustness: Provide a response diff in model test to make debugging easier
Signed-off-by: Marek Siarkowicz <serathius@users.noreply.github.com>
2023-06-09 22:42:17 +02:00
Marek Siarkowicz
53af854871 tests/robustness: Assume starting from empty etcd instead of throwing out first failed request
Signed-off-by: Marek Siarkowicz <serathius@users.noreply.github.com>
2023-06-09 22:38:16 +02:00
Marek Siarkowicz
f91f6d8414 tests/robustness: Put traffic type on second place before cluster size in test name
Signed-off-by: Marek Siarkowicz <serathius@users.noreply.github.com>
2023-06-09 22:30:53 +02:00
Marek Siarkowicz
16bf0f6641 tests/robustness: Use traffic.RecordingClient in watch
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-25 22:17:23 +02:00
Marek Siarkowicz
4872b679a5 tests/robustness: Expect revions to be unique for Kubernetes Traffic
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-23 15:51:10 +02:00
Marek Siarkowicz
f3c9db9c46
Merge pull request #15893 from serathius/robustness-validate-client-watch
tests/robustness: Validate all etcd watches opened to etcd
2023-05-16 10:51:55 +02:00
Marek Siarkowicz
6429f47631 tests/robustness: Validate all etcd watches opened to etcd
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-16 10:28:01 +02:00
Marek Siarkowicz
112aad1ea7 tests/robustness: Unify model test cases
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-16 10:13:08 +02:00
Marek Siarkowicz
6e53792568 tests/robustness: Implement Kubernetes optimistic concurrency operations
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-15 13:45:27 +02:00
Marek Siarkowicz
911c40a347 tests/robustness: Implement kubernetes list watch protocol
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-15 10:11:05 +02:00
Bogdan Kanivets
c338882d7a tests/robustness: use monotonic clock for watch events
see: https://github.com/etcd-io/etcd/pull/15323
For consistency watch events should also use only time-measurement operations.

fixes: https://github.com/etcd-io/etcd/issues/15328
Signed-off-by: Bogdan Kanivets <bkanivets@apple.com>
2023-05-14 12:58:13 -07:00
Marek Siarkowicz
2a0c989662
Merge pull request #15882 from serathius/robustness-txn-fields
tests/robustness: Improve naming of Txn fields
2023-05-12 13:34:02 +02:00
Marek Siarkowicz
831ce4c3cf tests/robustness: Improve naming of Txn fields
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-12 13:10:25 +02:00
Marek Siarkowicz
e9900f6fff tests/robustness: Separate stream id from client id and improve AppendableHistory doc
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-11 21:03:52 +02:00
Marek Siarkowicz
9a922091ed
Merge pull request #15873 from serathius/robustness-safeguards
tests/robustness: Add safeguards to client and history
2023-05-11 13:37:42 +02:00
Marek Siarkowicz
962e15038e tests/robustness: Add safeguards to client and history
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-11 13:12:09 +02:00
Marek Siarkowicz
165a76b506 tests/robustness: Fix pointer causing all cluster tests using kubernetes traffic
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-10 16:08:08 +02:00
Marek Siarkowicz
dd248518d1 tests/robustness: Move request progress field from traffic to watch config and pass testScenario to reduce number of arguments
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-10 11:43:02 +02:00
Marek Siarkowicz
ad20230e07 test/robustness: Create dedicated traffic package
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-09 10:50:13 +02:00
Marek Siarkowicz
f6161673af
Merge pull request #15851 from serathius/robustness-generic
tests/robustness: Make weighted pick random generic
2023-05-09 10:36:11 +02:00
Marek Siarkowicz
b14b468661 tests/robustness: Make weighted pick random generic
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-08 19:58:38 +02:00
Marek Siarkowicz
7c68be4cf3 tests/robustness: Implement Range limit and count
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-07 09:32:07 +02:00
Marek Siarkowicz
40f71ef3c6 tests/robustness: Implement delete request for kubernetes scenario
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-05 13:40:46 +02:00
Marek Siarkowicz
92366a5338 tests/robustness: Split model code into deterministic and non-deterministic
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
Co-authored-by: Benjamin Wang <wachao@vmware.com>
Co-authored-by: chao <54131596+chaochn47@users.noreply.github.com>
2023-05-05 12:25:10 +02:00
Marek Siarkowicz
cfe154209c tests/robustness: Separate describe model functions to dedicated file
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-04 14:03:18 +02:00
Marek Siarkowicz
9b5680c5f1 tests/robustness: Implement first step in validating the Kubernetes-etcd contract.
* Use mod revision for optimistic concurrency.
* Introduce range requests as more general then get
* Add kubernetes specific traffic generation, for now using pull, but
  expected to evolve to use watch.
* Introduce kubernetes specific test scenario

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-05-04 13:26:54 +02:00
Wei Fu
09d053e035 tests/robustness: tune timeout policy
In a [scheduled test][1], the error shows

```
2023-04-19T11:16:15.8166316Z     traffic.go:96: rpc error: code = Unavailable desc = keepalive ping failed to receive ACK within timeout
```

According to [grpc-keepalive@v1.51.0][2], each frame from server will
fresh the `lastRead` and it won't file `Ping` frame to server. But the
client used by [`tombstone` request][3] might hit the race. Since we use
5ms as timeout, the client might not receive the result of `Ping` from
server in time. The keepalive will mark it timeout and close the
connection.

I didn't reproduce it in my local. If we add the sleep before update
`lastRead`, it can reproduce it sometimes. Still investigating this
part.

```diff
diff --git a/internal/transport/http2_client.go b/internal/transport/http2_client.go
index d518b07e..bee9c00a 100644
--- a/internal/transport/http2_client.go
+++ b/internal/transport/http2_client.go
@@ -1560,6 +1560,7 @@ func (t *http2Client) reader(errCh chan<- error) {
                t.controlBuf.throttle()
                frame, err := t.framer.fr.ReadFrame()
                if t.keepaliveEnabled {
+                       time.Sleep(2 * time.Millisecond)
                        atomic.StoreInt64(&t.lastRead, time.Now().UnixNano())
                }
                if err != nil {
```

`DialKeepAliveTime` is always >= [10s][4]. I think we should increase
the timeout to avoid flaky caused by unstable env.

And in a [scheduled test][5], the error shows

```
logger.go:130: 2023-04-22T10:45:52.646Z	INFO	Failed to trigger failpoint	{"failpoint": "blackhole", "error": "context deadline exceeded"}
```

Before sending `Status` to member, the client doesn't [pick][6] the
connection in time (100ms) and returns the error.

The `waitTillSnapshot` is used to ensure that it is good enough to
trigger snapshot transfer. And we have 1min timeout for
injectFailpoints, so I think we can remove the 100ms timeout to reduce
unnecessary stop.

```
injectFailpoints(1min timeout)
  failpoint.Inject
    triggerBlockhole.Trigger
      blackhole
        waitTillSnapshot
```

> NOTE: I didn't reproduce it either. :(

Reference:

[1]: <https://github.com/etcd-io/etcd/actions/runs/4741737098/jobs/8419176899>
[2]: <eeb9afa1f6/internal/transport/http2_client.go (L1647)>
[3]: <7450cd886d/tests/robustness/traffic.go (L94)>
[4]: <eeb9afa1f6/dialoptions.go (L445)>
[5]: <https://github.com/etcd-io/etcd/actions/runs/4772033408/jobs/8484334015>
[6]: <eeb9afa1f6/clientconn.go (L932)>

REF: #15763

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-04-29 07:03:47 +08:00