Mirroristas/etcd

mirror of https://github.com/etcd-io/etcd.git synced 2024-09-27 06:25:44 +00:00

Author	SHA1	Message	Date
Wei Fu	07effc4d0a	*: fix revive linter Remove old revive_pass in the bash scripts and migirate the revive.toml into golangci linter_settings. Signed-off-by: Wei Fu <fuweid89@gmail.com>	2023-09-24 14:21:11 +08:00
Marek Siarkowicz	11da84a1d1	tests/robustness: Implement loading client reports Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2023-06-28 15:35:17 +02:00
Marek Siarkowicz	26cd2bc017	tests/robustness: Store whole watch operations Want to keep watch requests to properly validate reliability of watch stream. Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2023-06-24 18:15:50 +02:00
Marek Siarkowicz	7bbc738ec4	tests/robustness: Extract validation to separate package Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2023-06-14 09:14:27 +02:00
Marek Siarkowicz	16bf0f6641	tests/robustness: Use traffic.RecordingClient in watch Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2023-05-25 22:17:23 +02:00
Marek Siarkowicz	4872b679a5	tests/robustness: Expect revions to be unique for Kubernetes Traffic Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2023-05-23 15:51:10 +02:00
Marek Siarkowicz	6429f47631	tests/robustness: Validate all etcd watches opened to etcd Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2023-05-16 10:28:01 +02:00
Marek Siarkowicz	911c40a347	tests/robustness: Implement kubernetes list watch protocol Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2023-05-15 10:11:05 +02:00
Bogdan Kanivets	c338882d7a	tests/robustness: use monotonic clock for watch events see: https://github.com/etcd-io/etcd/pull/15323 For consistency watch events should also use only time-measurement operations. fixes: https://github.com/etcd-io/etcd/issues/15328 Signed-off-by: Bogdan Kanivets <bkanivets@apple.com>	2023-05-14 12:58:13 -07:00
Marek Siarkowicz	831ce4c3cf	tests/robustness: Improve naming of Txn fields Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2023-05-12 13:10:25 +02:00
Marek Siarkowicz	dd248518d1	tests/robustness: Move request progress field from traffic to watch config and pass testScenario to reduce number of arguments Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2023-05-10 11:43:02 +02:00
Marek Siarkowicz	92366a5338	tests/robustness: Split model code into deterministic and non-deterministic Signed-off-by: Marek Siarkowicz <siarkowicz@google.com> Co-authored-by: Benjamin Wang <wachao@vmware.com> Co-authored-by: chao <54131596+chaochn47@users.noreply.github.com>	2023-05-05 12:25:10 +02:00
Wei Fu	09d053e035	tests/robustness: tune timeout policy In a [scheduled test][1], the error shows ``` 2023-04-19T11:16:15.8166316Z traffic.go:96: rpc error: code = Unavailable desc = keepalive ping failed to receive ACK within timeout ``` According to [grpc-keepalive@v1.51.0][2], each frame from server will fresh the `lastRead` and it won't file `Ping` frame to server. But the client used by [`tombstone` request][3] might hit the race. Since we use 5ms as timeout, the client might not receive the result of `Ping` from server in time. The keepalive will mark it timeout and close the connection. I didn't reproduce it in my local. If we add the sleep before update `lastRead`, it can reproduce it sometimes. Still investigating this part. ```diff diff --git a/internal/transport/http2_client.go b/internal/transport/http2_client.go index d518b07e..bee9c00a 100644 --- a/internal/transport/http2_client.go +++ b/internal/transport/http2_client.go @@ -1560,6 +1560,7 @@ func (t http2Client) reader(errCh chan<- error) { t.controlBuf.throttle() frame, err := t.framer.fr.ReadFrame() if t.keepaliveEnabled { + time.Sleep(2 time.Millisecond) atomic.StoreInt64(&t.lastRead, time.Now().UnixNano()) } if err != nil { ``` `DialKeepAliveTime` is always >= [10s][4]. I think we should increase the timeout to avoid flaky caused by unstable env. And in a [scheduled test][5], the error shows ``` logger.go:130: 2023-04-22T10:45:52.646Z INFO Failed to trigger failpoint {"failpoint": "blackhole", "error": "context deadline exceeded"} ``` Before sending `Status` to member, the client doesn't [pick][6] the connection in time (100ms) and returns the error. The `waitTillSnapshot` is used to ensure that it is good enough to trigger snapshot transfer. And we have 1min timeout for injectFailpoints, so I think we can remove the 100ms timeout to reduce unnecessary stop. ``` injectFailpoints(1min timeout) failpoint.Inject triggerBlockhole.Trigger blackhole waitTillSnapshot ``` > NOTE: I didn't reproduce it either. :( Reference: [1]: <https://github.com/etcd-io/etcd/actions/runs/4741737098/jobs/8419176899> [2]: <`eeb9afa1f6/internal/transport/http2_client.go (L1647)`> [3]: <`7450cd886d/tests/robustness/traffic.go (L94)`> [4]: <`eeb9afa1f6/dialoptions.go (L445)`> [5]: <https://github.com/etcd-io/etcd/actions/runs/4772033408/jobs/8484334015> [6]: <`eeb9afa1f6/clientconn.go (L932)`> REF: #15763 Signed-off-by: Wei Fu <fuweid89@gmail.com>	2023-04-29 07:03:47 +08:00
Marek Siarkowicz	1e41d95ab2	tests/robustness: Document analysing watch issue Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2023-04-05 22:40:47 +02:00
Peter Wortmann	42a2643df9	tests/robustness: Reproduce issue #15220 This issue is somewhat easily reproduced simply by bombarding the server with requests for progress notifications, which eventually leads to one being delivered ahead of the payload message. This is then caught by the watch response validation code previously added by Marek Siarkowicz. Signed-off-by: Peter Wortmann <peter.wortmann@skao.int>	2023-04-05 11:23:02 +01:00
Marek Siarkowicz	6582e349db	tests: Enfoce timeout on failpoints Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2023-04-04 12:25:07 +02:00
Marek Siarkowicz	0cbd56e8b6	tests: Cleanup endpoints Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2023-04-03 12:18:54 +02:00
Marek Siarkowicz	4340cbb4aa	Merge pull request #15575 from serathius/ensure-watch tests: Ensure watch catches all events generated in traffic	2023-03-30 10:28:22 +02:00
Marek Siarkowicz	ad688b2a85	tests: Ensure watch catches all events generated in traffic Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2023-03-29 11:41:10 +02:00
Marek Siarkowicz	c54521156e	tests: Refactor watch validation Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2023-03-28 17:32:34 +02:00
Marek Siarkowicz	d475cf81a0	tests: Rename linearizability tests to robustness Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2023-02-26 14:36:18 +01:00

21 Commits