147 Commits

Author SHA1 Message Date
Madhav Jivrajani
b51a834645 tests/robustness: allow persisting result reports for successful runs
Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
2024-02-14 16:28:47 +05:30
Madhav Jivrajani
cdd018ad2a tests/robustness: add a robustness test for validating create events
Split off valdiating create events from the prevKV test.
The added test tests the following two:
- A create event should not exist in our past history
- A non-create event should exist in our past history

Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
2024-02-14 16:28:44 +05:30
Madhav Jivrajani
4fa07a1c8a tests/robustness: make merging histories work on []PersistedEvent
Event histories after merging serve as an authorotative list of
events that can be seen as ones persisted by etcd, we don't need
PrevValue as part of it.

Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
2024-02-14 15:44:08 +05:30
Madhav Jivrajani
9aad6700d5 tests/robustness: add robustness test for watch with PrevKV()
Kubernetes relies on the PrevKV() option in the watches it opens
against etcd. This commit adds a robustness test to validate the
same.

A watch response returned with PrevKV() is valid if:
The value in current event's prevKV matches the previous
event's value of the same key if this is not a create event.

There are cases where there can be a prevKV for a create event
as well, for example if a watch is opened after the key is creatd.
Since we don't simulate for that, we don't check for that.

Further, this adjusts revision numbers such that we can successfully create
a new replay. Needed now since we will have unit tests with
and without PrevKV co-existing and we requite creation of a
new replay everytime we validate PrevKV.

We also regenerate test data with so that prevKV exists in it

Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
2024-02-13 22:55:57 +05:30
Madhav Jivrajani
f0f4e8a4e8 tests/robustness: fix out of index panic in model replay
Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
2024-02-01 16:14:35 +05:30
Marek Siarkowicz
4d3108246d
Merge pull request #17260 from serathius/validate-watch-without-event-history
Validate watch even if event history cannot be created
2024-01-25 16:01:01 +01:00
Marek Siarkowicz
f0d73c9d12 Separate robustness test scenarios and increase number of times we run exploratory tests in nightly
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-01-16 17:17:54 +01:00
Marek Siarkowicz
c37991cf8b Validate watch even if event history cannot be created
Creation of event history requires each client to return consistent
events. If clients observed inconsistent view of some revision, merging
will fail and return diff between two clients. This however doesn't
provide hint on what kind of issue happend.

This PR helps cases where there is an error with single watch
stream (like event duplication) by running normal watch validation even
without full event history.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-01-16 16:04:03 +01:00
Marek Siarkowicz
3471ef133d Add an e2e test and robustness failpoint around recovering from snapshot backend
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-01-04 15:25:24 +01:00
Jongwoo Han
08d799c4cc
Correct typo from 'Kuberntes' to 'Kubernetes'
Signed-off-by: Jongwoo Han <jongwooo.han@gmail.com>
2023-12-20 18:09:31 +09:00
Benjamin Wang
3ab54f720f install gofail in module-aware mode and ignore go.mod file
Signed-off-by: Benjamin Wang <benjamin.ahrtr@gmail.com>
2023-12-11 12:37:05 +00:00
Marek Siarkowicz
5175652a8e Abort if failpoint injecton failed
If one of nodes is unhealthy the test would never finish as watchers
would never reach max revision.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-12-03 17:26:51 +01:00
Marek Siarkowicz
b71686d1e6 Refactor mocking rand
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-11-15 10:26:39 +01:00
Siyuan Zhang
834fac9fb2 robustness test: add with functions of randomizable config params in robustness test
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2023-11-15 02:08:07 +00:00
ZhouJianMS
55516234d3 exclude sleep failpoint from 1 node scenario
Signed-off-by: ZhouJianMS <zhoujian@microsoft.com>
2023-11-13 16:19:44 +08:00
ZhouJianMS
d208985aec error handling for gofailpoint
Signed-off-by: ZhouJianMS <zhoujian@microsoft.com>
2023-11-03 19:25:17 +08:00
ZhouJianMS
827dc18682 Add IO stall failpoint in raft loop
Signed-off-by: ZhouJianMS <zhoujian@microsoft.com>
2023-11-03 16:42:33 +08:00
Marek Siarkowicz
45fb4565e3
Merge pull request #16786 from serathius/robustness-drop-packet
Implement random packet dropping
2023-10-19 08:44:23 +02:00
Marek Siarkowicz
aa28a69ce0 Implement random packet dropping
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-10-18 10:14:43 +02:00
Wei Fu
aea1cd0077 feat: enable unparam lint
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-10-17 21:24:13 +08:00
Marek Siarkowicz
e51b639520
Merge pull request #16766 from serathius/robustness-member-replace
Add member replace failpoint to robustness tests
2023-10-17 13:36:21 +02:00
Marek Siarkowicz
5fed813f2e
Merge pull request #16767 from serathius/robustness-main-test
Make the main_test the entrypoint and move senario generation to separate file
2023-10-17 13:09:16 +02:00
Marek Siarkowicz
7e8bb15ccb Add member replace failpoint to robustness tests
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-10-17 11:17:49 +02:00
Marek Siarkowicz
0d83a72cf5 Split failpoints file
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-10-17 09:51:43 +02:00
Marek Siarkowicz
452e820516 Make the main_test the entrypoint and move senario generation to separate file
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-10-16 22:10:41 +02:00
Marek Siarkowicz
d6e376b6c6 Move failpoints to separate package
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-10-16 20:57:31 +02:00
Marek Siarkowicz
4791964173
Merge pull request #16757 from serathius/minimal-time
Use the minimal time event was observed on watch
2023-10-16 11:25:30 +02:00
Marek Siarkowicz
841731bbf0 Fix linearization failure not causing test failure
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-10-13 18:56:22 +02:00
Marek Siarkowicz
4c7b8dbc94 Use the minimal time event was observed on watch
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-10-13 18:17:29 +02:00
Marek Siarkowicz
57d9a7eec6
Merge pull request #16756 from serathius/robustness-reorder-validation
Refactor and reorder validation to avoid reporting multiple correlated failures
2023-10-13 18:12:25 +02:00
Marek Siarkowicz
b02798e946 Refactor and reorder validation to avoid reporting multiple corelated failures
It doesn't make sense to report watch failure if key value operations
are not linearizable.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-10-13 14:06:13 +02:00
Marek Siarkowicz
de39c75053
Merge pull request #16711 from serathius/robustness-fix-profile
Fix providing profile to robustness tests
2023-10-09 09:52:55 +02:00
Marek Siarkowicz
b4d54922eb Fix providing profile to robustness tests
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-10-08 21:27:22 +02:00
Marek Siarkowicz
f5e82260da Fix parsing failpoint names when failpoint has value set
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-10-07 18:20:18 +02:00
Marek Siarkowicz
05a77032fc Inject sleep during etcd bootstrap to reproduce https://github.com/etcd-io/etcd/issues/16666
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-10-07 12:31:56 +02:00
Marek Siarkowicz
c2655b4112 Fix watch validation assuming that client requesting older watch revision
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-10-05 14:09:43 +02:00
Marek Siarkowicz
11b441b605 Reuse embed.Config in e2e cluster config
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-10-01 09:56:31 +02:00
Wei Fu
4704a5af3a *: fix unused issue
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-09-25 19:37:18 +08:00
Wei Fu
07effc4d0a *: fix revive linter
Remove old revive_pass in the bash scripts and migirate the revive.toml
into golangci linter_settings.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-09-24 14:21:11 +08:00
Wei Fu
aa97484166 *: enable goimports in verify-lint
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-09-21 21:14:09 +08:00
Wei Fu
9c3edfa0af *: fix staticcheck lint
Changed TraceKey/StartTimeKey/TokenFieldNameGRPCKey to struct{} to
follow the correct usage of context. Similar patch to #8901.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-09-21 11:24:26 +08:00
Wei Fu
5e3910d96c *: fix govet-shadow lint
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-09-19 20:24:01 +08:00
Benjamin Wang
700411d838
Merge pull request #16601 from fuweid/fix-nakedret-lint
*: fix nakedret lint
2023-09-18 10:00:25 +01:00
Wei Fu
e72c2c40d4 *: fix nakedret lint
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-09-17 23:52:41 +08:00
chenyahui
c0aa3b613b Use any instead of interface{}
Signed-off-by: chenyahui <cyhone@qq.com>
2023-09-17 17:41:58 +08:00
Wei Fu
3f6a5c0bb1 *: enable larger runner
Use ubuntu-latest-8-cores larger runner to support lazyfs in robustness
CI.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-08-18 22:21:00 +08:00
Marek Siarkowicz
a2bd589cdb tests/robustness: Reduce minimal QPS to eliminate flakes
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-08-05 19:27:10 +02:00
Marek Siarkowicz
eb32d9cccc tests: Add support for lazyfs
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-07-27 13:42:38 +02:00
Wei Fu
516e096a97 tests/robustness: enhance compact failpoint
If the cluster serves requests slowly, the database has few revision
number and then Compact won't trigger BatchCommit. Add a loop to check
the last revision is big enough to trigger panic.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-07-26 21:32:16 +08:00
Marek Siarkowicz
8fca6ebdb2 tests/robustness: Prevent to many concurrent non-unique writes which are causing linearization to timeout
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-07-03 14:39:23 +02:00