235 Commits

Author SHA1 Message Date
Marek Siarkowicz
f194f4723c Reduce number of concurrent clients to 8 and compactions to avoid flakes
Not hitting minimal QPS is expected to be caused by introduction of
compation. Let's avoid it for high throughput test cases.

Reducing number of clients to avoid linearization timeouts.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-08-15 13:57:06 +02:00
Marek Siarkowicz
44b6c03ec0 Ensure proper gofail package version in robustness tests
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-08-02 09:24:49 +02:00
Marek Siarkowicz
fdf8fde387 Remove flake caused failpoint in watch disrupting progress notifies
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-08-01 21:19:03 +02:00
Marek Siarkowicz
4488f2c9b6
Merge pull request #18252 from fykaa/reduce-concurrency-high-traffic
Reduce client concurrency for high traffic robustness tests
2024-07-31 17:35:37 +02:00
Siyuan Zhang
cded6b0ac6 robustness: remove head rev match in validateGotAtLeastOneProgressNotify
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2024-07-03 09:03:54 -07:00
Faeka Ansari
e97bc39060 Reduce client concurrency for high traffic robustness tests
Signed-off-by: Faeka Ansari <faeka6@gmail.com>
2024-06-29 15:00:04 +05:30
Marek Siarkowicz
c41e02f7b6 Add failpoint name to test name allowing us to track a per failpoint failures in testgrid
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-06-27 18:35:04 +02:00
Marek Siarkowicz
1870222f41 Separate persisted responses without knowing their revision to prevent duplicating state during linearization
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-06-24 21:38:27 +02:00
Marek Siarkowicz
35f4556b59 Add tests for patching history to check output and return values
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-06-24 20:41:32 +02:00
Marek Siarkowicz
4fe227c46c Disable robustness test detection of #18089 to allow detecting other issues
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-06-22 11:01:47 +02:00
Chun-Hung Tseng
f21f074baa
Use $(MAKE) instead of make
Recursive make commands should always use the variable MAKE, as
the value of this variable is the file name with which make was invoked

Reference:
- https://www.gnu.org/software/make/manual/html_node/MAKE-Variable.html

Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>
2024-06-18 19:16:30 +02:00
Madhav Jivrajani
5c2422ba05 tests/robustness: fix access of ChoiceWeight
Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
2024-06-18 13:44:40 +05:30
Marek Siarkowicz
c70e0e4f55
Merge pull request #18181 from serathius/robustness-compact-lazyfs
Avoid sending Compact request when LazyFS is enabled
2024-06-18 09:26:41 +02:00
Marek Siarkowicz
2deefb081b
Merge pull request #18060 from siyuanfoundation/robust
robustness: change mixedVersionOption to use ChoiceWeight.
2024-06-18 08:49:47 +02:00
Marek Siarkowicz
2e04ee77b6 Avoid sending Compact request when LazyFS is enabled
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-06-18 08:36:24 +02:00
Siyuan Zhang
fff58bb809 robustness: change mixedVersionOption to use ChoiceWeight.
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2024-06-17 15:53:47 -07:00
Marek Siarkowicz
5e42ed9b22 Reproduce issue #17529
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-06-15 19:40:23 +02:00
Siyuan Zhang
aaa6e9ef8c robustness: Separate compaction and LazyFS test scenario for cluster size 1.
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2024-06-14 13:33:50 -07:00
Ivan Valdes
d73cc2bb65
tests/robustness: update documentation to reflect Prow migration
Signed-off-by: Ivan Valdes <ivan@vald.es>
2024-06-13 10:20:32 -07:00
Marek Siarkowicz
2c56e8edc1
Merge pull request #18107 from serathius/e2e-error-log
Improve e2e error reporting
2024-06-07 13:58:59 +02:00
Marek Siarkowicz
5959110f4a Implement Compaction support in robustness test
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-06-07 10:33:57 +02:00
Marek Siarkowicz
3c5684967f Improve e2e error reporting
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
Co-authored-by: James Blair <mail@jamesblair.net>
Co-authored-by: chao <chaochn@amazon.com>
2024-06-07 10:24:52 +02:00
Marek Siarkowicz
2ffaf5fba4
Merge pull request #18133 from serathius/robustness-connection-reset
Ignore connection reset error when triggering a failpoint
2024-06-07 10:23:14 +02:00
Wei Fu
fc1863086c tests/robustness: unlock Delete/LeaseRevoke ops
We should return token to that bucket if `nonUniqueWriteLimiter.Take()`
return true. After unlock Delete/LeaseRevoke ops, the model should be
updated for replay function. There are two updates for `toWatchEvents`.

1. When leaveRevokes op has deleted few keys, we should generate
   `delete-operation` events based on alphabetical order of deleted
   keys.
2. When putWithLease op hits non-exist lease, we should ignore that
   update event.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2024-06-06 20:40:08 +08:00
Marek Siarkowicz
b8eeaacbcb Ignore connection reset error when triggering a failpoint
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-06-05 17:58:46 +02:00
Marek Siarkowicz
67743348dc
Merge pull request #18054 from siyuanfoundation/robust
workflow: change the target of make test-robustness to test-robustness-main
2024-05-22 19:54:11 +02:00
Siyuan Zhang
8dcb198f14 workflow: change the target of make test-robustness to test-robustness-main
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2024-05-22 09:25:46 -07:00
Marek Siarkowicz
aaa9f15f23 Increase robustness test request timeout to 200ms
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-05-22 18:00:15 +02:00
Marek Siarkowicz
1d367fbae6
Merge pull request #17923 from siyuanfoundation/robust
Add randomness in robustness cluster process version to test mixed version scenarios.
2024-05-22 14:07:22 +02:00
Siyuan Zhang
0f94c2ca4f robustness: add mix version scenario with fixed leader.
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2024-05-21 17:42:12 +00:00
Siyuan Zhang
b54d7552a7 robustness: add mix version option in exploratoryScenarios.
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2024-05-21 16:57:53 +00:00
Siyuan Zhang
cde6cd006d e2e: add flag to pass specific binary path for last release.
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2024-05-21 16:57:48 +00:00
Marek Siarkowicz
3fb36d9ae2 Allow gofail trigger to fail as long as the member stops running
This is required for compaction based failpoint, to allow the traffic
send compaction request causing etcd to crash before failpoint executes
the trigger.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-05-21 18:46:35 +02:00
Akshay Nanavare
8eb91d0e15 add error constants in validate pkg
Signed-off-by: Akshay Nanavare <nakshay303@gmail.com>
2024-05-14 20:19:08 +05:30
Marek Siarkowicz
d8bb19327b Prevent picking a failpoint that waiting till snapshot that doesn't support lower snapshot catchup entries but allow reproducing issue #15271
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-05-13 12:08:42 +02:00
James Blair
2e70fc7d1d
Automate labels for tests pull requests.
Signed-off-by: James Blair <mail@jamesblair.net>
2024-05-12 22:24:21 +12:00
Marek Siarkowicz
dc35b6493f
Merge pull request #17966 from serathius/robustness-relax
Relax assumptions about all client request persisted in WAL to only require first and last request to be persisted
2024-05-09 17:51:27 +02:00
Marek Siarkowicz
e1244f19d6
Merge pull request #17918 from serathius/robustness-serializable-validation-test
Add tests to serializable operations validation
2024-05-09 11:07:49 +02:00
Marek Siarkowicz
b8ffc5e8c0
Merge pull request #17967 from serathius/robustness-update-readme
Update the robustness README and fix the #14370 reproduction case
2024-05-09 10:05:27 +02:00
Marek Siarkowicz
c4ff2c20bd
Merge pull request #17965 from serathius/makefile-cache
Fix caching by not depending on PHONY target in non-PHONY target
2024-05-08 16:47:45 +02:00
Marek Siarkowicz
f5c0e785a7 Fix caching by not depending on PHONY target in non-PHONY target
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-05-08 14:29:28 +02:00
Marek Siarkowicz
b883f839f1 Add tests to serializable operations validation
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-05-08 12:29:55 +02:00
Marek Siarkowicz
bb398a0e6c
Merge pull request #17889 from serathius/robustness-operations-failpoints
Robustness operations failpoints
2024-05-08 11:37:22 +02:00
Marek Siarkowicz
be9758e2bc Update the robustness README and fix the #14370 reproduction case
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-05-08 11:31:28 +02:00
Marek Siarkowicz
7181c7532f Relax assumptions about all client request persisted in WAL to only require first and last request to be persisted
This assumption is not true during durability issues like #14370.
In reality we want to avoid situations where WAL is was truncated, for
that it's enough that we ensure that first and last operations are
present.

Found it when running `make test-robustness-issue14370` and instead of
getting `Model is not linearizable` I got that assumptions were broken.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-05-08 10:40:38 +02:00
Siyuan Zhang
dd79332cf6 robustness: add 2 more log lines when persistClientReports
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2024-05-06 09:53:39 -07:00
Marek Siarkowicz
c4e3b61a1c Record operation from failpoint injection
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-05-01 19:20:22 +02:00
Marek Siarkowicz
1e7dd97e3b Add LeaseRevoke request to WAL parsing
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-27 12:04:27 +02:00
Marek Siarkowicz
b36d9b2156
Merge pull request #17731 from serathius/robustness-wal-validate-watch
Robustness wal validate watch
2024-04-26 08:37:33 +02:00
Marek Siarkowicz
2de719dea4 Use WAL persisted request to validate watch
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-25 21:11:37 +02:00