197 Commits

Author SHA1 Message Date
Marek Siarkowicz
e1244f19d6
Merge pull request #17918 from serathius/robustness-serializable-validation-test
Add tests to serializable operations validation
2024-05-09 11:07:49 +02:00
Marek Siarkowicz
b8ffc5e8c0
Merge pull request #17967 from serathius/robustness-update-readme
Update the robustness README and fix the #14370 reproduction case
2024-05-09 10:05:27 +02:00
Marek Siarkowicz
c4ff2c20bd
Merge pull request #17965 from serathius/makefile-cache
Fix caching by not depending on PHONY target in non-PHONY target
2024-05-08 16:47:45 +02:00
Marek Siarkowicz
f5c0e785a7 Fix caching by not depending on PHONY target in non-PHONY target
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-05-08 14:29:28 +02:00
Marek Siarkowicz
b883f839f1 Add tests to serializable operations validation
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-05-08 12:29:55 +02:00
Marek Siarkowicz
bb398a0e6c
Merge pull request #17889 from serathius/robustness-operations-failpoints
Robustness operations failpoints
2024-05-08 11:37:22 +02:00
Marek Siarkowicz
be9758e2bc Update the robustness README and fix the #14370 reproduction case
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-05-08 11:31:28 +02:00
Siyuan Zhang
dd79332cf6 robustness: add 2 more log lines when persistClientReports
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2024-05-06 09:53:39 -07:00
Marek Siarkowicz
c4e3b61a1c Record operation from failpoint injection
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-05-01 19:20:22 +02:00
Marek Siarkowicz
1e7dd97e3b Add LeaseRevoke request to WAL parsing
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-27 12:04:27 +02:00
Marek Siarkowicz
b36d9b2156
Merge pull request #17731 from serathius/robustness-wal-validate-watch
Robustness wal validate watch
2024-04-26 08:37:33 +02:00
Marek Siarkowicz
2de719dea4 Use WAL persisted request to validate watch
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-25 21:11:37 +02:00
Marek Siarkowicz
9027014adb
Merge pull request #17827 from siyuanfoundation/flaky
robustness: Add option to not overwrite results dir.
2024-04-24 22:48:02 +02:00
Siyuan Zhang
3c3b76cea1 robustness: not overwrite results dir by giving each dir a unique name.
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2024-04-24 12:36:55 -07:00
Marek Siarkowicz
96d619459b
Merge pull request #17734 from MadhavJivrajani/toolchain-directive
tests: set GOTOOLCHAIN var for report validation
2024-04-24 09:25:07 +02:00
Madhav Jivrajani
856847d89b tests: set GOTOOLCHAIN var for report validation
Set GOTOOLCHAIN directive in order to successfully run tests
from root. Else, go will try and download a family of releases
(of the form 1.x), which are not published binaries.

Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
2024-04-23 14:44:51 +05:30
Marek Siarkowicz
9fcde37447 Persist member data with lazyfs enabled
Discovered turning off LazyFS before creating the report might result in
empty server directory. This PR moves cluster shutdown to defer executed
after we generate report and copies the data from lazyfs directory.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-23 10:19:05 +02:00
Marek Siarkowicz
062a0ea057
Merge pull request #17825 from serathius/robustness-qps
Don't require minimal for failpoint injection period
2024-04-22 19:03:42 +02:00
Marek Siarkowicz
fa9e9504ad Handle watch responses with error
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-21 20:49:49 +02:00
Marek Siarkowicz
a097a3b39d
Merge pull request #17810 from serathius/robustness-revisions-between-progress
Validate revisions between progress notify
2024-04-21 20:04:25 +02:00
Marek Siarkowicz
f285330d46 Don't require minimal for failpoint injection period
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-20 10:34:51 +02:00
Marek Siarkowicz
964680c8d0 Validate delivery of events between progress notifies
Simplifying bookmarkable to just validate revision order between events
and progress notifies.

Use reliable to validate if events are missing, but still report
broken resumable if first event after revision is missing. It's easier
to have one place that validates event slices.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-19 10:42:54 +02:00
Marek Siarkowicz
5a8c8b703b
Merge pull request #17807 from serathius/robustness-resumable-revision-zero
Resumable handles watch with revision zero
2024-04-16 19:41:53 +02:00
Marek Siarkowicz
dc187ce6e8 Validate bookmarkable checks the last event before progress notify
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-16 09:17:40 +02:00
Marek Siarkowicz
94a47a7cbd Resumable handles watch with revision zero
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-15 20:23:51 +02:00
Marek Siarkowicz
042e7d1a0c Add filter validation to ensure watch only includes events within selector
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-15 20:05:08 +02:00
Marek Siarkowicz
a95a307698 Add tests to watch validation
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-14 21:38:03 +02:00
Marek Siarkowicz
569693be8d Utilize WAL to patch operation history
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-14 12:09:38 +02:00
Marek Siarkowicz
452445e2d8
Merge pull request #17781 from serathius/robustness-read-limit
Remove limit from read requests after a failed write
2024-04-14 12:05:23 +02:00
Marek Siarkowicz
2e6eebef85
Merge pull request #17759 from serathius/robustness-assumptions
Add explicit checks for assumptions in robustness test validation
2024-04-13 00:19:25 +02:00
Marek Siarkowicz
d0bf8ddca4 Improve description for Kubernetes CAS operations
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-12 16:18:31 +02:00
Marek Siarkowicz
cadfc407e9 Remove limit from read requests after a failed write
Limit can cause multiple request due to pagination.
For reads after a failed write we would like to return to normal write
request as soon as possible.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-12 15:01:17 +02:00
Marek Siarkowicz
f8de338ab2 Add explicit checks for assumptions in robustness test validation
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-12 14:18:22 +02:00
Marek Siarkowicz
bfbfee0afa
Merge pull request #17768 from serathius/robustness-success-rate
[Robustness] Collect failed read operations to calculate request success rate
2024-04-12 09:46:20 +02:00
Marek Siarkowicz
718d5ba2b4 Calculate request success rate to provide signal to performance debugging
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-11 09:36:17 +02:00
Marek Siarkowicz
ae7f79fd63 Refactor append from appendFailed and appendSuccesfull
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-11 09:36:17 +02:00
Marek Siarkowicz
65130c6d21 Refactor merge succesfull and failed operation in history
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-10 21:11:46 +02:00
Marek Siarkowicz
229275d46e Refactor appendSuccesful and appendFailed methods to match
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-10 10:33:19 +02:00
Marek Siarkowicz
41ac7e33a1 Don't cache test-robustness-reports
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-09 15:59:58 +02:00
Marek Siarkowicz
6cb4c3f90d Document re-evaluating existing robustness test reports
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-08 16:58:12 +02:00
Marek Siarkowicz
3a23994fbf Make no failpoint error more readable
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-07 15:13:59 +02:00
Marek Siarkowicz
e2bb8c698f Limit a timeout in testing robustness validation
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-04-06 12:28:57 +02:00
Marek Siarkowicz
09769c4be7
Merge pull request #17642 from fuweid/fix-17506
*: LeaseTimeToLive returns error if leader changed
2024-04-02 14:55:41 +02:00
Wei Fu
d3bb6f688b *: LeaseTimeToLive returns error if leader changed
The old leader demotes lessor and all the leases' expire time will be
updated. Instead of returning incorrect remaining TTL, we should return
errors to force client retry.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2024-03-26 18:55:01 +08:00
Ivan Valdes
0976398964
tests/robustness: address golangci var-naming issues
Signed-off-by: Ivan Valdes <ivan@vald.es>
2024-03-25 16:27:05 -07:00
thirdkeyword
fbda591866 fix some typos
Signed-off-by: thirdkeyword <fliterdashen@gmail.com>
2024-03-25 10:34:44 +08:00
Chao Chen
405862e807 Fix event loss after compaction
Signed-off-by: Chao Chen <chaochn@amazon.com>
2024-03-15 14:22:37 -07:00
Madhav Jivrajani
0b27570368 tests/robustness: use WithRequireLeader in Kubernetes traffic
Kubernetes uses WithRequireLeader to modify the context passed
to the Watch() and RequestProgress() calls in order to ensure
that a leader is present in the cluster before serving the request.

This commit mimics that behaviour in the Kubernetes traffic.

Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
2024-02-22 15:01:33 +05:30
Marek Siarkowicz
3a351c2fec Revert "tests/robustness: check for compaction before prevKV validation"
This reverts commit 5d7f58d14be1da5135d158dad6fc43391cbf6283.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-02-21 16:15:40 +01:00
Madhav Jivrajani
5d7f58d14b tests/robustness: check for compaction before prevKV validation
We can check for the condition that Kubernetes checks for, i.e.
prevKV can be nil iff the event is not a create a event, only if
we know whether compaction has occured or not. If compaction has
occured, prevKV can be nil and that is completely valid.

This commit checks if compaction took place during the test run
by querying the /metrics endpoint. Based on if compaction occured,
we now check the Kubernetes condition in the prevKV robustness test.

Signed-off-by: Madhav Jivrajani <madhav.jiv@gmail.com>
2024-02-19 17:05:59 +05:30