The TestLeasingDeleteRangeContendTxn is trying to test for RangeDelete when
the target resources are being updated. When the `txnLeasing` wants a
server-side transaction, it needs to ensure all the keys mod revision should
be leass than what it saw. If the compare fails, it will repeat to apply the
server-side transaction until it is sucessful. I believe the test-case is
trying to verify how the `txnLeasing` handles the race issue.
Before the patch #15401, the resource-updating goroutine keeps updating until
the RangeDelete finishes. The testcase is flaky because two goroutines are
sharing one `ctx` and grpc-go client won't wait for the response if `ctx`
has been canceled.
For example,
| DelLease Goroutine | PutLease Goroutine | ETCD Server | Key/0 Status |
| -- | --- | -- | -- |
| deleted | | | version = 0 |
| | send update(key/0=123) req | received update(key/0=123) req | version = 0 |
| cancel | | | version = 0 |
| | exit because of cancel | | version = 0 |
| get key/0 by putkv | | | version = 0 |
| | | applied update(key/0=123) | version = 1 |
| get key/0 by raw-cli | | | version = 1 |
So `raw-cli` gets `[key/0=123]` while the `putkv` gets `[]`. If `putkv`
applies two update reqs to ETCD server and the last one is canceled
before apply, the error will be like:
```
expected [key:"key/0" version:2 value:"123" ], got [key:"key/0" version:1 value:"123" ]
```
The resource-updating goroutine should not share the ctx with RangeDelete here.
And I also revert current main branch because the resource-update goroutine
only updates 8 times and might exit before `RangeDelete`. In this case,
the `txnLeasing` is not handling the race issue.
Fixes: #15352
Signed-off-by: Wei Fu <fuweid89@gmail.com>
The huge (100k+) value was justified when storev2 was being dumped completely with every snapshot.
With storev2 being decomissioned we can checkpoint more frequently for faster recovery.
Signed-off-by: James Blair <mail@jamesblair.net>
Fixes etcd-io#15352.
Depending on the goroutine scheduling, the expected count of 8 might not
have been reached yet. This ensures the routine won't stop earlier than
that.
Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>
Problem: time.Now() uses wall clock reading. See https://github.com/golang/go/blob/master/src/time/time.go#L17
"later time-telling operations use the wall clock reading, but later time-measuring operations, specifically comparisons and subtractions, use the monotonic clock reading."
This can cause 'Return' to be before 'Call' and wrong order of operations from different clients.
Solution: use same base time for all clients and only use 'time-measuring' operations to record timestamps for history.
Signed-off-by: Bogdan Kanivets <bkanivets@apple.com>