Unlike SnapshotWithVersion, the client.Snapshot doesn't wait for first
response. The server could open db after we close connection or shutdown
the server. We can read few bytes to ensure server opens boltdb.
Signed-off-by: Wei Fu <fuweid89@gmail.com>
Changed TraceKey/StartTimeKey/TokenFieldNameGRPCKey to struct{} to
follow the correct usage of context. Similar patch to #8901.
Signed-off-by: Wei Fu <fuweid89@gmail.com>
fix the unexpected blocking when using Barrier.Wait(), e.g.
NewBarrier(client, "a").Wait() will block if key "a" is not existed but "a0" is existed, but it should return immediately.
Signed-off-by: zhangwenkang <zwenkang@vmware.com>
1. rename confChangeCh to raftAdvancedC
2. rename waitApply to confChanged
3. add comments and test assertion
Signed-off-by: Chao Chen <chaochn@amazon.com>
Increase request to 1000 to increase sample size/reduce variability and increase tolerance threshold from 10 to 15%.
Signed-off-by: James Blair <mail@jamesblair.net>
The TestLeasingDeleteRangeContendTxn is trying to test for RangeDelete when
the target resources are being updated. When the `txnLeasing` wants a
server-side transaction, it needs to ensure all the keys mod revision should
be leass than what it saw. If the compare fails, it will repeat to apply the
server-side transaction until it is sucessful. I believe the test-case is
trying to verify how the `txnLeasing` handles the race issue.
Before the patch #15401, the resource-updating goroutine keeps updating until
the RangeDelete finishes. The testcase is flaky because two goroutines are
sharing one `ctx` and grpc-go client won't wait for the response if `ctx`
has been canceled.
For example,
| DelLease Goroutine | PutLease Goroutine | ETCD Server | Key/0 Status |
| -- | --- | -- | -- |
| deleted | | | version = 0 |
| | send update(key/0=123) req | received update(key/0=123) req | version = 0 |
| cancel | | | version = 0 |
| | exit because of cancel | | version = 0 |
| get key/0 by putkv | | | version = 0 |
| | | applied update(key/0=123) | version = 1 |
| get key/0 by raw-cli | | | version = 1 |
So `raw-cli` gets `[key/0=123]` while the `putkv` gets `[]`. If `putkv`
applies two update reqs to ETCD server and the last one is canceled
before apply, the error will be like:
```
expected [key:"key/0" version:2 value:"123" ], got [key:"key/0" version:1 value:"123" ]
```
The resource-updating goroutine should not share the ctx with RangeDelete here.
And I also revert current main branch because the resource-update goroutine
only updates 8 times and might exit before `RangeDelete`. In this case,
the `txnLeasing` is not handling the race issue.
Fixes: #15352
Signed-off-by: Wei Fu <fuweid89@gmail.com>
Fixes etcd-io#15352.
Depending on the goroutine scheduling, the expected count of 8 might not
have been reached yet. This ensures the routine won't stop earlier than
that.
Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>
Comments fixed as per goword in go _test package files that
shell function go_srcs_in_module lists as per changes on #14827
Helps in #14827
Signed-off-by: Bhargav Ravuri <bhargav.ravuri@infracloud.io>
Comments fixed as per goword in go test files that shell
function go_srcs_in_module lists as per changes on #14827
Helps in #14827
Signed-off-by: Bhargav Ravuri <bhargav.ravuri@infracloud.io>
Check the values of myKey and myRev first in Unlock method to prevent calling Unlock without Lock. Because this may cause the value of pfx to be deleted by mistake.
Signed-off-by: chenyahui <cyhone@qq.com>
Check the client count before creating the ephemeral key, do not
create the key if there are already too many clients. Check the
count after creating the key again, if the total kvs is bigger
than the expected count, then check the rev of the current key,
and take action accordingly based on its rev. If its rev is in
the first ${count}, then it's valid client, otherwise, it should
fail.
Signed-off-by: Benjamin Wang <wachao@vmware.com>
Since http2 spec defines the receive windows's size and max size of
frame in the stream, the underlayer - gRPC client can pre-read data
from server even if the application layer hasn't read it yet.
And the initialized cluster has 20KiB snapshot, which can be pre-read
by underlayer. We should increase the snapshot's size, just in case
that the io.Copy won't return the canceled or timeout error.
Fixes: #14477
Signed-off-by: Wei Fu <fuweid89@gmail.com>