Backport of PR #16822, commits f7e488dc9262685d6624755e0d3bb0a655863248,
67f17166bf2ba337dafb8e0ea8eea5f74a990767,
and f7ff898fd6c2d6dbb54278343073aa4fa5f46a03.
Co-authored-by: Benjamin Wang <benjamin.wang@broadcom.com>
Signed-off-by: Ivan Valdes <ivan@vald.es>
To avoid inconsistant behavior during cluster upgrade we are feature
gating persistance behind cluster version. This should ensure that
all cluster members are upgraded to v3.6 before changing behavior.
To allow backporting this fix to v3.5 we are also introducing flag
--experimental-enable-lease-checkpoint-persist that will allow for
smooth upgrade in v3.5 clusters with this feature enabled.
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
Current checkpointing mechanism is buggy. New checkpoints for any lease
are scheduled only until the first leader change. Added fix for that
and a test that will check it.
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
Update to Go 1.12.5 testing. Remove deprecated unused and gosimple
pacakges, and mask staticcheck 1006. Also, fix unconvert errors related
to unnecessary type conversions and following staticcheck errors:
- remove redundant return statements
- use for range instead of for select
- use time.Since instead of time.Now().Sub
- omit comparison to bool constant
- replace T.Fatal and T.Fatalf in tests with T.Error and T.Fatalf respectively because the goroutine calls T.Fatal must be called in the same goroutine as the test
- fix error strings that should not be capitalized
- use sort.Strings(...) instead of sort.Sort(sort.StringSlice(...))
- use he status code of Canceled instead of grpc.ErrClientConnClosing which is deprecated
- use use status.Errorf instead of grpc.Errorf which is deprecated
Related #10528#10438
Instead of unconditionally randomizing, extend leases on promotion
if too many leases expire within the same time span. If the server
has few leases or spread out expires, there will be no extension.
Randomize the very first expiry on lease recovery
to prevent recovered leases from expiring all at
the same time.
Address https://github.com/coreos/etcd/issues/8096.
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
Revoke expects the BatchTx lock to be held when holding the TxnDeleter
because it updates the lease bucket. The tests don't hold the lock so
it may race with the backend commit loop.
Fixes#7662
The new leader needs to refresh with an extened TTL to gracefully handle
the potential concurrent leader issue. Clients might still send keep alive
to old leader until the old leader itself gives up leadership at most after
an election timeout.
When raft broadcasts a Grant to all nodes, all nodes must
agree on the same lease ID. Otherwise, attaching a key to
a lease will fail since the lease ID is node-dependent.
Basic support for lease operations like create and revoke.
We still need to:
1. attach keys to leases in KV implmentation if lease field is set
2. leader periodically removes expired leases
3. leader serves keepAlive requests and follower forwards keepAlive
requests to leader.