The first issue is to support linearizable lease renewal request.
The second issue is to prevent the consistent_index coming from
a snapshot is overwritten by the local old value.
When etcdserver receives a LeaseRenew request, it may be still in
progress of processing the LeaseGrantRequest on exact the same
leaseID. Accordingly it may return a TTL=0 to client due to the
leaseID not found error. So the leader should wait for the appliedID
to be available before processing client requests.
humanize.Time() drops precision resulting in some events reporting they took
"now" time to complete. Using time.Duration.String() results in accurate
duration being reported.
Fixes#13905
Reason to store CI and term in backend was to make db fully independent
snapshot, it was never meant to interfere with apply logic. Skip of CI
was introduced for v2->v3 migration where we wanted to prevent it from
decreasing when replaying wal in
https://github.com/etcd-io/etcd/pull/5391. By mistake it was added to
apply flow during refactor in
https://github.com/etcd-io/etcd/pull/12855#commitcomment-70713670.
Consistency index and term should only be negotiated and used by raft to make
decisions. Their values should only driven by raft state machine and
backend should only be responsible for storing them.
Currently there are a handful of tests within etcd that silently fail
because LeakDetection will skip the test before it manages to hit this
check.
Here we move the check to the beginning of the process to highlight
these cases earlier, and to avoid them accidentally presenting as leaks.
Removed the fields consistentIdx and consistentTerm from struct EtcdServer,
and added applyingIndex and applyingTerm into struct consistentIndex in
package cindex. We may remove the two fields completely if we decide to
remove the OnPreCommitUnsafe, and it will depend on the performance test
result.
Previously the SetConsistentIndex() is called during the apply workflow,
but it's outside the db transaction. If a commit happens between SetConsistentIndex
and the following apply workflow, and etcd crashes for whatever reason right
after the commit, then etcd commits an incomplete transaction to db.
Eventually etcd runs into the data inconsistency issue.
In this commit, we move the SetConsistentIndex into a txPostLockHook, so
it will be executed inside the transaction lock.
We don't consistently reach the same etcd server during the lifetime of
a test and in some cases, this means that this test will flake if an
etcd server was slow to update its state and the test hits the outdated
server.
Here we switch to using an `Eventually` case which will wait upto a
second for the expected result before failing - with a 10ms gap between
invocations.
```
[tests(dani/leasefix)] $ gotestsum -- ./common -tags integration -count 100 -timeout 15m -run TestLeaseGrantAndList
✓ common (2m26.71s)
DONE 1600 tests in 147.258s
```