suppose a lease granting request from a follower goes through and followed by a lease look up or renewal, the leader might not apply the lease grant request locally. So the leader might not find the lease from the lease look up or renewal request which will result lease not found error. To fix this issue, we force the leader to apply its pending commited index before looking up lease.
FIX#6978
This commit protects membership change operations with auth. Only
users that have root role can issue the operations.
Implements https://github.com/coreos/etcd/issues/6899
If we promote the lessor before finish applying all
entries from the last term, we might incorrectly renew
the already revoked leases.
Here is an example:
- Term 1: revoke lease A accepted by raft
- Old leader failed, new election happened
- Term 2: promote
- Term 2: keep alive A succeed. A now has 10 seconds TTL
- Term 2: revoke lease A from Term 1 got committed and applied
- Term 2: the lease A with 10 seconds TTL is revoked
To solve this, the new leader MUST apply all entries from old term
before promote its lessor to start accept renew requests.
When 1000 leases expired at the same time, etcd takes more than 5 seconds to clean them. This means that even after the leases have expired, keys associated with leases are still accessible. I increase the deletion throughput by parallelizing leases deletion process.
All outstanding goroutines now go into the etcdserver waitgroup. goroutines are
shutdown with a "stopping" channel which is closed when the run() goroutine
shutsdown. The done channel will only close once the waitgroup is totally cleared.
If a user upgrades etcd from 2.3.x to 3.0 and shutdown the
cluster immediately without triggering any new backend writes,
then the consistent index in backend would be zero.
The user cannot restart etcdserver due to today's strick index
match checking. We now have to lose this a bit for this case.
kv.commit updates the consistent index in backend. When
executing in parallel with apply, it might grab tx lock
after apply update the consistent index and before apply
starts to execute the opeartion. If the server dies right
after kv.commit, the consistent is updated but the opeartion
is not executed. If we restart etcd server, etcd will skip
the operation. :(
There are a few other places that we need to take care of,
but let us fix this first.
If a server isn't serving txn requests from a client, the server
doesn't need the result of range requests in the txn.
This is a succeeding commit of
https://github.com/coreos/etcd/pull/5689
Most fields accessed with sync/atomic functions are 64bit aligned, but a couple
are not. This makes comments out of date and therefore misleading.
Affected fields reordered, comments scrubbed and updated.
This commit lets etcdserver skip needless log entry applying. If the
result of log applying isn't required by the node (client that issued
the request isn't talking with the node) and the operation has no side
effects, applying can be skipped.
It would contribute to reduce disk I/O on followers and be useful for
a cluster that processes much serializable get.
Currently auth tokens are generated in the replicated state machine
layer randomly. It means one auth token generated in node A cannot be
used for node B. It is problematic for load balancing and fail
over. This commit moves the token generation logic from the state
machine to API layer (before raft) and let all nodes share a single
token.
Log index of Raft is also added to a token for ensuring uniqueness of
the token and detecting activation of the token in the cluster (some
nodes can receive the token before generating and installing the token
in its state machine).
This commit also lets authStore have simple token related things. It
is required because of unit test. The test requires cleaning of the
state of the simple token things after one test (succeeding test can
create duplicated token and it causes panic).