previously, apply() doesn't set consistIndex for EntryConfChange type.
this causes a misalignment between consistIndex and applied index
where EntryConfChange entry results setting applied index but not consistIndex.
suppose that addMember() is called and leader reflects that change.
1. applied index and consistIndex is now misaligned.
2. a new follower node joined.
3. leader sends the snapshot to follower
where the applied index is the snapshot metadata index.
4. follower node saves the snapshot and database(includes consistIndex) from leader.
5. restarting follower loads snapshot and database.
6. follower checks snapshot metadata index(same as applied index) and database consistIndex,
finds them don't match, and then panic.
FIXES#7834
Replacing "etcdserver: fill a response header in auth RPCs"
The revision should be set at the time of "apply",
not in later RPC layer.
Fix https://github.com/coreos/etcd/issues/7691
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
Fix https://github.com/coreos/etcd/issues/7470.
This patch removes unnecessary term look-up in
'createMergedSnapshotMessage', which can trigger panic
if raft entry at etcdProgress.appliedi got compacted
by subsequent 'MsgSnap' messages--if a follower is
being (in this case, network latency spikes) slow, it
could receive subsequent 'MsgSnap' requests from leader.
etcd server-side 'applyAll' routine and raft's Ready
processing routine becomes asynchronous after raft
entries are persisted. And given that raft Ready routine
takes less time to finish, it is possible that second
'MsgSnap' is being handled, while the slow 'applyAll'
is still processing the first(old) 'MsgSnap'. Then raft
Ready routine can compact the log entries at future
index to 'applyAll'. That is how 'createMergedSnapshotMessage'
tried to look up raft term with outdated etcdProgress.appliedi.
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
This commit adds a mechanism of handling a case of expired auth token
to clientv3. If a server returns an error code
grpc.codes.Unauthenticated, newRetryWrapper() tries to get a new token
and use it as an option of PerRPCCredential.
Fixes https://github.com/coreos/etcd/issues/7012
GetUser would not propagate to the minority node, causing TestCtlV2GetRoleUser to
run CreateUser instead of UpdateUser. Instead, use quorum get to fetch the
current state of auth.
Fixes#7069
Removing the periodic SYNC calls broke the health endpoint since the
raft index stops updating. Instead, don't bother monitoring the
raft index; issue a QGET directly to get a consensus response.
Fixes#6985
suppose a lease granting request from a follower goes through and followed by a lease look up or renewal, the leader might not apply the lease grant request locally. So the leader might not find the lease from the lease look up or renewal request which will result lease not found error. To fix this issue, we force the leader to apply its pending commited index before looking up lease.
FIX#6978
This commit protects membership change operations with auth. Only
users that have root role can issue the operations.
Implements https://github.com/coreos/etcd/issues/6899
Would retry a few times before returning a not primary error that
the client should never see. Instead, use proper timeouts and
then return a request timeout error on failure.
Fixes#6922
This exists to prevent sending too many requests that
would lead into applier falling behind Raft accepting-proposal.
Based on recent benchmarks, etcd was able to process high workloads
(2 million writes with 1K concurrent clients).
The limit 1000 is too conservative to test those high workloads.