Connection pausing added another exit condition in the listener
path, causing the bridge to leak connections instead of closing
them when signalled to close. Also adds some additional Close
paranoia.
Fixes#7823
previously, apply() doesn't set consistIndex for EntryConfChange type.
this causes a misalignment between consistIndex and applied index
where EntryConfChange entry results setting applied index but not consistIndex.
suppose that addMember() is called and leader reflects that change.
1. applied index and consistIndex is now misaligned.
2. a new follower node joined.
3. leader sends the snapshot to follower
where the applied index is the snapshot metadata index.
4. follower node saves the snapshot and database(includes consistIndex) from leader.
5. restarting follower loads snapshot and database.
6. follower checks snapshot metadata index(same as applied index) and database consistIndex,
finds them don't match, and then panic.
FIXES#7834
Turns out the optimization to ignore setting the init rev for
current revision watches breaks some ordering assumptions. Since
Watch only returns a channel once it gets a response, it should
bind the revision at the time of the first create response.
Was causing TestWatchReconnInit to fail.
Replacing "etcdserver: fill a response header in auth RPCs"
The revision should be set at the time of "apply",
not in later RPC layer.
Fix https://github.com/coreos/etcd/issues/7691
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
run goroutine was resetting a field for no reason and without holding a lock.
This patch cleans up the run goroutine management to make the start/stop path
less racey in general.
Dual locking doesn't really give a convincing performance improvement and
the lock ordering makes it impossible to safely check if the TTL keeper
is enabled or not.
Fixes#7722
etcd passes 'url.URL.Host' to 'SelfCert' which contains
client, peer port. 'net.ParseIP("127.0.0.1:2379")' returns
'nil', and the client on this self-cert will see errors
of '127.0.0.1 because it doesn't contain any IP SANs'
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
advance() should use rs.req.Entries[0].Data as the context instead of
req.Context for deletion. Since req.Context is never set, there won't be
any context being deleted from pendingReadIndex; results mem leak.
FIXES#7571
When a grpc watch stream is torn down, it will join on its logical substream
goroutines by waiting for each to close a channel. This doesn't guarantee
the substream is fully exited, though, but only about to exit and can be
waiting to resume even after Watch.Close finishes. Instead, use a
waitgroup.Done at the very end of the substream defer.
Fixes#7573
Use the path/filepath package instead of the path package. The
path package assumes slash-separated paths, which doesn't work
on Windows. But path/filepath manipulates filename paths in a way
that's compatible across OSes.
Fix https://github.com/coreos/etcd/issues/7512.
If a server starts and aborts due to config error,
it is possible to get stuck in ReadyNotify waits.
This adds select case to get notified on stop channel.
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
Fix https://github.com/coreos/etcd/issues/7470.
This patch removes unnecessary term look-up in
'createMergedSnapshotMessage', which can trigger panic
if raft entry at etcdProgress.appliedi got compacted
by subsequent 'MsgSnap' messages--if a follower is
being (in this case, network latency spikes) slow, it
could receive subsequent 'MsgSnap' requests from leader.
etcd server-side 'applyAll' routine and raft's Ready
processing routine becomes asynchronous after raft
entries are persisted. And given that raft Ready routine
takes less time to finish, it is possible that second
'MsgSnap' is being handled, while the slow 'applyAll'
is still processing the first(old) 'MsgSnap'. Then raft
Ready routine can compact the log entries at future
index to 'applyAll'. That is how 'createMergedSnapshotMessage'
tried to look up raft term with outdated etcdProgress.appliedi.
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
If substream is closing but outc is still open while reconnecting, then outc
would only be closed once the watch client would connect or once the watch
client is closed. This was leading to deadlocks in the proxy tests. Instead,
close immediately if the context is canceled.
Fixes#7503
In cases of multiple endpoints, it's possible member add would get a its
member list from a member that has not yet recognized the membership
update. Instead, confirm that the member list response is from the
member that acked the member add or from a member that has synced
with the cluster following the member add.
Fixes#7498