Thanks to this the logs:
- are automatically printed if the test fails.
- are in pretty consistent format.
- are annotated by 'member' information of the cluster emitting them.
Side changes:
- Set propert default got DefaultWarningApplyDuration (used to be '0')
- Name the members based on their 'place' on the list (as opposed to
'random')
This change makes the etcd package compatible with the existing Go
ecosystem for module versioning.
Used this tool to update package imports:
https://github.com/KSubedi/gomove
It has often been tedious to test the interactions between multi-member
Raft groups, especially when many steps were required to reach a certain
scenario. Often, this boilerplate was as boring as it is hard to write
and hard to maintain, making it attractive to resort to shortcuts
whenever possible, which in turn tended to undercut how meaningful and
maintainable the tests ended up being - that is, if the tests were even
written, which sometimes they weren't.
This change introduces a datadriven framework specifically for testing
deterministically the interaction between multiple members of a raft group
with the goal of reducing the friction for writing these tests to near
zero.
In the near term, this will be used to add thorough testing for joint
consensus (which is already available today, but wildly undertested),
but just converting an existing test into this framework has shown that
the concise representation and built-in inspection of log messages
highlights unexpected behavior much more readily than the previous unit
tests did (the test in question is `snapshot_succeed_via_app_resp`; the
reader is invited to compare the old and new version of it).
The main building block is `InteractionEnv`, which holds on to the state
of the whole system and exposes various relevant methods for
manipulating it, including but not limited to adding nodes, delivering
and dropping messages, and proposing configuration changes. All of this
is extensible so that in the future I hope to use it to explore the
phenomena discussed in
https://github.com/etcd-io/etcd/issues/7625#issuecomment-488798263
which requires injecting appropriate "crash points" in the Ready
handling loop. Discussions of the "what if X happened in state Y"
can quickly be made concrete by "scripting up an interaction test".
Additionally, this framework is intentionally not kept internal to the
raft package.. Though this is in its infancy, a goal is that it should
be possible for a suite of interaction tests to allow applications to
validate that their Storage implementation behaves accordingly, simply
by running a raft-provided interaction suite against their Storage.
etcd is going to support incremental snapshot, and we design to let it
send at most one snapshot out at first stage. So when one snapshot is in
flight, snapshot request will return error.
When failing to get snapshot when sending MsgSnap, raft prints out
related log and abort sending this message.
etcd now compact raft storage asynchronously, and append entry to raft
storage may happen at the same time. Add the lock to fix the bug that
the entries saved in storage may be organized in a wrong way.
limit the max size of entries sent per message.
Lower the cost at probing state as we limit the size per message;
lower the penalty when aggressively decrease to a too low next.
This panic can never be reached when using raft.Node, because we only
read from propc when there is a leader. However, it is possible to see
this error when using raft the raft object directly (as in MultiNode),
and in this case it is better to simply drop the proposal (as if we had
sent it to a leader that immediately vanished).
Add an error return to MemoryStorage.Append for consistency.
Memory storage should append all entries that have greater index
than the snap.Matedata.Index. We first truncate the old parts of
incoming entries. Then truncate the existing entries in the storage.
At last, we append the incoming entries to the existing entries.
Compaction is now treated as an implementation detail of Storage
implementations; Node.Compact() and related functionality have been
removed. Ready.Snapshot is now used only for incoming snapshots.
A return value has been added to ApplyConfChange to allow applications
to track the node information that must be stored in the snapshot.
raftpb.Snapshot has been split into Snapshot and SnapshotMetadata, to
allow the full snapshot data to be read from disk only when needed.
raft.Storage has new methods Snapshot, ApplySnapshot, HardState, and
SetHardState. The Snapshot and HardState parameters have been removed
from RestartNode() and will now be loaded from Storage instead.
The only remaining difference between StartNode and RestartNode is that
the former bootstraps an initial list of Peers.
The first entry in the log is a dummy which is used for matchTerm
but may not have an actual payload. This change permits Storage
implementations to treat this term value specially instead of
storing it as a dummy Entry.
Storage.FirstIndex() no longer includes the term-only entry.
This reverses a recent decision to create entry zero as initially
unstable; Storage implementations are now required to make
Term(0) == 0 and the first unstable entry is now index 1.
stableTo(0) is no longer allowed.
This entry is now persisted through the normal flow instead of appearing
in the stored log at creation time. This is how things worked before
the Storage interface was introduced. (see coreos/etcd#1689)
Callers must in general have a reference to their Storage objects to
transfer entries from Ready to Storage, so it doesn't make sense to
create a hidden Storage for them.
By explicitly creating Storage objects in tests we can remove a
few casts of raftLog's storage field.
This change splits the raftLog.entries array into an in-memory
"unstable" list and a pluggable interface for retrieving entries that
have been persisted to disk. An in-memory implementation of this
interface is provided which behaves the same as the old version;
in a future commit etcdserver could replace the MemoryStorage with
one backed by the WAL.