Also rename hasNextEnts to hasNextCommittedEnts.
Also rename maxNextEntsSize to maxNextCommittedEntsSize.
Pure refactor.
Signed-off-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
This commit makes the rarely used `raftpb.Message.Snapshot` field nullable.
In doing so, it reduces the memory size of a `raftpb.Message` message from
264 bytes to 128 bytes — a 52% reduction in size.
While this commit does not change the protobuf encoding, it does change
how that encoding is used. `(gogoproto.nullable) = false` instruct the
generated proto marshaling logic to always encode a value for the field,
even if that value is empty. `(gogoproto.nullable) = true` instructs the
generated proto marshaling logic to omit an encoded value for the field
if the field is nil.
This raises compatibility concerns in both directions. Messages encoded
by new binary versions without a `Snapshot` field will be decoded as an
empty field by old binary versions. In other words, old binary versions
can't tell the difference. However, messages encoded by old binary versions
with an empty Snapshot field will be decoded as a non-nil, empty field by
new binary versions. As a result, new binary versions need to be prepared
to handle such messages.
While Message.Snapshot is not intentionally part of the external interface
of this library, it was possible for users of the library to access it and
manipulate it. As such, this change may be considered a breaking change.
Signed-off-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
This commit removes the verbose comparisons in tests, in favor of using
assert/require helpers from the testify packages.
Signed-off-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
Make the field name and comment clearer on the fact that it's used both in
StateProbe and StateReplicate. The old name ProbeSent was slightly confusing,
and also triggered thinking that it's used only in StateProbe.
Signed-off-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
When Inflights to a particular node is full, i.e. MaxInflightMsgs for the
append messages flow is saturated, it is still necessary to continue sending
MsgApp to ensure progress. Currently this is achieved by "forgetting" the first
in-flight message in the window, which frees up quota for one new MsgApp.
This new message is constructed in such a way that it potentially has multiple
entries, or a large entry. The effect of this is that the in-flight limitations
can be exceeded arbitrarily, for as long as the flow to this node continues
being saturated. In particular, if a follower is stuck, the leader will keep
sending entries to it.
This commit makes the MsgApp empty when Inflights is saturated, and prevents
the described leakage of Entries to slow followers.
Signed-off-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
Previously, Progress update on MsgApp send was scattered across raft.go and
tracker/progress.go. This commit better encapsulates this logic in the Progress
type.
Signed-off-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
- avoid large indented blocks, leave the main block unindented
- declare pb.Message inlined in the sending call
Signed-off-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
This commit adds a data-driven test which simulates conditions under which Raft
messages flow to a particular node is throttled while in StateReplicate. The
test demonstrates that MsgApp messages with non-empty Entries may "leak" to a
paused stream every time there is successful heartbeat exchange.
Signed-off-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
These are nonsensical and a network implementation is not required
to handle them correctly, so panic instead of sending them out.
Signed-off-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
This commit simplifies the logic added in 37c7e4d to auto-leave joint
configurations. It does so by making the following adjustments to the
code:
- remove the `oldApplied <= r.pendingConfIndex` condition. This does
not seem necessary. When a node first attempts to auto-leave a joint
config, it will bump `r.pendingConfIndex` when proposing. In cases
where `oldApplied >= r.pendingConfIndex`, the proposal must have
already been applied. Reviewers should double check this.
- use raft.Step instead of custom proposal code. This code was already
present in stepLeader, so there was no reason to duplicate it. This
would have avoided bugs like the one we fixed in #14538.
- use `confChangeToMsg` to generate message, to centralize the creation
of all `MsgProp{EntryConfChange}` messages.
Signed-off-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
This commit ensures that the raft leader eagerly broadcasts a MsgApp to
each follower when initiating an automatic transition out of a joint
configuration. This had been missed previously, which could lead to
delayed completion of an auto-transition.
Signed-off-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
This commit updates the `RawNode`'s `prevHardSt` to the new HardState in
`acceptReady` instead of on `Advance`. This aligns the handling of
`prevHardSt` with that of `prevSoftSt` (and other fields like `msgs`)
and simplifies the logic in `Advance`.
Signed-off-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
This reverts commit d73a986e4edb15ef9dbfc994f1cbf5e96694d877, which
was added only for benchmarking purposes.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
This makes the alternative option of implementing the leader's self-ack
of entry append the default.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
This is tested directly at the level of `RawNode` in
`TestRawNodeReadIndex`. `*node` is a thin wrapper around `RawNode` so
this is sufficient.
The reason to remove the test is that it now incurs data races
since it's not possible to adjust the `readStates` and `step`
fields while the node is running, and there is no primitive
to synchronize with its goroutine. This could all be fixed
but it's not worth it.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
This now needs an additional Ready cycle to apply the previous conf change,
so the finalizing conf change does too.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
Leader only acks to itself on `(*raft).advance` so we have to
make this test a bit more like the real thing.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
This was expecting the progress of the leader to be updated as a
result of MsgProp but it is now happening in `(*raft).advance`.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
This fixes essentially all tests using this, since now they don't have
to do anything special about the extra cycle introduced for single
nodes.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
Switched this to baking the conf changes into the initial state
to have fewer cycles to walk through in the test.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>
This needed to apply entries from CommittedEntries, not Entries.
Previously the test got away with it because the two slices were
equal. Now it was hanging because when it proposed the second
conf change the first one hadn't applied yet, and so it got dropped,
and the test would hang.
Signed-off-by: Tobias Grieger <tobias.b.grieger@gmail.com>