971 Commits

Author SHA1 Message Date
caoming
9668536124 raft: add a test case in TestStorageAppend 2018-11-15 16:41:36 +08:00
Andrew Werner
e4af2be5bb raft: separate MaxCommittedSizePerReady config from MaxSizePerMsg
Prior to this change, MaxSizePerMsg was used both to cap the total byte size of
entries in messages as well as the total byte size of entries passed through
CommittedEntries in the Ready struct. This change adds a new Config parameter
MaxCommittedSizePerReady which defaults to MaxSizePerMsg and contols the second
of above descibed settings.
2018-11-14 09:59:09 -05:00
Shin'ya Ueoka
aa4313a55a *: fix github links 2018-11-10 11:14:18 +09:00
Xiang Li
c0e04700cf
Merge pull request #10230 from manishrjain/master
raft: Explain ReportSnapshot and Propose behavior
2018-11-01 06:48:45 +08:00
Manish R Jain
4aa72ca1d3
raft: Explain ReportSnapshot and Propose behavior
Update godocs for node interface, explaining the behavior of ReportSnapshot and Propose.
2018-10-31 15:37:55 -07:00
Xiang Li
798955d4d6
Merge pull request #10209 from ping40/d1024
raft: Fix comment on TestLeaderBcastBeat
2018-10-25 14:42:16 -07:00
Gyuho Lee
b7ed4165ea raft: fix godoc in tests
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-24 23:23:32 -07:00
Gyuho Lee
965ba5ca8b
Merge pull request #10203 from ping40/doc1022_2
raft: fix description in UT
2018-10-24 23:21:02 -07:00
ping40
10255cf196 raft: Fix comment on TestLeaderBcastBeat 2018-10-24 16:56:10 +08:00
Gyuho Lee
86b933311d
Merge pull request #10205 from gyuho/testing-prow
OWNERS: experiment
2018-10-22 16:07:27 -07:00
Gyuho Lee
c561f8310e OWNERS: experiment
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-10-22 12:49:08 -07:00
Tobias Schottdorf
ad49c8fd98 raft: fix bug in unbounded log growth prevention mechanism
The previous code was using the proto-generated `Size()` method to
track the size of an incoming proposal at the leader. This includes
the Index and Term, which were mutated after the call to `Size()`
when appending to the log. Additionally, it was not taking into
account that an ignored configuration change would ignore the
original proposal and append an empty entry instead.

As a result, a fully committed Raft group could end up with a non-
zero tracked uncommitted Raft log counter that would eventually hit
the ceiling and drop all future proposals indiscriminately. It would
also immediately imply that proposals exceeding the threshold alone
would get refused (as the "first uncommitted proposal" gets special
treatment and is always allowed in).

Track only the size of the payload actually appended to the Raft log
instead.

For context, see:
https://github.com/cockroachdb/cockroach/issues/31618#issuecomment-431374938
2018-10-22 11:28:39 +02:00
ping40
de470991e1 raft: fix description in UT 2018-10-22 13:59:50 +08:00
Nathan VanBenschoten
73c20cc1b7 raft: Fix comment on sendHeartbeat 2018-10-14 00:03:43 -04:00
Nathan VanBenschoten
7be7ac5a5d raft: Fix spelling in doc.go 2018-10-13 23:25:05 -04:00
Nathan VanBenschoten
f89b06dc6d raft: provide protection against unbounded Raft log growth
The suggested pattern for Raft proposals is that they be retried
periodically until they succeed. This turns out to be an issue
when a leader cannot commit entries because the leader will continue
to append re-proposed entries to its log without committing anything.
This can result in the uncommitted tail of a leader's log growing
without bound until it is able to commit entries.

This change add a safeguard to protect against this case where a
leader's log can grow without bound during loss of quorum scenarios.
It does so by introducing a new, optional ``MaxUncommittedEntriesSize
configuration. This config limits the max aggregate size of uncommitted
entries that may be appended to a leader's log. Once this limit
is exceeded, proposals will begin to return ErrProposalDropped
errors.

See cockroachdb/cockroach#27772
2018-10-13 23:25:05 -04:00
Ben Darnell
08e88c6693
Merge pull request #10063 from tschottdorf/fix-commit-pagination
raft: fix correctness bug in CommittedEntries pagination
2018-10-02 12:39:29 -04:00
Peter Mattis
66ee394527 raft: fix Ready.MustSync logic
The previous logic was erroneously setting Ready.MustSync to true when
the hard state had not changed because we were comparing an empty hard
state to the previous hard state. In combination with another misfeature
in CockroachDB (unnecessary writing of empty batches), this was causing
a steady stream of synchronous writes to disk.
2018-09-19 16:33:16 -04:00
Gyuho Lee
c2b3c54370 raft: fix link typo
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-09-06 09:20:22 -07:00
Tobias Schottdorf
7a8ab37bfd raft: fix correctness bug in CommittedEntries pagination
In #9982, a mechanism to limit the size of `CommittedEntries` was
introduced. The way this mechanism worked was that it would load
applicable entries (passing the max size hint) and would emit a
`HardState` whose commit index was truncated to match the limitation
applied to the entries. Unfortunately, this was subtly incorrect
when the user-provided `Entries` implementation didn't exactly
match what Raft uses internally. Depending on whether a `Node` or
a `RawNode` was used, this would either lead to regressing the
HardState's commit index or outright forgetting to apply entries,
respectively.

Asking implementers to precisely match the Raft size limitation
semantics was considered but looks like a bad idea as it puts
correctness squarely in the hands of downstream users. Instead, this
PR removes the truncation of `HardState` when limiting is active
and tracks the applied index separately. This removes the old
paradigm (that the previous code tried to work around) that the
client will always apply all the way to the commit index, which
isn't true when commit entries are paginated.

See [1] for more on the discovery of this bug (CockroachDB's
implementation of `Entries` returns one more entry than Raft's when the
size limit hits).

[1]: https://github.com/cockroachdb/cockroach/issues/28918#issuecomment-418174448
2018-09-04 14:52:23 +02:00
Gyuho Lee
bb60f8ab1d raft: change import paths to "go.etcd.io/etcd"
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2018-08-28 17:47:52 -07:00
Zhao Haiyuan
6ee880eb5b raft: fix typo in test 2018-08-22 23:48:47 +08:00
Xiang Li
11dd0b583b
Merge pull request #9982 from bdarnell/pagination
raft: Introduce CommittedEntries pagination
2018-08-11 09:12:46 +08:00
Ben Darnell
a9e7c1e11f raft: Make flow control more aggressive
We allow multiple in-flight append messages, but prior to this change
the only way we'd ever send them is if there is a steady stream of new
proposals. Catching up a follower that is far behind would be
unnecessarily slow (this is exacerbated by a quirk of CockroachDB's
use of raft which limits our ability to catch up via snapshot in some
cases).

See cockroachdb/cockroach#27983
2018-08-08 11:10:54 -04:00
Ben Darnell
0a670b7c9b raft: Introduce CommittedEntries pagination
The MaxSizePerMsg setting is now used to limit the size of
Ready.CommittedEntries. This prevents out-of-memory errors if the raft
log has become very large and commits all at once.
2018-08-07 12:54:34 -04:00
Ben Darnell
bc14deecca raft: Add a test for MaxSizePerMsg feature
Ensure that this limit is respected when generating MsgApp messages.
2018-08-06 16:52:16 -04:00
Nathan VanBenschoten
0a415cf0d6 raft: dont allocate slice and sort on every commit 2018-07-25 23:42:16 -04:00
Gyuho Lee
7aaaa0d82f raft: do not use underscore in var name
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-05 10:25:47 -07:00
Gyuho Lee
0249c39cb3 raft: remove unnecessary type conversion
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-07-05 10:12:45 -07:00
Ben Darnell
20422c5b4d raft: Really avoid scanning raft log in becomeLeader
I meant to do this in #9073, but sent the PR before it was finished.
The last log index is known directly; there is no need to fetch any
entries here.
2018-06-26 15:29:51 -04:00
Gyuho Lee
1136ba0e0d raft: fix logger variadic parameter
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-06-15 13:10:58 -07:00
Gyuho Lee
9054786553 Revert "raft: fix logger Panic variadic parameter"
This reverts commit 5a94aba33eeb504e7036a27268c67f6a1796445e.
2018-06-15 13:10:58 -07:00
sudeesh john
e07d19e549 raft: fix logger Panic variadic parameter
"# github.com/coreos/etcd/raft
raft/logger.go:117: missing ... in args forwarded to print-like function"

New parameter check got added the golang to check the function parameter
c006036075 (diff-8fa5b0d6191706747ef5773f895781c9)
2018-06-15 13:10:58 -07:00
Xiang Li
357308bfcd
Merge pull request #9679 from lorneli/lorneli-raft-dev
raft: describe the purpose of lockedRand
2018-05-26 22:03:18 -07:00
lorneli
a083282482 raft: describe the purpose of lockedRand
Struct lockedRand wraps rand.Rand with mutex lock because it's
accessed by multiple raft groups.
2018-05-26 21:59:24 +08:00
Xiang Li
20cf7f4d5b
Merge pull request #9671 from lorneli/raft-test
raft: merge test cases of pre-candidate with the normal one
2018-05-24 08:27:07 -07:00
Gyuho Lee
e7adfb0ebf raft: use different parameters for tests
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-05-09 15:42:45 -07:00
lorneli
3d12e36c7e raft: merge test cases of pre-candidate with the normal one
So result checking just compares the expected with output and
becomes more readable.
2018-05-01 17:08:37 +08:00
Jia Zhan
d14b705355 raft: fix a few comments 2018-04-27 11:25:06 -07:00
Vincent Lee
f0dffb4163 raft: Propose in raft node wait the proposal result so we can fail fast while dropping proposal. 2018-04-03 11:04:09 +08:00
Kostas Christidis
438163feb4 raft: fix failing tests in rafttest
Tests in `rafttest` would fail because they referred to field `Id` instead of
`ID`. This PR fixes that.

Closes #9504.

Signed-off-by: Kostas Christidis <kostas@christidis.io>
2018-03-28 15:12:29 -04:00
Gyuho Lee
8aae8c1c9c raft: document disruptive rejoining server, add tests
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-03-06 09:54:29 -08:00
Gyuho Lee
d808b4686c raft: fix typo in raft_test.go
Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-02-26 10:03:25 -08:00
Gyuho Lee
01db389ea8 raft: document why reuse candidate's term for vote response in stepCandidate
"stepCandidate" should reuse candidate's own term, not term in Message,
because pre-vote is requested with future term.

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-02-21 16:11:01 -08:00
Gyuho Lee
38846c220a raft: use leader's term when candidate becomes follower
`raft.Step` already ensures that when `m.Term > r.Term`,
candidate reverts back to follower with its term being
reset with `m.Term`, thus it's always true that
`m.Term == r.Term` in `stepCandidate`.

This just makes `r.becomeFollower` calls consistent.

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-02-21 16:10:52 -08:00
Gyuho Lee
2b7c12fb12 raft: reuse "last index" in "appendEntry"
No need to call "lastIndex" again.
"append" call already returns "lastIndex".

Signed-off-by: Gyuho Lee <gyuhox@gmail.com>
2018-02-05 21:26:45 -08:00
Xiang Li
d54f281b26
Merge pull request #8525 from shuaili87/pre-vote-compatible
raft: fix deadlock during PreVote migration process
2018-01-26 16:34:59 -08:00
Manjunath A Kumatagi
c27998db97 raft: fix govet errors 2018-01-25 04:51:38 -05:00
Ben Darnell
4e0291ff91 raft: Clarify conditions for granting votes and prevotes.
This includes one theoretical logic change: A node that knows the
leader of the current term will no longer grant votes, even if it has
not yet voted in this term. It also adds a `m.Type == MsgPreVote`
guard on the `m.Term > r.Term` check, which was previously thought to
be incorrect (see #8517) but was actually just unclear.

Closes #8517
Closes #8571
2018-01-23 15:05:11 -05:00
Kostas Christidis
97fad42d81 docs: fix invalid reference in Raft README
Code snippet in Raft README refers to non-existent field `State`. Fixed
the reference by setting it to `HardState`.
2018-01-17 16:03:32 -05:00