Mirroristas/etcd

mirror of https://github.com/etcd-io/etcd.git synced 2024-09-27 06:25:44 +00:00

Author	SHA1	Message	Date
Tobias Schottdorf	500af91653	raft: restore ability to bootstrap RawNode We are worried about breaking backwards compatibility for any application out there that may have relied on the old behavior. Their RawNode invocation would have been broken by the removal of the peers argument so it would not have changed silently; an associated comment tells callers how to fix it.	2019-07-19 10:02:02 +02:00
Tobias Schottdorf	c9491d7861	raft: clean up bootstrap This is the first (maybe not last) step in cleaning up the bootstrap code around StartNode. Initializing a Raft group for the first time is awkward, since a configuration has to be pulled from thin air. The way this is solved today is unclean: The app is supposed to pass peers to StartNode(), we add configuration changes for them to the log, immediately pretend that they are applied, but actually leave them unapplied (to give the app a chance to observe them, though if the app did decide to not apply them things would really go off the rails), and then return control to the app. The app will then process the initial Readys and as a result the configuration will be persisted to disk; restarts of the node then use RestartNode which doesn't take any peers. The code that did this lived awkwardly in two places fairly deep down the callstack, though it was really only necessary in StartNode(). This commit refactors things to make this more obvious: only StartNode does this dance now. In particular, RawNode does not support this at all any more; it expects the app to set up its Storage correctly. Future work may provide helpers to make this "preseeding" of the Storage more user-friendly. It isn't entirely straightforward to do so since the Storage interface doesn't provide the right accessors for this purpose. Briefly speaking, we want to make sure that a non-bootstrapped node can never catch up via the log so that we can implicitly use one of the "skipped" log entries to represent the configuration change into the bootstrap configuration. This is an invasive change that affects all consumers of raft, and it is of lower urgency since the code (post this commit) already encapsulates the complexity sufficiently.	2019-07-19 10:02:02 +02:00
Tobias Schottdorf	c62b7048b5	raft: use RawNode for node's event loop It has always bugged me that any new feature essentially needed to be tested twice due to the two ways in which apps can use raft (`node` and `RawNode`). Due to upcoming testing work for joint consensus, now is a good time to rectify this somewhat. This commit removes most logic from `(node).run` and uses `RawNode` internally. This simplifies the logic and also lead (via debugging) to some insight on how the semantics of the approaches differ, which is now documented in the comments.	2019-07-19 09:59:59 +02:00
Tobias Schottdorf	b171e1c78b	raft: centralize configuration change application Put all the logic related to applying a configuration change in one place in preparation for adding joint consensus. This inspired various TODOs. I had to rewrite TestSnapshotSucceedViaAppResp since it was relying on a snapshot applied to the leader, which is now prevented.	2019-07-03 21:26:42 +02:00
Tobias Schottdorf	f9c2d00fb3	raft: extract 'tracker' package Mechanically extract `progressTracker`, `Progress`, and `inflights` to their own package named `tracker`. Add lots of comments in the progress, and take the opportunity to rename and clarify various fields.	2019-06-21 22:15:00 +02:00
Gyuho Lee	34bd797e67	*: revert module import paths Signed-off-by: Gyuho Lee <leegyuho@amazon.com>	2019-05-28 15:39:35 -07:00
Tobias Schottdorf	ea82b2b758	raft: move more methods onto the progress tracker Continues what was initiated in the last commit.	2019-05-21 16:02:52 +02:00
shivaramr	9150bf52d6	go modules: Fix module path version to include version number	2019-04-26 15:29:50 -07:00
caoming	92d5d19ce9	raft: more precise that rename res to err.	2019-03-22 10:18:00 +08:00
Xiang Li	c0e04700cf	Merge pull request #10230 from manishrjain/master raft: Explain ReportSnapshot and Propose behavior	2018-11-01 06:48:45 +08:00
Manish R Jain	4aa72ca1d3	raft: Explain ReportSnapshot and Propose behavior Update godocs for node interface, explaining the behavior of ReportSnapshot and Propose.	2018-10-31 15:37:55 -07:00
Nathan VanBenschoten	f89b06dc6d	raft: provide protection against unbounded Raft log growth The suggested pattern for Raft proposals is that they be retried periodically until they succeed. This turns out to be an issue when a leader cannot commit entries because the leader will continue to append re-proposed entries to its log without committing anything. This can result in the uncommitted tail of a leader's log growing without bound until it is able to commit entries. This change add a safeguard to protect against this case where a leader's log can grow without bound during loss of quorum scenarios. It does so by introducing a new, optional ``MaxUncommittedEntriesSize configuration. This config limits the max aggregate size of uncommitted entries that may be appended to a leader's log. Once this limit is exceeded, proposals will begin to return ErrProposalDropped errors. See cockroachdb/cockroach#27772	2018-10-13 23:25:05 -04:00
Ben Darnell	08e88c6693	Merge pull request #10063 from tschottdorf/fix-commit-pagination raft: fix correctness bug in CommittedEntries pagination	2018-10-02 12:39:29 -04:00
Peter Mattis	66ee394527	raft: fix Ready.MustSync logic The previous logic was erroneously setting Ready.MustSync to true when the hard state had not changed because we were comparing an empty hard state to the previous hard state. In combination with another misfeature in CockroachDB (unnecessary writing of empty batches), this was causing a steady stream of synchronous writes to disk.	2018-09-19 16:33:16 -04:00
Tobias Schottdorf	7a8ab37bfd	raft: fix correctness bug in CommittedEntries pagination In #9982, a mechanism to limit the size of `CommittedEntries` was introduced. The way this mechanism worked was that it would load applicable entries (passing the max size hint) and would emit a `HardState` whose commit index was truncated to match the limitation applied to the entries. Unfortunately, this was subtly incorrect when the user-provided `Entries` implementation didn't exactly match what Raft uses internally. Depending on whether a `Node` or a `RawNode` was used, this would either lead to regressing the HardState's commit index or outright forgetting to apply entries, respectively. Asking implementers to precisely match the Raft size limitation semantics was considered but looks like a bad idea as it puts correctness squarely in the hands of downstream users. Instead, this PR removes the truncation of `HardState` when limiting is active and tracks the applied index separately. This removes the old paradigm (that the previous code tried to work around) that the client will always apply all the way to the commit index, which isn't true when commit entries are paginated. See [1] for more on the discovery of this bug (CockroachDB's implementation of `Entries` returns one more entry than Raft's when the size limit hits). [1]: https://github.com/cockroachdb/cockroach/issues/28918#issuecomment-418174448	2018-09-04 14:52:23 +02:00
Gyuho Lee	bb60f8ab1d	raft: change import paths to "go.etcd.io/etcd" Signed-off-by: Gyuho Lee <leegyuho@amazon.com>	2018-08-28 17:47:52 -07:00
Ben Darnell	0a670b7c9b	raft: Introduce CommittedEntries pagination The MaxSizePerMsg setting is now used to limit the size of Ready.CommittedEntries. This prevents out-of-memory errors if the raft log has become very large and commits all at once.	2018-08-07 12:54:34 -04:00
Vincent Lee	f0dffb4163	raft: Propose in raft node wait the proposal result so we can fail fast while dropping proposal.	2018-04-03 11:04:09 +08:00
Vincent Lee	11fa4f0275	raft: raft learners should be returned after applyConfChange	2018-01-11 17:30:17 +08:00
Ben Darnell	8d8f3195e4	raft: Avoid scanning raft log in becomeLeader Scanning the uncommitted portion of the raft log to determine whether there are any pending config changes can be expensive. In cockroachdb/cockroach#18601, we've seen that a new leader can spend so much time scanning its log post-election that it fails to send its first heartbeats in time to prevent a second election from starting immediately. Instead of tracking whether a pending config change exists with a boolean, this commit tracks the latest log index at which a pending config change could exist. This is a less expensive solution to the problem, and the impact of false positives should be minimal since a newly-elected leader should be able to quickly commit the tail of its log.	2017-12-30 10:13:36 -05:00
siddontang	c6f2db2e92	raft: support learner	2017-11-11 10:38:21 +08:00
Gyu-Ho Lee	f65aee0759	*: replace 'golang.org/x/net/context' with 'context' Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-09-07 13:39:42 -07:00
Peter Mattis	ab03a42f06	raft: add Ready.MustSync Add Ready.MustSync which indicates that the hard state and raft log entries in a Ready message must be synchronously written to persistent storage.	2017-02-13 15:13:21 -05:00
Alexander Morozov	7afc490c95	raft: return empty status if node is stopped If the node is stopped, then Status can hang forever because there is no event loop to answer. So, just return empty status to avoid deadlocks. Fix #6855 Signed-off-by: Alexander Morozov <lk4d4math@gmail.com>	2016-11-15 15:45:23 -08:00
Xiang Li	710b14ce56	raft: support safe readonly request Implement raft readonly request described in raft thesis 6.4 along with the existing clock/lease based approach.	2016-09-12 15:13:52 +08:00
Gyu-Ho Lee	e64ef3f261	raft: add 'TransferLeadership' to Node interface	2016-08-10 16:25:22 -07:00
Xiang Li	484f579905	raft: hide Campaign rules on applying all entries	2016-07-25 15:53:39 -07:00
Xiang Li	1c5754f02d	raft: fix readindex	2016-07-19 15:00:58 -07:00
Jared Hulbert	df94f58462	raft: atomic access alignment The relevant structures are properly aligned, however, there is no comment highlighting the need to keep it aligned as is present elsewhere in the codebase. Adding note to keep alignment, in line with similar comments in the codebase.	2016-07-08 11:05:41 -07:00
Gyu-Ho Lee	9e0de02fde	raft: fix minor grammar, remove TODO - test 'Term' panic cases (remove TODO) - fix minor grammar in 'Node' godoc	2016-07-05 07:21:52 -07:00
Xiang Li	5f1c763993	Merge pull request #5553 from swingbach/master raft: implemented read-only query when quorum check is on	2016-06-28 12:47:43 -07:00
swingbach@gmail.com	0faae33ace	raft: implemented read-only query when quorum check is on	2016-06-28 10:52:53 +08:00
Xiang Li	848f539536	raft: make tick unblock and fix potential live lock	2016-06-16 08:01:06 -07:00
Gyu-Ho Lee	fe884f8209	raft: update LICENSE header	2016-05-12 20:49:15 -07:00
Xiang Li	59c5110b73	raft: fix detected race in node.go	2016-04-22 15:45:33 -07:00
es-chow	ac059eb8cb	raft: transfer leader feature	2016-04-08 16:56:32 +08:00
Anthony Romano	bd832e5b0a	*: migrate Godeps to vendor/	2016-03-22 17:10:28 -07:00
Anthony Romano	afa0368dcc	*: fix godoc bugs in interfaces and slice fields detected with goword	2016-02-24 00:45:40 -08:00
Xiang Li	390a4518c0	raft: rework comment for advance interface	2016-02-12 13:43:51 -08:00
Anthony Romano	20461ab11a	*: fix many typos	2016-01-31 21:42:39 -08:00
Ben Darnell	22925a1d2f	raft: Remove redundant `raft.Commit` field. Keeping this field in sync with `raft.raftLog.committed` was error-prone, so instead we synthesize the `HardState` on demand. Fixes #4278.	2016-01-26 15:18:55 -05:00
Sam Rijs	896719c877	raft: use configured logger in raft/node.go Those three log statements in node.go have not been using the logger that was passed via `raft.Config`, but instead the default raft logger. This changes it to use the proper logger.	2016-01-25 00:15:44 +11:00
Emil Hessman	b9f22cb69b	raft: fix Node doc typo	2015-09-21 06:13:33 +02:00
Xiang Li	085447ed85	raft: fix raft node start bug raft node should set initial prev hard state to empty. Or it will not send the first hard coded state to application until the state changes again. This commit fixs the issue. It introduce a small overhead, that the same tate might send to application twice when restarting. But this is fine.	2015-05-27 13:32:04 -07:00
Xiang Li	b3fb052ad4	raft: make peers a prviate field in raft.Config	2015-03-24 11:10:07 -07:00
Xiang Li	abddef0f28	raft: make node configurable	2015-03-23 21:20:49 -07:00
Xiang Li	d9b5b56c82	raft: make raft configurable	2015-03-23 09:55:19 -07:00
Xiang Li	97579e2e1d	raft: introduce logger interface	2015-03-08 21:36:32 -07:00
Xiang Li	9b4d52ee73	raft: do not resend snapshot if not necessary raft relies on the link layer to report the status of the sent snapshot. If the snapshot is still sending, the replication to that remote peer will be paused. If the snapshot finish sending, the replication will begin optimistically after electionTimeout. If the snapshot fails, raft will try to resend it.	2015-02-28 11:41:58 -08:00
Xiang Li	2af33fd494	raft: add reportUnreachable	2015-02-28 10:45:22 -08:00

1 2 3 4 5

238 Commits