232 Commits

Author SHA1 Message Date
Xiang Li
1279e495f0 raft: make the repeated log message under bad path debug level 2015-06-05 17:29:24 -07:00
Xiang Li
1561b85bf3 raft: drop the raft prefix in logging 2015-06-02 12:50:42 -07:00
Xiang Li
b3fb052ad4 raft: make peers a prviate field in raft.Config 2015-03-24 11:10:07 -07:00
Xiang Li
d9b5b56c82 raft: make raft configurable 2015-03-23 09:55:19 -07:00
Xiang Li
4a64373225 raft: add flow control for progress
Each progress has a inflighs sliding window. When the progress
is in replicate state, inflights will control the sending speed
of the leader.

The leader can have at most maxInflight number of inflight
messages for each replicate progress. Receving a appResp moves
forward the sliding window. Heartbeat response free one
slot if the window is full.
2015-03-20 20:04:33 -07:00
Xiang Li
2adb58f9de raft: move progress to progress.go 2015-03-19 10:05:04 -07:00
Xiang Li
7571b2cde2 raft: limit the size of msgApp
limit the max size of entries sent per message.
Lower the cost at probing state as we limit the size per message;
lower the penalty when aggressively decrease to a too low next.
2015-03-18 15:59:30 -07:00
Yicheng Qin
67194c0b22 raft: introduce progress states 2015-03-18 08:16:32 -07:00
Xiang Li
39731724ff Merge pull request #2485 from yichengq/337
raft: fall back to bad path when unreachable
2015-03-11 14:16:39 -07:00
Yicheng Qin
be0bf2a2bd raft: fall back to bad path when unreachable 2015-03-11 13:21:23 -07:00
Xiang Li
c643967a41 raft: reply with the commit index when receives a smaller append message
Follower should not reject the append message with a smaller index than its commit
index. Or it will trigger the leader's resending logic, which might have a high cost.
2015-03-10 22:32:36 -07:00
Xiang Li
a2be25cba4 Merge pull request #2460 from xiang90/raft-logger
raft: introduce logger interface
2015-03-09 08:00:21 -07:00
Xiang Li
97579e2e1d raft: introduce logger interface 2015-03-08 21:36:32 -07:00
Xiang Li
7fe608532a raft: do not reset vote if term is not changed
raft MUST keep the voting information for the same term. reset
should not reset vote if term is not changed.
2015-03-07 22:31:20 -08:00
Yicheng Qin
b4b9b9118a rafthttp: report MsgSnap status 2015-03-02 09:38:11 -08:00
Yicheng Qin
09f181f585 raft: log unreachable remote node 2015-03-01 16:47:49 -08:00
Xiang Li
9b4d52ee73 raft: do not resend snapshot if not necessary
raft relies on the link layer to report the status of the sent snapshot.
If the snapshot is still sending, the replication to that remote peer will
be paused. If the snapshot finish sending, the replication will begin
optimistically after electionTimeout. If the snapshot fails, raft will
try to resend it.
2015-02-28 11:41:58 -08:00
Xiang Li
2185ac5ac8 raft: cleanup unreachable 2015-02-28 11:35:16 -08:00
Xiang Li
2af33fd494 raft: add reportUnreachable 2015-02-28 10:45:22 -08:00
Xiang Li
bff2ccaa22 Merge pull request #2170 from xiang90/remove_log
raft: remove default verbose logging
2015-01-27 15:58:53 -08:00
Xiang Li
553379e82b raft: remove default verbose logging 2015-01-27 15:57:44 -08:00
Ben Darnell
33d2400063 raft: Send any waiting appends after receiving MsgAppResp.
This addresses a problem that comes up in the cockroach tests,
in which the order of messages may lead to deadlocks (due to
the fact that we don't have regular heartbeat timers in most
of our tests).
2015-01-27 17:43:29 -05:00
Jonathan Boulle
f1ed69e883 *: switch to line comments for copyright
Build tags are not compatible with block comments.
Also adds copyright header to a few places it was missing.
2015-01-26 09:53:30 -08:00
Ben Darnell
8c3a6508e9 raft: Add applied to the newRaft log message. 2015-01-22 12:04:40 -05:00
Ben Darnell
59214978a2 raft: Add applied index as an argument to newRaft and RestartNode. 2015-01-22 11:38:05 -05:00
Xiang Li
003b97a60f raft: public progress struct in raft 2015-01-20 10:26:22 -08:00
Ben Darnell
2e1c36cdd9 raft: introduce MsgHeartbeatResp.
Now that heartbeats are distinct from MsgApp{,Resp}, the retries
currently performed in stepLeader's MsgAppResp section are only
performed on an actual MsgAppResp (or a new MsgProp). This means
that it may take a long time to recover from a dropped MsgAppResp
in a quiet cluster.

This commit adds a dedicated heartbeat response message. This message
does not convey the follower's current log position because the
MsgHeartbeat does not include the leaders term and index. Upon receipt
of a heartbeat response, the leader may retry the latest MsgApp if it
believes the follower to be behind.
2015-01-14 17:34:10 -05:00
Ben Darnell
9972e62d94 raft: Use <= instead of < for heartbeat ticks.
In code outside the raft package, we cannot call raft.bcastHeartbeat
directly. Instead, to control heartbeats we set heartbeatInterval to 1
and call Tick().
2015-01-14 15:27:32 -05:00
Xiang Li
35b907ac58 raft: add lastIndex as rejectHint
Add the lastindex of the raft log as reject hint, so the leader can
bypass the greater index probing and decrease the next index directly
to last + 1.
2015-01-01 19:04:07 -08:00
Xiang Li
fc96a9e4a7 raft: remove unnecessary funcs in raft.go 2014-12-25 17:04:33 -08:00
Xiang Li
88767d913d raft: leader waits for the reply of previous message when follower is not in good path.
It is reasonable for the leader to wait for the reply before sending out the next
msgApp or msgSnap for the follower in bad path. Or the leader will send out useless
messages if the previous message is rejected or the previous message is a snapshot.
Especially for the snapshot case, the leader will be 100% to send out duplicate message
including the snapshot, which is a huge waste.

This commit implement a timeout based wait mechanism. The timeout for normal msgApp is a
heartbeatTimeout and the timeout for snapshot is electionTimeout(snapshot is larger). We
can implement a piggyback mechanism(application notifies the msg lost) in the future
if necessary.
2014-12-18 15:01:50 -08:00
Xiang Li
044e35b814 raft: use newRaft 2014-12-15 11:25:35 -08:00
Xiang Li
c586d5012c raft: log term as %d 2014-12-14 10:06:45 -08:00
Xiang Li
2c2e032155 Merge pull request #1908 from bdarnell/error-fixes
raft: remove panic when we see a proposal with no leader.
2014-12-11 13:58:51 -08:00
Ben Darnell
b26856b603 raft: add detail to "no leader" log message 2014-12-11 15:07:32 -05:00
Xiang Li
89cba625d6 Merge pull request #1897 from xiang90/raft
raft: get rid of the using of defer in critical path
2014-12-10 21:24:38 -08:00
Yicheng Qin
e89cc25c50 Merge pull request #1901 from yichengq/260
rafthttp: batch MsgProp
2014-12-10 21:16:07 -08:00
Yicheng Qin
3867c72c8a raft: support to do multiple proposals in one message 2014-12-10 20:00:59 -08:00
Ben Darnell
fa247d09cc raft: remove panic when we see a proposal with no leader.
This panic can never be reached when using raft.Node, because we only
read from propc when there is a leader. However, it is possible to see
this error when using raft the raft object directly (as in MultiNode),
and in this case it is better to simply drop the proposal (as if we had
sent it to a leader that immediately vanished).

Add an error return to MemoryStorage.Append for consistency.
2014-12-10 17:34:40 -05:00
Xiang Li
96de9776b7 raft: get rid of allocation 2014-12-10 13:41:04 -08:00
Xiang Li
a5efbf826d raft: drop nodes in softState 2014-12-09 11:43:52 -08:00
Xiang Li
ba45637ba3 raft: group step funcs 2014-12-08 15:29:54 -08:00
Xiang Li
099f4f10ea raft: one line 2014-12-08 15:28:48 -08:00
Xiang Li
8ead428e76 raft: group getter funcs 2014-12-08 15:24:34 -08:00
Xiang Li
f73d059d80 raft: group configuration related funcs 2014-12-08 15:23:21 -08:00
Xiang Li
25313b1210 raft: move poll close to campaign 2014-12-08 15:21:57 -08:00
Xiang Li
d52c66ad42 raft: removed unused func 2014-12-08 15:20:43 -08:00
Xiang Li
62ed1de10d raft: refactoring logging 2014-12-08 15:16:02 -08:00
Xiang Li
6cb7f2d9e9 raft: print out log when creating a newraft 2014-12-08 14:37:39 -08:00
Ben Darnell
ea4d645a83 raft: Ignore redundant addNode calls.
This avoids clobbering any state when bootstrapping entries are
applied twice.
2014-12-05 17:15:50 -05:00