165 Commits

Author SHA1 Message Date
Yicheng Qin
8ee1bf31d6 raft: use IsEmptySnap to check the empty snapshot 2014-11-25 00:37:21 -08:00
Ben Darnell
9ddd8ee539 Rename Storage.HardState back to InitialState and include ConfState.
This fixes integration/migration_test.go (and highlights the fact that
we need some more raft-level testing of restoring from snapshots).
2014-11-21 17:22:20 -05:00
Ben Darnell
0d680d0e6b Merge remote-tracking branch 'coreos/master' into merge
* coreos/master:
  rafthttp: fix import
  raft: should not decrease match and next when handling out of order msgAppResp
  Fix migration to allow snapshots to have the right IDs
  add snapshotted integration test
  fix test import loop
  fix import loop, add set to types, and fix comments
  etcdserver: autodetect v0.4 WALs and upgrade them to v0.5 automatically
  wal: add a bench for write entry
  rafthttp: add streaming server and client
  dep: use vendored imports in codegangsta/cli
  dep: bump golang.org/x/net/context

Conflicts:
	etcdserver/server.go
	etcdserver/server_test.go
	migrate/snapshot.go
2014-11-21 15:40:11 -05:00
Xiang Li
063c5c77a0 raft: should not decrease match and next when handling out of order msgAppResp 2014-11-20 17:58:23 -08:00
Ben Darnell
b29240baf0 Merge remote-tracking branch 'coreos/master' into merge
* coreos/master:
  scripts: build-docker tag and use ENTRYPOINT
  scripts: build-release add etcd-migrate
  create .godir
  raft: optimistically increase the next if the follower is already matched
  raft: add handleHeartbeat handleHeartbeat commits to the commit index in the message. It never decreases the commit index of the raft state machine.
  rafthttp: send takes raft message instead of bytes
  *: add rafthttp pkg into test list
  raft: include commitIndex in heartbeat
  rafthttp: move server stats in raftHandler to etcdserver
  *: etcdhttp.raftHandler -> rafthttp.RaftHandler
  etcdserver: rename sender.go -> sendhub.go
  *: etcdserver.sender -> rafthttp.Sender

Conflicts:
	raft/log.go
	raft/raft_paper_test.go
2014-11-19 17:05:16 -05:00
Ben Darnell
355ee4f393 raft: Integrate snapshots into the raft.Storage interface.
Compaction is now treated as an implementation detail of Storage
implementations; Node.Compact() and related functionality have been
removed. Ready.Snapshot is now used only for incoming snapshots.

A return value has been added to ApplyConfChange to allow applications
to track the node information that must be stored in the snapshot.

raftpb.Snapshot has been split into Snapshot and SnapshotMetadata, to
allow the full snapshot data to be read from disk only when needed.

raft.Storage has new methods Snapshot, ApplySnapshot, HardState, and
SetHardState. The Snapshot and HardState parameters have been removed
from RestartNode() and will now be loaded from Storage instead.
The only remaining difference between StartNode and RestartNode is that
the former bootstraps an initial list of Peers.
2014-11-19 16:40:26 -05:00
Xiang Li
b50f331558 Merge pull request #1744 from xiang90/next
raft: optimistically increase the next if the follower is already matched
2014-11-19 13:21:11 -08:00
Xiang Li
4c1fd07311 raft: optimistically increase the next if the follower is already matched
This is useful since we want to pipeline the appendEntry requests. Without
enabling optimistic increasing, the second pipelining appendEntry request
will include the entries the first one has already sent out. We decrease
the next directly to match if the leader receives a rejection for a matched
follower. This happens if one pipelining request get lost and following ones
arrives at the follower.
2014-11-18 13:41:38 -08:00
Xiang Li
bd4cfa2a07 raft: add handleHeartbeat
handleHeartbeat commits to the commit index in the message. It never decreases the
commit index of the raft state machine.
2014-11-18 08:34:06 -08:00
Xiang Li
b93d87f17f raft: include commitIndex in heartbeat 2014-11-17 16:19:28 -08:00
Ben Darnell
64d9bcabf1 Add Storage.Term() method and hide the first entry from other methods.
The first entry in the log is a dummy which is used for matchTerm
but may not have an actual payload. This change permits Storage
implementations to treat this term value specially instead of
storing it as a dummy Entry.

Storage.FirstIndex() no longer includes the term-only entry.

This reverses a recent decision to create entry zero as initially
unstable; Storage implementations are now required to make
Term(0) == 0 and the first unstable entry is now index 1.
stableTo(0) is no longer allowed.
2014-11-17 16:54:12 -05:00
Ben Darnell
b29c512f50 Merge remote-tracking branch 'coreos/master' into log-storage-interface
* coreos/master: (27 commits)
  pkg/wait: move wait to pkg/wait
  etcdserver: do not add/remove/update local member to/from sender hub
  etcdserver: not record attributes when add member
  raft: add a test for proposeConfChange
  raft: block Stop() on n.done, support idempotency
  raft: add a test for node proposal
  integration: add increase cluster size test
  integration: remove unnecessary t.Testing argument
  raft: stop the node synchronously
  integration: fix test to propagate NewServer errors
  etcdserver: move peer URLs check to config
  etcdserver: ensure initial-advertise-peer-urls match initial-cluster
  raft: add a test for node.Tick
  raft: add comment string for TestNodeStart
  etcdserver: use member instead of node at etcd level
  raft: nodes return sorted ids
  raft: update unstable when calling stableTo with 0
  *: support updating advertise-peer-url Users might want to update the peerurl of the etcd member in several cases. For example, if the IP address of the physical machine etcd running on is changed, user need to update the adversite-pee-rurl accordingly. This commit makes etcd support updating the advertise-peer-url of its members.
  transport: create a tls listener only if the tlsInfo is not empty and the scheme is HTTPS
  etcdserver: use member pointer for all tests
  ...

Conflicts:
	etcdserver/server.go
	raft/log.go
	raft/log_test.go
	raft/node.go
2014-11-13 14:21:09 -05:00
Ben Darnell
54b07d7974 Remove raft.loadEnts and the ents parameter to raft.RestartNode.
The initial entries are now provided via the Storage interface.
2014-11-12 18:31:19 -05:00
Yicheng Qin
78cbb1512c raft: nodes return sorted ids
This makes raft.softState return the same result when its soft state is
not changed.
2014-11-11 22:58:15 -08:00
Ben Darnell
25b6590547 raft: introduce log storage interface.
This change splits the raftLog.entries array into an in-memory
"unstable" list and a pluggable interface for retrieving entries that
have been persisted to disk. An in-memory implementation of this
interface is provided which behaves the same as the old version;
in a future commit etcdserver could replace the MemoryStorage with
one backed by the WAL.
2014-11-10 17:40:39 -05:00
Jonathan Boulle
b99633207c raft: minor cleanup in comments 2014-10-30 10:01:44 -07:00
Jonathan Boulle
0cf0cb3d02 raft: add tests for progress.maybeDecr 2014-10-29 22:57:25 -07:00
Jonathan Boulle
81304b2b7e raft: move test helper function into tests 2014-10-29 14:04:45 -07:00
Xiang Li
74c257f63d Merge pull request #1419 from xiangli-cmu/raft_log_test
raft: add test for findConflict
2014-10-27 14:30:36 -07:00
Yicheng Qin
b986a52579 raft: use raft-specific rand.Rand instead of global one 2014-10-27 12:32:11 -07:00
Xiang Li
94f701cf95 raft: refactor isUpToDate and add a test 2014-10-25 20:34:14 -07:00
Soheil Hassas Yeganeh
09e9618b02 raft: change raftLog.maybeAppend to return the last new index
As per @unihorn's comment on #1366, we change raftLog.maybeAppend to
return the last new index of entries in maybeAppend.
2014-10-23 15:42:47 -04:00
Soheil Hassas Yeganeh
233617bea2 raft: Make MsgAppRes ack only the last index in MsgApp
As explained in #1366, the leader will fail to transmit the missed
logs if the leader receives a hearbeat response from a follower
that is not yet matched in the leader. In other words, there are
append responses that do not explicitly reject an append but
implied a gap.

This commit is based on @xiangli-cmu's idea. We should only acknowledge
upto the index of logs in the append message. This way responses to
heartbeats would never interfer with the log synchronization because
their log index is always 0.

Fixes #1366
2014-10-23 14:56:17 -04:00
Yicheng Qin
e200d2a8e2 etcdserver/raft: remove msgDenied, removedNodes, shouldStop
The future plan is to do all these in etcdserver level.
2014-10-20 15:13:18 -07:00
Jonathan Boulle
7a4d42166b *: add license header to all source files 2014-10-17 15:41:22 -07:00
Jonathan Boulle
8168fed825 etcdserver: add ServerStats and LeaderStats
This adds the remaining two stats endpoints: `/v2/stats/self`, for
various statistics on the EtcdServer, and `/v2/stats/leader`, for
statistics on a leader's followers.

By and large most of the stats code is copied across from 0.4.x, updated
where necessary to integrate with the new decoupling of raft from
transport.

This does not satisfactorily resolve the question of name vs ID. In the
old world, names were unique in the cluster and transmitted over the
wire, so they could be used safely in all statistics. In the new world,
a given EtcdServer only knows its own name, and it is instead IDs that
are communicated among the cluster members. Hence in most places here we
simply substitute a string-encoded ID in place of name, and only where
possible do we retain the actual given name of the EtcdServer.
2014-10-16 10:43:44 -07:00
Yicheng Qin
8cd6030a1d etcdserver: add checking when apply conf change 2014-10-16 09:49:26 -07:00
Yicheng Qin
eb2dd1892f raft: add RemovedNodes to SoftState 2014-10-15 10:53:07 -07:00
Yicheng Qin
32c38820c1 raft: protobuf messageType 2014-10-13 11:13:43 -07:00
Xiang Li
1eb1020717 raft: fix raft test 2014-10-09 14:42:29 +08:00
Xiang Li
af5b8c6c44 raft: int64 -> uint64 2014-10-09 14:26:43 +08:00
Xiang Li
38af14b0f4 Merge pull request #1260 from coreos/snap_rm
raft: save removed nodes in snapshot
2014-10-09 08:00:33 +08:00
Xiang Li
abe97e49d5 raft: more comment 2014-10-09 07:02:05 +08:00
Xiang Li
73f2aaf98f raft: removedSlice -> removedNodes 2014-10-09 06:55:25 +08:00
Xiang Li
c67fd14fe8 Merge pull request #1257 from bdarnell/cleanups
Raft: assorted cleanups (golint and go vet)
2014-10-09 05:55:21 +08:00
Xiang Li
7b61565c0a raft: save removed nodes in snapshot 2014-10-08 15:33:55 +08:00
Xiang Li
1cd3345e00 raft: address issues with election timeout 2014-10-08 07:41:17 +08:00
Ben Darnell
1083ce8f73 raft: remove misleading labels in array definition
Since these are arrays instead of maps, the "keys" here are actually
(useless) goto labels. What really matters is that the ordering is
the same between the constant declarations and the array.
2014-10-07 18:44:06 -04:00
Ben Darnell
3ad0df3722 Raft: fix problems reported by golint. 2014-10-07 18:44:06 -04:00
Xiang Li
d7d6f84f64 raft: rand election timeout 2014-10-07 20:12:49 +08:00
Xiang Li
45e4a8643a raft: add tests for raft.compact 2014-10-07 16:03:11 +08:00
Xiang Li
5587e0d73f raft: compact takes index and nodes parameters
Before this commit, compact always compact log at current appliedindex of raft.
This prevents us from doing non-blocking snapshot since we have to make snapshot
and compact atomically. To prepare for non-blocking snapshot, this commit make
compact supports index and nodes parameters. After completing snapshot, the applier
should call compact with the snapshot index and the nodes at snapshot index to do
a compaction at snapsohot index.
2014-10-07 16:03:11 +08:00
Xiang Li
dc9cb4b4ba raft: fix send
send should not attach current term to msgProp. Send should simply do proxy for msgProp without
changing its term. msgProp has a special term 0, which indicates that it is a local message.
2014-10-06 04:48:35 +08:00
Xiang Li
70bf464cd6 raft: add comment to decrTo 2014-10-03 13:42:34 +08:00
Xiang Li
16ba77767e raft: do not decrease nextIndex and send entries for stale reply 2014-10-03 13:41:27 +08:00
Yicheng Qin
8490904f20 Merge pull request #1224 from unihorn/149
raft: msg.Denied -> msg.Reject
2014-10-02 12:24:09 -07:00
Yicheng Qin
fff918c672 raft: msg.Denied -> msg.Reject
Change the field name because it has msgDenied already.
2014-10-02 12:22:12 -07:00
Yicheng Qin
e4a6c9651a raft: add removed
The usage of removed:
1. tell removed node about its removal explicitly using msgDenied
2. prevent removed node disrupt cluster progress by launching leader election

It is set when apply node removal, or receive msgDenied.
2014-10-01 14:57:38 -07:00
Xiang Li
d7b4e44a66 raft: heartbeat is only response for maintaining leader dominance 2014-09-29 16:57:43 -07:00
Xiang Li
51529cc3f2 raft: remove index field in msg AppResp 2014-09-28 21:13:53 -07:00