338 Commits

Author SHA1 Message Date
Brandon Philips
22c944d8ef chore(server): bump 0.4.0+git 2014-05-20 20:55:57 -07:00
Brandon Philips
a2d16b52bb chore(server): bump to 0.4.1 2014-05-20 20:46:46 -07:00
Brandon Philips
62560f9959 fix(server): add user facing remove API
This was accidently removed as we refactored the standy stuff. Re-add this
user facing remove endpoint that matches the config endpoints.
2014-05-20 20:01:10 -07:00
Brandon Philips
cc37c58103 chore(server): bump to 0.4.0+git 2014-05-20 17:10:28 -07:00
Brandon Philips
07d1eb0edb chore(server): bump to 0.4.0 2014-05-20 17:09:22 -07:00
Xiang Li
1e7a7b11dd Merge pull request #799 from xiangli-cmu/deny_unknow_peer
hack(server): notify removed peers when they try to become candidates
2014-05-20 13:37:14 -07:00
Yicheng Qin
934c28d498 fix(peer_server): set store and registry when setting raft server
New raft server needs new store and registry.
2014-05-20 13:12:12 -07:00
Xiang Li
189fece683 hack(server): notify removed peers when they try to become candidates
A peer might be removed during a network partiton. When it comes back it
will not have received any of the log entries that would have notified
it of its removal and go onto propose a vote. This will disrupt the
cluster and the cluster should give the machine feedback that it is no
longer a member.

The term of a denied vote is MaxUint64. The notification of the removal
is a raft event. These two modification are quick heck.

In reaction to this notification the machine should shutdown. In this
case the shutdown just moves it towards becoming a standby server.
2014-05-20 10:17:32 -07:00
Brandon Philips
1084e51320 Merge pull request #786 from unihorn/91
feat(standby_server): write cluster info to disk
2014-05-18 10:08:52 -07:00
Yicheng Qin
84f71b6c87 chore(standby_server): remove error return
because standby server should be started in best efforts.
2014-05-16 18:07:49 -04:00
Yicheng Qin
71679bcf56 feat(standby_server): make atomic move for file
to avoid the risk of writing out a corrupted file.
2014-05-16 01:00:07 -04:00
Yicheng Qin
a824be4c14 feat(standby_server): save/load Running into disk 2014-05-16 00:10:15 -04:00
Yicheng Qin
35cc81e22f feat(standby_server): save/load syncInterval to disk 2014-05-15 23:57:58 -04:00
Yicheng Qin
716496ec42 chore(standby_server): still sleep for the first time 2014-05-15 23:18:59 -04:00
Yicheng Qin
b7d9fdbd39 feat(standby_server): write cluster info to disk
For better fault tolerance and availability.
2014-05-15 07:47:15 -04:00
Brandon Philips
7cf8a4a8d0 Merge pull request #779 from unihorn/89
feat: implement standby mode
2014-05-14 10:03:03 -07:00
Yicheng Qin
851026362a chore(standby_server): let syncInterval represent in second unit
This is done to keep consistency with other namings.
2014-05-14 10:13:05 -04:00
Yicheng Qin
f6591b95c7 chore(standby): minor changes based on comments 2014-05-13 22:19:52 -04:00
Yicheng Qin
403f709ebd chore(cluster_config): set default timeout to 5s
Or the leader death could let the standbys down for a rather long time.
2014-05-13 16:13:44 -04:00
Yicheng Qin
c0027bfc78 feat(cluster_config): change field from int to float64
This is modified for better flexibility, especially for testing.
2014-05-12 22:42:18 -04:00
Yicheng Qin
6a64141962 fix(TestV1Watch): ensure server has started 2014-05-09 15:42:18 -07:00
Yicheng Qin
5367c1c998 chore(standby): minor changes based on comments 2014-05-09 15:38:03 -07:00
Yicheng Qin
c6b1a738c3 feat(option): add cluster config option
It will be used when creating a brand-new cluster.
2014-05-09 15:22:11 -07:00
Yicheng Qin
6d4f018887 chore(cluster_config): rename SyncClusterInterval to SyncInterval
for better naming
2014-05-09 13:28:21 -07:00
Yicheng Qin
765cd5d8b3 refactor(find_cluster): make it simpler 2014-05-09 02:27:04 -07:00
Yicheng Qin
baadf63912 feat: implement standby mode
Change log:
1. PeerServer
- estimate initial mode from its log through removedInLog variable
- refactor FindCluster to return the estimation
- refactor Start to call FindCluster explicitly
- move raftServer start and cluster init from FindCluster to Start
- remove stopNotify from PeerServer because it is not used anymore
2. Etcd
- refactor Run logic to fit the specification
3. ClusterConfig
- rename promoteDelay to removeDelay for better naming
- add SyncClusterInterval field to ClusterConfig
- commit command to set default cluster config when cluster is created
- store cluster config info into key space for consistency
- reload cluster config when reboot
4. add StandbyServer
5. Error
- remove unused EcodePromoteError
2014-05-09 01:56:55 -07:00
Yicheng Qin
f1c13e2d9d Merge pull request #774 from unihorn/83
feat(join): check cluster conditions before join
2014-05-08 14:08:38 -07:00
Yicheng Qin
6c950eaf97 Merge pull request #772 from unihorn/81
feat(peer_server): stop service when removed
2014-05-08 14:02:09 -07:00
Yicheng Qin
5c7a963cf0 chore(peer_server): adjust code to make it more clear 2014-05-08 13:20:46 -07:00
Yicheng Qin
c92231c91a Merge branch 'master' of github.com:coreos/etcd
Conflicts:
	server/peer_server_handlers.go
2014-05-08 13:17:51 -07:00
Yicheng Qin
e960a0e03c chore(client): minor changes based on comments
The changes are made on error handling, comments and constant.
2014-05-08 13:15:10 -07:00
Yicheng Qin
015d228b04 Merge pull request #763 from unihorn/77
fix(raft_server_stats): set startTime when init
2014-05-08 12:28:44 -07:00
Yicheng Qin
b3e66ee980 fix(TestV2Watch): ensure server has started 2014-05-08 12:18:08 -07:00
Yicheng Qin
bc4a98c386 Merge pull request #776 from unihorn/85
feat(peer_server): add State field to machineMessage
2014-05-08 11:53:26 -07:00
Yicheng Qin
04f09d2fd0 feat(peer_server): add State field to machineMessage
State field indicates the state of each machine.
For now, its value could be follower or leader.
2014-05-08 10:25:39 -07:00
Yicheng Qin
0558b546ff fix(registry): fetch peers from store instead of cache
The current cache implmentation may contain removed machines, so we
fetch peers from store for correctness.
2014-05-08 08:44:32 -07:00
Yicheng Qin
5465201292 chore(peer_server): more explanation for asyncRemove 2014-05-07 16:31:17 -07:00
Yicheng Qin
ae81f843f1 refactor(client): remove useless logic in redirection 2014-05-07 16:09:08 -07:00
Yicheng Qin
c9ce14c857 chore(peer_server): set client transporter separately
It also moves the hack on timeout from raft transporter to
client transporter.
2014-05-07 13:26:05 -07:00
Yicheng Qin
bed20b7837 chore(peer_server): add more function description 2014-05-07 12:51:41 -07:00
Yicheng Qin
206881bfec fix(peer_server): check running status before start/stop
This makes peer server more robust.
2014-05-07 12:44:48 -07:00
Yicheng Qin
001b1fcd46 feat(join): check cluster conditions before join 2014-05-07 11:46:21 -07:00
Yicheng Qin
4e14604e5c refactor(server): add Client struct
This is used to send request to web API.
It will do this behavior a lot in standby mode, so I abstract this
struct first.
2014-05-07 11:46:15 -07:00
Yicheng Qin
ba36a16bc5 feat(peer_server): stop service when removed
It doesn't modify the exit logic, but makes external code know
when removal happens and be able to determine what it should do.
2014-05-07 10:00:27 -07:00
Yicheng Qin
997e7d3bf4 Merge pull request #771 from unihorn/80
refactor(peer_server): remove standby mode in peer server
2014-05-07 09:57:02 -07:00
Yicheng Qin
17e299995c refactor(peer_server): remove standby mode in peer server 2014-05-07 09:10:09 -07:00
Yicheng Qin
d78116c35b Merge pull request #675 from unihorn/56
fix(peer_server): exit all server goroutines in Stop()
2014-05-07 08:09:14 -07:00
Yicheng Qin
6516cf854c chore(server): rename daemon to startRoutine
For better understanding.
2014-05-07 07:51:44 -07:00
Yicheng Qin
e55512f60b fix(peer_server): graceful stop for peer server run
Peer server will be started and stopped repeatedly in the design.
This step ensures its stop doesn't affect the next start.
The patch includes goroutine stop and timer trigger remove.
2014-05-07 07:43:27 -07:00
Yicheng Qin
c692a8f0a7 fix(raft_server_stats): set startTime when init
This helps one-node cluster get rid of bogus startTime.
2014-04-29 09:59:02 -07:00