347 Commits

Author SHA1 Message Date
Brandon Philips
a974bbfe4f chore(server): bump to 0.4.2+git 2014-06-02 15:26:06 -07:00
Brandon Philips
99dcc8c322 chore(server): bump back to 0.4.2 2014-06-02 15:25:03 -07:00
Brandon Philips
707174b56a chore(server): bump to 0.4.2+git 2014-06-02 14:19:52 -07:00
Brandon Philips
ce92cc3dc5 feat(CHANGELOG): bump to v0.4.2 2014-06-02 14:17:38 -07:00
Yicheng Qin
2387ef3f21 Merge pull request #819 from unihorn/97
fix(server): joinIndex is not set after recovery from full outage
2014-06-02 11:04:07 -07:00
Yicheng Qin
d5bfca9465 Merge pull request #814 from unihorn/91
fix(server/v2): set correct content-type for etcdError response
2014-06-02 10:38:36 -07:00
Yicheng Qin
d7768635fd fix(server): set joinIndex when recovered 2014-05-31 10:03:39 -07:00
Yicheng Qin
4bebb538eb fix(standby_server): able to join the cluster containing itself
Standby server will switch to peer server if it finds that
it has been contained in the cluster.
2014-05-30 14:03:49 -07:00
Yicheng Qin
db4c5e0eaa fix(server/v2): set correct content-type for etcdError response
"net/http".Error reset the content type, so we get rid of it and
write our own one.
2014-05-29 14:18:50 -07:00
Brandon Philips
22c944d8ef chore(server): bump 0.4.0+git 2014-05-20 20:55:57 -07:00
Brandon Philips
a2d16b52bb chore(server): bump to 0.4.1 2014-05-20 20:46:46 -07:00
Brandon Philips
62560f9959 fix(server): add user facing remove API
This was accidently removed as we refactored the standy stuff. Re-add this
user facing remove endpoint that matches the config endpoints.
2014-05-20 20:01:10 -07:00
Brandon Philips
cc37c58103 chore(server): bump to 0.4.0+git 2014-05-20 17:10:28 -07:00
Brandon Philips
07d1eb0edb chore(server): bump to 0.4.0 2014-05-20 17:09:22 -07:00
Xiang Li
1e7a7b11dd Merge pull request #799 from xiangli-cmu/deny_unknow_peer
hack(server): notify removed peers when they try to become candidates
2014-05-20 13:37:14 -07:00
Yicheng Qin
934c28d498 fix(peer_server): set store and registry when setting raft server
New raft server needs new store and registry.
2014-05-20 13:12:12 -07:00
Xiang Li
189fece683 hack(server): notify removed peers when they try to become candidates
A peer might be removed during a network partiton. When it comes back it
will not have received any of the log entries that would have notified
it of its removal and go onto propose a vote. This will disrupt the
cluster and the cluster should give the machine feedback that it is no
longer a member.

The term of a denied vote is MaxUint64. The notification of the removal
is a raft event. These two modification are quick heck.

In reaction to this notification the machine should shutdown. In this
case the shutdown just moves it towards becoming a standby server.
2014-05-20 10:17:32 -07:00
Brandon Philips
1084e51320 Merge pull request #786 from unihorn/91
feat(standby_server): write cluster info to disk
2014-05-18 10:08:52 -07:00
Yicheng Qin
84f71b6c87 chore(standby_server): remove error return
because standby server should be started in best efforts.
2014-05-16 18:07:49 -04:00
Yicheng Qin
71679bcf56 feat(standby_server): make atomic move for file
to avoid the risk of writing out a corrupted file.
2014-05-16 01:00:07 -04:00
Yicheng Qin
a824be4c14 feat(standby_server): save/load Running into disk 2014-05-16 00:10:15 -04:00
Yicheng Qin
35cc81e22f feat(standby_server): save/load syncInterval to disk 2014-05-15 23:57:58 -04:00
Yicheng Qin
716496ec42 chore(standby_server): still sleep for the first time 2014-05-15 23:18:59 -04:00
Yicheng Qin
b7d9fdbd39 feat(standby_server): write cluster info to disk
For better fault tolerance and availability.
2014-05-15 07:47:15 -04:00
Brandon Philips
7cf8a4a8d0 Merge pull request #779 from unihorn/89
feat: implement standby mode
2014-05-14 10:03:03 -07:00
Yicheng Qin
851026362a chore(standby_server): let syncInterval represent in second unit
This is done to keep consistency with other namings.
2014-05-14 10:13:05 -04:00
Yicheng Qin
f6591b95c7 chore(standby): minor changes based on comments 2014-05-13 22:19:52 -04:00
Yicheng Qin
403f709ebd chore(cluster_config): set default timeout to 5s
Or the leader death could let the standbys down for a rather long time.
2014-05-13 16:13:44 -04:00
Yicheng Qin
c0027bfc78 feat(cluster_config): change field from int to float64
This is modified for better flexibility, especially for testing.
2014-05-12 22:42:18 -04:00
Yicheng Qin
6a64141962 fix(TestV1Watch): ensure server has started 2014-05-09 15:42:18 -07:00
Yicheng Qin
5367c1c998 chore(standby): minor changes based on comments 2014-05-09 15:38:03 -07:00
Yicheng Qin
c6b1a738c3 feat(option): add cluster config option
It will be used when creating a brand-new cluster.
2014-05-09 15:22:11 -07:00
Yicheng Qin
6d4f018887 chore(cluster_config): rename SyncClusterInterval to SyncInterval
for better naming
2014-05-09 13:28:21 -07:00
Yicheng Qin
765cd5d8b3 refactor(find_cluster): make it simpler 2014-05-09 02:27:04 -07:00
Yicheng Qin
baadf63912 feat: implement standby mode
Change log:
1. PeerServer
- estimate initial mode from its log through removedInLog variable
- refactor FindCluster to return the estimation
- refactor Start to call FindCluster explicitly
- move raftServer start and cluster init from FindCluster to Start
- remove stopNotify from PeerServer because it is not used anymore
2. Etcd
- refactor Run logic to fit the specification
3. ClusterConfig
- rename promoteDelay to removeDelay for better naming
- add SyncClusterInterval field to ClusterConfig
- commit command to set default cluster config when cluster is created
- store cluster config info into key space for consistency
- reload cluster config when reboot
4. add StandbyServer
5. Error
- remove unused EcodePromoteError
2014-05-09 01:56:55 -07:00
Yicheng Qin
f1c13e2d9d Merge pull request #774 from unihorn/83
feat(join): check cluster conditions before join
2014-05-08 14:08:38 -07:00
Yicheng Qin
6c950eaf97 Merge pull request #772 from unihorn/81
feat(peer_server): stop service when removed
2014-05-08 14:02:09 -07:00
Yicheng Qin
5c7a963cf0 chore(peer_server): adjust code to make it more clear 2014-05-08 13:20:46 -07:00
Yicheng Qin
c92231c91a Merge branch 'master' of github.com:coreos/etcd
Conflicts:
	server/peer_server_handlers.go
2014-05-08 13:17:51 -07:00
Yicheng Qin
e960a0e03c chore(client): minor changes based on comments
The changes are made on error handling, comments and constant.
2014-05-08 13:15:10 -07:00
Yicheng Qin
015d228b04 Merge pull request #763 from unihorn/77
fix(raft_server_stats): set startTime when init
2014-05-08 12:28:44 -07:00
Yicheng Qin
b3e66ee980 fix(TestV2Watch): ensure server has started 2014-05-08 12:18:08 -07:00
Yicheng Qin
bc4a98c386 Merge pull request #776 from unihorn/85
feat(peer_server): add State field to machineMessage
2014-05-08 11:53:26 -07:00
Yicheng Qin
04f09d2fd0 feat(peer_server): add State field to machineMessage
State field indicates the state of each machine.
For now, its value could be follower or leader.
2014-05-08 10:25:39 -07:00
Yicheng Qin
0558b546ff fix(registry): fetch peers from store instead of cache
The current cache implmentation may contain removed machines, so we
fetch peers from store for correctness.
2014-05-08 08:44:32 -07:00
Yicheng Qin
5465201292 chore(peer_server): more explanation for asyncRemove 2014-05-07 16:31:17 -07:00
Yicheng Qin
ae81f843f1 refactor(client): remove useless logic in redirection 2014-05-07 16:09:08 -07:00
Yicheng Qin
c9ce14c857 chore(peer_server): set client transporter separately
It also moves the hack on timeout from raft transporter to
client transporter.
2014-05-07 13:26:05 -07:00
Yicheng Qin
bed20b7837 chore(peer_server): add more function description 2014-05-07 12:51:41 -07:00
Yicheng Qin
206881bfec fix(peer_server): check running status before start/stop
This makes peer server more robust.
2014-05-07 12:44:48 -07:00