Brandon Philips
22c944d8ef
chore(server): bump 0.4.0+git
2014-05-20 20:55:57 -07:00
Brandon Philips
a2d16b52bb
chore(server): bump to 0.4.1
2014-05-20 20:46:46 -07:00
Brandon Philips
62560f9959
fix(server): add user facing remove API
...
This was accidently removed as we refactored the standy stuff. Re-add this
user facing remove endpoint that matches the config endpoints.
2014-05-20 20:01:10 -07:00
Brandon Philips
cc37c58103
chore(server): bump to 0.4.0+git
2014-05-20 17:10:28 -07:00
Brandon Philips
07d1eb0edb
chore(server): bump to 0.4.0
2014-05-20 17:09:22 -07:00
Xiang Li
1e7a7b11dd
Merge pull request #799 from xiangli-cmu/deny_unknow_peer
...
hack(server): notify removed peers when they try to become candidates
2014-05-20 13:37:14 -07:00
Yicheng Qin
934c28d498
fix(peer_server): set store and registry when setting raft server
...
New raft server needs new store and registry.
2014-05-20 13:12:12 -07:00
Xiang Li
189fece683
hack(server): notify removed peers when they try to become candidates
...
A peer might be removed during a network partiton. When it comes back it
will not have received any of the log entries that would have notified
it of its removal and go onto propose a vote. This will disrupt the
cluster and the cluster should give the machine feedback that it is no
longer a member.
The term of a denied vote is MaxUint64. The notification of the removal
is a raft event. These two modification are quick heck.
In reaction to this notification the machine should shutdown. In this
case the shutdown just moves it towards becoming a standby server.
2014-05-20 10:17:32 -07:00
Brandon Philips
1084e51320
Merge pull request #786 from unihorn/91
...
feat(standby_server): write cluster info to disk
2014-05-18 10:08:52 -07:00
Yicheng Qin
84f71b6c87
chore(standby_server): remove error return
...
because standby server should be started in best efforts.
2014-05-16 18:07:49 -04:00
Yicheng Qin
71679bcf56
feat(standby_server): make atomic move for file
...
to avoid the risk of writing out a corrupted file.
2014-05-16 01:00:07 -04:00
Yicheng Qin
a824be4c14
feat(standby_server): save/load Running into disk
2014-05-16 00:10:15 -04:00
Yicheng Qin
35cc81e22f
feat(standby_server): save/load syncInterval to disk
2014-05-15 23:57:58 -04:00
Yicheng Qin
716496ec42
chore(standby_server): still sleep for the first time
2014-05-15 23:18:59 -04:00
Yicheng Qin
b7d9fdbd39
feat(standby_server): write cluster info to disk
...
For better fault tolerance and availability.
2014-05-15 07:47:15 -04:00
Brandon Philips
7cf8a4a8d0
Merge pull request #779 from unihorn/89
...
feat: implement standby mode
2014-05-14 10:03:03 -07:00
Yicheng Qin
851026362a
chore(standby_server): let syncInterval represent in second unit
...
This is done to keep consistency with other namings.
2014-05-14 10:13:05 -04:00
Yicheng Qin
f6591b95c7
chore(standby): minor changes based on comments
2014-05-13 22:19:52 -04:00
Yicheng Qin
403f709ebd
chore(cluster_config): set default timeout to 5s
...
Or the leader death could let the standbys down for a rather long time.
2014-05-13 16:13:44 -04:00
Yicheng Qin
c0027bfc78
feat(cluster_config): change field from int to float64
...
This is modified for better flexibility, especially for testing.
2014-05-12 22:42:18 -04:00
Yicheng Qin
6a64141962
fix(TestV1Watch): ensure server has started
2014-05-09 15:42:18 -07:00
Yicheng Qin
5367c1c998
chore(standby): minor changes based on comments
2014-05-09 15:38:03 -07:00
Yicheng Qin
c6b1a738c3
feat(option): add cluster config option
...
It will be used when creating a brand-new cluster.
2014-05-09 15:22:11 -07:00
Yicheng Qin
6d4f018887
chore(cluster_config): rename SyncClusterInterval to SyncInterval
...
for better naming
2014-05-09 13:28:21 -07:00
Yicheng Qin
765cd5d8b3
refactor(find_cluster): make it simpler
2014-05-09 02:27:04 -07:00
Yicheng Qin
baadf63912
feat: implement standby mode
...
Change log:
1. PeerServer
- estimate initial mode from its log through removedInLog variable
- refactor FindCluster to return the estimation
- refactor Start to call FindCluster explicitly
- move raftServer start and cluster init from FindCluster to Start
- remove stopNotify from PeerServer because it is not used anymore
2. Etcd
- refactor Run logic to fit the specification
3. ClusterConfig
- rename promoteDelay to removeDelay for better naming
- add SyncClusterInterval field to ClusterConfig
- commit command to set default cluster config when cluster is created
- store cluster config info into key space for consistency
- reload cluster config when reboot
4. add StandbyServer
5. Error
- remove unused EcodePromoteError
2014-05-09 01:56:55 -07:00
Yicheng Qin
f1c13e2d9d
Merge pull request #774 from unihorn/83
...
feat(join): check cluster conditions before join
2014-05-08 14:08:38 -07:00
Yicheng Qin
6c950eaf97
Merge pull request #772 from unihorn/81
...
feat(peer_server): stop service when removed
2014-05-08 14:02:09 -07:00
Yicheng Qin
5c7a963cf0
chore(peer_server): adjust code to make it more clear
2014-05-08 13:20:46 -07:00
Yicheng Qin
c92231c91a
Merge branch 'master' of github.com:coreos/etcd
...
Conflicts:
server/peer_server_handlers.go
2014-05-08 13:17:51 -07:00
Yicheng Qin
e960a0e03c
chore(client): minor changes based on comments
...
The changes are made on error handling, comments and constant.
2014-05-08 13:15:10 -07:00
Yicheng Qin
015d228b04
Merge pull request #763 from unihorn/77
...
fix(raft_server_stats): set startTime when init
2014-05-08 12:28:44 -07:00
Yicheng Qin
b3e66ee980
fix(TestV2Watch): ensure server has started
2014-05-08 12:18:08 -07:00
Yicheng Qin
bc4a98c386
Merge pull request #776 from unihorn/85
...
feat(peer_server): add State field to machineMessage
2014-05-08 11:53:26 -07:00
Yicheng Qin
04f09d2fd0
feat(peer_server): add State field to machineMessage
...
State field indicates the state of each machine.
For now, its value could be follower or leader.
2014-05-08 10:25:39 -07:00
Yicheng Qin
0558b546ff
fix(registry): fetch peers from store instead of cache
...
The current cache implmentation may contain removed machines, so we
fetch peers from store for correctness.
2014-05-08 08:44:32 -07:00
Yicheng Qin
5465201292
chore(peer_server): more explanation for asyncRemove
2014-05-07 16:31:17 -07:00
Yicheng Qin
ae81f843f1
refactor(client): remove useless logic in redirection
2014-05-07 16:09:08 -07:00
Yicheng Qin
c9ce14c857
chore(peer_server): set client transporter separately
...
It also moves the hack on timeout from raft transporter to
client transporter.
2014-05-07 13:26:05 -07:00
Yicheng Qin
bed20b7837
chore(peer_server): add more function description
2014-05-07 12:51:41 -07:00
Yicheng Qin
206881bfec
fix(peer_server): check running status before start/stop
...
This makes peer server more robust.
2014-05-07 12:44:48 -07:00
Yicheng Qin
001b1fcd46
feat(join): check cluster conditions before join
2014-05-07 11:46:21 -07:00
Yicheng Qin
4e14604e5c
refactor(server): add Client struct
...
This is used to send request to web API.
It will do this behavior a lot in standby mode, so I abstract this
struct first.
2014-05-07 11:46:15 -07:00
Yicheng Qin
ba36a16bc5
feat(peer_server): stop service when removed
...
It doesn't modify the exit logic, but makes external code know
when removal happens and be able to determine what it should do.
2014-05-07 10:00:27 -07:00
Yicheng Qin
997e7d3bf4
Merge pull request #771 from unihorn/80
...
refactor(peer_server): remove standby mode in peer server
2014-05-07 09:57:02 -07:00
Yicheng Qin
17e299995c
refactor(peer_server): remove standby mode in peer server
2014-05-07 09:10:09 -07:00
Yicheng Qin
d78116c35b
Merge pull request #675 from unihorn/56
...
fix(peer_server): exit all server goroutines in Stop()
2014-05-07 08:09:14 -07:00
Yicheng Qin
6516cf854c
chore(server): rename daemon to startRoutine
...
For better understanding.
2014-05-07 07:51:44 -07:00
Yicheng Qin
e55512f60b
fix(peer_server): graceful stop for peer server run
...
Peer server will be started and stopped repeatedly in the design.
This step ensures its stop doesn't affect the next start.
The patch includes goroutine stop and timer trigger remove.
2014-05-07 07:43:27 -07:00
Yicheng Qin
c692a8f0a7
fix(raft_server_stats): set startTime when init
...
This helps one-node cluster get rid of bogus startTime.
2014-04-29 09:59:02 -07:00