Cole Gleason
f4f429d4e3
docs(cluster-size): remove outdated refrences to flag max-cluster-size
2014-06-16 09:41:37 -07:00
Brandon Philips
dc1f4adcd0
chore(server): bump to 0.4.3+git
2014-06-07 18:17:54 -07:00
Brandon Philips
9970141f76
chore(server): bump to 0.4.3
2014-06-07 18:17:05 -07:00
Brandon Philips
16c2bcf951
chore(server): go fmt
...
blame me for not running test first.
2014-06-07 18:03:22 -07:00
Brandon Philips
1c958f8fc3
fix(server): reduce the screaming heartbeat logs
...
Currently the only way we know that a peer isn't getting a heartbeat is
an edge triggered event from go raft on every missed heartbeat. This
means that we need to do some book keeping in order to do exponential
backoff.
The upside is that instead of screaming thousands of log lines before a
machine hits the default removal of 30 minutes it is only ~100.
2014-06-07 17:47:10 -07:00
Yicheng Qin
ed58193ebe
chore(server): set DefaultRemoveDelay to 30mins
...
Its value was 5s before, which could remove the node insanely fast.
2014-06-06 16:57:35 -07:00
Yicheng Qin
fbcfe8e1c4
Merge pull request #807 from Shopify/raft-server-stats-struct-field-tag-fix
...
style(server): changed a LeaderInfo struct field from "startTime" to "StartTime"
2014-06-05 12:45:34 -07:00
Brandon Philips
a974bbfe4f
chore(server): bump to 0.4.2+git
2014-06-02 15:26:06 -07:00
Brandon Philips
99dcc8c322
chore(server): bump back to 0.4.2
2014-06-02 15:25:03 -07:00
Brandon Philips
707174b56a
chore(server): bump to 0.4.2+git
2014-06-02 14:19:52 -07:00
Brandon Philips
ce92cc3dc5
feat(CHANGELOG): bump to v0.4.2
2014-06-02 14:17:38 -07:00
Yicheng Qin
2387ef3f21
Merge pull request #819 from unihorn/97
...
fix(server): joinIndex is not set after recovery from full outage
2014-06-02 11:04:07 -07:00
Yicheng Qin
d5bfca9465
Merge pull request #814 from unihorn/91
...
fix(server/v2): set correct content-type for etcdError response
2014-06-02 10:38:36 -07:00
Yicheng Qin
d7768635fd
fix(server): set joinIndex when recovered
2014-05-31 10:03:39 -07:00
Yicheng Qin
4bebb538eb
fix(standby_server): able to join the cluster containing itself
...
Standby server will switch to peer server if it finds that
it has been contained in the cluster.
2014-05-30 14:03:49 -07:00
Yicheng Qin
db4c5e0eaa
fix(server/v2): set correct content-type for etcdError response
...
"net/http".Error reset the content type, so we get rid of it and
write our own one.
2014-05-29 14:18:50 -07:00
marc.barry
673d90728e
style(server): changed a LeaderInfo struct field from "startTime" to "StartTime"
...
Changed the LeaderInfo struct "start time" field from "startTime" to "StartTime" so that it is an exported identifier. This required adding the `json:"startTime"` structure field tag so that the encoding/json package correctly performs JSON encoding (i.e. the correct property name --> startTime).
2014-05-21 11:19:56 -04:00
Brandon Philips
22c944d8ef
chore(server): bump 0.4.0+git
2014-05-20 20:55:57 -07:00
Brandon Philips
a2d16b52bb
chore(server): bump to 0.4.1
2014-05-20 20:46:46 -07:00
Brandon Philips
62560f9959
fix(server): add user facing remove API
...
This was accidently removed as we refactored the standy stuff. Re-add this
user facing remove endpoint that matches the config endpoints.
2014-05-20 20:01:10 -07:00
Brandon Philips
cc37c58103
chore(server): bump to 0.4.0+git
2014-05-20 17:10:28 -07:00
Brandon Philips
07d1eb0edb
chore(server): bump to 0.4.0
2014-05-20 17:09:22 -07:00
Xiang Li
1e7a7b11dd
Merge pull request #799 from xiangli-cmu/deny_unknow_peer
...
hack(server): notify removed peers when they try to become candidates
2014-05-20 13:37:14 -07:00
Yicheng Qin
934c28d498
fix(peer_server): set store and registry when setting raft server
...
New raft server needs new store and registry.
2014-05-20 13:12:12 -07:00
Xiang Li
189fece683
hack(server): notify removed peers when they try to become candidates
...
A peer might be removed during a network partiton. When it comes back it
will not have received any of the log entries that would have notified
it of its removal and go onto propose a vote. This will disrupt the
cluster and the cluster should give the machine feedback that it is no
longer a member.
The term of a denied vote is MaxUint64. The notification of the removal
is a raft event. These two modification are quick heck.
In reaction to this notification the machine should shutdown. In this
case the shutdown just moves it towards becoming a standby server.
2014-05-20 10:17:32 -07:00
Brandon Philips
1084e51320
Merge pull request #786 from unihorn/91
...
feat(standby_server): write cluster info to disk
2014-05-18 10:08:52 -07:00
Yicheng Qin
84f71b6c87
chore(standby_server): remove error return
...
because standby server should be started in best efforts.
2014-05-16 18:07:49 -04:00
Yicheng Qin
71679bcf56
feat(standby_server): make atomic move for file
...
to avoid the risk of writing out a corrupted file.
2014-05-16 01:00:07 -04:00
Yicheng Qin
a824be4c14
feat(standby_server): save/load Running into disk
2014-05-16 00:10:15 -04:00
Yicheng Qin
35cc81e22f
feat(standby_server): save/load syncInterval to disk
2014-05-15 23:57:58 -04:00
Yicheng Qin
716496ec42
chore(standby_server): still sleep for the first time
2014-05-15 23:18:59 -04:00
Yicheng Qin
b7d9fdbd39
feat(standby_server): write cluster info to disk
...
For better fault tolerance and availability.
2014-05-15 07:47:15 -04:00
Brandon Philips
7cf8a4a8d0
Merge pull request #779 from unihorn/89
...
feat: implement standby mode
2014-05-14 10:03:03 -07:00
Yicheng Qin
851026362a
chore(standby_server): let syncInterval represent in second unit
...
This is done to keep consistency with other namings.
2014-05-14 10:13:05 -04:00
Yicheng Qin
f6591b95c7
chore(standby): minor changes based on comments
2014-05-13 22:19:52 -04:00
Yicheng Qin
403f709ebd
chore(cluster_config): set default timeout to 5s
...
Or the leader death could let the standbys down for a rather long time.
2014-05-13 16:13:44 -04:00
Yicheng Qin
c0027bfc78
feat(cluster_config): change field from int to float64
...
This is modified for better flexibility, especially for testing.
2014-05-12 22:42:18 -04:00
Yicheng Qin
6a64141962
fix(TestV1Watch): ensure server has started
2014-05-09 15:42:18 -07:00
Yicheng Qin
5367c1c998
chore(standby): minor changes based on comments
2014-05-09 15:38:03 -07:00
Yicheng Qin
c6b1a738c3
feat(option): add cluster config option
...
It will be used when creating a brand-new cluster.
2014-05-09 15:22:11 -07:00
Yicheng Qin
6d4f018887
chore(cluster_config): rename SyncClusterInterval to SyncInterval
...
for better naming
2014-05-09 13:28:21 -07:00
Yicheng Qin
765cd5d8b3
refactor(find_cluster): make it simpler
2014-05-09 02:27:04 -07:00
Yicheng Qin
baadf63912
feat: implement standby mode
...
Change log:
1. PeerServer
- estimate initial mode from its log through removedInLog variable
- refactor FindCluster to return the estimation
- refactor Start to call FindCluster explicitly
- move raftServer start and cluster init from FindCluster to Start
- remove stopNotify from PeerServer because it is not used anymore
2. Etcd
- refactor Run logic to fit the specification
3. ClusterConfig
- rename promoteDelay to removeDelay for better naming
- add SyncClusterInterval field to ClusterConfig
- commit command to set default cluster config when cluster is created
- store cluster config info into key space for consistency
- reload cluster config when reboot
4. add StandbyServer
5. Error
- remove unused EcodePromoteError
2014-05-09 01:56:55 -07:00
Yicheng Qin
f1c13e2d9d
Merge pull request #774 from unihorn/83
...
feat(join): check cluster conditions before join
2014-05-08 14:08:38 -07:00
Yicheng Qin
6c950eaf97
Merge pull request #772 from unihorn/81
...
feat(peer_server): stop service when removed
2014-05-08 14:02:09 -07:00
Yicheng Qin
5c7a963cf0
chore(peer_server): adjust code to make it more clear
2014-05-08 13:20:46 -07:00
Yicheng Qin
c92231c91a
Merge branch 'master' of github.com:coreos/etcd
...
Conflicts:
server/peer_server_handlers.go
2014-05-08 13:17:51 -07:00
Yicheng Qin
e960a0e03c
chore(client): minor changes based on comments
...
The changes are made on error handling, comments and constant.
2014-05-08 13:15:10 -07:00
Yicheng Qin
015d228b04
Merge pull request #763 from unihorn/77
...
fix(raft_server_stats): set startTime when init
2014-05-08 12:28:44 -07:00
Yicheng Qin
b3e66ee980
fix(TestV2Watch): ensure server has started
2014-05-08 12:18:08 -07:00