149 Commits

Author SHA1 Message Date
Xiang Li
aaedf32c04 fix(test/remove_node_test.go) fix a deadlock in the test
The go-etcd client waits for the response from the paused node. And the test waits for the reponse to continue.
Actually we do not even need that small test, since we will check the machine status afterwards.
2014-05-20 14:34:59 -07:00
Yicheng Qin
9e5b12f591 tests(remove_node): add TestRemovePausedNode 2014-05-20 11:01:14 -07:00
Yicheng Qin
71679bcf56 feat(standby_server): make atomic move for file
to avoid the risk of writing out a corrupted file.
2014-05-16 01:00:07 -04:00
Yicheng Qin
b7d9fdbd39 feat(standby_server): write cluster info to disk
For better fault tolerance and availability.
2014-05-15 07:47:15 -04:00
Yicheng Qin
fc77b3e9e6 fix(simple_snapshot_test): enlarge reasonable index range 2014-05-13 22:28:28 -04:00
Yicheng Qin
403f709ebd chore(cluster_config): set default timeout to 5s
Or the leader death could let the standbys down for a rather long time.
2014-05-13 16:13:44 -04:00
Yicheng Qin
5367c1c998 chore(standby): minor changes based on comments 2014-05-09 15:38:03 -07:00
Yicheng Qin
6d4f018887 chore(cluster_config): rename SyncClusterInterval to SyncInterval
for better naming
2014-05-09 13:28:21 -07:00
Yicheng Qin
baadf63912 feat: implement standby mode
Change log:
1. PeerServer
- estimate initial mode from its log through removedInLog variable
- refactor FindCluster to return the estimation
- refactor Start to call FindCluster explicitly
- move raftServer start and cluster init from FindCluster to Start
- remove stopNotify from PeerServer because it is not used anymore
2. Etcd
- refactor Run logic to fit the specification
3. ClusterConfig
- rename promoteDelay to removeDelay for better naming
- add SyncClusterInterval field to ClusterConfig
- commit command to set default cluster config when cluster is created
- store cluster config info into key space for consistency
- reload cluster config when reboot
4. add StandbyServer
5. Error
- remove unused EcodePromoteError
2014-05-09 01:56:55 -07:00
Yicheng Qin
04f09d2fd0 feat(peer_server): add State field to machineMessage
State field indicates the state of each machine.
For now, its value could be follower or leader.
2014-05-08 10:25:39 -07:00
Xiang Li
b56aa62bcc Merge pull request #773 from unihorn/82
tests(snapshot): expand reasonable range for index
2014-05-07 13:02:41 -04:00
Yicheng Qin
c4cd86e094 tests(snapshot): expand reasonable range for index
snapshot file was createed with name '0_503.ss' and '0_1010.ss' when testing.
2014-05-07 09:41:36 -07:00
Yicheng Qin
17e299995c refactor(peer_server): remove standby mode in peer server 2014-05-07 09:10:09 -07:00
Yicheng Qin
ece25833aa Merge pull request #738 from unihorn/68
feat(peer_server): forbid rejoining with different name
2014-04-18 11:49:36 -07:00
Yicheng Qin
000e3ba651 chore(rejoin_test): rewrite some printout 2014-04-18 10:48:14 -07:00
Yicheng Qin
b742af56ec Merge pull request #723 from unihorn/63
tests: add TestJoinThroughFollower
2014-04-17 19:44:46 -07:00
Yicheng Qin
b17703a9e4 chore(tests/join): adjust output 2014-04-17 19:28:58 -07:00
Yicheng Qin
0c95e1eabb feat(peer_server): forbid rejoining with different name
Or it will confuse the cluster, especially the heartbeat between nodes.
2014-04-17 15:46:33 -07:00
Yicheng Qin
732fb7c160 tests(rejoin): add TestReplaceWithDifferentPeerAddress
The functionality has not been implemented yet.
2014-04-17 10:17:26 -07:00
Yicheng Qin
273c293645 fix(server): rejoin cluster with different ip 2014-04-17 10:16:30 -07:00
Yicheng Qin
67600603c5 chore: rename proxy mode to standby mode
It makes the name more reasonable.
2014-04-17 08:04:42 -07:00
Yicheng Qin
adf4acf947 chore: gofmt go files 2014-04-15 09:42:25 -07:00
Yicheng Qin
d88b52c5f3 fix(tests/v1_migration): correct HTTP response
The bug is introduced in 03839ca8 due to the mistake.
2014-04-15 09:25:14 -07:00
Yicheng Qin
8bcfb2ecaf Merge pull request #707 from unihorn/62
fix(peer_server): recover from outage with discovery
2014-04-14 13:58:43 -07:00
Yicheng Qin
03839ca806 fix(peer_server): recover from outage with discovery
This patch also contains the refactor of find cluster process.
It is changed based on @xiangli-cmu 's commits in 627 issue.
2014-04-14 13:56:47 -07:00
Yicheng Qin
de9c318436 tests: add TestJoinThroughFollower 2014-04-14 13:41:45 -07:00
Xiang Li
bc70cdc242 tests(snapshot_test) loose the timing assumption for snapshot test
Test run slowly on drone after open race option.
2014-04-11 19:49:57 -04:00
Xiang Li
fc84da29e8 fix(internal_version_test.go) protect the checkedVersion by a lock 2014-04-10 23:35:55 -04:00
Xiang Li
2817baf3f8 fix(discovery_test.go) protect the garbageHandler by a lock 2014-04-10 23:28:40 -04:00
Yicheng Qin
1c40f327be test(snapshot): test restart with snapshot 2014-04-04 16:45:02 -07:00
Yicheng Qin
0a4b6570e1 chore(tests): start TLS cluster slowly to evade problem 2014-04-04 10:57:11 -07:00
Xiang Li
5b4a473f14 Merge pull request #667 from unihorn/53
chore(tests): test TestTLSMultiNodeKillAllAndRecovery now
2014-03-27 22:40:09 -04:00
Yicheng Qin
4e747d24dd chore(tests): test TestTLSMultiNodeKillAllAndRecovery now
It is fixed.
2014-03-27 17:06:18 -07:00
Tomás Senart
b6053d6a86 Making code formatting consistent.
$ gofmt -s -w  && goimports -w
2014-03-27 14:19:08 +01:00
Blake Mizerany
4bce3e4810 test(tests/functional): skip TestTLSMultiNodeKillAllAndRecovery until fixed 2014-03-26 19:22:59 -07:00
Ben Johnson
62b89a128a Merge branch 'master' of https://github.com/coreos/etcd into proxy
Conflicts:
	config/config.go
	server/peer_server.go
	server/transporter.go
	tests/server_utils.go
2014-03-24 15:30:14 -07:00
Ben Johnson
174b9ff343 bump(github.com/goraft/raft): 6bf34b9
Move from coreos/raft to goraft/raft and update to latest.
2014-03-24 15:09:47 -07:00
Ben Johnson
7d4fda550d Machine join/remove v2 API. 2014-03-18 16:25:21 -06:00
Yicheng Qin
50d9e6a7fd chore(fixtures/ca): make all certificates generated by etcd-ca 2014-03-17 12:32:55 -07:00
Ben Johnson
c0a59b3a27 Add minimum active size and promote delay. 2014-03-10 14:44:04 -06:00
Ben Johnson
3fff1a8dcd Add /machines and /machines/:name endpoints. 2014-03-06 15:11:31 -07:00
Ben Johnson
c8d6b26dfd Add auto-demotion after peer inactivity. 2014-03-03 11:15:05 -07:00
Yicheng Qin
69adb78433 fix(transporter): CancelRequest doesn't work on HTTPS connections blocked
Currently this is a workaround. And it should be fixed in Go1.3.
2014-02-27 14:31:46 -08:00
Yicheng Qin
e99bc99dcc fix(tests/multi_node_kill_all_and_recovery): wait for cluter to build over 2014-02-27 14:31:46 -08:00
Ben Johnson
fddbf35df2 Add automatic node promotion / demotion. 2014-02-25 10:02:01 -07:00
Ben Johnson
f5698d3566 Proxy promotion. 2014-02-24 17:01:04 -07:00
Ben Johnson
1d961b8e56 Add proxy mode. 2014-02-22 15:02:20 -07:00
Yicheng Qin
04f21b5976 Merge pull request #569 from unihorn/5
Ordering and functionality of `-discovery` `-peers` and data dir to find peers
2014-02-17 14:34:53 -08:00
Yicheng Qin
3a4df1612c feat(discovery): adjust boot order to find peers
The boot order for peers is -discovery, -peers, log data, forming
new cluster itself.

Special rules:
1. If discovery succeeds, it would find peers specified by discover URL
only.
2. Etcd would fail when meeting bad -discovery, no -peers and log data.

Add TestDiscoveryDownNoBackupPeersWithDataDir as the test.
2014-02-17 12:53:39 -08:00
Yicheng Qin
bd56b15b6e fix(tests/discovery): use host as -peers parameter instead of url
Or it cannot test the functionality correctly.
Moreover, add TestDiscoveryNoWithBackupPeers as the test for it.
2014-02-14 18:23:41 -08:00