2
0
mirror of https://github.com/etcd-io/etcd.git synced 2024-09-27 06:25:44 +00:00

509 Commits

Author SHA1 Message Date
Joe Betz
4b53ab0909 test: Clean agent directories on disk before functional test runs, not after
This is primarily so CI tooling can capture the agent logs after the functional tester runs.
2017-11-21 11:41:57 -08:00
fanmin shi
be171fa424 etcdserver: apply() sets consistIndex for any entry type
previously, apply() doesn't set consistIndex for EntryConfChange type.
this causes a misalignment between consistIndex and applied index
where EntryConfChange entry results setting applied index but not consistIndex.

suppose that addMember() is called and leader reflects that change.
1. applied index and consistIndex is now misaligned.
2. a new follower node joined.
3. leader sends the snapshot to follower
	where the applied index is the snapshot metadata index.
4. follower node saves the snapshot and database(includes consistIndex) from leader.
5. restarting follower loads snapshot and database.
6. follower checks snapshot metadata index(same as applied index) and database consistIndex,
	finds them don't match, and then panic.

FIXES 
2017-05-03 09:22:48 -07:00
Anthony Romano
8c3c1b4a9c *: use filepath.Join for files 2017-03-23 09:53:56 -07:00
Gyu-Ho Lee
3221454cab etcdserver: remove possibly compacted entry look-up
Fix https://github.com/coreos/etcd/issues/7470.

This patch removes unnecessary term look-up in
'createMergedSnapshotMessage', which can trigger panic
if raft entry at etcdProgress.appliedi got compacted
by subsequent 'MsgSnap' messages--if a follower is
being (in this case, network latency spikes) slow, it
could receive subsequent 'MsgSnap' requests from leader.

etcd server-side 'applyAll' routine and raft's Ready
processing routine becomes asynchronous after raft
entries are persisted. And given that raft Ready routine
takes less time to finish, it is possible that second
'MsgSnap' is being handled, while the slow 'applyAll'
is still processing the first(old) 'MsgSnap'. Then raft
Ready routine can compact the log entries at future
index to 'applyAll'. That is how 'createMergedSnapshotMessage'
tried to look up raft term with outdated etcdProgress.appliedi.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-18 07:56:18 -07:00
Hitoshi Mitake
98c60e8faa auth, etcdserver: let maintenance services require root role
This commit lets maintenance services require root privilege. It also
moves AuthInfoFromCtx() from etcdserver to auth pkg for cleaning purpose.
2017-02-14 13:42:31 -08:00
vimalk78
5fac6b8d15 etcdserver: resume compactor only if leader 2017-01-04 05:01:14 +05:30
fanmin shi
2a1bae0c2a etcdserver: consistent naming in raftReadyHandler 2016-12-29 11:27:16 -08:00
fanmin shi
2faf72f47c etcdserver: rework update committed index logic 2016-12-27 10:11:40 -08:00
fanmin shi
fef4a79528 lease: force leader to apply its pending committed index for lease operations
suppose a lease granting request from a follower goes through and followed by a lease look up or renewal, the leader might not apply the lease grant request locally. So the leader might not find the lease from the lease look up or renewal request which will result lease not found error. To fix this issue, we force the leader to apply its pending commited index before looking up lease.

FIX 
2016-12-22 14:24:38 -08:00
Xiang Li
35fd5dc9fc Merge pull request from mitake/auth-member
protect membership change RPCs with auth
2016-12-15 08:04:31 -08:00
Hitoshi Mitake
86d7390804 auth, etcdserver: protect membership change operations with auth
This commit protects membership change operations with auth. Only
users that have root role can issue the operations.

Implements https://github.com/coreos/etcd/issues/6899
2016-12-15 22:54:20 +09:00
Anthony Romano
2c06def8ca etcdserver, embed, v2http: move pprof setup to embed
Seems like a better place for prof setup since it's not specific to v2.
2016-12-09 12:37:35 -08:00
Xiang Li
2f96a68a20 etcdserver: do not send v2 sync if ttl keys do not exist 2016-12-07 14:48:15 -08:00
Vimal Kumar
dfe853ebff auth: add a timeout mechanism to simple token 2016-11-28 17:21:13 +05:30
Xiang Li
c33d04fb54 etcdserver: print out warning when waiting for file lock 2016-11-01 17:55:16 -07:00
Gyu-Ho Lee
6ec03d3f7c etcdserver: move 'EtcdServer.send' to raft.go
Clear 'TODO'
2016-10-26 16:26:00 -07:00
Gyu-Ho Lee
0c61d8804a etcdserver: make WaitGroup.Add sync with Wait 2016-10-12 13:11:35 -07:00
Gyu-Ho Lee
e011ea25ca etcdserver: separate EtcdServer from raftNode 2016-10-07 13:18:39 -07:00
Xiang Li
0f0c048e29 etcdserver: fix early lessor promotion issue
If we promote the lessor before finish applying all
entries from the last term, we might incorrectly renew
the already revoked leases.

Here is an example:

- Term 1: revoke lease A accepted by raft
- Old leader failed, new election happened
- Term 2: promote
- Term 2: keep alive A succeed. A now has 10 seconds TTL
- Term 2: revoke lease A from Term 1 got committed and applied
- Term 2: the lease A with 10 seconds TTL is revoked

To solve this, the new leader MUST apply all entries from old term
before promote its lessor to start accept renew requests.
2016-10-05 14:41:47 -07:00
Xiang Li
e3e3993022 etcdserver: support read index
Use read index to achieve l-read.
2016-09-27 13:41:40 +08:00
fanmin shi
690a0b6f00 etcdserver: parallelize expired leases process
When 1000 leases expired at the same time, etcd takes more than 5 seconds to clean them. This means that even after the leases have expired, keys associated with leases are still accessible. I increase the deletion throughput by parallelizing leases deletion process.
2016-09-19 16:17:49 -07:00
Anthony Romano
3866e78c26 etcdserver: tighten up goroutine management
All outstanding goroutines now go into the etcdserver waitgroup. goroutines are
shutdown with a "stopping" channel which is closed when the run() goroutine
shutsdown. The done channel will only close once the waitgroup is totally cleared.
2016-09-19 12:10:41 -07:00
Xiang Li
771ee43169 etcdserver: allow zero kv index for cluster upgrade
If a user upgrades etcd from 2.3.x to 3.0 and shutdown the
cluster immediately without triggering any new backend writes,
then the consistent index in backend would be zero.

The user cannot restart etcdserver due to today's strick index
match checking. We now have to lose this a bit for this case.
2016-08-30 11:28:18 -07:00
Xiang Li
7f3d4bfae5 etcdserver: kv.commit needs to be serialized with apply
kv.commit updates the consistent index in backend. When
executing in parallel with apply, it might grab tx lock
after apply update the consistent index and before apply
starts to execute the opeartion. If the server dies right
after kv.commit, the consistent is updated but the opeartion
is not executed. If we restart etcd server, etcd will skip
the operation. :(

There are a few other places that we need to take care of,
but let us fix this first.
2016-08-23 09:16:09 -07:00
Xiang Li
83de13e4a8 etcdserver: support apply wait 2016-08-19 16:18:35 -07:00
Xiang Li
d0fa390048 etcdserver: improve logging for leadership transfer 2016-08-17 11:40:46 -07:00
Gyu-Ho Lee
64a0e34602 etcdserver: transfer leadership when stopping 2016-08-13 14:31:58 -07:00
Gyu-Ho Lee
c6c6cfb502 etcdserver: implement 'CutPeer', 'MendPeer' 2016-08-12 07:38:52 -07:00
Anthony Romano
a1ce07a321 etcdserver: reject member removal that breaks the current active quorum 2016-08-10 17:00:39 -07:00
Anthony Romano
9063ce5e3f etcdserver, embed: stricter reconfig checking
Make --strict-reconfig-check a default and check if cluster is healthy when
adding a member.
2016-08-05 16:59:25 -07:00
Xiang Li
d69d438289 *: minor cleanup for lease 2016-08-04 20:39:32 -07:00
Xiang Li
29a077bdbe etcdserver: always recover lessor first 2016-08-04 08:06:19 -07:00
Anthony Romano
bf71497537 etcdserver, lease: tie lease min ttl to election timeout 2016-08-02 13:06:57 -07:00
Anthony Romano
06da46c4ee etcdserver: apply serialized requests outside auth apply lock
Fixes 
2016-07-30 22:00:49 -07:00
Anthony Romano
13c2d32061 Merge pull request from heyitsanthony/fix-version-race
etcdserver, api, membership: don't race on setting version
2016-07-27 08:56:39 -07:00
Hitoshi Mitake
0090573749 etcdserver: skip range requests in txn if the result is needless
If a server isn't serving txn requests from a client, the server
doesn't need the result of range requests in the txn.

This is a succeeding commit of
https://github.com/coreos/etcd/pull/5689
2016-07-26 19:49:07 -07:00
Anthony Romano
de2c3ec3db etcdserver, api, membership: don't race on setting version
Fixes 
2016-07-26 18:21:40 -07:00
Xiang Li
2d761d64a4 etcdserver: set applied index correctly 2016-07-16 11:44:18 -07:00
Xiang Li
27b03f0ed5 *: deny proposals when there is a huge gap between apply/commit 2016-07-14 10:02:55 -07:00
Xiang Li
f65e75e4b3 *: remove unnecessary data upgrade code 2016-07-11 15:11:56 -07:00
Hitoshi Mitake
c47689d98f Merge pull request from mitake/skip-apply
RFC: etcdserver, pkg: skip needless log entry applying
2016-07-10 01:23:35 +09:00
Jared Hulbert
f78d4713ea etcdserver: atomic access alignment
Most fields accessed with sync/atomic functions are 64bit aligned, but a couple
are not.  This makes comments out of date and therefore misleading.

Affected fields reordered, comments scrubbed and updated.
2016-07-08 11:20:47 -07:00
Hitoshi Mitake
abb20ec51f etcdserver, pkg: skip needless log entry applying
This commit lets etcdserver skip needless log entry applying. If the
result of log applying isn't required by the node (client that issued
the request isn't talking with the node) and the operation has no side
effects, applying can be skipped.

It would contribute to reduce disk I/O on followers and be useful for
a cluster that processes much serializable get.
2016-07-08 15:16:45 +09:00
Xiang Li
8a8a8253fa etcdserver: commit before sending snapshot 2016-07-03 13:54:05 -07:00
Anthony Romano
b7f5f8fc99 etcdserver: exit on missing backend only if semver is >= 3.0.0 2016-07-01 09:10:01 -07:00
Xiang Li
9614dc6e71 etcdserver: check index of the kv when restarting 2016-06-27 10:27:27 -07:00
Xiang Li
891ddcba6e etcdserver: refuse to restart if backend file is missing 2016-06-26 21:16:51 -07:00
Gyu-Ho Lee
d37e564eaa etcdserver: use TouchDirAll 2016-06-19 11:26:52 -07:00
Xiang Li
57474697af etcdserver: add applied metrics 2016-06-17 11:52:50 -07:00
Xiang Li
699e76b631 etcdserver: only pause compaction when sending snapshot 2016-06-16 08:57:02 -07:00