570 Commits

Author SHA1 Message Date
Anthony Romano
714b48a4b4 etcdserver: initialize raftNode with constructor
raftNode was being initialized in start(), which was causing
hangs when trying to stop the etcd server since the stop channel
would not be initialized in time for the stop call. Instead,
setup non-configurable bits in a constructor.

Fixes #7668
2017-04-18 09:33:59 -07:00
Anthony Romano
d9ec6b4d22 *: return updated member list in v3 rpcs
Now it's possible to atomically know the new member configuration from
issuing a membership change RPC.
2017-04-12 16:24:51 -07:00
Anthony Romano
8ad935ef2c etcdserver: use cancelable context for server initiated requests 2017-03-31 19:19:33 -07:00
Anthony Romano
7ef75e373a Merge pull request #7525 from heyitsanthony/big-backend
etcdserver, backend: configure mmap size based on quota
2017-03-23 10:06:00 -07:00
Hitoshi Mitake
5594f695bc e2e, etcdserver: fix wrong usages of ordinal
They must be "ordinary".
2017-03-21 23:50:16 +09:00
Anthony Romano
5e4b008106 *: base initial mmap size on quota size 2017-03-17 15:38:49 -07:00
Anthony Romano
2f1542c06d *: use filepath.Join for files 2017-03-16 07:46:06 -07:00
Gyu-Ho Lee
80c10e150f etcdserver: remove possibly compacted entry look-up
Fix https://github.com/coreos/etcd/issues/7470.

This patch removes unnecessary term look-up in
'createMergedSnapshotMessage', which can trigger panic
if raft entry at etcdProgress.appliedi got compacted
by subsequent 'MsgSnap' messages--if a follower is
being (in this case, network latency spikes), it
could receive subsequent 'MsgSnap' requests from leader.

etcd server-side 'applyAll' routine and raft's Ready
processing routine becomes asynchronous after raft
entries are persisted. And given that raft Ready routine
takes less time to finish, it is possible that second
'MsgSnap' is being handled, while the slow 'applyAll'
is still processing the first(old) 'MsgSnap'. Then raft
Ready routine can compact the log entries at future
index to 'applyAll'. That is how 'createMergedSnapshotMessage'
tried to look up raft term with outdated etcdProgress.appliedi.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-15 12:46:56 -07:00
Xiang
7f0733cf46 etcdserver: candidate should wait for applying all configuration changes 2017-03-14 17:20:20 -07:00
Anthony Romano
58da8b17ee etcdserver: support mvcc txn 2017-03-08 20:54:15 -08:00
Hitoshi Mitake
f8a290e7ca *: support jwt token in v3 auth API
This commit adds jwt token support in v3 auth API.

Remaining major ToDos:
- Currently token type isn't hidden from etcdserver. In the near
  future the information should be completely invisible from
  etcdserver package.
- Configurable expiration of token. Currently tokens can be valid
  until keys are changed.

How to use:
1. generate keys for signing and verfying jwt tokens:
 $ openssl genrsa -out app.rsa 1024
 $ openssl rsa -in app.rsa -pubout > app.rsa.pub
2.  add command line options to etcd like below:
--auth-token-type jwt \
--auth-jwt-pub-key app.rsa.pub --auth-jwt-priv-key app.rsa \
--auth-jwt-sign-method RS512
3. launch etcd cluster

Below is a performance comparison of serializable read w/ and w/o jwt
token. Every (3) etcd node is executed on a single machine. Signing
method is RS512 and key length is 1024 bit. As the results show, jwt
based token introduces a performance overhead but it would be
acceptable for a case that requires authentication.

w/o jwt token auth (no auth):

Summary:
  Total:        1.6172 secs.
  Slowest:      0.0125 secs.
  Fastest:      0.0001 secs.
  Average:      0.0002 secs.
  Stddev:       0.0004 secs.
  Requests/sec: 6183.5877

Response time histogram:
  0.000 [1]     |
  0.001 [9982]  |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.003 [1]     |
  0.004 [1]     |
  0.005 [0]     |
  0.006 [0]     |
  0.008 [6]     |
  0.009 [0]     |
  0.010 [1]     |
  0.011 [5]     |
  0.013 [3]     |

Latency distribution:
  10% in 0.0001 secs.
  25% in 0.0001 secs.
  50% in 0.0001 secs.
  75% in 0.0001 secs.
  90% in 0.0002 secs.
  95% in 0.0002 secs.
  99% in 0.0003 secs.

w/ jwt token auth:

Summary:
  Total:        2.5364 secs.
  Slowest:      0.0182 secs.
  Fastest:      0.0002 secs.
  Average:      0.0003 secs.
  Stddev:       0.0005 secs.
  Requests/sec: 3942.5185

Response time histogram:
  0.000 [1]     |
  0.002 [9975]  |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.004 [0]     |
  0.006 [1]     |
  0.007 [11]    |
  0.009 [2]     |
  0.011 [4]     |
  0.013 [5]     |
  0.015 [0]     |
  0.016 [0]     |
  0.018 [1]     |

Latency distribution:
  10% in 0.0002 secs.
  25% in 0.0002 secs.
  50% in 0.0002 secs.
  75% in 0.0002 secs.
  90% in 0.0003 secs.
  95% in 0.0003 secs.
  99% in 0.0004 secs.
2017-03-06 19:46:03 -08:00
Gyu-Ho Lee
3d75395875 *: remove never-unused vars, minor lint fix
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
2017-03-06 14:59:12 -08:00
Hitoshi Mitake
0191509637 auth, etcdserver: authenticate clients based on certificate CommonName
This commit lets v3 auth mechanism authenticate clients based on
CommonName of certificate like v2 auth.
2017-01-31 17:22:12 +09:00
Xiang Li
699b1e5b3a Merge pull request #7160 from xiang90/snapshotcount
etcdserver: increase snapshot to 100,000
2017-01-14 16:53:44 -08:00
Hitoshi Mitake
9886e9448e auth, etcdserver: let maintenance services require root role
This commit lets maintenance services require root privilege. It also
moves AuthInfoFromCtx() from etcdserver to auth pkg for cleaning purpose.
2017-01-14 19:36:24 +09:00
Xiang Li
c5a9d54835 etcdserver: increase snapshot to 100,000
Keep more wal entries in memory for fast follower recovery.
10,000 was a too small number that triggers quite a few snapshots.
ZK proves that 100,000 is a reasonable number for even old less prowerful
machines.

Eventually we should provide both count and max memory (for large entries).
2017-01-13 18:05:25 -08:00
vimalk78
5fac6b8d15 etcdserver: resume compactor only if leader 2017-01-04 05:01:14 +05:30
fanmin shi
2a1bae0c2a etcdserver: consistent naming in raftReadyHandler 2016-12-29 11:27:16 -08:00
fanmin shi
2faf72f47c etcdserver: rework update committed index logic 2016-12-27 10:11:40 -08:00
fanmin shi
fef4a79528 lease: force leader to apply its pending committed index for lease operations
suppose a lease granting request from a follower goes through and followed by a lease look up or renewal, the leader might not apply the lease grant request locally. So the leader might not find the lease from the lease look up or renewal request which will result lease not found error. To fix this issue, we force the leader to apply its pending commited index before looking up lease.

FIX #6978
2016-12-22 14:24:38 -08:00
Xiang Li
35fd5dc9fc Merge pull request #6903 from mitake/auth-member
protect membership change RPCs with auth
2016-12-15 08:04:31 -08:00
Hitoshi Mitake
86d7390804 auth, etcdserver: protect membership change operations with auth
This commit protects membership change operations with auth. Only
users that have root role can issue the operations.

Implements https://github.com/coreos/etcd/issues/6899
2016-12-15 22:54:20 +09:00
Anthony Romano
2c06def8ca etcdserver, embed, v2http: move pprof setup to embed
Seems like a better place for prof setup since it's not specific to v2.
2016-12-09 12:37:35 -08:00
Xiang Li
2f96a68a20 etcdserver: do not send v2 sync if ttl keys do not exist 2016-12-07 14:48:15 -08:00
Vimal Kumar
dfe853ebff auth: add a timeout mechanism to simple token 2016-11-28 17:21:13 +05:30
Xiang Li
c33d04fb54 etcdserver: print out warning when waiting for file lock 2016-11-01 17:55:16 -07:00
Gyu-Ho Lee
6ec03d3f7c etcdserver: move 'EtcdServer.send' to raft.go
Clear 'TODO'
2016-10-26 16:26:00 -07:00
Gyu-Ho Lee
0c61d8804a etcdserver: make WaitGroup.Add sync with Wait 2016-10-12 13:11:35 -07:00
Gyu-Ho Lee
e011ea25ca etcdserver: separate EtcdServer from raftNode 2016-10-07 13:18:39 -07:00
Xiang Li
0f0c048e29 etcdserver: fix early lessor promotion issue
If we promote the lessor before finish applying all
entries from the last term, we might incorrectly renew
the already revoked leases.

Here is an example:

- Term 1: revoke lease A accepted by raft
- Old leader failed, new election happened
- Term 2: promote
- Term 2: keep alive A succeed. A now has 10 seconds TTL
- Term 2: revoke lease A from Term 1 got committed and applied
- Term 2: the lease A with 10 seconds TTL is revoked

To solve this, the new leader MUST apply all entries from old term
before promote its lessor to start accept renew requests.
2016-10-05 14:41:47 -07:00
Xiang Li
e3e3993022 etcdserver: support read index
Use read index to achieve l-read.
2016-09-27 13:41:40 +08:00
fanmin shi
690a0b6f00 etcdserver: parallelize expired leases process
When 1000 leases expired at the same time, etcd takes more than 5 seconds to clean them. This means that even after the leases have expired, keys associated with leases are still accessible. I increase the deletion throughput by parallelizing leases deletion process.
2016-09-19 16:17:49 -07:00
Anthony Romano
3866e78c26 etcdserver: tighten up goroutine management
All outstanding goroutines now go into the etcdserver waitgroup. goroutines are
shutdown with a "stopping" channel which is closed when the run() goroutine
shutsdown. The done channel will only close once the waitgroup is totally cleared.
2016-09-19 12:10:41 -07:00
Xiang Li
771ee43169 etcdserver: allow zero kv index for cluster upgrade
If a user upgrades etcd from 2.3.x to 3.0 and shutdown the
cluster immediately without triggering any new backend writes,
then the consistent index in backend would be zero.

The user cannot restart etcdserver due to today's strick index
match checking. We now have to lose this a bit for this case.
2016-08-30 11:28:18 -07:00
Xiang Li
7f3d4bfae5 etcdserver: kv.commit needs to be serialized with apply
kv.commit updates the consistent index in backend. When
executing in parallel with apply, it might grab tx lock
after apply update the consistent index and before apply
starts to execute the opeartion. If the server dies right
after kv.commit, the consistent is updated but the opeartion
is not executed. If we restart etcd server, etcd will skip
the operation. :(

There are a few other places that we need to take care of,
but let us fix this first.
2016-08-23 09:16:09 -07:00
Xiang Li
83de13e4a8 etcdserver: support apply wait 2016-08-19 16:18:35 -07:00
Xiang Li
d0fa390048 etcdserver: improve logging for leadership transfer 2016-08-17 11:40:46 -07:00
Gyu-Ho Lee
64a0e34602 etcdserver: transfer leadership when stopping 2016-08-13 14:31:58 -07:00
Gyu-Ho Lee
c6c6cfb502 etcdserver: implement 'CutPeer', 'MendPeer' 2016-08-12 07:38:52 -07:00
Anthony Romano
a1ce07a321 etcdserver: reject member removal that breaks the current active quorum 2016-08-10 17:00:39 -07:00
Anthony Romano
9063ce5e3f etcdserver, embed: stricter reconfig checking
Make --strict-reconfig-check a default and check if cluster is healthy when
adding a member.
2016-08-05 16:59:25 -07:00
Xiang Li
d69d438289 *: minor cleanup for lease 2016-08-04 20:39:32 -07:00
Xiang Li
29a077bdbe etcdserver: always recover lessor first 2016-08-04 08:06:19 -07:00
Anthony Romano
bf71497537 etcdserver, lease: tie lease min ttl to election timeout 2016-08-02 13:06:57 -07:00
Anthony Romano
06da46c4ee etcdserver: apply serialized requests outside auth apply lock
Fixes #6010
2016-07-30 22:00:49 -07:00
Anthony Romano
13c2d32061 Merge pull request #6045 from heyitsanthony/fix-version-race
etcdserver, api, membership: don't race on setting version
2016-07-27 08:56:39 -07:00
Hitoshi Mitake
0090573749 etcdserver: skip range requests in txn if the result is needless
If a server isn't serving txn requests from a client, the server
doesn't need the result of range requests in the txn.

This is a succeeding commit of
https://github.com/coreos/etcd/pull/5689
2016-07-26 19:49:07 -07:00
Anthony Romano
de2c3ec3db etcdserver, api, membership: don't race on setting version
Fixes #6029
2016-07-26 18:21:40 -07:00
Xiang Li
2d761d64a4 etcdserver: set applied index correctly 2016-07-16 11:44:18 -07:00
Xiang Li
27b03f0ed5 *: deny proposals when there is a huge gap between apply/commit 2016-07-14 10:02:55 -07:00