Fix https://github.com/coreos/etcd/issues/7470.
This patch removes unnecessary term look-up in
'createMergedSnapshotMessage', which can trigger panic
if raft entry at etcdProgress.appliedi got compacted
by subsequent 'MsgSnap' messages--if a follower is
being (in this case, network latency spikes), it
could receive subsequent 'MsgSnap' requests from leader.
etcd server-side 'applyAll' routine and raft's Ready
processing routine becomes asynchronous after raft
entries are persisted. And given that raft Ready routine
takes less time to finish, it is possible that second
'MsgSnap' is being handled, while the slow 'applyAll'
is still processing the first(old) 'MsgSnap'. Then raft
Ready routine can compact the log entries at future
index to 'applyAll'. That is how 'createMergedSnapshotMessage'
tried to look up raft term with outdated etcdProgress.appliedi.
Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
This commit adds jwt token support in v3 auth API.
Remaining major ToDos:
- Currently token type isn't hidden from etcdserver. In the near
future the information should be completely invisible from
etcdserver package.
- Configurable expiration of token. Currently tokens can be valid
until keys are changed.
How to use:
1. generate keys for signing and verfying jwt tokens:
$ openssl genrsa -out app.rsa 1024
$ openssl rsa -in app.rsa -pubout > app.rsa.pub
2. add command line options to etcd like below:
--auth-token-type jwt \
--auth-jwt-pub-key app.rsa.pub --auth-jwt-priv-key app.rsa \
--auth-jwt-sign-method RS512
3. launch etcd cluster
Below is a performance comparison of serializable read w/ and w/o jwt
token. Every (3) etcd node is executed on a single machine. Signing
method is RS512 and key length is 1024 bit. As the results show, jwt
based token introduces a performance overhead but it would be
acceptable for a case that requires authentication.
w/o jwt token auth (no auth):
Summary:
Total: 1.6172 secs.
Slowest: 0.0125 secs.
Fastest: 0.0001 secs.
Average: 0.0002 secs.
Stddev: 0.0004 secs.
Requests/sec: 6183.5877
Response time histogram:
0.000 [1] |
0.001 [9982] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
0.003 [1] |
0.004 [1] |
0.005 [0] |
0.006 [0] |
0.008 [6] |
0.009 [0] |
0.010 [1] |
0.011 [5] |
0.013 [3] |
Latency distribution:
10% in 0.0001 secs.
25% in 0.0001 secs.
50% in 0.0001 secs.
75% in 0.0001 secs.
90% in 0.0002 secs.
95% in 0.0002 secs.
99% in 0.0003 secs.
w/ jwt token auth:
Summary:
Total: 2.5364 secs.
Slowest: 0.0182 secs.
Fastest: 0.0002 secs.
Average: 0.0003 secs.
Stddev: 0.0005 secs.
Requests/sec: 3942.5185
Response time histogram:
0.000 [1] |
0.002 [9975] |∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
0.004 [0] |
0.006 [1] |
0.007 [11] |
0.009 [2] |
0.011 [4] |
0.013 [5] |
0.015 [0] |
0.016 [0] |
0.018 [1] |
Latency distribution:
10% in 0.0002 secs.
25% in 0.0002 secs.
50% in 0.0002 secs.
75% in 0.0002 secs.
90% in 0.0003 secs.
95% in 0.0003 secs.
99% in 0.0004 secs.
Getting gosimple suggestion while running test script, so this PR is for fixing gosimple S1019 check.
raft/node_test.go:456:40: should use make([]raftpb.Entry, 1) instead (S1019)
raft/node_test.go:457:49: should use make([]raftpb.Entry, 1) instead (S1019)
raft/node_test.go:458:43: should use make([]raftpb.Message, 1) instead (S1019)
Refer https://github.com/dominikh/go-tools/blob/master/cmd/gosimple/README.md#checks for more information.
Keep more wal entries in memory for fast follower recovery.
10,000 was a too small number that triggers quite a few snapshots.
ZK proves that 100,000 is a reasonable number for even old less prowerful
machines.
Eventually we should provide both count and max memory (for large entries).
This commit adds a mechanism of handling a case of expired auth token
to clientv3. If a server returns an error code
grpc.codes.Unauthenticated, newRetryWrapper() tries to get a new token
and use it as an option of PerRPCCredential.
Fixes https://github.com/coreos/etcd/issues/7012