Mirroristas/etcd

mirror of https://github.com/etcd-io/etcd.git synced 2024-09-27 06:25:44 +00:00

Author	SHA1	Message	Date
Gyu-Ho Lee	939337f450	*: add max requests bytes, keepalive to server, blackhole methods to integration Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-11-16 09:05:06 -08:00
Gyu-Ho Lee	d62e39d5ca	*: deprecate "metadata.NewContext" Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-11-16 09:05:06 -08:00
Gyu-Ho Lee	eb1589ad35	*: regenerate proto Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-11-16 09:05:06 -08:00
Anthony Romano	877d0ce469	etcdserver: consolidate error checking for v3_server functions Duplicated error checking code moved into raftRequest/raftRequestOnce.	2017-08-23 14:39:59 -07:00
Anthony Romano	8ab42fb045	*: move v2http handlers without /v2 prefix to etcdhttp Lets --enable-v2=false configurations provide /metrics, /health, etc. Fixes #8167	2017-07-24 09:54:48 -07:00
Iwasaki Yudai	536a5f594b	v3rpc: Let clients establish unlimited streams From go-grpc v1.2.0, the number of max streams per client is set to 100 by default by the server side. This change makes it impossible for third party proxies and custom clients to establish many streams.	2017-07-12 10:46:33 -07:00
Anthony Romano	a032b3b914	v3rpc: treat nil txn request op as error Fixes #7889	2017-06-20 10:57:41 -07:00
Anthony Romano	c87594f27c	etcdserver: use same ReadView for read-only txns A read-only txn isn't serialized by raft, but it uses a fresh read txn for every mvcc access prior to executing its request ops. If a write txn modifies the keys matching the read txn's comparisons, the read txn may return inconsistent results. To fix, use the same read-only mvcc txn for the duration of the etcd txn. Probably gets a modest txn speedup as well since there are fewer read txn allocations.	2017-06-09 09:50:43 -07:00
Anthony Romano	864ffec88c	v2http: put back /v2/machines and mark as non-deprecated This reverts commit 2bb33181b6c8fbe8109fc668a19ce4ab46c605ec. python-etcd seems to depend on /v2/machines and the maintainer vanished. Plus, it is prefixed with /v2/ so it probably can't be deprecated anyway.	2017-06-08 12:05:59 -07:00
Gyu-Ho Lee	12bc2bba36	etcdserver: add leaseExpired debugging metrics Fix https://github.com/coreos/etcd/issues/8050. Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-06-08 11:23:12 -07:00
Anthony Romano	9169ad0d7d	*: fix go tool vet -all -shadow errors	2017-06-06 09:47:06 -07:00
Anthony Romano	c1c9a2c96c	etcdserver: close mvcc.KV on init error path Scheduled compaction will panic if KV is not stopped before closing the backend.	2017-05-23 10:41:37 -07:00
Hitoshi Mitake	4cd5e7ebb2	Merge pull request #7809 from mitake/auth-watch protect watch with auth	2017-05-20 13:23:30 +09:00
Hitoshi Mitake	939912c425	clientv3, etcdserver: support auth in Watch()	2017-05-20 11:34:45 +09:00
Anthony Romano	33c375dc44	*: fill out blank package godocs Mostly one-liner short descriptions, but also includes some typo fixes and some examples.	2017-05-18 09:41:13 -07:00
Xiang	32c252f003	etcdserver: more logging on snapshot close path	2017-05-17 14:48:52 -07:00
Anthony Romano	f6cd4d4f5b	snap, etcdserver: tighten up snapshot path handling Computing the snapshot file path is error prone; snapshot recovery was constructing file paths missing a path separator so the snapshot would never be loaded. Instead, refactor the backend path handling to use helper functions where possible.	2017-05-11 13:46:59 -07:00
fanmin shi	47f5b7c3ad	Merge pull request #7876 from fanminshi/fix_7628 etcdserver: renaming db happens after snapshot persists to wal and snap files	2017-05-09 16:15:41 -07:00
fanmin shi	dfdaf082c5	etcdserver: add a test to ensure renaming db happens before persisting wal and snap files	2017-05-09 14:00:22 -07:00
fanmin shi	8b7b7222dd	etcdserver: renaming db happens after snapshot persists to wal and snap files In the case that follower recieves a snapshot from leader and crashes before renaming xxx.snap.db to db but after snapshot has persisted to .wal and .snap, restarting follower results loading old db, new .wal, and new .snap. This will causes a index mismatch between snap metadata index and consistent index from db. This pr forces an ordering where saving/renaming db must happen after snapshot is persisted to wal and snap file. this guarantees wal and snap files are newer than db. on server restart, etcd server checks if snap index > db consistent index. if yes, etcd server attempts to load xxx.snap.db where xxx=snap index if there is any and panic other wise. FIXES #7628	2017-05-09 14:00:12 -07:00
Iwasaki Yudai	010ffc0692	v3rpc: remove duplicated error case for lease.ErrLeaseNotFound	2017-05-08 20:09:41 -07:00
fanmin shi	e33b10a666	etcdserver: add a test to ensure config change also update ConsistIndex	2017-05-02 16:51:40 -07:00
fanmin shi	5533c3058a	etcdserver: apply() sets consistIndex for any entry type previously, apply() doesn't set consistIndex for EntryConfChange type. this causes a misalignment between consistIndex and applied index where EntryConfChange entry results setting applied index but not consistIndex. suppose that addMember() is called and leader reflects that change. 1. applied index and consistIndex is now misaligned. 2. a new follower node joined. 3. leader sends the snapshot to follower where the applied index is the snapshot metadata index. 4. follower node saves the snapshot and database(includes consistIndex) from leader. 5. restarting follower loads snapshot and database. 6. follower checks snapshot metadata index(same as applied index) and database consistIndex, finds them don't match, and then panic. FIXES #7834	2017-05-02 14:57:36 -07:00
Anthony Romano	3ce31acda4	v3client: wrap watch ctxs with blank ctx Printing the values in ctx.String() will data race if the value is mutable and doesn't implement String(), which seems to be common. Instead, just return a fixed string instead of computing it; v3client watches don't need as much flexibility for creating separate strings, so separate ctx strings probably aren't necessary at this point. Fixes #7811	2017-04-25 15:03:06 -07:00
Gyu-Ho Lee	327f09fcb4	etcdserver: do not block on raft stopping Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-04-25 13:35:43 -07:00
Gyu-Ho Lee	91f6aee4f2	etcdserver: ensure waitForApply sync with applyAll Problem is: `Step1`: `etcdserver/raft.go`'s `Ready` process routine sends config-change entries via `r.applyc <- ap` (https://github.com/coreos/etcd/blob/master/etcdserver/raft.go#L193-L203) `Step2`: `etcdserver/server.go`'s `*EtcdServer.run` routine receives this via `ap := <-s.r.apply()` (https://github.com/coreos/etcd/blob/master/etcdserver/server.go#L735-L738) `StepA`: `Step1` proceeds without sync, right after sending `r.applyc <- ap`. `StepB`: `Step2` proceeds without sync, right after `sched.Schedule(s.applyAll(&ep,&ap))`. `StepC`: `etcdserver` tries to sync with `s.applyAll(&ep,&ap)` by calling `rh.waitForApply()`. `rh.waitForApply()` waits for all pending jobs to finish in `pkg/schedule` side. However, the order of `StepA`,`StepB`,`StepC` is not guaranteed. It is possible that `StepC` happens first, and proceeds without waiting on apply. And the restarting member comes back as a leader in single-node cluster, when there is no synchronization between apply-layer and config-change Raft entry apply. Confirmed with more debugging lines below, only reproducible with slow CPU VM (~2 vCPU). ``` ~:24.005397 I \| etcdserver: starting server... [version: 3.2.0+git, cluster version: to_be_decided] ~:24.011136 I \| etcdserver: [DEBUG] 29b2d24047a277df waitForApply before ~:24.011194 I \| etcdserver: [DEBUG] 29b2d24047a277df starts wait for 0 pending jobs ~:24.011234 I \| etcdserver: [DEBUG] 29b2d24047a277df finished wait for 0 pending jobs (current pending 0) ~:24.011268 I \| etcdserver: [DEBUG] 29b2d24047a277df waitForApply after ~:24.011348 I \| etcdserver: [DEBUG] [0] 29b2d24047a277df is scheduling conf change on 29b2d24047a277df ~:24.011396 I \| etcdserver: [DEBUG] [1] 29b2d24047a277df is scheduling conf change on 5edf80e32a334cf0 ~:24.011437 I \| etcdserver: [DEBUG] [2] 29b2d24047a277df is scheduling conf change on e32e31e76c8d2678 ~:24.011477 I \| etcdserver: [DEBUG] 29b2d24047a277df scheduled conf change on 29b2d24047a277df ~:24.011509 I \| etcdserver: [DEBUG] 29b2d24047a277df scheduled conf change on 5edf80e32a334cf0 ~:24.011545 I \| etcdserver: [DEBUG] 29b2d24047a277df scheduled conf change on e32e31e76c8d2678 ~:24.012500 I \| etcdserver: [DEBUG] 29b2d24047a277df applyConfChange on 29b2d24047a277df before ~:24.013014 I \| etcdserver/membership: added member 29b2d24047a277df [unix://127.0.0.1:2100515039] to cluster 9250d4ae34216949 ~:24.013066 I \| etcdserver: [DEBUG] 29b2d24047a277df applyConfChange on 29b2d24047a277df after ~:24.013113 I \| etcdserver: [DEBUG] 29b2d24047a277df applyConfChange on 29b2d24047a277df after trigger ~:24.013158 I \| etcdserver: [DEBUG] 29b2d24047a277df applyConfChange on 5edf80e32a334cf0 before ~:24.013666 W \| etcdserver: failed to send out heartbeat on time (exceeded the 10ms timeout for 11.964739ms) ~:24.013709 W \| etcdserver: server is likely overloaded ~:24.013750 W \| etcdserver: failed to send out heartbeat on time (exceeded the 10ms timeout for 12.057265ms) ~:24.013775 W \| etcdserver: server is likely overloaded ~:24.013950 I \| raft: 29b2d24047a277df is starting a new election at term 4 ~:24.014012 I \| raft: 29b2d24047a277df became candidate at term 5 ~:24.014051 I \| raft: 29b2d24047a277df received MsgVoteResp from 29b2d24047a277df at term 5 ~:24.014107 I \| raft: 29b2d24047a277df became leader at term 5 ~:24.014146 I \| raft: raft.node: 29b2d24047a277df elected leader 29b2d24047a277df at term 5 ``` I am printing out the number of pending jobs before we call `sched.WaitFinish(0)`, and there was no pending jobs, so it returned immediately (before we schedule `applyAll`). This is the root cause to: - https://github.com/coreos/etcd/issues/7595 - https://github.com/coreos/etcd/issues/7739 - https://github.com/coreos/etcd/issues/7802 `sched.WaitFinish(0)` doesn't work when `len(f.pendings)==0` and `f.finished==0`. Config-change is the first job to apply, so `f.finished` is 0 in this case. `f.finished` monotonically increases, so we need `WaitFinish(finished+1)`. And `finished` must be the one before calling `Schedule`. This is safe because `Schedule(applyAll)` is the only place adding jobs to `sched`. Then scheduler waits on the single job of `applyAll`, by getting the current number of finished jobs before sending `Schedule`. Or just make it be blocked until `applyAll` routine triggers on the config-change job. This patch just removes `waitForApply`, and signal `raftDone` to wait until `applyAll` finishes applying entries. Confirmed that it fixes the issue, as below: ``` ~:43.198354 I \| rafthttp: started streaming with peer 36cda5222aba364b (stream MsgApp v2 reader) ~:43.198740 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c waitForApply before ~:43.198836 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c starts wait for 0 pending jobs, 1 finished jobs ~:43.200696 I \| integration: launched 3169361310155633349 () ~:43.201784 I \| etcdserver: [DEBUG] [0] 3988bc20c2b2e40c is scheduling conf change on 36cda5222aba364b ~:43.201884 I \| etcdserver: [DEBUG] [1] 3988bc20c2b2e40c is scheduling conf change on 3988bc20c2b2e40c ~:43.201965 I \| etcdserver: [DEBUG] [2] 3988bc20c2b2e40c is scheduling conf change on cf5d6cbc2a121727 ~:43.202070 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c scheduled conf change on 36cda5222aba364b ~:43.202139 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c scheduled conf change on 3988bc20c2b2e40c ~:43.202204 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c scheduled conf change on cf5d6cbc2a121727 ~:43.202444 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on 36cda5222aba364b (request ID: 0) before ~:43.204486 I \| etcdserver/membership: added member 36cda5222aba364b [unix://127.0.0.1:2100913646] to cluster 425d73f1b7b01674 ~:43.204588 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on 36cda5222aba364b (request ID: 0) after ~:43.204703 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on 36cda5222aba364b (request ID: 0) after trigger ~:43.204791 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on 3988bc20c2b2e40c (request ID: 0) before ~:43.205689 I \| etcdserver/membership: added member 3988bc20c2b2e40c [unix://127.0.0.1:2101113646] to cluster 425d73f1b7b01674 ~:43.205783 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on 3988bc20c2b2e40c (request ID: 0) after ~:43.205929 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on 3988bc20c2b2e40c (request ID: 0) after trigger ~:43.206056 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on cf5d6cbc2a121727 (request ID: 0) before ~:43.207353 I \| etcdserver/membership: added member cf5d6cbc2a121727 [unix://127.0.0.1:2100713646] to cluster 425d73f1b7b01674 ~:43.207516 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on cf5d6cbc2a121727 (request ID: 0) after ~:43.207619 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on cf5d6cbc2a121727 (request ID: 0) after trigger ~:43.207710 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c finished scheduled conf change on 36cda5222aba364b ~:43.207781 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c finished scheduled conf change on 3988bc20c2b2e40c ~:43.207843 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c finished scheduled conf change on cf5d6cbc2a121727 ~:43.207951 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c finished wait for 0 pending jobs (current pending 0, finished 1) ~:43.208029 I \| rafthttp: started HTTP pipelining with peer cf5d6cbc2a121727 ~:43.210339 I \| rafthttp: peer 3988bc20c2b2e40c became active ~:43.210435 I \| rafthttp: established a TCP streaming connection with peer 3988bc20c2b2e40c (stream MsgApp v2 reader) ~:43.210861 I \| rafthttp: started streaming with peer 3988bc20c2b2e40c (writer) ~:43.211732 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c waitForApply after ``` Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-04-25 10:22:27 -07:00
Anthony Romano	2bb33181b6	v2http: remove deprecated /v2/machines path	2017-04-22 03:11:21 -07:00
Anthony Romano	393e4335b7	*: put gateway stubs into their own packages Fixes #7773	2017-04-19 13:09:06 -07:00
Anthony Romano	d24a763a12	Merge pull request #7771 from heyitsanthony/remove-2.0-version etcdserver: remove 2.0 StatusNotFound version check	2017-04-19 00:57:19 -07:00
Hitoshi Mitake	d3456b5ecd	Merge pull request #7759 from mitake/fix-7724 *: simply ignore ErrAuthNotEnabled in clientv3 if auth is not enabled	2017-04-19 16:07:18 +09:00
Anthony Romano	3d8e2e1171	etcdserver: remove 2.0 StatusNotFound version check	2017-04-18 20:22:56 -07:00
Hitoshi Mitake	e1306bff8f	*: simply ignore ErrAuthNotEnabled in clientv3 if auth is not enabled Fix https://github.com/coreos/etcd/issues/7724	2017-04-19 11:27:14 +09:00
Anthony Romano	714b48a4b4	etcdserver: initialize raftNode with constructor raftNode was being initialized in start(), which was causing hangs when trying to stop the etcd server since the stop channel would not be initialized in time for the stop call. Instead, setup non-configurable bits in a constructor. Fixes #7668	2017-04-18 09:33:59 -07:00
Hitoshi Mitake	ac69e63fa8	etcdserver: fill-in Auth API Header in apply layer Replacing "etcdserver: fill a response header in auth RPCs" The revision should be set at the time of "apply", not in later RPC layer. Fix https://github.com/coreos/etcd/issues/7691 Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-04-17 14:26:26 -07:00
Gyu-Ho Lee	cfbc5e5c3b	Merge pull request #7706 from gyuho/wait-apply-conf-change etcdserver: wait apply on conf change Raft entry	2017-04-13 16:54:06 -07:00
Gyu-Ho Lee	04354f32ab	etcdserver: wait apply on conf change Raft entry When apply-layer sees configuration change entry in raft.Ready.CommittedEntries, the server should not proceed until that entry is applied. Otherwise, follower's raft layer advances, possibly election-timeouts, and becomes the leader in single-node cluster, before add-node conf change of other nodes is applied. Fix https://github.com/coreos/etcd/issues/7595. Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-04-13 15:59:24 -07:00
Xiang Li	957c9cd1df	Merge pull request #7734 from mitake/status-auth etcdserver: let Status() not require authentication	2017-04-13 15:53:33 -07:00
Hitoshi Mitake	67f2e41f20	etcdserver: let Status() not require authentication The information that can be obtained with the RPC doesn't need to be protected. Fix https://github.com/coreos/etcd/issues/7721	2017-04-13 17:39:09 +09:00
Anthony Romano	d9ec6b4d22	*: return updated member list in v3 rpcs Now it's possible to atomically know the new member configuration from issuing a membership change RPC.	2017-04-12 16:24:51 -07:00
Anthony Romano	78a5eb79b5	*: add swagger and grpc-gateway assets for v3lock and v3election	2017-04-10 15:21:07 -07:00
Anthony Romano	dc8115a534	v3election: Election RPC service Fixes #7589	2017-04-07 16:36:38 -07:00
Anthony Romano	135a40751e	v3rpc: force RangeEnd=nil if length is 0 gRPC will replace empty strings with nil, but for the embedded case it's possible for []byte{} to slip in and confuse the single key / >= key watch logic.	2017-04-07 16:36:38 -07:00
Gyu-Ho Lee	7f2d6b3ef6	clientv3,v3client: add cluster embedded client Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-04-04 08:14:18 -07:00
Anthony Romano	24e4c94d98	Merge pull request #7640 from heyitsanthony/etcdserver-ctx etcdserver: ctx-ize server initiated requests	2017-04-03 09:07:28 -07:00
Anthony Romano	8ad935ef2c	etcdserver: use cancelable context for server initiated requests	2017-03-31 19:19:33 -07:00
Anthony Romano	833769f59f	v3rpc: return leader loss error if lease stream is canceled Canceling the stream won't cancel the receive since it's using the internal grpc context, not the one assigned by etcd.	2017-03-30 20:18:33 -07:00
Anthony Romano	1ff0b71b30	*: use protoc 3.2.0 Fixes #7631	2017-03-30 13:43:10 -07:00
Asko Kauppi	dae2755253	Documentation: fix typos	2017-03-30 11:41:50 +03:00
Gyu-Ho Lee	0bf110e27f	clientv3,v3client: maintenance to embedded client Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-03-28 14:12:43 -07:00
andelf	54efb460af	etcdserver: fix a typo in bucket name var	2017-03-24 13:11:01 +08:00

1 2 3 4 5 ...

1489 Commits