Mirroristas/etcd

mirror of https://github.com/etcd-io/etcd.git synced 2024-09-27 06:25:44 +00:00

Author	SHA1	Message	Date
Gyuho Lee	744c73e019	etcdserver: fix "lease_expired_total" metrics Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-04-10 13:57:17 -07:00
Gyuho Lee	29db853317	etcdserver: replace "hostWhitelist" with "AccessController" Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-03-27 06:25:44 -07:00
Gyuho Lee	509cf414f7	etcdserver: remove duplicate "setAppliedIndex" calls Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-03-15 19:35:44 -04:00
Gyuho Lee	4f754c1850	etcdserver: clean up with "RaftStatusGetter" Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-03-15 19:30:08 -04:00
Gyuho Lee	9680b8a157	etcdserver: adjust election ticks on restart Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-03-10 19:09:38 -08:00
Gyuho Lee	78918848bd	etcdserver: support Raft Pre-Vote Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-03-06 09:55:55 -08:00
Gyuho Lee	3648649277	etcdserver: add "HostWhitelist" to "ServerConfig" Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-02-28 18:25:28 -08:00
Gyuho Lee	8a518b01c4	*: revert "internal/mvcc" change Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-02-26 17:11:40 -08:00
Gyuho Lee	9b5d6edc4b	*: revert "internal/raftsnap" Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-02-26 17:11:40 -08:00
Gyuho Lee	0e12e888e0	*: move "internal/store" to "etcdserver/v2store" Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-02-26 17:11:40 -08:00
Gyuho Lee	dd2f3b0de8	*: revert "internal/lease" change Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-02-26 17:11:40 -08:00
Gyuho Lee	bb95d190c1	*: revert "internal/auth" change Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-02-26 17:11:40 -08:00
Gyuho Lee	0850ccbf45	*: revert "internal/version" change Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-02-26 17:11:40 -08:00
Gyuho Lee	0e65660548	*: revert "internal/discovery" change Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-02-26 17:11:40 -08:00
Gyuho Lee	19010a7182	*: revert "internal/alarm" change Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-02-26 17:11:40 -08:00
Gyuho Lee	6bbe107225	*: revert "internal/compactor" package change Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-02-26 17:11:40 -08:00
Xiang	b83244bd35	etcdserver: improve request took too long warning	2018-02-06 12:15:52 -08:00
Gyuho Lee	37546f74ab	*: move "version" to "internal/version" Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-01-29 10:00:20 -08:00
Hitoshi Mitake	6c91766490	*: move "auth" to "internal/auth"	2018-01-29 14:57:35 +09:00
Gyuho Lee	80d15948bc	*: move "mvcc" to "internal/mvcc" Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-01-26 11:14:41 -08:00
Gyuho Lee	349a377a67	*: move "lease" to "internal/lease" Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-01-26 11:09:29 -08:00
Gyuho Lee	880835c02c	*: move "store" to "internal/store" Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-01-26 11:06:22 -08:00
Gyuho Lee	432581c7d0	*: move "discovery" to "internal/discovery" Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-01-25 15:41:17 -08:00
Gyuho Lee	46b9844ca5	: move "alarm,compactor" to "internal/" Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-01-25 15:26:21 -08:00
Gyuho Lee	dee39bf786	internal/raftsnap: move "raftsnap" to internal Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-01-24 10:36:04 -08:00
Gyuho Lee	6a70a931d3	etcdserver: rename "snap" to "raftsnap" package Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-01-24 10:26:07 -08:00
dvonthenen	25cdf4ed92	*: expose Raft Applied Index through to "etcdctl endpoint status" Fixed based on feedback Fixed spacing Fix gofmt	2018-01-22 07:37:21 -08:00
Gyuho Lee	85af65eca9	etcdserver: log lease revoke error Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2017-12-14 21:45:20 -08:00
Gyu-Ho Lee	f65aee0759	*: replace 'golang.org/x/net/context' with 'context' Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-09-07 13:39:42 -07:00
Anthony Romano	758c3c09fd	etcdserver: refactor v2 request processing Makes interfaces more reusable.	2017-08-31 11:47:40 -07:00
Anthony Romano	1d3afd4bb5	etcdhttp, v2http, etcdserver: use etcdserver.{Server,ServerV2} interfaces	2017-08-31 11:47:40 -07:00
Anthony Romano	31381da53a	etcdserver: raise alarm on cluster corruption Fixes #7125	2017-08-22 09:59:59 -07:00
Anthony Romano	478ba2c4f2	etcdserver: consolidate error checking for v3_server functions Duplicated error checking code moved into raftRequest/raftRequestOnce.	2017-07-25 14:28:39 -07:00
Gyu-Ho Lee	61a736a068	etcdserver: check alarms in health handler Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-07-18 15:51:28 -07:00
Gyu-Ho Lee	403ba1dfa7	etcdserver: expose 'transferLeadership' as 'MoveLeader' Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-06-23 12:51:28 -07:00
Xiang Li	0fe8fdcb29	Merge pull request #8123 from yudai/revision_compactor Compactor: Add Revisional compactor	2017-06-22 16:34:28 -07:00
Iwasaki Yudai	a3f8f47422	*: add Revision compactor	2017-06-21 15:41:07 -07:00
Anthony Romano	6ed51dc621	etcdserver, v3rpc: support nested txns	2017-06-21 14:33:15 -07:00
Anthony Romano	dcf52bbfac	etcdserver, embed, integration: don't use pointer for ServerConfig ServerConfig is owned by etdcserver and unshared, so don't pass or store by pointer. Also removes duplicated field 'snapCount'.	2017-06-15 13:02:13 -07:00
Gyu-Ho Lee	45fd8279f0	etcdserver: add leaseExpired debugging metrics Fix https://github.com/coreos/etcd/issues/8050. Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-06-08 10:36:25 -07:00
Hitoshi Mitake	0c655902f2	auth, etcdserver: protect revoking lease with auth Currently clients can revoke any lease without permission. This commit lets etcdserver protect revoking with write permission. This commit adds a mechanism for generating internal token. It is used for indicating that LeaseRevoke was issued internally so it should be able to delete any attached keys.	2017-06-07 17:46:14 -07:00
Anthony Romano	887db5a3db	*: fix go tool vet -all -shadow errors	2017-06-03 21:32:36 -07:00
Anthony Romano	a20e667c5b	Merge pull request #7967 from heyitsanthony/purge-snapdb etcdserver: purge old snap.db files	2017-05-30 16:15:11 -07:00
fanmin shi	9e7740011b	etcdserver: add --max-request-bytes flag	2017-05-25 11:01:38 -07:00
Anthony Romano	c1c9a2c96c	etcdserver: close mvcc.KV on init error path Scheduled compaction will panic if KV is not stopped before closing the backend.	2017-05-23 10:41:37 -07:00
Anthony Romano	ab16fa1f07	etcdserver: purge old snap.db files Lots of garbage db files in #7957. Should purge.	2017-05-22 15:44:21 -07:00
Anthony Romano	f6cd4d4f5b	snap, etcdserver: tighten up snapshot path handling Computing the snapshot file path is error prone; snapshot recovery was constructing file paths missing a path separator so the snapshot would never be loaded. Instead, refactor the backend path handling to use helper functions where possible.	2017-05-11 13:46:59 -07:00
fanmin shi	8b7b7222dd	etcdserver: renaming db happens after snapshot persists to wal and snap files In the case that follower recieves a snapshot from leader and crashes before renaming xxx.snap.db to db but after snapshot has persisted to .wal and .snap, restarting follower results loading old db, new .wal, and new .snap. This will causes a index mismatch between snap metadata index and consistent index from db. This pr forces an ordering where saving/renaming db must happen after snapshot is persisted to wal and snap file. this guarantees wal and snap files are newer than db. on server restart, etcd server checks if snap index > db consistent index. if yes, etcd server attempts to load xxx.snap.db where xxx=snap index if there is any and panic other wise. FIXES #7628	2017-05-09 14:00:12 -07:00
fanmin shi	5533c3058a	etcdserver: apply() sets consistIndex for any entry type previously, apply() doesn't set consistIndex for EntryConfChange type. this causes a misalignment between consistIndex and applied index where EntryConfChange entry results setting applied index but not consistIndex. suppose that addMember() is called and leader reflects that change. 1. applied index and consistIndex is now misaligned. 2. a new follower node joined. 3. leader sends the snapshot to follower where the applied index is the snapshot metadata index. 4. follower node saves the snapshot and database(includes consistIndex) from leader. 5. restarting follower loads snapshot and database. 6. follower checks snapshot metadata index(same as applied index) and database consistIndex, finds them don't match, and then panic. FIXES #7834	2017-05-02 14:57:36 -07:00
Gyu-Ho Lee	91f6aee4f2	etcdserver: ensure waitForApply sync with applyAll Problem is: `Step1`: `etcdserver/raft.go`'s `Ready` process routine sends config-change entries via `r.applyc <- ap` (https://github.com/coreos/etcd/blob/master/etcdserver/raft.go#L193-L203) `Step2`: `etcdserver/server.go`'s `*EtcdServer.run` routine receives this via `ap := <-s.r.apply()` (https://github.com/coreos/etcd/blob/master/etcdserver/server.go#L735-L738) `StepA`: `Step1` proceeds without sync, right after sending `r.applyc <- ap`. `StepB`: `Step2` proceeds without sync, right after `sched.Schedule(s.applyAll(&ep,&ap))`. `StepC`: `etcdserver` tries to sync with `s.applyAll(&ep,&ap)` by calling `rh.waitForApply()`. `rh.waitForApply()` waits for all pending jobs to finish in `pkg/schedule` side. However, the order of `StepA`,`StepB`,`StepC` is not guaranteed. It is possible that `StepC` happens first, and proceeds without waiting on apply. And the restarting member comes back as a leader in single-node cluster, when there is no synchronization between apply-layer and config-change Raft entry apply. Confirmed with more debugging lines below, only reproducible with slow CPU VM (~2 vCPU). ``` ~:24.005397 I \| etcdserver: starting server... [version: 3.2.0+git, cluster version: to_be_decided] ~:24.011136 I \| etcdserver: [DEBUG] 29b2d24047a277df waitForApply before ~:24.011194 I \| etcdserver: [DEBUG] 29b2d24047a277df starts wait for 0 pending jobs ~:24.011234 I \| etcdserver: [DEBUG] 29b2d24047a277df finished wait for 0 pending jobs (current pending 0) ~:24.011268 I \| etcdserver: [DEBUG] 29b2d24047a277df waitForApply after ~:24.011348 I \| etcdserver: [DEBUG] [0] 29b2d24047a277df is scheduling conf change on 29b2d24047a277df ~:24.011396 I \| etcdserver: [DEBUG] [1] 29b2d24047a277df is scheduling conf change on 5edf80e32a334cf0 ~:24.011437 I \| etcdserver: [DEBUG] [2] 29b2d24047a277df is scheduling conf change on e32e31e76c8d2678 ~:24.011477 I \| etcdserver: [DEBUG] 29b2d24047a277df scheduled conf change on 29b2d24047a277df ~:24.011509 I \| etcdserver: [DEBUG] 29b2d24047a277df scheduled conf change on 5edf80e32a334cf0 ~:24.011545 I \| etcdserver: [DEBUG] 29b2d24047a277df scheduled conf change on e32e31e76c8d2678 ~:24.012500 I \| etcdserver: [DEBUG] 29b2d24047a277df applyConfChange on 29b2d24047a277df before ~:24.013014 I \| etcdserver/membership: added member 29b2d24047a277df [unix://127.0.0.1:2100515039] to cluster 9250d4ae34216949 ~:24.013066 I \| etcdserver: [DEBUG] 29b2d24047a277df applyConfChange on 29b2d24047a277df after ~:24.013113 I \| etcdserver: [DEBUG] 29b2d24047a277df applyConfChange on 29b2d24047a277df after trigger ~:24.013158 I \| etcdserver: [DEBUG] 29b2d24047a277df applyConfChange on 5edf80e32a334cf0 before ~:24.013666 W \| etcdserver: failed to send out heartbeat on time (exceeded the 10ms timeout for 11.964739ms) ~:24.013709 W \| etcdserver: server is likely overloaded ~:24.013750 W \| etcdserver: failed to send out heartbeat on time (exceeded the 10ms timeout for 12.057265ms) ~:24.013775 W \| etcdserver: server is likely overloaded ~:24.013950 I \| raft: 29b2d24047a277df is starting a new election at term 4 ~:24.014012 I \| raft: 29b2d24047a277df became candidate at term 5 ~:24.014051 I \| raft: 29b2d24047a277df received MsgVoteResp from 29b2d24047a277df at term 5 ~:24.014107 I \| raft: 29b2d24047a277df became leader at term 5 ~:24.014146 I \| raft: raft.node: 29b2d24047a277df elected leader 29b2d24047a277df at term 5 ``` I am printing out the number of pending jobs before we call `sched.WaitFinish(0)`, and there was no pending jobs, so it returned immediately (before we schedule `applyAll`). This is the root cause to: - https://github.com/coreos/etcd/issues/7595 - https://github.com/coreos/etcd/issues/7739 - https://github.com/coreos/etcd/issues/7802 `sched.WaitFinish(0)` doesn't work when `len(f.pendings)==0` and `f.finished==0`. Config-change is the first job to apply, so `f.finished` is 0 in this case. `f.finished` monotonically increases, so we need `WaitFinish(finished+1)`. And `finished` must be the one before calling `Schedule`. This is safe because `Schedule(applyAll)` is the only place adding jobs to `sched`. Then scheduler waits on the single job of `applyAll`, by getting the current number of finished jobs before sending `Schedule`. Or just make it be blocked until `applyAll` routine triggers on the config-change job. This patch just removes `waitForApply`, and signal `raftDone` to wait until `applyAll` finishes applying entries. Confirmed that it fixes the issue, as below: ``` ~:43.198354 I \| rafthttp: started streaming with peer 36cda5222aba364b (stream MsgApp v2 reader) ~:43.198740 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c waitForApply before ~:43.198836 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c starts wait for 0 pending jobs, 1 finished jobs ~:43.200696 I \| integration: launched 3169361310155633349 () ~:43.201784 I \| etcdserver: [DEBUG] [0] 3988bc20c2b2e40c is scheduling conf change on 36cda5222aba364b ~:43.201884 I \| etcdserver: [DEBUG] [1] 3988bc20c2b2e40c is scheduling conf change on 3988bc20c2b2e40c ~:43.201965 I \| etcdserver: [DEBUG] [2] 3988bc20c2b2e40c is scheduling conf change on cf5d6cbc2a121727 ~:43.202070 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c scheduled conf change on 36cda5222aba364b ~:43.202139 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c scheduled conf change on 3988bc20c2b2e40c ~:43.202204 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c scheduled conf change on cf5d6cbc2a121727 ~:43.202444 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on 36cda5222aba364b (request ID: 0) before ~:43.204486 I \| etcdserver/membership: added member 36cda5222aba364b [unix://127.0.0.1:2100913646] to cluster 425d73f1b7b01674 ~:43.204588 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on 36cda5222aba364b (request ID: 0) after ~:43.204703 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on 36cda5222aba364b (request ID: 0) after trigger ~:43.204791 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on 3988bc20c2b2e40c (request ID: 0) before ~:43.205689 I \| etcdserver/membership: added member 3988bc20c2b2e40c [unix://127.0.0.1:2101113646] to cluster 425d73f1b7b01674 ~:43.205783 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on 3988bc20c2b2e40c (request ID: 0) after ~:43.205929 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on 3988bc20c2b2e40c (request ID: 0) after trigger ~:43.206056 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on cf5d6cbc2a121727 (request ID: 0) before ~:43.207353 I \| etcdserver/membership: added member cf5d6cbc2a121727 [unix://127.0.0.1:2100713646] to cluster 425d73f1b7b01674 ~:43.207516 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on cf5d6cbc2a121727 (request ID: 0) after ~:43.207619 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c applyConfChange on cf5d6cbc2a121727 (request ID: 0) after trigger ~:43.207710 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c finished scheduled conf change on 36cda5222aba364b ~:43.207781 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c finished scheduled conf change on 3988bc20c2b2e40c ~:43.207843 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c finished scheduled conf change on cf5d6cbc2a121727 ~:43.207951 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c finished wait for 0 pending jobs (current pending 0, finished 1) ~:43.208029 I \| rafthttp: started HTTP pipelining with peer cf5d6cbc2a121727 ~:43.210339 I \| rafthttp: peer 3988bc20c2b2e40c became active ~:43.210435 I \| rafthttp: established a TCP streaming connection with peer 3988bc20c2b2e40c (stream MsgApp v2 reader) ~:43.210861 I \| rafthttp: started streaming with peer 3988bc20c2b2e40c (writer) ~:43.211732 I \| etcdserver: [DEBUG] 3988bc20c2b2e40c waitForApply after ``` Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>	2017-04-25 10:22:27 -07:00

1 2 3 4 5 ...

570 Commits