Mirroristas/etcd

mirror of https://github.com/etcd-io/etcd.git synced 2024-09-27 06:25:44 +00:00

Author	SHA1	Message	Date
Benjamin Wang	b2b7b9d535	Merge pull request #14423 from serathius/one_member_data_loss_raft_3_4 [release-3.4] fix the potential data loss for clusters with only one member	2022-09-06 03:29:45 +08:00
Benjamin Wang	119e4dda19	fix the potential data loss for clusters with only one member For a cluster with only one member, the raft always send identical unstable entries and committed entries to etcdserver, and etcd responds to the client once it finishes (actually partially) the applying workflow. When the client receives the response, it doesn't mean etcd has already successfully saved the data, including BoltDB and WAL, because: 1. etcd commits the boltDB transaction periodically instead of on each request; 2. etcd saves WAL entries in parallel with applying the committed entries. Accordingly, it may run into a situation of data loss when the etcd crashes immediately after responding to the client and before the boltDB and WAL successfully save the data to disk. Note that this issue can only happen for clusters with only one member. For clusters with multiple members, it isn't an issue, because etcd will not commit & apply the data before it being replicated to majority members. When the client receives the response, it means the data must have been applied. It further means the data must have been committed. Note: for clusters with multiple members, the raft will never send identical unstable entries and committed entries to etcdserver. Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-09-05 14:15:47 +02:00
Vladimir Sokolov	38342e88da	etcdserver: nil-logger issue fix for version 3.4 In v3.5 it is assumed that the logger should not be nil, however it is still a case in v3.4. The PR targeted to v3.5 was backported to 3.4 and that's why it's possible to get panic on nil logger in 3.4. This commit fixed this issue. Fixes #14402 Signed-off-by: Vladimir Sokolov <vsvastey@gmail.com>	2022-09-03 04:34:03 +03:00
Benjamin Wang	cc1b0e6a44	do not get previous K/V for create event Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-08-01 13:11:46 +08:00
Benjamin Wang	f53db9b246	etcdserver: resend ReadIndex request on empty apply request Backport https://github.com/etcd-io/etcd/pull/12795 to 3.4 Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-07-25 09:21:31 +08:00
Benjamin Wang	e2b36f8879	Merge pull request #14253 from serathius/checkpoints-fix-3.4 [3.4] Checkpoints fix 3.4	2022-07-22 16:56:17 +08:00
Marek Siarkowicz	8f4735dfd4	server: Require either cluster version v3.6 or --experimental-enable-lease-checkpoint-persist to persist lease remainingTTL To avoid inconsistant behavior during cluster upgrade we are feature gating persistance behind cluster version. This should ensure that all cluster members are upgraded to v3.6 before changing behavior. To allow backporting this fix to v3.5 we are also introducing flag --experimental-enable-lease-checkpoint-persist that will allow for smooth upgrade in v3.5 clusters with this feature enabled. Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-07-22 10:28:29 +02:00
Benjamin Wang	f18d074866	Merge pull request #14254 from ramses/backport-13435 [3.4] Backport: non mutating requests pass through quotaKVServer when NOSPACE	2022-07-22 09:27:00 +08:00
vivekpatani	e4deb09c9e	etcdserver,pkg: remove temp files in snap dir when etcdserver starting - Backporting: https://github.com/etcd-io/etcd/pull/12846 - Reference: https://github.com/etcd-io/etcd/issues/14232 Signed-off-by: vivekpatani <9080894+vivekpatani@users.noreply.github.com>	2022-07-21 15:50:27 -07:00
Chao Chen	96f69dee47	Backport: non mutating requests pass through quotaKVServer when NOSPACE This is a backport of https://github.com/etcd-io/etcd/pull/13435 and is part of the work for 3.4.20 https://github.com/etcd-io/etcd/issues/14232. The original change had a second commit that modifies a changelog file. The 3.4 branch does not include any changelog file, so that part was not cherry-picked. Local Testing: - `make build` - `make test` Both succeed. Signed-off-by: Ramsés Morales <ramses@gmail.com>	2022-07-21 15:06:09 -07:00
Benjamin Wang	6071b1c523	Support configuring MaxConcurrentStreams for http2 Backport https://github.com/etcd-io/etcd/pull/14219 to 3.4 Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-07-21 14:25:29 +08:00
Chao Chen	864006b72d	print out applied index as uint64 Signed-off-by: Chao Chen <chaochn@amazon.com>	2022-07-20 12:07:51 -07:00
Pierre Zemb	3f9fba9112	etcdserver: add more detailed traces on linearized reading To improve debuggability of `agreement among raft nodes before linearized reading`, we added some tracing inside `linearizableReadLoop`. This will allow us to know the timing of `s.r.ReadIndex` vs `s.applyWait.Wait(rs.Index)`. Signed-off-by: Chao Chen <chaochn@amazon.com>	2022-07-20 12:07:51 -07:00
Benjamin Wang	07d2b1d626	support linearizable renew lease for 3.4 Cherry pick https://github.com/etcd-io/etcd/pull/13932 to 3.4. When etcdserver receives a LeaseRenew request, it may be still in progress of processing the LeaseGrantRequest on exact the same leaseID. Accordingly it may return a TTL=0 to client due to the leaseID not found error. So the leader should wait for the appliedID to be available before processing client requests. Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-07-19 13:34:55 +08:00
Benjamin Wang	f036529b5d	Backport two lease related bug fixes to 3.4 The first bug fix is to resolve the race condition between goroutine and channel on the same leases to be revoked. It's a classic mistake in using Golang channel + goroutine. Please refer to https://go.dev/doc/effective_go#channels The second bug fix is to resolve the issue that etcd lessor may continue to schedule checkpoint after stepping down the leader role. Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-06-24 09:09:40 +08:00
Benjamin Wang	1abf085cfb	fix all the pipeline failues for release 3.4 Items resolved: 1. fix the vet error: possible misuse of reflect.SliceHeader; 2. fix the vet error: call to (*T).Fatal from a non-test goroutine; 3. bump package golang.org/x/crypto, net and sys; 4. bump boltdb from 1.3.3 to 1.3.6; 5. remove the vendor directory; 6. remove go 1.12.17 and 1.15.15, add go 1.16.15 into pipeline; 7. bump go version to 1.16 in go.mod; 8. fix the issue: compile: version go1.16.15 does not match go tool version go1.17.11, refer to https://github.com/actions/setup-go/issues/107; 9. fix data race on compactMainRev and watcherGauge; 10. fix test failure for TestLeasingTxnOwnerGet in cluster_proxy mode. Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-06-22 05:28:45 +08:00
Hitoshi Mitake	757a8e8f5b	*: implement a retry logic for auth old revision in the client	2022-04-29 23:46:24 +09:00
Chao Chen	04d47a93f9	backport from #13467 exclude the same alarm type activated by multiple peers	2021-11-12 14:17:14 -08:00
Chao Chen	dbde4f2d5e	Backport-3.4 exclude alarms from health check conditionally	2021-05-04 10:37:12 -07:00
Chao Chen	c4eb81af99	Backport-3.4 etcdserver/util.go: reduce memory when logging range requests	2021-04-22 15:07:44 -07:00
Lili Cosic	0b7e4184e8	etcdserver,wal: Convert int to string using rune()	2021-04-19 11:18:13 +02:00
Chris Wedgwood	656dc63eab	etcdserver: fix incorrect metrics generated when clients cancel watches Manual cherry-pick of 9571325fe85173a60c89d6ac6ce3491c7b1ec7a4 for release-3.4.	2021-03-31 22:59:29 -07:00
Piotr Tabor	30799c97be	Merge pull request #12815 from dbavatar/release-3.4-peervalidation etcdserver: Fix PeerURL validation	2021-03-30 12:54:32 +02:00
Sam Batschelet	9aeabe447d	server: Added config parameter experimental-warning-apply-duration Signed-off-by: Sam Batschelet <sbatsche@redhat.com>	2021-03-03 12:14:30 -05:00
Chao Chen	f27ef4d343	[Backport-3.4] etcdserver/api/etcdhttp: log successful etcd server side health check in debug level ref. #12677 ref. `0b9cfa8677`	2021-02-08 21:44:44 -08:00
galal-hussein	3019246742	etcdserver: add ConfChangeAddLearnerNode to the list of config changes To fix a panic that happens when trying to get ids of etcd members in force new cluster mode, the issue happen if the cluster previously had etcd learner nodes added to the cluster Fixes #12285	2020-09-14 17:50:57 +02:00
Jordan Liggitt	b8878eac45	etcdserver: Avoid panics logging slow v2 requests in integration tests	2020-08-19 11:30:39 -04:00
Gyuho Lee	e71e0c5c88	Merge pull request #12226 from jingyih/fix_backport_PR12216 *: add plog logging to the backport of PR12216	2020-08-18 08:48:09 -07:00
Gyuho Lee	299e0f17aa	Revert "etcdserver/api/v3rpc: "MemberList" never return non-empty ClientURLs" This reverts commit 0372cfc7ab1052ac616ca34551f83657a8fd2e3e.	2020-08-18 08:45:38 -07:00
jingyih	75d5e78d1f	*: fix backport of PR12216 Fix bugs introduced in commit c60dabf	2020-08-16 15:01:18 +08:00
jingyih	c60dabf2f3	*: add experimental flag for watch notify interval Signed-off-by: Gyuho Lee <leegyuho@amazon.com>	2020-08-15 10:24:25 -07:00
Gyuho Lee	008074187c	etcdserver: add OS level FD metrics Similar counts are exposed via Prometheus. This adds the one that are perceived by etcd server. e.g. os_fd_limit 120000 os_fd_used 14 process_cpu_seconds_total 0.31 process_max_fds 120000 process_open_fds 17 Signed-off-by: Gyuho Lee <leegyuho@amazon.com>	2020-08-12 18:38:35 -07:00
Gyuho Lee	0372cfc7ab	etcdserver/api/v3rpc: "MemberList" never return non-empty ClientURLs Signed-off-by: Gyuho Lee <leegyuho@amazon.com> cr https://code.amazon.com/reviews/CR-29712724	2020-07-16 16:29:51 -07:00
Yuchen Zhou	ed28c768a3	etcdserver: change protobuf field type from int to int64 (#12000 )	2020-07-08 10:21:10 +03:00
cfc4n	4488595e05	auth: Customize simpleTokenTTL settings. see https://github.com/etcd-io/etcd/issues/11978 for more detail.	2020-06-25 19:58:26 +08:00
cfc4n	ee963470f4	etcdserver:FDUsage set ticker to 10 minute from 5 seconds. This ticker will check File Descriptor Requirements ,and count all fds in used. And recorded some logs when in used >= limit/5*4. Just recorded message. If fds was more than 10K,It's low performance due to FDUsage() works. So need to increase it. see https://github.com/etcd-io/etcd/issues/11969 for more detail.	2020-06-24 13:28:40 +08:00
Gyuho Lee	7adbfa1144	Merge pull request #12038 from spzala/automated-cherry-pick-of-#11608-upstream-release-3.4 Automated cherry pick of #11608	2020-06-21 19:19:50 -07:00
Hitoshi Mitake	963b242846	etcdserver: don't let InternalAuthenticateRequest have password	2020-06-21 19:18:18 -04:00
Sahdev P. Zala	9a24f73f7b	Discovery: do not allow passing negative cluster size When an etcd instance attempts to perform service discovery, if a cluster size with negative value is provided, the etcd instance will panic without recovery because of	2020-06-21 18:00:35 -04:00
David Crawshaw	78f67988aa	etcdserver, et al: add --unsafe-no-fsync flag This makes it possible to run an etcd node for testing and development without placing lots of load on the file system. Fixes #11930. Signed-off-by: David Crawshaw <crawshaw@tailscale.com>	2020-06-04 20:19:28 -07:00
Gyuho Lee	cfe37de6c0	rafthttp: log snapshot download duration Signed-off-by: Gyuho Lee <leegyuho@amazon.com>	2020-05-20 11:37:01 -07:00
Gyuho Lee	a668adba78	rafthttp: improve snapshot send logging Signed-off-by: Gyuho Lee <leegyuho@amazon.com>	2020-05-18 11:39:24 -07:00
Gyuho Lee	9bad82fee5	*: make sure snapshot save downloads SHA256 checksum ref. https://github.com/etcd-io/etcd/pull/11896 Signed-off-by: Gyuho Lee <leegyuho@amazon.com>	2020-05-17 17:38:42 -07:00
Gyuho Lee	f1ea03a7c8	etcdserver/api/snap: exclude orphaned defragmentation files in snapNames ref. https://github.com/etcd-io/etcd/pull/11900 Signed-off-by: Gyuho Lee <leegyuho@amazon.com>	2020-05-17 14:21:02 -07:00
Ted Yu	4079deadb4	etcdserver: continue releasing snap db in case of error Signed-off-by: Ted Yu <yuzhihong@gmail.com> Signed-off-by: Gyuho Lee <leegyuho@amazon.com>	2020-05-17 14:15:26 -07:00
Viacheslav Biriukov	87fc3c9e57	etcdserver,wal: fix inconsistencies in WAL and snapshot etcdserver/, wal/: changes to snapshots and wal logic etcdserver/: changes to snapshots and wal logic to fix #10219 etcdserver/, wal/: add Sync method etcdserver/, wal/: find valid snapshots by cross checking snap files and wal snap entries etcdserver/, wal/:Add comments, clean up error messages and tests etcdserver/, wal/*: Remove orphaned .snap.db files during Release Signed-off-by: Gyuho Lee <leegyuho@amazon.com>	2020-05-15 08:40:09 -07:00
Jingyi Hu	f1eca4e1fa	Merge pull request #11752 from tangcong/automated-cherry-pick-of-#11652-#11670-#11710-origin-release-3.4 Automated cherry pick of #11652 #11670 #11710	2020-04-10 23:21:45 +08:00
tangcong	eb80716532	etcdserver: print warn log when failed to apply request	2020-04-09 09:33:40 +08:00
tangcong	347c8dac3b	*: fix auth revision corruption bug	2020-04-09 09:33:36 +08:00
Changxin Miao	9c8554573f	etcdserver: watch stream got closed once one request is not permitted (#11708 )	2020-04-06 07:06:57 -07:00

1 2 3 4 5 ...

1939 Commits