Mirroristas/etcd

mirror of https://github.com/etcd-io/etcd.git synced 2024-09-27 06:25:44 +00:00

Author	SHA1	Message	Date
Marek Siarkowicz	35cbdf3961	server: Extract corruption detection to dedicated struct Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Benjamin Wang	5c8aa08e2c	move consistent_index forward when executing alarmList operation Cherry pick https://github.com/etcd-io/etcd/pull/14419 to 3.5. Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-09-06 12:48:06 +08:00
Benjamin Wang	acb1ee993a	Backport two lease related bug fixes to 3.5 The first bug fix is to resolve the race condition between goroutine and channel on the same leases to be revoked. It's a classic mistake in using Golang channel + goroutine. Please refer to https://go.dev/doc/effective_go#channels The second bug fix is to resolve the issue that etcd lessor may continue to schedule checkpoint after stepping down the leader role.	2022-06-04 14:01:08 +08:00
ahrtr	5c68f2e510	Update conssitent_index when applying fails When clients have no permission to perform whatever operation, then the applying may fail. We should also move consistent_index forward in this case, otherwise the consitent_index may smaller than the snapshot index.	2022-04-20 22:17:49 +08:00
ahrtr	4002aa51bd	set backend to cindex before recovering the lessor in applySnapshot	2022-04-12 15:56:14 +08:00
ahrtr	66c7aab4d3	fix the data inconsistency issue by adding a txPostLockHook into the backend Previously the SetConsistentIndex() is called during the apply workflow, but it's outside the db transaction. If a commit happens between SetConsistentIndex and the following apply workflow, and etcd crashes for whatever reason right after the commit, then etcd commits an incomplete transaction to db. Eventually etcd runs into the data inconsistency issue. In this commit, we move the SetConsistentIndex into a txPostLockHook, so it will be executed inside the transaction lock.	2022-04-08 20:37:34 +08:00
Piotr Tabor	73080a7166	Merge pull request #13501 from ahrtr/reset_ci_after_reload_db_3.5 [3.5] Set the backend again after recovering v3 backend from snapshot	2021-12-06 13:22:22 +01:00
Marek Siarkowicz	d00e89db2e	server: Require either cluster version v3.6 or --experimental-enable-lease-checkpoint-persist to persist lease remainingTTL To avoid inconsistant behavior during cluster upgrade we are feature gating persistance behind cluster version. This should ensure that all cluster members are upgraded to v3.6 before changing behavior. To allow backporting this fix to v3.5 we are also introducing flag --experimental-enable-lease-checkpoint-persist that will allow for smooth upgrade in v3.5 clusters with this feature enabled.	2021-12-02 16:54:10 +01:00
ahrtr	8c81598455	set the backend again after recovering v3 backend from snapshot	2021-11-25 05:45:20 +08:00
Piotr Tabor	c4ebac0c57	applyV2 should reapply on backend only once During review of: https://github.com/etcd-io/etcd/pull/12988 spotted that PUT is actially writing to v3-backend. If we are replaying WAL log, it might happened that backend's applied_index is > than the WAL's log entry. In such situation we should skip applying on backend V3. I think both the methods (setVersion, setMembersAttributes) are in practice idempotent so its not that 'serious' problem, but for formal correctness adding the proper checks.	2021-05-18 23:16:59 -07:00
Chao Chen	2042d2abc4	use v2 api to update cluster version	2021-05-17 13:50:38 -07:00
Piotr Tabor	ab586cd463	Persists Term in the (bbolt) Backend. Additional layer of protection, that allows to validate whether we start replaying log not only from the proper 'index', but also of the right 'term'.	2021-05-13 21:29:01 +02:00
Piotr Tabor	865df75714	Save raftpb.ConfState in the backend. This makes (bbolt) backend a full feature snapshot in term of WAL/raft, i.e. carries: - commit : (applied_index) - confState Benefits: - Backend will be a sufficient point in time definition sufficient to start replaying WAL. We have applied_index & confState in consistent state. - In case of emergency a backend state can be used for recovery	2021-05-13 14:29:36 +02:00
Piotr Tabor	ead81df948	Disallow -v2-deprecation>'not-yet' combined with --enable-v2	2021-05-12 18:09:34 +02:00
Marek Siarkowicz	efc8505739	etcdserver: Implement running defrag if freeable space will exceed privided threshold	2021-05-11 14:00:29 +02:00
Piotr Tabor	2fb6f0a74b	Simplify lease management after cindex update is moved to 'hooks'.	2021-05-04 18:21:23 +02:00
Piotr Tabor	2dbecea5b2	Simplify KVStore interaction with cindex thanks to hooks.	2021-05-04 18:21:23 +02:00
Piotr Tabor	fe3254aee3	Remove explicit authStore->ConsistencyIndex updates, as they are taken care by hook.	2021-05-04 15:38:23 +02:00
Piotr Tabor	50051675f9	Integrate backend::hooks with consistent_index. Every transaction committed to backend is writing most recent consistent_index. Makes sure that even automatically trigger commits of batch-transactions stays "really" consistent a.d. the most recent WAL log index applied.	2021-05-04 15:38:23 +02:00
Piotr Tabor	cedbea6c81	Merge pull request #12904 from wpedrak/limit_mlocked_memory server: replace mlockall with `Mlock` in `--experimental-memory-mlock`	2021-04-29 18:21:24 +02:00
wpedrak	927b3a3152	server: replace mlockall with `Mlock` in `--experimental-memory-mlock` Implementation of `--experimental-memory-mlock` backed by `mlockall` syscall is replaced by `Mlock` flag (backed by mlock syscall) of bboltDB.	2021-04-29 12:08:20 +02:00
Piotr Tabor	f53b70facb	Embed: In case KVStoreHash verification fails, close the backend. In case of failed verification, the server used to keep opened backend (so the file was locked on OS level).	2021-04-29 11:51:25 +02:00
Piotr Tabor	911204cd76	Fix `ETCDCTL_API=2 etcdctl backup --with-v3` consistent index consistency Prior to this CL, `ETCDCTL_API=2 etcdctl backup --with-v3` was readacting WAL log (by removal of some entries), but was NOT updating consistent_index in the backend. Also the WAL editing logic was buggy, as it didn't took in consideration the fact that when TERM changes, there can be entries with duplicated indexes in the log. So its NOT sufficient to subtract number of removed entries to get accurate log indexes. The PR replaces removing and shifting of WAL entries with replacing them with an no-op entries. Thanks to this consistent-index references are staying up to date. The PR also: - updates 'verification' logic to check whether consistent_index does not lag befor last snapshot - env-gated execution of verification framework in `etcdctl backup`. Tested with: ``` (./build.sh && cd tests && EXPECT_DEBUG=TRUE 'env' 'go' 'test' '-timeout=300m' 'go.etcd.io/etcd/tests/v3/e2e' -run=TestCtlV2Backup --count=1000 2>&1 \| tee TestCtlV2BackupV3.log) ```	2021-04-29 11:51:24 +02:00
Piotr Tabor	ea287dd9f8	Merge pull request #12854 from ptabor/20210410-shouldApplyV3 (no)StoreV2 (Part 3): Applying consistency fix: ClusterVersionSet (and co) might get not applied on v2store	2021-04-21 09:31:38 +02:00
Piotr Tabor	06ba0fc5a2	Merge pull request #12846 from pyiyun/fix-snaptmpfile-bug etcdserver: remove temp files in snap dir when etcdserver starting	2021-04-17 12:58:46 +02:00
pyiyun	28a490b09c	etcdserver: remove temp files in snap dir when etcdServer starting When etcd exits abnormally, tmp files will remain in snap dir, so clean up tmp files in snap dir when etcdserver starting. Fixes #12837	2021-04-16 20:30:04 +08:00
Piotr Tabor	d69e46ea47	Make ShouldApplyV3 an enum - not bool	2021-04-13 23:01:03 +02:00
Piotr Tabor	b1c04ce043	Applying consistency fix: ClusterVersionSet (and co) might get no applied on v2store ClusterVersionSet, ClusterMemberAttrSet, DowngradeInfoSet functions are writing both to V2store and backend. Prior this CL there were in a branch not executed if shouldApplyV3 was false, e.g. during restore when Backend is up-to-date (has high consistency-index) while v2store requires replay from WAL log. The most serious consequence of this bug was that v2store after restore could have different index (revision) than the same exact store before restore, so potentially different content between replicas. Also this change is supressing double-applying of Membership (ClusterConfig) changes on Backend (store v3) - that lackilly are not part of MVCC/KeyValue store, so they didn't caused Revisions to be bumped. Inspired by jingyih@ comment: https://github.com/etcd-io/etcd/pull/12820#issuecomment-815299406	2021-04-12 09:43:48 +02:00
wpedrak	3991a8c9fa	etcdserver: replace `forceVersionC` with `FirstCommitInTermNotify`	2021-04-09 11:30:42 +02:00
wpedrak	3d485faac5	etcdserver: resend ReadIndex request on empty apply request Empty apply indicates first commit in current term. It is first time when follower is sure, that it's ReadIndex request can be processed.	2021-04-09 11:30:42 +02:00
Piotr Tabor	931af493cf	Merge pull request #12830 from ptabor/20210405-split-pkg Split client/pkg as dedicated low-dependencies module for client	2021-04-08 01:12:17 +02:00
Piotr Tabor	3bb7acc8cf	Migrate dependencies pkg/foo -> client/pkg/foo	2021-04-07 00:38:47 +02:00
Piotr Tabor	44bd22307e	Merge get_logger() & Logger() method.	2021-03-14 14:05:17 +01:00
Piotr Tabor	fd7fed1511	Move config (ServerConfig) out of etcdserver package. Motivation: - ServerConfig is part of 'embed' public API, while etcdserver is more 'internal' - EtcdServer is already too big and config is pretty wide-spread leaf if we were to split etcdserver (e.g. into pre & post-apply part).	2021-03-11 20:56:22 +01:00
Piotr Tabor	a46a358577	--experimental-memory-mlock support The flag protects etcd memory from being swapped out to disk. This can happen in memory constrained systems where mmaped bbolt area is natural condidate for swapping out. This flag should provide better tail latency on the cost of higher RSS ram usage. If the experiment is successful, the logic should get moved into bbolt layer, where we can protect specific bbolt instances (e.g. avoid protecting both during defragmentation).	2021-03-07 12:32:57 +01:00
Maksim Buldukyan	7e38cfcc8d	raft: makes 'ConnReadTimeout/ConnWriteTimeout' customizable	2021-02-10 10:36:50 +07:00
Yanhao Mo	6d82778a4e	etcdserver: export method EtcdServer.leaderChangedNotify (#12378 )	2021-02-02 18:13:32 +08:00
Piotr Tabor	aaf423e962	server: Update imports. find -name '*.go' \| xargs sed -i --follow-symlinks 's\|etcd/v3/\|etcd/server/v3/\|g'	2020-10-26 13:02:32 +01:00
Piotr Tabor	4a5e9d1261	server: Move server files to 'server' directory. 26 git mv mvcc wal auth etcdserver etcdmain proxy embed/ lease/ server 36 git mv go.mod go.sum server	2020-10-26 12:57:19 +01:00

39 Commits