Mirroristas/etcd

mirror of https://github.com/etcd-io/etcd.git synced 2024-09-27 06:25:44 +00:00

Author	SHA1	Message	Date
Kafuu Chino	dd983c662b	*: avoid closing a watch with ID 0 incorrectly Signed-off-by: Kafuu Chino <KafuuChinoQ@gmail.com> add test 1 1 1 1 1 1	2022-10-08 20:06:19 +08:00
Hitoshi Mitake	7b568f23ab	*: handle auth invalid token and old revision errors in watch Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>	2022-10-03 23:00:13 +09:00
Benjamin Wang	6c26693ebe	Merge pull request #14178 from lavacat/release-3.5-txn-panic [3.5] server: don't panic in readonly serializable txn	2022-09-13 14:44:38 +08:00
Marek Siarkowicz	2ddb9e0883	tests: Fix member id in CORRUPT alarm Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:56 +02:00
Marek Siarkowicz	5660bf0e7f	server: Make corrtuption check optional and period configurable Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:56 +02:00
Marek Siarkowicz	21fb173f76	server: Implement compaction hash checking Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:56 +02:00
Marek Siarkowicz	4a75e3d52d	server: Refactor compaction checker Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	00bc8da0ef	tests: Add tests for HashByRev HTTP API Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	1200b1006d	server: Cache compaction hash for HashByRev API Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	7358362c99	server: Extract hasher to separate interface Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	631107285a	server: Remove duplicated compaction revision Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	a3f609d742	server: Return revision range that hash was calcualted for Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	78a6f387cb	server: Cover corruptionMonitor with tests Get 100% coverage on InitialCheck and PeriodicCheck functions to avoid any mistakes. Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	35cbdf3961	server: Extract corruption detection to dedicated struct Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	d32de2c410	server: Extract triggerCorruptAlarm to function Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Benjamin Wang	5c8aa08e2c	move consistent_index forward when executing alarmList operation Cherry pick https://github.com/etcd-io/etcd/pull/14419 to 3.5. Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-09-06 12:48:06 +08:00
Benjamin Wang	7eb696dfcd	fix the potential data loss for clusters with only one member For a cluster with only one member, the raft always send identical unstable entries and committed entries to etcdserver, and etcd responds to the client once it finishes (actually partially) the applying workflow. When the client receives the response, it doesn't mean etcd has already successfully saved the data, including BoltDB and WAL, because: 1. etcd commits the boltDB transaction periodically instead of on each request; 2. etcd saves WAL entries in parallel with applying the committed entries. Accordingly, it may run into a situation of data loss when the etcd crashes immediately after responding to the client and before the boltDB and WAL successfully save the data to disk. Note that this issue can only happen for clusters with only one member. For clusters with multiple members, it isn't an issue, because etcd will not commit & apply the data before it being replicated to majority members. When the client receives the response, it means the data must have been applied. It further means the data must have been committed. Note: for clusters with multiple members, the raft will never send identical unstable entries and committed entries to etcdserver. Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-09-05 14:26:24 +02:00
Bogdan Kanivets	204d883904	[backport 3.5] server: don't panic in readonly serializable txn Problem: We pass grpc context down to applier in readonly serializable txn. This context can be cancelled for example due to timeout. This will trigger panic inside applyTxn Solution: Only panic for transactions with write operations fixes https://github.com/etcd-io/etcd/issues/14110 main PR https://github.com/etcd-io/etcd/pull/14149 Signed-off-by: Bogdan Kanivets <bkanivets@apple.com>	2022-09-01 01:01:50 -07:00
Benjamin Wang	437f3778d0	Add flag `--max-concurrent-streams` to set the max concurrent stream each client can open at a time Also refer to https://github.com/etcd-io/etcd/pull/14169#discussion_r917154243 Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-07-13 14:18:15 +08:00
Benjamin Wang	acb1ee993a	Backport two lease related bug fixes to 3.5 The first bug fix is to resolve the race condition between goroutine and channel on the same leases to be revoked. It's a classic mistake in using Golang channel + goroutine. Please refer to https://go.dev/doc/effective_go#channels The second bug fix is to resolve the issue that etcd lessor may continue to schedule checkpoint after stepping down the leader role.	2022-06-04 14:01:08 +08:00
ahrtr	5c68f2e510	Update conssitent_index when applying fails When clients have no permission to perform whatever operation, then the applying may fail. We should also move consistent_index forward in this case, otherwise the consitent_index may smaller than the snapshot index.	2022-04-20 22:17:49 +08:00
Marek Siarkowicz	003a310489	Merge pull request #13933 from ahrtr/fix_snapshot_recover_cindex_3.5 [3.5]Set backend to cindex before recovering the lessor in applySnapshot	2022-04-12 10:46:55 +02:00
ahrtr	4002aa51bd	set backend to cindex before recovering the lessor in applySnapshot	2022-04-12 15:56:14 +08:00
ahrtr	bc5307de95	support linearizable renew lease When etcdserver receives a LeaseRenew request, it may be still in progress of processing the LeaseGrantRequest on exact the same leaseID. Accordingly it may return a TTL=0 to client due to the leaseID not found error. So the leader should wait for the appliedID to be available before processing client requests.	2022-04-12 14:12:45 +08:00
ahrtr	66c7aab4d3	fix the data inconsistency issue by adding a txPostLockHook into the backend Previously the SetConsistentIndex() is called during the apply workflow, but it's outside the db transaction. If a commit happens between SetConsistentIndex and the following apply workflow, and etcd crashes for whatever reason right after the commit, then etcd commits an incomplete transaction to db. Eventually etcd runs into the data inconsistency issue. In this commit, we move the SetConsistentIndex into a txPostLockHook, so it will be executed inside the transaction lock.	2022-04-08 20:37:34 +08:00
Marek Siarkowicz	780ec338f0	server: Save consistency index and term to backend even when they decrease Reason to store CI and term in backend was to make db fully independent snapshot, it was never meant to interfere with apply logic. Skip of CI was introduced for v2->v3 migration where we wanted to prevent it from decreasing when replaying wal in https://github.com/etcd-io/etcd/pull/5391. By mistake it was added to apply flow during refactor in https://github.com/etcd-io/etcd/pull/12855#commitcomment-70713670. Consistency index and term should only be negotiated and used by raft to make decisions. Their values should only driven by raft state machine and backend should only be responsible for storing them.	2022-04-07 21:22:18 +02:00
ahrtr	7db1051774	enhance health check endpoint to support serializable request	2022-02-17 15:03:22 +08:00
Piotr Tabor	73080a7166	Merge pull request #13501 from ahrtr/reset_ci_after_reload_db_3.5 [3.5] Set the backend again after recovering v3 backend from snapshot	2021-12-06 13:22:22 +01:00
Marek Siarkowicz	d00e89db2e	server: Require either cluster version v3.6 or --experimental-enable-lease-checkpoint-persist to persist lease remainingTTL To avoid inconsistant behavior during cluster upgrade we are feature gating persistance behind cluster version. This should ensure that all cluster members are upgraded to v3.6 before changing behavior. To allow backporting this fix to v3.5 we are also introducing flag --experimental-enable-lease-checkpoint-persist that will allow for smooth upgrade in v3.5 clusters with this feature enabled.	2021-12-02 16:54:10 +01:00
ahrtr	8c81598455	set the backend again after recovering v3 backend from snapshot	2021-11-25 05:45:20 +08:00
Hitoshi Mitake	dec6f72d68	*: implement a retry logic for auth old revision in the client	2021-11-15 00:09:16 +09:00
Chao Chen	7d44a7cd6e	server/etcdserver/api/etcdhttp: exclude the same alarm type activated by multiple peers	2021-11-12 14:21:14 -08:00
Marek Siarkowicz	58d2b12a50	client: Add grpc authority header integration tests	2021-09-30 12:15:32 +02:00
Marek Siarkowicz	e68c7ab4bc	server: Ensure that adding and removing members handle storev2 and backend out of sync	2021-09-15 14:36:41 +02:00
tangcong	dfd2fea4c5	fix health endpoint not usable when authentication is enabled	2021-07-30 07:53:40 +08:00
J. David Lowe	e27effa250	etcdserver: don't attempt to grant nil permission to a role Prevent etcd from crashing when given a bad grant payload, e.g.: $ curl -d '{"name": "foo"}' http://localhost:2379/v3/auth/role/add {"header":{"cluster_id":"14841639068965178418", ... $ curl -d '{"name": "foo"}' http://localhost:2379/v3/auth/role/grant curl: (52) Empty reply from server Signed-off-by: Gyuho Lee <leegyuho@amazon.com>	2021-06-04 15:34:22 -07:00
J. David Lowe	ae194c1470	etcdserver: don't activate alarm w/missing AlarmType Narrowly prevent etcd from crashing when given a bad ACTIVATE payload, e.g.: $ curl -d "{\"action\":\"ACTIVATE\"}" ${ETCD}/v3/maintenance/alarm curl: (52) Empty reply from server	2021-06-04 14:21:04 -07:00
Wilson Wang	4563cebaa4	server: skip unnecessary sprintf which executes proto.Size() (cherry picked from commit 11edc76b15f0a1b8a3cf7e2d91f59978b93407b5)	2021-06-03 13:21:23 -07:00
Piotr Tabor	3f13d3a2d5	integration.BeforeTest can be run without leak-detection.	2021-05-28 10:01:36 +02:00
Piotr Tabor	e6baf6d751	Represent bucket as object instead of []byte name. Thanks to this change: - all the maps bucket -> buffer are indexed by int's instead of string. No need to do: byte[] -> string -> hash conversion on each access. - buckets are strongly typed in backend/mvcc API.	2021-05-25 09:22:25 +02:00
Piotr Tabor	c4ebac0c57	applyV2 should reapply on backend only once During review of: https://github.com/etcd-io/etcd/pull/12988 spotted that PUT is actially writing to v3-backend. If we are replaying WAL log, it might happened that backend's applied_index is > than the WAL's log entry. In such situation we should skip applying on backend V3. I think both the methods (setVersion, setMembersAttributes) are in practice idempotent so its not that 'serious' problem, but for formal correctness adding the proper checks.	2021-05-18 23:16:59 -07:00
Chao Chen	2042d2abc4	use v2 api to update cluster version	2021-05-17 13:50:38 -07:00
Piotr Tabor	85341e08f2	Merge pull request #12968 from serathius/logger-simplify server: Simplify passing logger setup by passing only logger	2021-05-15 15:58:00 +02:00
Marek Siarkowicz	41ed74824e	server: Simplify passing logger setup by passing only logger	2021-05-14 13:14:48 +02:00
Piotr Tabor	ab586cd463	Persists Term in the (bbolt) Backend. Additional layer of protection, that allows to validate whether we start replaying log not only from the proper 'index', but also of the right 'term'.	2021-05-13 21:29:01 +02:00
Piotr Tabor	e44fb40be5	Merge pull request #12962 from ptabor/20210513-write-conf-state Save raftpb.ConfState in the backend.	2021-05-13 19:22:28 +02:00
Gyuho Lee	e2d67f2e3b	Merge pull request #12956 from gyuho/rename-to-main *: rename "master" branch references to "main" in source code	2021-05-13 08:26:33 -07:00
Piotr Tabor	865df75714	Save raftpb.ConfState in the backend. This makes (bbolt) backend a full feature snapshot in term of WAL/raft, i.e. carries: - commit : (applied_index) - confState Benefits: - Backend will be a sufficient point in time definition sufficient to start replaying WAL. We have applied_index & confState in consistent state. - In case of emergency a backend state can be used for recovery	2021-05-13 14:29:36 +02:00
Piotr Tabor	3cb1ba4b2b	Merge pull request #12954 from serathius/logger-new-ctx-client client: Add logger argument to NewCtxClient	2021-05-13 09:03:38 +02:00
Gyuho Lee	77c8033739	server: rename "master" branch references Signed-off-by: Gyuho Lee <leegyuho@amazon.com>	2021-05-12 10:37:35 -07:00

1 2 3

137 Commits