Mirroristas/etcd

mirror of https://github.com/etcd-io/etcd.git synced 2024-09-27 06:25:44 +00:00

Author	SHA1	Message	Date
Marek Siarkowicz	21fb173f76	server: Implement compaction hash checking Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:56 +02:00
Marek Siarkowicz	4a75e3d52d	server: Refactor compaction checker Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	8d4ca10ece	tests: Move CorruptBBolt to testutil Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	037a898ba0	tests: Unify TestCompactionHash and extend it to also Delete keys and Defrag Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	00bc8da0ef	tests: Add tests for HashByRev HTTP API Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	1200b1006d	server: Cache compaction hash for HashByRev API Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	7358362c99	server: Extract hasher to separate interface Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	631107285a	server: Remove duplicated compaction revision Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	a3f609d742	server: Return revision range that hash was calcualted for Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	1ff59923d6	server: Store real rv range in hasher Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	991b429336	server: Move adjusting revision to hasher Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	2b8dd0de4e	server: Pass revision as int Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	21e5d5d2b6	server: Calculate hash during compaction Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	f1a759a2c8	server: Fix range in mock not returning same number of keys and values Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	ea684db535	server: Move reading KV index inside scheduleCompaction function Makes it easier to test hash match between scheduleCompaction and HashByRev. Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	22d3e4ebd7	server: Return error from scheduleCompaction Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	679e327d5e	server: Refactor hasher Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	f5ed371885	server: Extract kvHash struct Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	3f26995f99	server: Move unsafeHashByRev to new hash.go file Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	bc592c7b01	server: Extract unsafeHashByRev function Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	336fef4ce2	server: Test HashByRev values to make sure they don't change Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	78a6f387cb	server: Cover corruptionMonitor with tests Get 100% coverage on InitialCheck and PeriodicCheck functions to avoid any mistakes. Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	35cbdf3961	server: Extract corruption detection to dedicated struct Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Marek Siarkowicz	d32de2c410	server: Extract triggerCorruptAlarm to function Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-09-07 15:11:55 +02:00
Benjamin Wang	5c8aa08e2c	move consistent_index forward when executing alarmList operation Cherry pick https://github.com/etcd-io/etcd/pull/14419 to 3.5. Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-09-06 12:48:06 +08:00
Benjamin Wang	7eb696dfcd	fix the potential data loss for clusters with only one member For a cluster with only one member, the raft always send identical unstable entries and committed entries to etcdserver, and etcd responds to the client once it finishes (actually partially) the applying workflow. When the client receives the response, it doesn't mean etcd has already successfully saved the data, including BoltDB and WAL, because: 1. etcd commits the boltDB transaction periodically instead of on each request; 2. etcd saves WAL entries in parallel with applying the committed entries. Accordingly, it may run into a situation of data loss when the etcd crashes immediately after responding to the client and before the boltDB and WAL successfully save the data to disk. Note that this issue can only happen for clusters with only one member. For clusters with multiple members, it isn't an issue, because etcd will not commit & apply the data before it being replicated to majority members. When the client receives the response, it means the data must have been applied. It further means the data must have been committed. Note: for clusters with multiple members, the raft will never send identical unstable entries and committed entries to etcdserver. Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-09-05 14:26:24 +02:00
Vitalii Levitskii	67e4c59e01	Backport of pull/14354 to 3.5.5 Signed-off-by: Vitalii Levitskii <vitalii@uber.com>	2022-08-29 15:58:17 +03:00
Benjamin Wang	9ea5b1ba22	Refactor the keepAliveListener and keepAliveConn Only `net.TCPConn` supports `SetKeepAlive` and `SetKeepAlivePeriod` by default, so if you want to warp multiple layers of net.Listener, the `keepaliveListener` should be the one which is closest to the original `net.Listener` implementation, namely `TCPListener`. Also refer to https://github.com/etcd-io/etcd/pull/14356 Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-08-20 15:03:15 +08:00
Benjamin Wang	8fdca41cd8	Change default sampling rate from 100% to 0% Refer to https://github.com/etcd-io/etcd/pull/14318 Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-08-07 07:19:30 +08:00
Benjamin Wang	2751c61f24	update all related dependencies Upgrade grpc to 1.41.0; Run ./script/fix.sh to fix all related issue. Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-08-07 07:17:27 +08:00
Benjamin Wang	5a86ae2c33	move setupTracing into a separate file config_tracing.go Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-08-07 07:17:27 +08:00
Benjamin Wang	2d7e49002c	etcdserver: bump OpenTelemetry to 1.0.1 Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-08-07 07:16:08 +08:00
Mike Dame	4c013c91e9	Change default sampling rate from 100% to 0% This changes the default parent-based trace sampling rate from 100% to 0%. Due to the high QPS etcd can handle, having 100% trace sampling leads to very high resource usage. Defaulting to 0% means that only already-sampled traces will be sampled in etcd. Fixes #14310 Signed-off-by: Mike Dame <mikedame@google.com>	2022-08-05 15:00:40 +00:00
Hitoshi Mitake	e15c005fef	server/auth: protect rangePermCache with a RW lock Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>	2022-07-19 15:56:12 +09:00
Benjamin Wang	437f3778d0	Add flag `--max-concurrent-streams` to set the max concurrent stream each client can open at a time Also refer to https://github.com/etcd-io/etcd/pull/14169#discussion_r917154243 Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-07-13 14:18:15 +08:00
Benjamin Wang	621cd7b9e5	restrict the max size of each WAL entry to the remaining size of the file Currently the max size of each WAL entry is hard coded as 10MB. If users set a value > 10MB for the flag --max-request-bytes, then etcd may run into a situation that it successfully processes a big request, but fails to decode it when replaying the WAL file on startup. On the other hand, we can't just remove the limitation, because if a WAL entry is somehow corrupted, and its recByte is a huge value, then etcd may run out of memory. So the solution is to restrict the max size of each WAL entry as a dynamic value, which is the remaining size of the WAL file. Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-06-17 09:01:29 +08:00
Benjamin Wang	acb1ee993a	Backport two lease related bug fixes to 3.5 The first bug fix is to resolve the race condition between goroutine and channel on the same leases to be revoked. It's a classic mistake in using Golang channel + goroutine. Please refer to https://go.dev/doc/effective_go#channels The second bug fix is to resolve the issue that etcd lessor may continue to schedule checkpoint after stepping down the leader role.	2022-06-04 14:01:08 +08:00
cfz	cceb25d758	server/auth: enable tokenProvider if recoved store enables auth we found a lease leak issue: if a new member(by member add) is recovered by snapshot, and then become leader, the lease will never expire afterwards. leader will log the revoke failure caused by "invalid auth token", since the token provider is not functional, and drops all generated token from upper layer, which in this case, is the lease revoking routine.	2022-05-06 12:24:28 +08:00
Colleen Murphy	5c44c3022b	Update golang.org/x/crypto to latest Update crypto to address CVE-2022-27191. The CVE fix is added in 0.0.0-20220315160706-3147a52a75dd but this change updates to latest.	2022-04-28 09:27:02 -07:00
Marek Siarkowicz	08407ff760	version: bump up to 3.5.4	2022-04-24 12:44:36 +02:00
ahrtr	5c68f2e510	Update conssitent_index when applying fails When clients have no permission to perform whatever operation, then the applying may fail. We should also move consistent_index forward in this case, otherwise the consitent_index may smaller than the snapshot index.	2022-04-20 22:17:49 +08:00
Marek Siarkowicz	0452feec71	version: bump up to 3.5.3	2022-04-13 17:17:51 +02:00
Marek Siarkowicz	003a310489	Merge pull request #13933 from ahrtr/fix_snapshot_recover_cindex_3.5 [3.5]Set backend to cindex before recovering the lessor in applySnapshot	2022-04-12 10:46:55 +02:00
ahrtr	4002aa51bd	set backend to cindex before recovering the lessor in applySnapshot	2022-04-12 15:56:14 +08:00
ahrtr	bc5307de95	support linearizable renew lease When etcdserver receives a LeaseRenew request, it may be still in progress of processing the LeaseGrantRequest on exact the same leaseID. Accordingly it may return a TTL=0 to client due to the leaseID not found error. So the leader should wait for the appliedID to be available before processing client requests.	2022-04-12 14:12:45 +08:00
Marek Siarkowicz	383eceb885	Merge pull request #13669 from maxsokolovsky/upgrade-server-dependency-golang.org/x/crypto etcdserver: upgrade the golang.org/x/crypto dependency	2022-04-09 09:44:05 +02:00
ahrtr	66c7aab4d3	fix the data inconsistency issue by adding a txPostLockHook into the backend Previously the SetConsistentIndex() is called during the apply workflow, but it's outside the db transaction. If a commit happens between SetConsistentIndex and the following apply workflow, and etcd crashes for whatever reason right after the commit, then etcd commits an incomplete transaction to db. Eventually etcd runs into the data inconsistency issue. In this commit, we move the SetConsistentIndex into a txPostLockHook, so it will be executed inside the transaction lock.	2022-04-08 20:37:34 +08:00
Marek Siarkowicz	780ec338f0	server: Save consistency index and term to backend even when they decrease Reason to store CI and term in backend was to make db fully independent snapshot, it was never meant to interfere with apply logic. Skip of CI was introduced for v2->v3 migration where we wanted to prevent it from decreasing when replaying wal in https://github.com/etcd-io/etcd/pull/5391. By mistake it was added to apply flow during refactor in https://github.com/etcd-io/etcd/pull/12855#commitcomment-70713670. Consistency index and term should only be negotiated and used by raft to make decisions. Their values should only driven by raft state machine and backend should only be responsible for storing them.	2022-04-07 21:22:18 +02:00
Marek Siarkowicz	238b18c110	Merge pull request #13895 from mrueg/rel3.5-client_golang [release-3.5] go.mod: Upgrade to prometheus/client_golang v1.11.1	2022-04-07 09:38:43 +02:00
Marek Siarkowicz	83538f342d	server: Add verification of whether lock was called within out outside of apply	2022-04-06 11:22:51 +02:00

1 2 3 4 5 ...

658 Commits