658 Commits

Author SHA1 Message Date
Marek Siarkowicz
21fb173f76 server: Implement compaction hash checking
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:56 +02:00
Marek Siarkowicz
4a75e3d52d server: Refactor compaction checker
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
8d4ca10ece tests: Move CorruptBBolt to testutil
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
037a898ba0 tests: Unify TestCompactionHash and extend it to also Delete keys and Defrag
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
00bc8da0ef tests: Add tests for HashByRev HTTP API
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
1200b1006d server: Cache compaction hash for HashByRev API
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
7358362c99 server: Extract hasher to separate interface
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
631107285a server: Remove duplicated compaction revision
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
a3f609d742 server: Return revision range that hash was calcualted for
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
1ff59923d6 server: Store real rv range in hasher
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
991b429336 server: Move adjusting revision to hasher
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
2b8dd0de4e server: Pass revision as int
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
21e5d5d2b6 server: Calculate hash during compaction
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
f1a759a2c8 server: Fix range in mock not returning same number of keys and values
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
ea684db535 server: Move reading KV index inside scheduleCompaction function
Makes it easier to test hash match between scheduleCompaction and
HashByRev.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
22d3e4ebd7 server: Return error from scheduleCompaction
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
679e327d5e server: Refactor hasher
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
f5ed371885 server: Extract kvHash struct
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
3f26995f99 server: Move unsafeHashByRev to new hash.go file
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
bc592c7b01 server: Extract unsafeHashByRev function
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
336fef4ce2 server: Test HashByRev values to make sure they don't change
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
78a6f387cb server: Cover corruptionMonitor with tests
Get 100% coverage on InitialCheck and PeriodicCheck functions to avoid
any mistakes.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
35cbdf3961 server: Extract corruption detection to dedicated struct
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
d32de2c410 server: Extract triggerCorruptAlarm to function
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Benjamin Wang
5c8aa08e2c move consistent_index forward when executing alarmList operation
Cherry pick https://github.com/etcd-io/etcd/pull/14419 to 3.5.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-09-06 12:48:06 +08:00
Benjamin Wang
7eb696dfcd fix the potential data loss for clusters with only one member
For a cluster with only one member, the raft always send identical
unstable entries and committed entries to etcdserver, and etcd
responds to the client once it finishes (actually partially) the
applying workflow.

When the client receives the response, it doesn't mean etcd has already
successfully saved the data, including BoltDB and WAL, because:
   1. etcd commits the boltDB transaction periodically instead of on each request;
   2. etcd saves WAL entries in parallel with applying the committed entries.
Accordingly, it may run into a situation of data loss when the etcd crashes
immediately after responding to the client and before the boltDB and WAL
successfully save the data to disk.
Note that this issue can only happen for clusters with only one member.

For clusters with multiple members, it isn't an issue, because etcd will
not commit & apply the data before it being replicated to majority members.
When the client receives the response, it means the data must have been applied.
It further means the data must have been committed.
Note: for clusters with multiple members, the raft will never send identical
unstable entries and committed entries to etcdserver.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-09-05 14:26:24 +02:00
Vitalii Levitskii
67e4c59e01 Backport of pull/14354 to 3.5.5
Signed-off-by: Vitalii Levitskii <vitalii@uber.com>
2022-08-29 15:58:17 +03:00
Benjamin Wang
9ea5b1ba22 Refactor the keepAliveListener and keepAliveConn
Only `net.TCPConn` supports `SetKeepAlive` and `SetKeepAlivePeriod`
by default, so if you want to warp multiple layers of net.Listener,
the `keepaliveListener` should be the one which is closest to the
original `net.Listener` implementation, namely `TCPListener`.

Also refer to https://github.com/etcd-io/etcd/pull/14356

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-08-20 15:03:15 +08:00
Benjamin Wang
8fdca41cd8 Change default sampling rate from 100% to 0%
Refer to https://github.com/etcd-io/etcd/pull/14318

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-08-07 07:19:30 +08:00
Benjamin Wang
2751c61f24 update all related dependencies
Upgrade grpc to 1.41.0;
Run ./script/fix.sh to fix all related issue.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-08-07 07:17:27 +08:00
Benjamin Wang
5a86ae2c33 move setupTracing into a separate file config_tracing.go
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-08-07 07:17:27 +08:00
Benjamin Wang
2d7e49002c etcdserver: bump OpenTelemetry to 1.0.1
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-08-07 07:16:08 +08:00
Mike Dame
4c013c91e9
Change default sampling rate from 100% to 0%
This changes the default parent-based trace sampling rate from
100% to 0%. Due to the high QPS etcd can handle, having 100% trace
sampling leads to very high resource usage. Defaulting to 0% means
that only already-sampled traces will be sampled in etcd.

Fixes #14310

Signed-off-by: Mike Dame <mikedame@google.com>
2022-08-05 15:00:40 +00:00
Hitoshi Mitake
e15c005fef server/auth: protect rangePermCache with a RW lock
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
2022-07-19 15:56:12 +09:00
Benjamin Wang
437f3778d0 Add flag --max-concurrent-streams to set the max concurrent stream each client can open at a time
Also refer to https://github.com/etcd-io/etcd/pull/14169#discussion_r917154243

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-07-13 14:18:15 +08:00
Benjamin Wang
621cd7b9e5 restrict the max size of each WAL entry to the remaining size of the file
Currently the max size of each WAL entry is hard coded as 10MB. If users
set a value > 10MB for the flag --max-request-bytes, then etcd may run
into a situation that it successfully processes a big request, but fails
to decode it when replaying the WAL file on startup.

On the other hand, we can't just remove the limitation, because if a
WAL entry is somehow corrupted, and its recByte is a huge value, then
etcd may run out of memory. So the solution is to restrict the max size
of each WAL entry as a dynamic value, which is the remaining size of
the WAL file.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-06-17 09:01:29 +08:00
Benjamin Wang
acb1ee993a Backport two lease related bug fixes to 3.5
The first bug fix is to resolve the race condition between goroutine
and channel on the same leases to be revoked. It's a classic mistake
in using Golang channel + goroutine. Please refer to
https://go.dev/doc/effective_go#channels

The second bug fix is to resolve the issue that etcd lessor may
continue to schedule checkpoint after stepping down the leader role.
2022-06-04 14:01:08 +08:00
cfz
cceb25d758
server/auth: enable tokenProvider if recoved store enables auth
we found a lease leak issue:
if a new member(by member add) is recovered by snapshot, and then
become leader, the lease will never expire afterwards. leader will
log the revoke failure caused by "invalid auth token", since the
token provider is not functional, and drops all generated token
from upper layer, which in this case, is the lease revoking
routine.
2022-05-06 12:24:28 +08:00
Colleen Murphy
5c44c3022b Update golang.org/x/crypto to latest
Update crypto to address CVE-2022-27191.

The CVE fix is added in 0.0.0-20220315160706-3147a52a75dd but this
change updates to latest.
2022-04-28 09:27:02 -07:00
Marek Siarkowicz
08407ff760 version: bump up to 3.5.4 2022-04-24 12:44:36 +02:00
ahrtr
5c68f2e510 Update conssitent_index when applying fails
When clients have no permission to perform whatever operation, then
the applying may fail. We should also move consistent_index forward
in this case, otherwise the consitent_index may smaller than the
snapshot index.
2022-04-20 22:17:49 +08:00
Marek Siarkowicz
0452feec71 version: bump up to 3.5.3 2022-04-13 17:17:51 +02:00
Marek Siarkowicz
003a310489
Merge pull request #13933 from ahrtr/fix_snapshot_recover_cindex_3.5
[3.5]Set backend to cindex before recovering the lessor in applySnapshot
2022-04-12 10:46:55 +02:00
ahrtr
4002aa51bd set backend to cindex before recovering the lessor in applySnapshot 2022-04-12 15:56:14 +08:00
ahrtr
bc5307de95 support linearizable renew lease
When etcdserver receives a LeaseRenew request, it may be still in
progress of processing the LeaseGrantRequest on exact the same
leaseID. Accordingly it may return a TTL=0 to client due to the
leaseID not found error. So the leader should wait for the appliedID
to be available before processing client requests.
2022-04-12 14:12:45 +08:00
Marek Siarkowicz
383eceb885
Merge pull request #13669 from maxsokolovsky/upgrade-server-dependency-golang.org/x/crypto
etcdserver: upgrade the golang.org/x/crypto dependency
2022-04-09 09:44:05 +02:00
ahrtr
66c7aab4d3 fix the data inconsistency issue by adding a txPostLockHook into the backend
Previously the SetConsistentIndex() is called during the apply workflow,
but it's outside the db transaction. If a commit happens between SetConsistentIndex
and the following apply workflow, and etcd crashes for whatever reason right
after the commit, then etcd commits an incomplete transaction to db.
Eventually etcd runs into the data inconsistency issue.

In this commit, we move the SetConsistentIndex into a txPostLockHook, so
it will be executed inside the transaction lock.
2022-04-08 20:37:34 +08:00
Marek Siarkowicz
780ec338f0 server: Save consistency index and term to backend even when they decrease
Reason to store CI and term in backend was to make db fully independent
snapshot, it was never meant to interfere with apply logic. Skip of CI
was introduced for v2->v3 migration where we wanted to prevent it from
decreasing when replaying wal in
https://github.com/etcd-io/etcd/pull/5391. By mistake it was added to
apply flow during refactor in
https://github.com/etcd-io/etcd/pull/12855#commitcomment-70713670.

Consistency index and term should only be negotiated and used by raft to make
decisions. Their values should only driven by raft state machine and
backend should only be responsible for storing them.
2022-04-07 21:22:18 +02:00
Marek Siarkowicz
238b18c110
Merge pull request #13895 from mrueg/rel3.5-client_golang
[release-3.5] go.mod: Upgrade to prometheus/client_golang v1.11.1
2022-04-07 09:38:43 +02:00
Marek Siarkowicz
83538f342d server: Add verification of whether lock was called within out outside of apply 2022-04-06 11:22:51 +02:00