122 Commits

Author SHA1 Message Date
Benjamin Wang
fff5d00ccf
Merge pull request #14149 from lavacat/main-txn-panic
server: don't panic in readonly serializable txn
2022-08-14 05:41:57 +08:00
Bogdan Kanivets
43bb9d5c22 server: don't panic in readonly serializable txn
Problem: We pass grpc context down to applier in readonly serializable txn.
This context can be cancelled for example due to timeout.
This will trigger panic inside applyTxn

Solution: Only panic for transactions with write operations

fixes https://github.com/etcd-io/etcd/issues/14110

Signed-off-by: Bogdan Kanivets <bkanivets@apple.com>
2022-08-09 00:46:50 -07:00
Benjamin Wang
3dd7d3f9af enhance the WAL file related error
The `ErrFileNotFound` was used for for three cases:
1. There is no any WAL files (probably due to no read permission);
2. There is no WAL files matching the snapshot index;
3. The WAL file seqs do not increase continuously.

It's not good for debug when users see the `ErrFileNotFound` error,
so in this PR, a different error is returned for each case above.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-08-03 05:37:30 +08:00
Marek Siarkowicz
6697fca97d server: Implement compaction hash checking
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-07-26 09:31:14 +02:00
Marek Siarkowicz
264498258b tests: Move CorruptBBolt to testutil
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-07-25 13:59:30 +02:00
Benjamin Wang
33347b6845
Merge pull request #13954 from mitake/perm-cache-lock
server/auth: protect rangePermCache with a mutex
2022-07-03 07:01:46 +08:00
Hitoshi Mitake
de09174a3f server/auth: protect rangePermCache with a RW lock
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
2022-07-02 23:23:13 +09:00
Benjamin Wang
e0998f49e1
Merge pull request #14124 from wayblink/index-op
mvcc:add ut for Revisions/CountRevisions and remove RangeSince as it …
2022-06-21 12:41:25 +08:00
wayblink
428d21f5eb mvcc:add ut for Revisions/CountRevisions and remove RangeSince as it is not used
Signed-off-by: wayblink <wayasxxx@gmail.com>
2022-06-21 11:41:44 +08:00
Benjamin Wang
8038e876d3 replace ioutil with os package
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-06-17 10:11:30 +08:00
Benjamin Wang
ccf477d12b restrict the max size of each WAL entry to the remaining size of the file
Currently the max size of each WAL entry is hard coded as 10MB. If users
set a value > 10MB for the flag --max-request-bytes, then etcd may run
into a situation that it successfully processes a big request, but fails
to decode it when replaying the WAL file on startup.

On the other hand, we can't just remove the limitation, because if a
WAL entry is somehow corrupted, and its recByte is a huge value, then
etcd may run out of memory. So the solution is to restrict the max size
of each WAL entry as a dynamic value, which is the remaining size of
the WAL file.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-06-17 05:41:49 +08:00
SimFG
d83925e357 schedule: Provide logs when the fifo job panic happens
To make the fifo scheduler better debuggability.

Signed-off-by: SimFG <1142838399@qq.com>
2022-06-15 20:58:17 +08:00
Marek Siarkowicz
71bba3c761
Merge pull request #14106 from SimFG/remove_case
wal: remove the repeated test case
2022-06-14 12:13:00 +02:00
SimFG
402188a98c wal: remove the repeated test case
The TestFilePipelineFailLockFile case maybe test the line 77 in the alloc function. But the LockFile will not throw an error when the file lock is hold. And TestFilePipelineFailPreallocate and TestFilePipelineFailLockFile are exactly the same except for the comments.

Signed-off-by: SimFG <1142838399@qq.com>
2022-06-14 17:26:20 +08:00
Marek Siarkowicz
2c12954158
Merge pull request #14049 from serathius/compact-hash
Calculate hash during compaction
2022-06-14 11:10:19 +02:00
Marek Siarkowicz
9612fc1194 tests: Unify TestCompactionHash and extend it to also Delete keys and Defrag
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:20:19 +02:00
Marek Siarkowicz
0e739da9a4 server: Cache compaction hash for HashByRev API
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:20:18 +02:00
Marek Siarkowicz
2b090e86a6 server: Extract hasher to separate interface
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:20:18 +02:00
Marek Siarkowicz
80828b593a server: Remove duplicated compaction revision
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:20:18 +02:00
Marek Siarkowicz
34a02ba621 server: Return revision range that hash was calcualted for
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:20:17 +02:00
Marek Siarkowicz
76d3c527ae server: Store real rv range in hasher
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:26 +02:00
Marek Siarkowicz
8f7b053480 server: Move adjusting revision to hasher
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:26 +02:00
Marek Siarkowicz
19941654fe server: Pass revision as int
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:26 +02:00
Marek Siarkowicz
887e53e61a server: Calculate hash during compaction
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:26 +02:00
Marek Siarkowicz
7381c81a7e server: Fix range in mock not returning same number of keys and values
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:26 +02:00
Marek Siarkowicz
08a43178fa server: Move reading KV index inside scheduleCompaction function
Makes it easier to test hash match between scheduleCompaction and
HashByRev.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:26 +02:00
Marek Siarkowicz
638ca1006a server: Return error from scheduleCompaction
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:26 +02:00
Marek Siarkowicz
0f90359c4b server: Refactor hasher
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:25 +02:00
Marek Siarkowicz
e62c358793 server: Extract kvHash struct
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:25 +02:00
Marek Siarkowicz
fcaf76dbca server: Move unsafeHashByRev to new hash.go file
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:25 +02:00
Marek Siarkowicz
0984878ae7 server: Extract unsafeHashByRev function
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:25 +02:00
Marek Siarkowicz
39c6935c65 server: Test HashByRev values to make sure they don't change
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:25 +02:00
Marek Siarkowicz
7c35dadc25 server: Extract corruption detection to dedicated struct
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:24 +02:00
Benjamin Wang
5f8cd5bd39 Replace all etcd versions with the centralized definitions
We have already defined all the constant etcd versions in the
centralized place api/version/version.go. So we should replace all
the versions with the centralized definitions.
2022-06-13 14:08:39 +08:00
SimFG
b295cebc05 mvcc: improve the use of locks in index.go
To make the meaning of the RangeSince function in treeIndex clearer.
2022-06-09 20:35:02 +08:00
Piotr Tabor
b073129d03 Applier does not depend on EtcdServer any longer.
All the depencies are explicily passed to the UberApplier factory method.
2022-05-20 14:32:04 +02:00
Piotr Tabor
d69e07dd3a Verification framework and check whether cindex is not decreasing. 2022-04-22 12:32:05 +02:00
ahrtr
a3650db574 use readTx in (*store).restore 2022-04-08 15:45:05 +08:00
Marek Siarkowicz
1ea53d527e server: Save consistency index and term to backend even when they decrease
Reason to store CI and term in backend was to make db fully independent
snapshot, it was never meant to interfere with apply logic. Skip of CI
was introduced for v2->v3 migration where we wanted to prevent it from
decreasing when replaying wal in
https://github.com/etcd-io/etcd/pull/5391. By mistake it was added to
apply flow during refactor in
https://github.com/etcd-io/etcd/pull/12855#commitcomment-70713670.

Consistency index and term should only be negotiated and used by raft to make
decisions. Their values should only driven by raft state machine and
backend should only be responsible for storing them.
2022-04-07 19:00:03 +02:00
ahrtr
4033f5c2b9 move the consistentIdx and consistentTerm from Etcdserver to cindex package
Removed the fields consistentIdx and consistentTerm from struct EtcdServer,
and added applyingIndex and applyingTerm into struct consistentIndex in
package cindex. We may remove the two fields completely if we decide to
remove the OnPreCommitUnsafe, and it will depend on the performance test
result.
2022-04-07 15:16:49 +08:00
ahrtr
e155e50886 rename LockWithoutHook to LockOutsideApply and add LockInsideApply 2022-04-07 05:35:13 +08:00
ahrtr
7ac995cdde enhanced authBackend to support authReadTx 2022-04-07 05:35:13 +08:00
ahrtr
a4c5da844d added detailed comment to explain the difference between Lock and LockWithoutHook 2022-04-07 05:35:13 +08:00
ahrtr
bfd5170f66 add a txPostLockHook into the backend
Previously the SetConsistentIndex() is called during the apply workflow,
but it's outside the db transaction. If a commit happens between SetConsistentIndex
and the following apply workflow, and etcd crashes for whatever reason right
after the commit, then etcd commits an incomplete transaction to db.
Eventually etcd runs into the data inconsistency issue.

In this commit, we move the SetConsistentIndex into a txPostLockHook, so
it will be executed inside the transaction lock.
2022-04-07 05:35:13 +08:00
Marek Siarkowicz
ad03f2076a
Merge pull request #13886 from serathius/backend-logger
tests: Pass logger to backend
2022-04-05 16:35:07 +02:00
Marek Siarkowicz
73fc864247 tests: Pass logger to backend 2022-04-05 15:53:38 +02:00
Marek Siarkowicz
1d3517020b server: Add verification of whether lock was called within out outside of apply 2022-04-05 15:34:45 +02:00
Piotr Tabor
6c974a3e31
Merge pull request #13867 from serathius/logs-test
tests: Use zaptest.NewLogger in tests
2022-04-04 14:47:04 +02:00
Marek Siarkowicz
804fddf921 tests: Use zaptest.NewLogger in tests 2022-04-04 13:03:15 +02:00
ahrtr
836bd6bc3a fix WARNING: DATA RACE issue when multiple goroutines access the backend concurrently 2022-04-03 06:13:09 +08:00