86 Commits

Author SHA1 Message Date
Ivan Valdes
9e1dadd748
etcdutl: Fix snapshot restore memory alloc issue
When running the snapshot command, allow receiving an initial memory map
allocation for the database, avoiding future memory allocation issues.

Backports commit: be2883321240340479f9e334bc4a3b2959ad7639 / PR: #17277

Co-authored-by: Fatih USTA <fatihusta86@gmail.com>
Signed-off-by: Ivan Valdes <ivan@vald.es>
2024-05-09 21:41:23 -04:00
Ivan Valdes
6abc349dc9
server: Implement WithMmapSize option for backend config
Accept a third argument for NewDefaultBackend for overrides to the
BackendConfig.
Add a new function, WithMmapSize, which modifies the backend config to
provide a custom InitiamMmapSize.

Backports commit: d69adf45f9e95661966204460f019b8164aaef12 / PR: #17277

Signed-off-by: Ivan Valdes <ivan@vald.es>
2024-05-09 21:41:16 -04:00
Wei Fu
0af22abc6c server/mvcc: should update currentRev in revMu
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2024-04-24 12:28:20 +08:00
Wei Fu
c06b17b9ff server/storage: update currentRev if scheduledCompact > currentRev
Signed-off-by: Wei Fu <fuweid89@gmail.com>
(cherry picked from commit 9ea234913a99670d18b66aa23915781f89713177)
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2024-04-24 12:25:19 +08:00
Wei Fu
6b034466aa server/mvcc: introduce compactBeforeSetFinishedCompact failpoint
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2024-04-24 12:14:27 +08:00
Benjamin Wang
adf1c3f291 Update the compaction log when bootstrap and update compact's signature
Actually the compact() never return an error, so remove the second return
parameter.

Signed-off-by: Benjamin Wang <benjamin.ahrtr@gmail.com>
2024-04-21 13:17:07 +01:00
Chao Chen
46d2caae1e [release-3.5] backport fix watch event loss after compaction
Signed-off-by: Chao Chen <chaochn@amazon.com>
2024-03-18 18:41:17 -07:00
Allen Ray
3d64877dc2 [3.5] Update to go1.21
Signed-off-by: Allen Ray <alray@redhat.com>
2024-02-02 14:25:53 -05:00
Benjamin Wang
0f494e02fa test: fix TestHashKVWhenCompacting: ensure all goroutine finished
Signed-off-by: Benjamin Wang <benjamin.ahrtr@gmail.com>
2024-01-25 13:32:03 +00:00
Rahul More
2c3b614090 mvcc: Printing etcd backend database related metrics inside scheduleCompaction function
Backporting commit 21bbc82 in etcd 3.5
To improve traceability of backend database usage, Added below parameter
related to backend database usage metrics inside scheduledCompaction
function.
current-db-size-bytes
current-db-size
current-db-size-in-use-bytes
current-db-size-in-use

Signed-off-by: Rahul More <rahulbapumore@gmail.com>
2024-01-22 19:10:20 +05:30
Siyuan Zhang
2d531a300a commit bbolt transaction if there is any pending deleting operations
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2024-01-11 13:34:14 -08:00
Siyuan Zhang
f219ab445e add tests to test tx delete consistency.
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2024-01-11 13:29:54 -08:00
Siyuan Zhang
b8d5e79fc1 [3.5] backport health check e2e tests.
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2023-12-07 09:51:39 -08:00
caojiamingalan
eb9bfaa983 Follow up https://github.com/etcd-io/etcd/pull/16068#discussion_r1263667496
Add a UnsafeReadScheduledCompact and UnsafeReadFinishedCompact

Signed-off-by: caojiamingalan <alan.c.19971111@gmail.com>
2023-07-18 10:54:16 -05:00
caojiamingalan
6ac9d94d67 etcdserver: backport check scheduledCompactKeyName and finishedCompactKeyName before writing hash to release-3.5.
Fix #15919.
Check ScheduledCompactKeyName and FinishedCompactKeyName
before writing hash to hashstore.
If they do not match, then it means this compaction has once been interrupted and its hash value is invalid. In such cases, we won't write the hash values to the hashstore, and avoids the incorrect corruption alarm.

Signed-off-by: caojiamingalan <alan.c.19971111@gmail.com>
2023-07-14 19:22:38 -05:00
Thomas Jungblut
4425ef572e Adding optional revision bump and mark compacted to snapshot restore
Signed-off-by: Allen Ray <alray@redhat.com>
Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>
2023-07-03 12:57:12 +02:00
kkkkun
8cffdbafba etcdserver: fix corruption check when server has just been compacted
Signed-off-by: kkkkun <scuzk373x@gmail.com>
2023-06-11 22:27:02 +08:00
Benjamin Wang
cd019255ba etcdserver: Guarantee order of requested progress notifications
Progress notifications requested using ProgressRequest were sent
directly using the ctrlStream, which means that they could race
against watch responses in the watchStream.

This would especially happen when the stream was not synced - e.g. if
you requested a progress notification on a freshly created unsynced
watcher, the notification would typically arrive indicating a revision
for which not all watch responses had been sent.

This changes the behaviour so that v3rpc always goes through the watch
stream, using a new RequestProgressAll function that closely matches
the behaviour of the v3rpc code - i.e.

1. Generate a message with WatchId -1, indicating the revision for
   *all* watchers in the stream

2. Guarantee that a response is (eventually) sent

The latter might require us to defer the response until all watchers
are synced, which is likely as it should be. Note that we do *not*
guarantee that the number of progress notifications matches the number
of requests, only that eventually at least one gets sent.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-04-11 09:51:48 +08:00
Marek Siarkowicz
92e56ab61e server: Test watch restore
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-20 12:04:43 +01:00
Bogdan Kanivets
dafdaaedf2 mvcc: update minRev when watcher stays synced
Problem: during restore in watchableStore.Restore, synced watchers are moved to unsynced.
minRev will be behind since it's not updated when watcher stays synced.

Solution: update minRev

fixes: https://github.com/etcd-io/etcd/issues/15271
Signed-off-by: Bogdan Kanivets <bkanivets@apple.com>
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-03-20 12:02:49 +01:00
James Blair
1ea808b5ba
Backport go_srcs_in_module changes and fix goword failures.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-02-24 22:01:41 +13:00
James Blair
183af509f6
Formatted source code for go 1.19.6.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-02-20 21:33:59 +13:00
Benjamin Wang
563713e128 etcdserver: call the OnPreCommitUnsafe in unsafeCommit
`unsafeCommit` is called by both `(*batchTxBuffered) commit` and
`(*backend) defrag`. When users perform the defragmentation
operation, etcd doesn't update the consistent index. If etcd
crashes(e.g. panicking) in the process for whatever reason, then
etcd replays the WAL entries starting from the latest snapshot,
accordingly it may re-apply entries which might have already been
applied, eventually the revision isn't consistent with other members.

Refer to discussion in https://github.com/etcd-io/etcd/pull/14685

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-11-11 17:35:26 +08:00
Cenk Alti
be4adc0c55
server: add more context to panic message
Signed-off-by: Cenk Alti <cenkalti@gmail.com>
2022-11-01 19:02:32 -04:00
Kafuu Chino
dd983c662b *: avoid closing a watch with ID 0 incorrectly
Signed-off-by: Kafuu Chino <KafuuChinoQ@gmail.com>

add test

1

1

1

1

1

1
2022-10-08 20:06:19 +08:00
Benjamin Wang
6c26693ebe
Merge pull request #14178 from lavacat/release-3.5-txn-panic
[3.5] server: don't panic in readonly serializable txn
2022-09-13 14:44:38 +08:00
Marek Siarkowicz
21fb173f76 server: Implement compaction hash checking
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:56 +02:00
Marek Siarkowicz
037a898ba0 tests: Unify TestCompactionHash and extend it to also Delete keys and Defrag
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
1200b1006d server: Cache compaction hash for HashByRev API
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
7358362c99 server: Extract hasher to separate interface
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
631107285a server: Remove duplicated compaction revision
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
a3f609d742 server: Return revision range that hash was calcualted for
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
1ff59923d6 server: Store real rv range in hasher
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
991b429336 server: Move adjusting revision to hasher
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
2b8dd0de4e server: Pass revision as int
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
21e5d5d2b6 server: Calculate hash during compaction
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
f1a759a2c8 server: Fix range in mock not returning same number of keys and values
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
ea684db535 server: Move reading KV index inside scheduleCompaction function
Makes it easier to test hash match between scheduleCompaction and
HashByRev.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
22d3e4ebd7 server: Return error from scheduleCompaction
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
679e327d5e server: Refactor hasher
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
f5ed371885 server: Extract kvHash struct
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
3f26995f99 server: Move unsafeHashByRev to new hash.go file
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
bc592c7b01 server: Extract unsafeHashByRev function
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
336fef4ce2 server: Test HashByRev values to make sure they don't change
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
35cbdf3961 server: Extract corruption detection to dedicated struct
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Bogdan Kanivets
204d883904 [backport 3.5] server: don't panic in readonly serializable txn
Problem: We pass grpc context down to applier in readonly serializable txn.
This context can be cancelled for example due to timeout.
This will trigger panic inside applyTxn

Solution: Only panic for transactions with write operations

fixes https://github.com/etcd-io/etcd/issues/14110
main PR https://github.com/etcd-io/etcd/pull/14149

Signed-off-by: Bogdan Kanivets <bkanivets@apple.com>
2022-09-01 01:01:50 -07:00
ahrtr
66c7aab4d3 fix the data inconsistency issue by adding a txPostLockHook into the backend
Previously the SetConsistentIndex() is called during the apply workflow,
but it's outside the db transaction. If a commit happens between SetConsistentIndex
and the following apply workflow, and etcd crashes for whatever reason right
after the commit, then etcd commits an incomplete transaction to db.
Eventually etcd runs into the data inconsistency issue.

In this commit, we move the SetConsistentIndex into a txPostLockHook, so
it will be executed inside the transaction lock.
2022-04-08 20:37:34 +08:00
Marek Siarkowicz
83538f342d server: Add verification of whether lock was called within out outside of apply 2022-04-06 11:22:51 +02:00
Bogdan Kanivets
631fa6fd65 server/storage/backend: restore original bolt db options after defrag
Problem: Defrag was implemented before custom bolt options were added.
Currently defrag doesn't restore backend options.
For example BackendFreelistType will be unset after defrag.

Solution: save bolt db options and use them in defrag.
2022-02-15 10:56:07 -08:00
leoyang.yl
55c16df997 fix runlock bug 2021-12-16 15:58:41 +00:00