17239 Commits

Author SHA1 Message Date
Marek Siarkowicz
19002cfc68 version: bump up to 3.5.5 v3.5.5 tests/v3.5.5 etcdctl/v3.5.5 etcdutl/v3.5.5 server/v3.5.5 client/v3.5.5 client/v2.305.5 client/pkg/v3.5.5 raft/v3.5.5 pkg/v3.5.5 api/v3.5.5 2022-09-15 14:02:30 +02:00
Benjamin Wang
2ba1bab410
Merge pull request #14454 from ahrtr/fix_TestV3AuthRestartMember_20220913_3.5
[release-3.5] fix the flaky test TestV3AuthRestartMember
2022-09-13 19:40:00 +08:00
Benjamin Wang
2f1171ff66 fix the flaky test fix_TestV3AuthRestartMember_20220913 for 3.5
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-09-13 16:50:30 +08:00
Benjamin Wang
6c26693ebe
Merge pull request #14178 from lavacat/release-3.5-txn-panic
[3.5] server: don't panic in readonly serializable txn
2022-09-13 14:44:38 +08:00
Benjamin Wang
646ba66c5e
Merge pull request #14434 from tjungblu/bz_1918413_3.5
etcdctl: allow move-leader to connect to multiple endpoints
2022-09-08 17:58:03 +08:00
Thomas Jungblut
243b7a125b etcdctl: fix move-leader for multiple endpoints
Due to a duplicate call of clientConfigFromCmd, the move-leader command
would fail with "conflicting environment variable is shadowed by corresponding command-line flag".
Also in scenarios where no command-line flag was supplied.

Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>
2022-09-08 11:20:15 +02:00
Benjamin Wang
16d72c0b9b
Merge pull request #14440 from vsvastey/usr/vsvastey/open-with-max-index-test-fix-3.5
[release-3.5] testing: fix TestOpenWithMaxIndex cleanup
2022-09-08 16:59:46 +08:00
Vladimir Sokolov
eef5e220a6 testing: fix TestOpenWithMaxIndex cleanup
A WAL object was closed by defer, however the WAL was rewritten afterwards,
so defer closed already closed WAL but not the new one. It caused a data
race between writing file and cleaning up a temporary test directory,
which led to a non-deterministic bug.

Fixes #14332

Signed-off-by: Vladimir Sokolov <vsvastey@gmail.com>
2022-09-08 11:26:10 +03:00
Benjamin Wang
a5a33cbe4c
Merge pull request #14436 from serathius/arm64
[release-3.5] server/etcdmain: add build support for Apple M1
2022-09-08 13:18:34 +08:00
Hitoshi Mitake
bb3fae4f83
Merge pull request #14409 from vivekpatani/release-3.5
[release-3.5] server,test: refresh cache on each NewAuthStore
2022-09-08 14:08:35 +09:00
Vivek Patani
7639d93f15 server,test: refresh cache on each NewAuthStore
- permissions were incorrectly loaded on restarts.
- #14355
- Backport of https://github.com/etcd-io/etcd/pull/14358

Signed-off-by: vivekpatani <9080894+vivekpatani@users.noreply.github.com>
2022-09-07 10:22:05 -07:00
Dirkjan Bussink
c79f96d6ff server/etcdmain: add build support for Apple M1
This has been additionally verified by running the tests locally as a
basic smoke test. GitHub Actions doesn't provide MacOS M1 (arm64) yet,
so there's no good way to automate testing.

Ran `TMPDIR=/tmp make test` locally. The `TMPDIR` bit is needed so
there's no really long path used that breaks Unix socket setup in one of
the tests.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 17:25:34 +02:00
Marek Siarkowicz
ba52d5a063
Merge pull request #14282 from serathius/fix-checks-v3.5
Fix corruption checks v3.5
2022-09-07 16:29:46 +02:00
Marek Siarkowicz
2ddb9e0883 tests: Fix member id in CORRUPT alarm
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:56 +02:00
Marek Siarkowicz
5660bf0e7f server: Make corrtuption check optional and period configurable
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:56 +02:00
Marek Siarkowicz
21fb173f76 server: Implement compaction hash checking
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:56 +02:00
Marek Siarkowicz
a56ec0be4b tests: Cover periodic check in tests
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:56 +02:00
Marek Siarkowicz
4a75e3d52d server: Refactor compaction checker
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
8d4ca10ece tests: Move CorruptBBolt to testutil
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
a8020a0320 tests: Rename corruptHash to CorruptBBolt
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
037a898ba0 tests: Unify TestCompactionHash and extend it to also Delete keys and Defrag
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
00bc8da0ef tests: Add tests for HashByRev HTTP API
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
d3db3bc454 tests: Add integration tests for compact hash
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
1200b1006d server: Cache compaction hash for HashByRev API
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
7358362c99 server: Extract hasher to separate interface
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
631107285a server: Remove duplicated compaction revision
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
a3f609d742 server: Return revision range that hash was calcualted for
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
1ff59923d6 server: Store real rv range in hasher
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
991b429336 server: Move adjusting revision to hasher
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
2b8dd0de4e server: Pass revision as int
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
21e5d5d2b6 server: Calculate hash during compaction
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
f1a759a2c8 server: Fix range in mock not returning same number of keys and values
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
ea684db535 server: Move reading KV index inside scheduleCompaction function
Makes it easier to test hash match between scheduleCompaction and
HashByRev.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
22d3e4ebd7 server: Return error from scheduleCompaction
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
679e327d5e server: Refactor hasher
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
f5ed371885 server: Extract kvHash struct
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
3f26995f99 server: Move unsafeHashByRev to new hash.go file
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
bc592c7b01 server: Extract unsafeHashByRev function
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
336fef4ce2 server: Test HashByRev values to make sure they don't change
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
78a6f387cb server: Cover corruptionMonitor with tests
Get 100% coverage on InitialCheck and PeriodicCheck functions to avoid
any mistakes.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
35cbdf3961 server: Extract corruption detection to dedicated struct
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Marek Siarkowicz
d32de2c410 server: Extract triggerCorruptAlarm to function
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00
Benjamin Wang
204c0319a1
Merge pull request #14429 from ahrtr/alarm_list_ci_3.5
[3.5] Move consistent_index forward when executing alarmList operation
2022-09-06 15:17:13 +08:00
Benjamin Wang
5c8aa08e2c move consistent_index forward when executing alarmList operation
Cherry pick https://github.com/etcd-io/etcd/pull/14419 to 3.5.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-09-06 12:48:06 +08:00
Benjamin Wang
747bf5ceff
Merge pull request #14424 from serathius/one_member_data_loss_raft_3_5
[release-3.5] fix the potential data loss for clusters with only one member
2022-09-06 03:28:24 +08:00
Benjamin Wang
7eb696dfcd fix the potential data loss for clusters with only one member
For a cluster with only one member, the raft always send identical
unstable entries and committed entries to etcdserver, and etcd
responds to the client once it finishes (actually partially) the
applying workflow.

When the client receives the response, it doesn't mean etcd has already
successfully saved the data, including BoltDB and WAL, because:
   1. etcd commits the boltDB transaction periodically instead of on each request;
   2. etcd saves WAL entries in parallel with applying the committed entries.
Accordingly, it may run into a situation of data loss when the etcd crashes
immediately after responding to the client and before the boltDB and WAL
successfully save the data to disk.
Note that this issue can only happen for clusters with only one member.

For clusters with multiple members, it isn't an issue, because etcd will
not commit & apply the data before it being replicated to majority members.
When the client receives the response, it means the data must have been applied.
It further means the data must have been committed.
Note: for clusters with multiple members, the raft will never send identical
unstable entries and committed entries to etcdserver.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-09-05 14:26:24 +02:00
Marek Siarkowicz
fbb14f91bf
Merge pull request #14397 from biosvs/backport-grpc-proxy-endpoints-autosync
Backport of pull/14354 to release-3.5
2022-09-01 10:14:22 +02:00
Bogdan Kanivets
204d883904 [backport 3.5] server: don't panic in readonly serializable txn
Problem: We pass grpc context down to applier in readonly serializable txn.
This context can be cancelled for example due to timeout.
This will trigger panic inside applyTxn

Solution: Only panic for transactions with write operations

fixes https://github.com/etcd-io/etcd/issues/14110
main PR https://github.com/etcd-io/etcd/pull/14149

Signed-off-by: Bogdan Kanivets <bkanivets@apple.com>
2022-09-01 01:01:50 -07:00
Vitalii Levitskii
67e4c59e01 Backport of pull/14354 to 3.5.5
Signed-off-by: Vitalii Levitskii <vitalii@uber.com>
2022-08-29 15:58:17 +03:00
Benjamin Wang
74aa38ec10
Merge pull request #14366 from ahrtr/keepalive_3.5_20220820
[3.5] Refactor the keepAliveListener and keepAliveConn
2022-08-24 10:14:26 +08:00