149 Commits

Author SHA1 Message Date
Siyuan Zhang
3565a822de Add VerifyTxConsistency to backend.
Signed-off-by: Siyuan Zhang <sizhang@google.com>

Update server/storage/backend/verify.go

Co-authored-by: Benjamin Wang <benjamin.wang@broadcom.com>

Update server/storage/backend/verify.go

Co-authored-by: Benjamin Wang <benjamin.wang@broadcom.com>
2024-02-22 11:31:16 -08:00
Ishan Tyagi
16a5e1da71 Added a error log when learner is not sync with etcd leader.
Signed-off-by: ishan16696 <ishan.tyagi@sap.com>
2024-01-30 15:42:11 +05:30
YaoC
f7ab7adf29 server: fix learner metric incorrect issue
Signed-off-by: YaoC <chengyao09@hotmail.com>
2024-01-12 09:36:33 +00:00
Marek Siarkowicz
a2eb17c809
Merge pull request #17199 from serathius/dont-flock
Don't flock snapshot files
2024-01-08 15:03:29 +01:00
Marek Siarkowicz
3471ef133d Add an e2e test and robustness failpoint around recovering from snapshot backend
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-01-04 15:25:24 +01:00
Marek Siarkowicz
7f8346b3f2 Don't flock snapshot files
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-01-04 14:53:44 +01:00
Marek Siarkowicz
1e8d66ef95 Add beforeOpenSnapshotBackend failpoint
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-12-20 15:36:54 +01:00
Benjamin Wang
67f17166bf Safeguard lease operations by double checking the leadership
1. ignore old leader's leases revoking request
2. double check current member's leadership before perform lease renew request
3. etcdserve: ensure current member's leadership before performing lease checkpoint request

Signed-off-by: Benjamin Wang <benjamin.ahrtr@gmail.com>
2023-12-15 17:53:36 +00:00
Benjamin Wang
36b2523669 added some log messages for better diagnosis
Signed-off-by: Benjamin Wang <benjamin.ahrtr@gmail.com>
2023-12-13 18:43:22 +00:00
Neil Shen
fb769c4306 server: ignore raft messages if member id mismatch
Ignore Raft messages when the `To` field mismatches the local member ID.
In cases where incorrect Raft messages are dispatched, potentially due
to a malfunctioning switch, this proactive check prevents panics,
such as "tocommit is out of range".

Signed-off-by: Neil Shen <overvenus@gmail.com>
2023-12-07 11:57:45 +08:00
Marek Siarkowicz
bc697bc26e Revert "Switch to validating v3 when v2 and v3 are synchronized"
This reverts commit 4fe46f92030e4381e6f9bf95adbb22a08282d297.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-12-03 18:12:09 +01:00
Marek Siarkowicz
03d551243b
Merge pull request #17015 from serathius/extract-membership-applier
Extract membership applier
2023-11-27 19:59:21 +01:00
Marek Siarkowicz
4fe46f9203 Switch to validating v3 when v2 and v3 are synchronized
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-11-24 17:46:33 +01:00
Marek Siarkowicz
2ad21558ac Remove shouldApplyV3 from the v3 applier
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-11-24 16:13:25 +01:00
Marek Siarkowicz
d22c00ccee Extract membership applier
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-11-24 15:57:15 +01:00
Marek Siarkowicz
7fdb33065d Move duplicated shouldApplyV3 logic up into apply method
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-11-24 10:21:14 +01:00
Marek Siarkowicz
093666f450 Cleanup v2 applier
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-11-23 15:41:13 +01:00
Marek Siarkowicz
c72ff1e69c Remove syncing the v2 store TTLs
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-11-23 14:55:01 +01:00
Marek Siarkowicz
dd7a4d28a8 Remove code used to make v2 proposals
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-11-19 22:39:33 +01:00
Marek Siarkowicz
b4fd31f254 Remove code for setting cluster version via V2 API
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-11-19 15:28:52 +01:00
Chao Chen
1324f03254 add existing http health check handler e2e test
Signed-off-by: Chao Chen <chaochn@amazon.com>
2023-10-18 12:42:23 -07:00
Benjamin Wang
628b45c099 test: add a test case to verify consistent memberlist on bootstrap
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-09-28 20:04:47 +01:00
Wei Fu
aa97484166 *: enable goimports in verify-lint
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-09-21 21:14:09 +08:00
chenyahui
c0aa3b613b Use any instead of interface{}
Signed-off-by: chenyahui <cyhone@qq.com>
2023-09-17 17:41:58 +08:00
Geeta Gharpure
8729417cee Preserve the order of steps done for snapshot
Signed-off-by: Geeta Gharpure <geetagh@amazon.com>
2023-08-22 19:12:37 +00:00
Geeta Gharpure
59332dc194 Update to generate v2 snapshot from v3 state
Signed-off-by: Geeta Gharpure <geetagh@amazon.com>
2023-08-21 19:18:11 +00:00
Jes Cok
52748f60f3 all: stop using math/rand.Seed
Fixes #16428.

Signed-off-by: Jes Cok <xigua67damn@gmail.com>
2023-08-20 16:34:44 +08:00
Chao Chen
6cdc9ae4fe server/etcdserver/raft.go:
1. rename confChangeCh to raftAdvancedC
2. rename waitApply to confChanged
3. add comments and test assertion

Signed-off-by: Chao Chen <chaochn@amazon.com>
2023-06-26 22:42:44 -07:00
Benjamin Wang
ad3b6ee4c6 etcdserver: wait for raft is notified on confChange before responding to client
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-06-26 13:40:51 -07:00
Geeta Gharpure
550aa152a7 Verify consistent index is latest at the time of snapshot
Signed-off-by: Geeta Gharpure <geetagh@amazon.com>
2023-06-19 16:00:04 +00:00
Chao Chen
f31d0eafb9 tests/e2e: add graceful shutdown test
Signed-off-by: Chao Chen <chaochn@amazon.com>
2023-05-09 17:08:53 -07:00
Chao Chen
caed563e08 fix flaking auth member remove test
Signed-off-by: Chao Chen <chaochn@amazon.com>
2023-04-03 17:41:08 -07:00
Wei Fu
22bdc91302 server/etcdserver: add log for terminating monitors
Adding log for terminating monitors is to make the debug easier.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-03-11 15:07:17 +08:00
James Blair
275e10bcf7
Return default snapshot count to 10,000.
The huge (100k+) value was justified when storev2 was being dumped completely with every snapshot.

With storev2 being decomissioned we can checkpoint more frequently for faster recovery.

Signed-off-by: James Blair <mail@jamesblair.net>
2023-03-06 20:21:03 +13:00
guozhao
de8d6b3792 etcdserver: use time.Ticker instead of time.After
Using time.After will create a new Timer in each cycle, In these cases
, it is better to use time.Ticker.

Signed-off-by: guozhao <guozhao@360.cn>
2023-01-17 16:58:13 +08:00
Benjamin Wang
8ed20e85d2 etcdserver: return membership.ErrIDNotFound when the memberID not found
When promoting a learner, we need to wait until the leader's applied ID
catches up to the commitId. Afterwards, check whether the learner ID
exist or not, and return `membership.ErrIDNotFound` directly in the API
if the member ID not found, to avoid the request being unnecessarily
delivered to raft.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-01-17 06:18:15 +08:00
Piotr Tabor
6f899a7b40
Merge pull request #15052 from ptabor/20221228-goimports-fix
./scripts/fix.sh: Takes care of goimports across the whole project.
2022-12-29 11:31:22 +01:00
Piotr Tabor
9e1abbab6e Fix goimports in all existing files. Execution of ./scripts/fix.sh
Signed-off-by: Piotr Tabor <ptab@google.com>
2022-12-29 09:41:31 +01:00
KiloG
101a2a61ea
etcdserver: fix typo in comment
etcdserver: fix typo in comment
2022-12-28 18:41:08 +08:00
Benjamin Wang
faff80a2b3 etcdserve: format the source code
gofmt -w ./server

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-12-02 13:00:59 +08:00
Benjamin Wang
e9aa275b36 etcdserver: update etcdserver to use the new raft module go.etcd.io/raft/v3
Just replaced all go.etcd.io/etcd/raft/v3 with go.etcd.io/raft/v3
under directory server.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-12-02 09:33:45 +08:00
Marek Siarkowicz
2b178fdd96 server: Handle cluster version equal downgrade version
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-10-17 12:05:57 +02:00
Benjamin Wang
cc840336f0 move consistent_index forward when executing alarmList operation
The alarm list is the only exception that doesn't move consistent_index
forward. The reproduction steps are as simple as,

```
etcd --snapshot-count=5 &
for i in {1..6}; do etcdctl  alarm list; done
kill -9 <etcd_pid>
etcd
```

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-09-05 10:05:55 +08:00
Marek Siarkowicz
d44bbff278 server: Make corrtuption check optional and period configurable
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-07-26 09:31:15 +02:00
Marek Siarkowicz
6697fca97d server: Implement compaction hash checking
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-07-26 09:31:14 +02:00
Marek Siarkowicz
c58ec9fe13 server: Refactor compaction checker
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-07-25 13:59:30 +02:00
Tsonglew
e5a80f5049
fix: typo gouroutine
fix: typo gouroutine
2022-06-16 16:35:06 +08:00
SimFG
d83925e357 schedule: Provide logs when the fifo job panic happens
To make the fifo scheduler better debuggability.

Signed-off-by: SimFG <1142838399@qq.com>
2022-06-15 20:58:17 +08:00
Marek Siarkowicz
7c35dadc25 server: Extract corruption detection to dedicated struct
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-06-13 18:19:24 +02:00
ahrtr
25deb436af fix the race condition between goroutine and channel on the same leases to be revoked 2022-05-25 16:44:41 +08:00