180 Commits

Author SHA1 Message Date
Benjamin Wang
b00646cb6e print error log when validation on conf change failed
Signed-off-by: Benjamin Wang <benjamin.ahrtr@gmail.com>
2024-06-06 19:28:59 +01:00
Thomas Jungblut
cee181d1ab v3rpc: run health notifier to listen on online defrag state change
Backport from 3.6 in #16836

Co-authored-by: Chao Chen <chaochn@amazon.com>
Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>
2024-05-06 10:03:08 +02:00
Thomas Jungblut
750bc0b1e4 gRPC health server sets serving status to NOT_SERVING on defrag
gRPC health server sets serving status to NOT_SERVING on defrag
Backport from 3.6 in #16278

Co-authored-by: Chao Chen <chaochn@amazon.com>
Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>
2024-04-30 15:09:51 +02:00
Chun-Hung Tseng
9331ee32e1
[backport-3.5] server: ignore raft messages if member id mismatch #17078
Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com>
2024-04-17 13:50:23 +02:00
Wei Fu
94a1d0c1b5 *: LeaseTimeToLive returns error if leader changed
The old leader demotes lessor and all the leases' expire time will be
updated. Instead of returning incorrect remaining TTL, we should return
errors to force client retry.

Cherry-pick: d3bb6f688b4643155b4a9924cec726bdc76a1306

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2024-04-04 22:33:05 +08:00
Marek Siarkowicz
579b22cf3a Fix progress notification for watch that doesn't get any events
When implementing the fix for progress notifications
(https://github.com/etcd-io/etcd/pull/15237) we made a incorrect
assumption that that unsynched watches will always get at least one event.

Unsynched watches include not only slow watchers, but also newly created
watches that requested current or older revision. In case that non of the events
match watch filter, those newly created watches might become synched
without any event going through.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-03-11 20:18:26 +01:00
Benjamin Wang
9ffba74e66
Merge pull request #17425 from ivanvc/release-3.5-backport-ignore-old-leaders-leases-revoking-request
[3.5] backport ignore old leaders leases revoking requests
2024-02-18 14:40:20 +00:00
Ivan Valdes
4a90575ab2
Backport ignore old leader's leases revoking request
Backported PR #16822, commits f7e488dc9262685d6624755e0d3bb0a655863248,
67f17166bf2ba337dafb8e0ea8eea5f74a990767,
and f7ff898fd6c2d6dbb54278343073aa4fa5f46a03

Signed-off-by: Ivan Valdes <ivan@vald.es>
2024-02-17 22:16:53 -08:00
vivekpatani
b9b4f1bd1b server: fix comment to match function name
- goword checks fail if function name mismatches with comment
- https://github.com/etcd-io/etcd/issues/17400

Signed-off-by: vivekpatani <9080894+vivekpatani@users.noreply.github.com>
2024-02-15 19:46:22 -08:00
Wei Fu
a965801b6e etcdserver: drain leaky goroutines before test completed
Signed-off-by: Wei Fu <fuweid89@gmail.com>
2024-02-06 12:11:33 +08:00
Marek Siarkowicz
f3a27b3745 Don't flock snapshot files
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2024-01-08 15:06:12 +01:00
Marek Siarkowicz
d6d263ac8d Check if be is nil to avoid panic when be is overriden with nil by recoverSnapshotBackend on line 517
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-12-20 11:41:54 +01:00
Marek Siarkowicz
a2e9dc8cc0 Don't redeclare err and snapshot variable, fixing validation of consistent index and closing database on defer
`err` variable shared throughout the NewServer function and used on line
396 to defer decision whether backend should be closed when starting
the server failed.

`snapshot` variable is first defined 407, redeclared locally on line 496 and later
again used on line 625. Creation of local variable is a bug introduced
in https://github.com/etcd-io/etcd/pull/11888.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-12-20 11:31:47 +01:00
Siyuan Zhang
b8d5e79fc1 [3.5] backport health check e2e tests.
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2023-12-07 09:51:39 -08:00
Marek Siarkowicz
6f125ce33b
Merge pull request #17039 from siyuanfoundation/release-3.5-step2
[3.5] Backport livez/readyz
2023-12-07 09:53:18 +01:00
Siyuan Zhang
ebb7e796c3 etcdserver: add linearizable_read check to readyz.
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2023-12-06 11:12:14 -08:00
Ivan Valdes
98aa466905
server: disable redirects in peer communication
Disable following redirects from peer HTTP communication on the client's side.
Etcd server may run into SSRF (Server-side request forgery) when adding a new
member. If users provide a malicious peer URL, the existing etcd members may be
redirected to another unexpected internal URL when getting the new member's
version.

Signed-off-by: Ivan Valdes <ivan@vald.es>
2023-12-05 10:59:25 -08:00
Marek Siarkowicz
ce4ae2beb6
Merge pull request #17024 from jmhbnz/backport-ssrf-fix
[3.5] Backport disable following redirects when checking peer urls
2023-11-28 21:22:32 +01:00
Siyuan Zhang
293fc21cd8 etcdserver: add metric counters for livez/readyz health checks.
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2023-11-27 12:52:15 -08:00
Siyuan Zhang
f5d7f997d6 etcdserver: add livez and ready http endpoints for etcd.
Add two separate probes, one for liveness and one for readiness. The liveness probe would check that the local individual node is up and running, or else restart the node, while the readiness probe would check that the cluster is ready to serve traffic. This would make etcd health-check fully Kubernetes API complient.

Signed-off-by: Siyuan Zhang <sizhang@google.com>
2023-11-27 12:52:15 -08:00
Chao Chen
2b54660a04 http health check bug fixes
Signed-off-by: Chao Chen <chaochn@amazon.com>
2023-11-27 12:52:15 -08:00
Marek Siarkowicz
46e394242f server: Split metrics and health code
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2023-11-27 09:31:00 -08:00
Marek Siarkowicz
8ab1c0f25b server: Cover V3 health with tests
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2023-11-27 09:31:00 -08:00
Marek Siarkowicz
9db8ddbb8c server: Refactor health checks
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2023-11-27 09:31:00 -08:00
Marek Siarkowicz
eed94f6f94 server: Run health check tests in subtests
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2023-11-27 09:31:00 -08:00
Marek Siarkowicz
2f6c84e91d server: Rename test case expect fields
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2023-11-27 09:31:00 -08:00
Marek Siarkowicz
c6784a7e82 server: Use named struct initialization in healthcheck test
Signed-off-by: Siyuan Zhang <sizhang@google.com>
2023-11-27 09:31:00 -08:00
James Blair
9e21048c4b
Backport server: Don't follow redirects when checking peer urls.
It's possible that etcd server may run into SSRF situation when adding a new member. If users provide a malicious peer URL, the existing etcd members may be redirected to other unexpected internal URL when getting the new member's version.

Signed-off-by: James Blair <mail@jamesblair.net>
2023-11-27 21:48:50 +13:00
caojiamingalan
04cfb4c660 etcdserver: add cluster id check for hashKVHandler
Signed-off-by: caojiamingalan <alan.c.19971111@gmail.com>
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-10-17 13:27:47 +02:00
James Blair
f62a894ae7
Fix goword failure in rafthttp/transport.go.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-08-11 21:47:30 +12:00
Thomas Jungblut
423f951409 Add first unit test for authApplierV3
This contains a slight refactoring to expose enough information
to write meaningful tests for auth applier v3.

Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>
2023-06-16 09:42:09 +02:00
Thomas Jungblut
b2fb75d147 Early exit auth check on lease puts
Mitigates etcd-io#15993 by not checking each key individually for permission
when auth is entirely disabled or admin user is calling the method.

Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>
2023-06-16 09:14:41 +02:00
Hitoshi Mitake
d1b1aa9dbe etcdserver: protect lease timetilive with auth
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
Co-authored-by: Benjamin Wang <wachao@vmware.com>
2023-05-08 22:45:38 +09:00
Benjamin Wang
cd019255ba etcdserver: Guarantee order of requested progress notifications
Progress notifications requested using ProgressRequest were sent
directly using the ctrlStream, which means that they could race
against watch responses in the watchStream.

This would especially happen when the stream was not synced - e.g. if
you requested a progress notification on a freshly created unsynced
watcher, the notification would typically arrive indicating a revision
for which not all watch responses had been sent.

This changes the behaviour so that v3rpc always goes through the watch
stream, using a new RequestProgressAll function that closely matches
the behaviour of the v3rpc code - i.e.

1. Generate a message with WatchId -1, indicating the revision for
   *all* watchers in the stream

2. Guarantee that a response is (eventually) sent

The latter might require us to defer the response until all watchers
are synced, which is likely as it should be. Note that we do *not*
guarantee that the number of progress notifications matches the number
of requests, only that eventually at least one gets sent.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-04-11 09:51:48 +08:00
Benjamin Wang
e6c2e380a9 security: remove password after authenticating the user
fix https://nvd.nist.gov/vuln/detail/CVE-2021-28235

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-04-06 20:12:02 +09:00
James Blair
1ea808b5ba
Backport go_srcs_in_module changes and fix goword failures.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-02-24 22:01:41 +13:00
James Blair
183af509f6
Formatted source code for go 1.19.6.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-02-20 21:33:59 +13:00
Benjamin Wang
53300ece3b etcdserver: return membership.ErrIDNotFound when the memberID not found
Backport https://github.com/etcd-io/etcd/pull/15095.

When promoting a learner, we need to wait until the leader's applied ID
catches up to the commitId. Afterwards, check whether the learner ID
exist or not, and return `membership.ErrIDNotFound` directly in the API
if the member ID not found, to avoid the request being unnecessarily
delivered to raft.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-01-17 06:24:27 +08:00
Benjamin Wang
e1fc545d8a etcdserver: process the scenaro of the last WAL record being partially synced to disk
We need to return io.ErrUnexpectedEOF in the error chain, so that
etcdserver can repair it automatically.

Backport https://github.com/etcd-io/etcd/pull/15068

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-01-08 05:30:01 +08:00
Benjamin Wang
c1a89973f0 etcdserver: fix nil pointer panic for readonly txn
Backporting https://github.com/etcd-io/etcd/pull/14895

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-12-06 18:16:49 +08:00
Chao Chen
378ad6b517 [3.5] Backport: non mutating requests pass through quotaKVServer when NOSPACE
Signed-off-by: Vaibhav Mehta <mehvaibh@amazon.com>
2022-12-05 21:04:09 +00:00
Benjamin Wang
ba122c9d56 etcdserver: intentionally set the memberID as 0 in corruption alarm
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-11-25 16:23:37 +08:00
Aleksandr Razumov
c91978077b client/pkg/fileutil: add missing logger to {Create,Touch}DirAll
Also populate it to every invocation.

Signed-off-by: WangXiaoxiao <1141195807@qq.com>
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-11-17 14:08:30 +01:00
Kafuu Chino
dd983c662b *: avoid closing a watch with ID 0 incorrectly
Signed-off-by: Kafuu Chino <KafuuChinoQ@gmail.com>

add test

1

1

1

1

1

1
2022-10-08 20:06:19 +08:00
Hitoshi Mitake
7b568f23ab *: handle auth invalid token and old revision errors in watch
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
2022-10-03 23:00:13 +09:00
Benjamin Wang
6c26693ebe
Merge pull request #14178 from lavacat/release-3.5-txn-panic
[3.5] server: don't panic in readonly serializable txn
2022-09-13 14:44:38 +08:00
Marek Siarkowicz
2ddb9e0883 tests: Fix member id in CORRUPT alarm
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:56 +02:00
Marek Siarkowicz
5660bf0e7f server: Make corrtuption check optional and period configurable
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:56 +02:00
Marek Siarkowicz
21fb173f76 server: Implement compaction hash checking
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:56 +02:00
Marek Siarkowicz
4a75e3d52d server: Refactor compaction checker
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-09-07 15:11:55 +02:00