15851 Commits

Author SHA1 Message Date
Cenk Alti
7a4a3ad8db
server: add more context to panic message
Signed-off-by: Cenk Alti <cenkalti@gmail.com>
2022-11-01 18:59:17 -04:00
Benjamin Wang
7c1499d3bb
Merge pull request #14649 from mitake/test-authrecover-3.4
[3.4] server: add a unit test case for authStore.Reocver() with empty rangePermCache
2022-10-29 13:11:36 +08:00
Hitoshi Mitake
b7a23311e6 etcdserver: call refreshRangePermCache on Recover() in AuthStore
Signed-off-by: Oleg Guba <oleg@dropbox.com>
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
2022-10-29 13:55:06 +09:00
Hitoshi Mitake
0b3ff06868 server: add a unit test case for authStore.Reocver() with empty rangePermCache
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
2022-10-29 13:27:53 +09:00
Benjamin Wang
ce1630f68f
Merge pull request #14601 from dusk125/release-3.4
Backport #14500 to 3.4
2022-10-27 14:21:22 +08:00
Allen Ray
9254f8f05b Release-3.4: server/etcdmain: add configurable cipher list to gRPC proxy listener
Signed-off-by: Allen Ray <alray@redhat.com>
2022-10-19 16:02:13 -04:00
Benjamin Wang
b058374fbd
Merge pull request #14594 from ZoeShaw101/fix-watch-test-issue-3.4
Backport #14591 to 3.4.
2022-10-17 05:25:50 +08:00
王霄霄
dcebdf7958 Backport #14591 to 3.4.
Signed-off-by: 王霄霄 1141195807@qq.com
Signed-off-by: 王霄霄 <1141195807@qq.com>
2022-10-16 21:18:53 +08:00
Benjamin Wang
5b764d8771
Merge pull request #14581 from tomari/tomari/watch-backoff-for-3.4
[3.4] client/v3: Add backoff before retry when watch stream returns unavailable
2022-10-13 07:23:02 +08:00
Hisanobu Tomari
7b7fbbf8b8 client/v3: Add backoff before retry when watch stream returns unavailable
The client retries connection without backoff when the server is gone
after the watch stream is established. This results in high CPU usage
in the client process. This change introduces backoff when the stream is
failed and unavailable.

Signed-off-by: Hisanobu Tomari <posco.grubb@gmail.com>
2022-10-13 05:26:31 +09:00
Sahdev Zala
429fcb98ab
Merge pull request #14579 from ahrtr/wal_log_3.4
[3.4] etcdserver: added more debug log for the purgeFile goroutine
2022-10-12 11:34:33 -04:00
Benjamin Wang
1d7639f796 etcdserver: added more debug log for the purgeFile goroutine
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-10-12 19:39:20 +08:00
Benjamin Wang
5b3ac7da6b
Merge pull request #14577 from pchan/acp3.4
Cherry pick of #13224
2022-10-12 17:58:26 +08:00
Sergey Kacheev
5381dafaae netutil: make a raw URL comparison part of the urlsEqual function
Signed-off-by: Prasad Chandrasekaran <prasadc@vmware.com>
2022-10-12 15:07:46 +05:30
Sergey Kacheev
90e7e254ae Apply suggestions from code review
Co-authored-by: Lili Cosic <cosiclili@gmail.com>
Signed-off-by: Prasad Chandrasekaran <prasadc@vmware.com>
2022-10-12 15:07:46 +05:30
Sergey Kacheev
abb019a51e netutil: add url comparison without resolver to URLStringsEqual
If one of the nodes in the cluster has lost a dns record,
restarting the second node will break it.
This PR makes an attempt to add a comparison without using a resolver,
which allows to protect cluster from dns errors and does not break
the current logic of comparing urls in the URLStringsEqual function.
You can read more in the issue #7798

Fixes #7798

Signed-off-by: Prasad Chandrasekaran <prasadc@vmware.com>
2022-10-12 15:07:46 +05:30
Hitoshi Mitake
57a27de189
Merge pull request #14562 from kafuu-chino/3.4-backport-14296
*: avoid closing a watch with ID 0 incorrectly
2022-10-10 22:48:53 +09:00
Kafuu Chino
ed10ca13f4 *: avoid closing a watch with ID 0 incorrectly
Signed-off-by: Kafuu Chino <KafuuChinoQ@gmail.com>

add test

1

1

1
2022-10-10 19:54:58 +08:00
Benjamin Wang
de11726a8a
Merge pull request #14548 from mitake/3.4-backport-14322
Backport PR 14322 to release-3.4
2022-10-05 05:50:43 +08:00
Hitoshi Mitake
91365174b3 tests: a test case for watch with auth token expiration
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
2022-10-04 22:55:36 +09:00
Hitoshi Mitake
0c6e466024 *: handle auth invalid token and old revision errors in watch
Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>
2022-10-04 22:49:06 +09:00
Marek Siarkowicz
d0a732f96d
Merge pull request #14530 from ahrtr/memberid_alarm
etcdserver: fix memberID equals to zero in corruption alarm
2022-09-28 09:30:10 +02:00
Benjamin Wang
29911e9a5b etcdserver: fix memberID equals to zero in corruption alarm
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-09-28 11:01:26 +08:00
Benjamin Wang
85b640cee7 Bump version to 3.4.21
Signed-off-by: Benjamin Wang <wachao@vmware.com>
v3.4.21
2022-09-15 08:46:22 +08:00
Marek Siarkowicz
1a05326fae
Merge pull request #14442 from ahrtr/fix_TestV3AuthRestartMember
[release-3.4] Fix the flaky test TestV3AuthRestartMember
2022-09-09 09:57:24 +02:00
Benjamin Wang
b8bea91f22 fix the flaky test TestV3AuthRestartMember
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-09-09 09:37:25 +08:00
Benjamin Wang
6730ed8477
Merge pull request #14410 from vivekpatani/release-3.4
[release-3.4] server,test: refresh cache on each NewAuthStore
2022-09-09 09:34:32 +08:00
Benjamin Wang
a55a9f5e07
Merge pull request #14441 from tjungblu/bz_1918413_3.4_upstream
[release-3.4] etcdctl: fix move-leader for multiple endpoints
2022-09-09 09:26:40 +08:00
Thomas Jungblut
86bc0a25c4 etcdctl: fix move-leader for multiple endpoints
Due to a duplicate call of clientConfigFromCmd, the move-leader command
would fail with "conflicting environment variable is shadowed by corresponding command-line flag".
Also in scenarios where no command-line flag was supplied.

Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>
2022-09-08 15:51:19 +02:00
Benjamin Wang
dd743eea81
Merge pull request #14439 from vsvastey/usr/vsvastey/open-with-max-index-test-fix-3.4
[release-3.4] testing: fix TestOpenWithMaxIndex cleanup
2022-09-08 17:00:20 +08:00
Vladimir Sokolov
1ed5dfc20e testing: fix TestOpenWithMaxIndex cleanup
A WAL object was closed by defer, however the WAL was rewritten afterwards,
so defer closed already closed WAL but not the new one. It caused a data
race between writing file and cleaning up a temporary test directory,
which led to a non-deterministic bug.

Fixes #14332

Signed-off-by: Vladimir Sokolov <vsvastey@gmail.com>
2022-09-08 10:49:47 +03:00
Benjamin Wang
b2b7b9d535
Merge pull request #14423 from serathius/one_member_data_loss_raft_3_4
[release-3.4] fix the potential data loss for clusters with only one member
2022-09-06 03:29:45 +08:00
Benjamin Wang
119e4dda19 fix the potential data loss for clusters with only one member
For a cluster with only one member, the raft always send identical
unstable entries and committed entries to etcdserver, and etcd
responds to the client once it finishes (actually partially) the
applying workflow.

When the client receives the response, it doesn't mean etcd has already
successfully saved the data, including BoltDB and WAL, because:
   1. etcd commits the boltDB transaction periodically instead of on each request;
   2. etcd saves WAL entries in parallel with applying the committed entries.
Accordingly, it may run into a situation of data loss when the etcd crashes
immediately after responding to the client and before the boltDB and WAL
successfully save the data to disk.
Note that this issue can only happen for clusters with only one member.

For clusters with multiple members, it isn't an issue, because etcd will
not commit & apply the data before it being replicated to majority members.
When the client receives the response, it means the data must have been applied.
It further means the data must have been committed.
Note: for clusters with multiple members, the raft will never send identical
unstable entries and committed entries to etcdserver.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-09-05 14:15:47 +02:00
Benjamin Wang
9d5ae56764
Merge pull request #14420 from vsvastey/usr/vsvastey/nil-logger
etcdserver: nil-logger issue fix for version 3.4
2022-09-05 14:53:08 +08:00
Vladimir Sokolov
38342e88da etcdserver: nil-logger issue fix for version 3.4
In v3.5 it is assumed that the logger should not be nil, however it is
still a case in v3.4. The PR targeted to v3.5 was backported to 3.4 and
that's why it's possible to get panic on nil logger in 3.4. This commit
fixed this issue.

Fixes #14402

Signed-off-by: Vladimir Sokolov <vsvastey@gmail.com>
2022-09-03 04:34:03 +03:00
vivekpatani
c0ef7d52e0 server,test: refresh cache on each NewAuthStore
- permissions were incorrectly loaded on restarts.
- #14355
- Backport of https://github.com/etcd-io/etcd/pull/14358

Signed-off-by: vivekpatani <9080894+vivekpatani@users.noreply.github.com>
2022-08-31 13:08:11 -07:00
Benjamin Wang
1e2682301c Bump version to 3.4.20
Signed-off-by: Benjamin Wang <wachao@vmware.com>
v3.4.20
2022-08-06 05:27:01 +08:00
Sahdev Zala
ee366151c6
Merge pull request #14290 from ahrtr/3.4_no_prevkv_for_create
[3.4] Do not get previous K/V for create event
2022-08-01 08:39:19 -04:00
Benjamin Wang
095bbfc4ed lock down the version of shadow to v0.1.11
The latest vesion v0.1.12 was just released On Jul 27, 2022,
and it is causing issue (see below) on the govet check,

```
govet_shadow' started at Sun Jul 31 23:23:27 PDT 2022
go get: upgraded golang.org/x/net v0.0.0-20211112202133-69e39bad7dc2 => v0.0.0-20220722155237-a158d28d115b
go get: upgraded golang.org/x/sys v0.0.0-20211019181941-9d821ace8654 => v0.0.0-20220722155257-8c9f86f7a55f
go get: upgraded golang.org/x/tools v0.0.0-20190524140312-2c0ae7006135 => v0.1.12
/root/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-prometheus@v1.2.0/client_metrics.go:7:2: missing go.sum entry for module providing package golang.org/x/net/context (imported by go.etcd.io/etcd/etcdserver/etcdserverpb); to add:
	go get go.etcd.io/etcd/etcdserver/etcdserverpb
/root/go/pkg/mod/google.golang.org/grpc@v1.26.0/internal/transport/controlbuf.go:28:2: missing go.sum entry for module providing package golang.org/x/net/http2 (imported by go.etcd.io/etcd/embed); to add:
	go get go.etcd.io/etcd/embed
/root/go/pkg/mod/google.golang.org/grpc@v1.26.0/internal/transport/controlbuf.go:29:2: missing go.sum entry for module providing package golang.org/x/net/http2/hpack (imported by github.com/soheilhy/cmux); to add:
	go get github.com/soheilhy/cmux@v0.1.4
/root/go/pkg/mod/google.golang.org/grpc@v1.26.0/server.go:36:2: missing go.sum entry for module providing package golang.org/x/net/trace (imported by go.etcd.io/etcd/embed); to add:
	go get go.etcd.io/etcd/embed
```

It isn't good to always to use the latest version. Instead, we should
lock down the version, and v0.1.11 was confirmed to be working.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-08-01 15:11:49 +08:00
Benjamin Wang
cc1b0e6a44 do not get previous K/V for create event
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-08-01 13:11:46 +08:00
Benjamin Wang
314dcbf6f5
Merge pull request #14274 from lavacat/release-3.4-fix-TestRoundRobinBalancedResolvableFailoverFromServerFail
[3.4] clientv3/balancer: fixed flaky TestRoundRobinBalancedResolvableFailoverFromServerFail
2022-07-27 04:59:38 +08:00
Bogdan Kanivets
6f483a649e clientv3/balancer: fixed flaky TestRoundRobinBalancedResolvableFailoverFromServerFail
- ignore "transport is closing" error during connections warmup after stopping one peer.

Signed-off-by: Bogdan Kanivets <bkanivets@apple.com>
2022-07-26 08:06:59 -07:00
Benjamin Wang
ce539a960c
Merge pull request #14279 from SimFG/mvcc-race
[3.4] clientv3/mvcc: fixed DATA RACE
2022-07-26 23:01:34 +08:00
SimFG
04e5e5516e [3.4] clientv3/mvcc: fixed DATA RACE between mvcc.(*store).setupMetricsReporter and mvcc.(*store).restore
Signed-off-by: SimFG <1142838399@qq.com>
2022-07-26 21:38:23 +08:00
Benjamin Wang
2c778eebf7
Merge pull request #14269 from ahrtr/3.4_resend_readindex
[3.4] etcdserver: resend ReadIndex request on empty apply request
2022-07-25 16:53:06 +08:00
Benjamin Wang
f53db9b246 etcdserver: resend ReadIndex request on empty apply request
Backport https://github.com/etcd-io/etcd/pull/12795 to 3.4

Signed-off-by: Benjamin Wang <wachao@vmware.com>
2022-07-25 09:21:31 +08:00
Benjamin Wang
e2b36f8879
Merge pull request #14253 from serathius/checkpoints-fix-3.4
[3.4] Checkpoints fix 3.4
2022-07-22 16:56:17 +08:00
Benjamin Wang
de2e8ccc78
Merge pull request #14258 from ahrtr/3.4_postphone_read_index
[3.4] raft: postpone MsgReadIndex until first commit in the term
2022-07-22 16:46:32 +08:00
Marek Siarkowicz
783e99cbfe Fix lease checkpointing tests by forcing a snapshot
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-07-22 10:28:44 +02:00
Marek Siarkowicz
8f4735dfd4 server: Require either cluster version v3.6 or --experimental-enable-lease-checkpoint-persist to persist lease remainingTTL
To avoid inconsistant behavior during cluster upgrade we are feature
gating persistance behind cluster version. This should ensure that
all cluster members are upgraded to v3.6 before changing behavior.

To allow backporting this fix to v3.5 we are also introducing flag
--experimental-enable-lease-checkpoint-persist that will allow for
smooth upgrade in v3.5 clusters with this feature enabled.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2022-07-22 10:28:29 +02:00