17506 Commits

Author SHA1 Message Date
Piotr Tabor
68fa5dcf99
Merge pull request #13549 from songlh-psu/main
fixing the goroutine leaks in TestHashKVWhenCompacting
2022-01-14 13:58:16 +01:00
Piotr Tabor
f2e49b5771
Merge pull request #13562 from timmyyuan/main
Fix goroutine leaks in TestNodeProposeAddDuplicateNode
2022-01-14 13:52:18 +01:00
Piotr Tabor
f72688e248
Merge pull request #13563 from timmyyuan/ting/fix-goroutine-leaks
Fix goroutine leaks in TestCommitPagination
2022-01-14 13:51:55 +01:00
Piotr Tabor
8b91b8296b
Merge pull request #13584 from serathius/monotonic
tests: Add integration test for revision monotonic under failure injection
2022-01-14 13:36:13 +01:00
Piotr Tabor
3c77c7fd3c
Merge pull request #13591 from serathius/codeql
Remove CodeQL errors
2022-01-13 15:21:05 +01:00
Marek Siarkowicz
4032d4f66a Remove CodeQL errors 2022-01-13 14:29:09 +01:00
Piotr Tabor
e433d12656
Merge pull request #13594 from ahrtr/update_changelog_3.5_for_pull_13501
update CHANGELOG-3.5.md to cover the fix for issue 13494
2022-01-13 09:04:53 +01:00
ahrtr
6ef154e548 update CHANGELOG-3.5.md to cover the fix for issue 13494 2022-01-13 14:55:08 +08:00
Piotr Tabor
f184dfd9dc
Merge pull request #13590 from serathius/recordings
README: Cleanup community meetings video recordings
2022-01-12 18:43:59 +01:00
Marek Siarkowicz
5e06fd40da README: Cleanup community meetings video recordings 2022-01-12 13:39:30 +01:00
Marek Siarkowicz
eac6d71352 tests: Add integration test for revision monotonic under failure injection 2022-01-12 11:51:12 +01:00
Piotr Tabor
e0a0fdc984
Merge pull request #13572 from microyahoo/update_lease_tools
update dump db tool
2022-01-12 10:33:28 +01:00
Piotr Tabor
868c51b95a
Merge pull request #13581 from spzala/versionsupport
Update supported versions and ref to the policy
2022-01-12 10:21:38 +01:00
Sahdev Zala
1e5bd39571 Update supported versions and ref to the policy
We support current release and two previous minor versions, and so
making changes accordingly. Also, adding link to the details of
the versioning.
2022-01-05 21:36:27 -05:00
songlh
a9652b4b4e fixing the leaks in TestStressWatchCancelClose 2022-01-04 17:57:19 -05:00
Sahdev Zala
96a9fd0a1e
Merge pull request #13574 from cunnie/defer_cancel
Golang Client docs: defer `cancel()`, avoid erroring
2022-01-03 20:03:22 -05:00
Sahdev Zala
a96f5ee8a1
Merge pull request #13577 from sayap/auth-graceful-disable
Disable auth gracefully without impacting existing watchers
2022-01-03 20:02:21 -05:00
Liang Zheng
0cc789d81d update dump db tool
Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
2022-01-01 00:13:33 +08:00
Yap Sok Ann
17fd2e7282 Disable auth gracefully without impacting existing watchers
This attempts to fix a special case of the problem described in #12385,
where trying to do `clientv3.Watch` with an expired token would result
in `ErrGRPCPermissionDenied`, due to the failing authorization check in
`isWatchPermitted`. Furthermore, the client can't auto recover, since
`shouldRefreshToken` rightly returns false for the permission denied
error.

In this case, we would like to have a runbook to dynamically disable
auth, without causing any disruption. Doing so would immediately expire
all existing tokens, which would then cause the behavior described
above. This means existing watchers would still work for a period of
time after disabling auth, until they have to reconnect, e.g. due to a
rolling restart of server nodes.

This commit adds a client-side fix and a server-side fix, either of
which is sufficient to get the added test case to pass. Note that it is
an e2e test case instead of an integration one, as the reconnect only
happens if the server node is stopped via SIGINT or SIGTERM.

A generic fix for the problem described in #12385 would be better, as
that shall also fix this special case. However, the fix would likely be
a lot more involved, as some untangling of authn/authz is required.
2021-12-31 14:39:46 +07:00
Brian Cunnie
5620a9c227
Golang Client docs: defer cancel(), avoid erroring
In the sample code demonstrating how to specify a client request
timeout, the `cancel()` is called immediately after the Put, but it
should be deferred instead, giving the Put enough time to complete.

In the canonical Golang context
[docs](https://pkg.go.dev/context#WithTimeout), the sample code sets a
`defer cancel()` immediately after context creation, and with this
commit we adhere to that convention.

fixes:
```json
{
  "level": "warn",
  "ts": "2021-12-29T09:56:42.439-0800",
  "logger": "etcd-client",
  "caller": "v3@v3.5.1/retry_interceptor.go:62",
  "msg": "retrying of unary invoker failed",
  "target": "etcd-endpoints://0xc000213340/localhost:2379",
  "attempt": 0,
  "error": "rpc error: code = Canceled desc = context canceled"
}
```
2021-12-29 14:10:36 -08:00
Ting Yuan
df8efd3853
Fix goroutine leaks in TestCommitPagination
raft: fix goroutine leaks in TestCommitPagination

The goroutine created with n.run() will leak if we forget to call n.Stop().

We can replay the goroutine leaks by using [goleak](https://github.com/uber-go/goleak):

```
$ cd raft &&  env go test -short -v -timeout=3m --race -run=TestCommitPagination.
... ...
raft2021/12/27 20:47:15 INFO: raft.node: 1 elected leader 1 at term 1
    leaks.go:78: found unexpected goroutines:
        [Goroutine 20 in state select, with go.etcd.io/etcd/raft/v3.(*node).run on top of the stack:
        goroutine 20 [select]:
        go.etcd.io/etcd/raft/v3.(*node).run(0xc00036f260)
                /home/yuanting/work/dev/goprojects/etcd/raft/node.go:344 +0xc1d
        created by go.etcd.io/etcd/raft/v3.TestCommitPagination
                /home/yuanting/work/dev/goprojects/etcd/raft/node_test.go:920 +0x554
        ]
--- FAIL: TestCommitPagination (0.45s)
FAIL
FAIL    go.etcd.io/etcd/raft/v3 0.508s
FAIL
```
2021-12-27 20:55:02 +08:00
Ting Yuan
e6f28dbeb2
Fix goroutine leaks in TestNodeProposeAddDuplicateNode
raft: fix goroutine leaks in TestNodeProposeAddDuplicateNode

The goroutine created with `n.run()` will leak if we forget to call `n.Stop()`
2021-12-27 20:36:26 +08:00
Linhai
a45c73d9b1 resolve the conflict 2021-12-21 17:49:47 -05:00
Piotr Tabor
69279532f4
Merge pull request #13540 from songlh-psu/fixing-3
fixing one panic and two goroutine leaks
2021-12-21 11:03:59 +01:00
Piotr Tabor
1e4a345706
Merge pull request #13545 from dbussink/build-apple-m1
server/etcdmain: add build support for Apple M1
2021-12-21 11:00:58 +01:00
Piotr Tabor
5b0bb07cb0
Merge pull request #13500 from ahrtr/reset_ci_after_reload_db
Set the backend again after recovering v3 backend from snapshot
2021-12-21 10:50:30 +01:00
Piotr Tabor
0bdc660ec2
Merge pull request #13537 from songlh/main
fix potential goroutine leaks
2021-12-21 10:47:54 +01:00
Linhai
246e7eba09 fixing the goroutine in two unit tests 2021-12-21 04:46:39 -05:00
Piotr Tabor
7ff2c7714e
Merge pull request #13546 from justaugustus/debian-base-bullseye
images: Use Kubernetes debian-base:bullseye-v1.1.0 as base image
2021-12-21 10:44:03 +01:00
Linhai Song
5e8f50bb09 remove the extra stop 2021-12-17 20:03:19 -05:00
Stephen Augustus
bbb187dcc0
images: Use Kubernetes debian-base:bullseye-v1.1.0 as base image
Signed-off-by: Stephen Augustus <foo@auggie.dev>
2021-12-17 16:06:37 -05:00
Dirkjan Bussink
ddb9554eec
server/etcdmain: add build support for Apple M1
This has been additionally verified by running the tests locally as a
basic smoke test. GitHub Actions doesn't provide MacOS M1 (arm64) yet,
so there's no good way to automate testing.

Ran `TMPDIR=/tmp make test` locally. The `TMPDIR` bit is needed so
there's no really long path used that breaks Unix socket setup in one of
the tests.
2021-12-17 17:35:36 +01:00
Piotr Tabor
42840d0fda
Merge pull request #13528 from ahrtr/update_test_remove_redundant_line
Remove the redundant line from test.sh
2021-12-16 11:47:39 +01:00
Linhai Song
0098dbf350 fixing two goroutine leaks and one panic 2021-12-15 22:38:25 -05:00
Linhai
0213b8baed fixing goroutine leaks in testServer 2021-12-15 02:43:49 -05:00
Linhai
3ebd0a7d00 fixing the goroutine leak in TestBackendClose 2021-12-15 01:54:51 -05:00
Linhai
d1194977eb fix potential goroutine leaks in TestTxnPanics 2021-12-15 01:22:56 -05:00
ahrtr
793e081a5b remove the redundant line from test.sh 2021-12-10 05:05:48 +08:00
Piotr Tabor
29292aa7bd
Merge pull request #13505 from LeoYang90/fix_watchable_runlock
fix watchablestore runlock bug
2021-12-03 12:21:30 +01:00
ahrtr
7be1464ef1 set the backend again after recovering v3 backend from snapshot 2021-12-03 05:52:12 +08:00
Piotr Tabor
170d9b9d73
Merge pull request #13508 from serathius/checkpoints-fix
Lease Checkpoints fix
2021-12-02 16:08:40 +01:00
Piotr Tabor
3e391f4fba
Merge pull request #13513 from ahrtr/enhance_etcdctl_make_mirror_log
etcdctl: enhance the make-mirror command to return error asap when invalid flags are provided
2021-12-02 16:05:22 +01:00
Marek Siarkowicz
48a7aab2bc server: Add lease checkpointing fix information to CHANGELOG 2021-12-02 14:36:57 +01:00
Marek Siarkowicz
7d10899d7f server: Require either cluster version v3.6 or --experimental-enable-lease-checkpoint-persist to persist lease remainingTTL
To avoid inconsistant behavior during cluster upgrade we are feature
gating persistance behind cluster version. This should ensure that
all cluster members are upgraded to v3.6 before changing behavior.

To allow backporting this fix to v3.5 we are also introducing flag
--experimental-enable-lease-checkpoint-persist that will allow for
smooth upgrade in v3.5 clusters with this feature enabled.
2021-12-02 12:26:47 +01:00
ahrtr
8b3405bdb8 etcdctl: enhance the make-mirror command to return error asap with wrong command line parameters 2021-11-30 06:26:11 +08:00
Michał Jasionowski
fd77b2700c etcdserver,integration: Store remaining TTL on checkpoint
To extend lease checkpointing mechanism to cases when the whole etcd
cluster is restarted.
2021-11-26 15:17:22 +01:00
Michał Jasionowski
48a360aad0 lease,integration: add checkpoint scheduling after leader change
Current checkpointing mechanism is buggy. New checkpoints for any lease
are scheduled only until the first leader change. Added fix for that
and a test that will check it.
2021-11-26 14:34:19 +01:00
leoyang.yl
7e6c29c198 fix runlock bug 2021-11-26 11:05:36 +08:00
Sam Batschelet
7572a61a39
Merge pull request #13498 from KushalP/upgrade-otel-version
*: Upgrade to use go.opentelemetry.io/otel@v1.2.0
2021-11-24 21:08:29 -05:00
Kushal Pisavadia
71493bde3e *: Upgrade to use go.opentelemetry.io/otel@v1.2.0
Upgrading from v1.0.1.

Upgrading related dependencies
------------------------------

The following dependencies also had to be upgraded:

- go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.26.1
  From v0.25.0. This gets rid of a transitive dependency on go.opentelemetry.io/otel@v1.0.1.
- google.golang.org/genproto@v0.0.0-20211118181313-81c1377c94b1
2021-11-24 16:03:33 +00:00