17841 Commits

Author SHA1 Message Date
Manuel Rüger
f0f77fc14e go.mod: Bump prometheus/client_golang to v1.12.1
Signed-off-by: Manuel Rüger <manuel@rueg.eu>
2022-04-06 19:03:24 +02:00
Marek Siarkowicz
c4d055fe7b
Merge pull request #13819 from endocrimes/dani/auth_test.go
migrate e2e/users tests to common framework
2022-04-06 16:02:46 +02:00
Piotr Tabor
d24ef3ac20
Merge pull request #13893 from ls-2018/todo
fix unexpose todo
2022-04-06 14:31:26 +02:00
ls-2018
5b84b30fce fix unexpose todo
Signed-off-by: ls-2018 <acejilam@gmail.com>
2022-04-06 17:38:46 +08:00
Piotr Tabor
047e61df7a
Merge pull request #13880 from ahrtr/fix_dump_logs_panic
etcd-dump-logs will panic if there is no WAL entry after the snapshot
2022-04-06 09:25:17 +02:00
Marek Siarkowicz
ad03f2076a
Merge pull request #13886 from serathius/backend-logger
tests: Pass logger to backend
2022-04-05 16:35:07 +02:00
Marek Siarkowicz
ae57fe5d30
Merge pull request #13885 from serathius/verify
server: Add verification of whether lock was called within out outside of apply
2022-04-05 16:22:52 +02:00
Marek Siarkowicz
73fc864247 tests: Pass logger to backend 2022-04-05 15:53:38 +02:00
Marek Siarkowicz
1d3517020b server: Add verification of whether lock was called within out outside of apply 2022-04-05 15:34:45 +02:00
Marek Siarkowicz
8d8271f6d1
Merge pull request #13175 from karuppiah7890/issue-13167-measure-flakyness
scripts: add script to measure percentage of commits with failed status
2022-04-05 15:25:47 +02:00
Marek Siarkowicz
a08d479463
Merge pull request #13868 from endocrimes/dani/leasefix
tests/common/lease: Wait for correct lease list response
2022-04-04 17:51:57 +02:00
Danielle Lancashire
f71196d113 tests/common/lease: Wait for correct lease list response
We don't consistently reach the same etcd server during the lifetime of
a test and in some cases, this means that this test will flake if an
etcd server was slow to update its state and the test hits the outdated
server.

Here we switch to using an `Eventually` case which will wait upto a
second for the expected result before failing - with a 10ms gap between
invocations.

```
[tests(dani/leasefix)] $ gotestsum -- ./common -tags integration -count 100 -timeout 15m -run TestLeaseGrantAndList
✓  common (2m26.71s)

DONE 1600 tests in 147.258s
```
2022-04-04 15:43:17 +02:00
Piotr Tabor
6c974a3e31
Merge pull request #13867 from serathius/logs-test
tests: Use zaptest.NewLogger in tests
2022-04-04 14:47:04 +02:00
Piotr Tabor
5b84d3934e
Merge pull request #13876 from ptabor/20220403-integration-test-fixes
Integration tests flake fixes
2022-04-04 14:46:29 +02:00
Marek Siarkowicz
9dc8bbb7cf
Merge pull request #13875 from ahrtr/be_race
fix WARNING: DATA RACE issue when multiple goroutines access the backend
2022-04-04 13:31:19 +02:00
Marek Siarkowicz
804fddf921 tests: Use zaptest.NewLogger in tests 2022-04-04 13:03:15 +02:00
ahrtr
543c87cc38 etcd-dump-logs will panic if there is no WAL entry after the snapshot 2022-04-04 18:58:18 +08:00
Piotr Tabor
d4dcd3061d Fix flakes in TestV3LeaseCheckpoint/Checkpointing_disabled,_lease_TTL_is_reset
I think strong (not-equal) relationship was too restrictive when expressed with 1s granularity.

```
        logger.go:130: 2022-04-03T22:15:15.242+0200	WARN	m1	leader failed to send out heartbeat on time; took too long, leader is overloaded likely from slow disk	{"member": "m1", "to": "cb785755eb80ac1", "heartbeat-interval": "10ms", "expected-duration": "20ms", "exceeded-duration": "24.666613ms"}
        logger.go:130: 2022-04-03T22:15:15.262+0200	INFO	m-1	published local member to cluster through raft	{"member": "m-1", "local-member-id": "e2dd9f523aa7be87", "local-member-attributes": "{Name:m-1 ClientURLs:[unix://127.0.0.1:2196386040]}", "cluster-id": "b4b8e7e41c23c8b5", "publish-timeout": "5.2s"}
        v3_lease_test.go:415: Expected lease ttl (4m58s) to be greather than (4m58s)
```
2022-04-03 23:13:01 +02:00
Piotr Tabor
90796720c1 Reduce integration test parallelism to 2 packages at once.
Especially with 'race' detection, running O(cpu) integrational tests was causing CPU overloads and timeouts.
2022-04-03 14:48:36 +02:00
Piotr Tabor
ed1bc447c7 Flakes: Additional logging and timeouts to understand common flakes. 2022-04-03 14:48:36 +02:00
Piotr Tabor
68f2cb8c77 Fix ExampleAuth from integration/clientv3/examples (on OsX)
The code now ensures that each of the test is running in its own directory as opposed to shared os.tempdir.
```
$  (cd tests && env go test -timeout=15m --race go.etcd.io/etcd/tests/v3/integration/clientv3/examples -run ExampleAuth)
2022/04/03 10:24:59 Running tests (examples): ...
2022/04/03 10:24:59 the function can be called only in the test context. Was integration.BeforeTest() called ?
2022/04/03 10:24:59 2022-04-03T10:24:59.462+0200	INFO	m0	LISTEN GRPC	{"member": "m0", "grpcAddr": "localhost:m0", "m.Name": "m0"}
```
2022-04-03 14:16:45 +02:00
Piotr Tabor
d57f8dba62 Deflaking: Make WaitLeader (and WaitMembersForLeader) aggressively (30s) wait for leader being established.
Nearly none of the tests was checking the value... just assuming WaitLeader success.

```
    maintenance_test.go:277: Waiting for leader...
    logger.go:130: 2022-04-03T08:01:09.914+0200	INFO	m0	cluster version differs from storage version.	{"member": "m0", "cluster-version": "3.6.0", "storage-version": "3.5.0"}
    logger.go:130: 2022-04-03T08:01:09.915+0200	WARN	m0	leader failed to send out heartbeat on time; took too long, leader is overloaded likely from slow disk	{"member": "m0", "to": "2acc3d3b521981", "heartbeat-interval": "10ms", "expected-duration": "20ms", "exceeded-duration": "103.756219ms"}
    logger.go:130: 2022-04-03T08:01:09.916+0200	INFO	m0	updated storage version	{"member": "m0", "new-storage-version": "3.6.0"}
    ...
    logger.go:130: 2022-04-03T08:01:09.926+0200	INFO	grpc	[[roundrobin] roundrobinPicker: Build called with info: {map[0xc002630ac0:{{unix:localhost:m0 localhost <nil> 0 <nil>}} 0xc002630af0:{{unix:localhost:m1 localhost <nil> 0 <nil>}} 0xc002630b20:{{unix:localhost:m2 localhost <nil> 0 <nil>}}]}]
    logger.go:130: 2022-04-03T08:01:09.926+0200	WARN	m0	apply request took too long	{"member": "m0", "took": "114.661766ms", "expected-duration": "100ms", "prefix": "", "request": "header:<ID:12658633312866157316 > cluster_version_set:<ver:\"3.6.0\" > ", "response": ""}
    logger.go:130: 2022-04-03T08:01:09.927+0200	INFO	m0	cluster version is updated	{"member": "m0", "cluster-version": "3.6"}
    logger.go:130: 2022-04-03T08:01:09.955+0200	INFO	m2.raft	9f96af25a04e2ec3 [logterm: 2, index: 8, vote: 9903a56eaf96afac] ignored MsgVote from 2acc3d3b521981 [logterm: 2, index: 8] at term 2: lease is not expired (remaining ticks: 10)	{"member": "m2"}
    logger.go:130: 2022-04-03T08:01:09.955+0200	INFO	m0.raft	9903a56eaf96afac [logterm: 2, index: 8, vote: 9903a56eaf96afac] ignored MsgVote from 2acc3d3b521981 [logterm: 2, index: 8] at term 2: lease is not expired (remaining ticks: 5)	{"member": "m0"}
    logger.go:130: 2022-04-03T08:01:09.955+0200	INFO	m0.raft	9903a56eaf96afac [term: 2] received a MsgAppResp message with higher term from 2acc3d3b521981 [term: 3]	{"member": "m0"}
    logger.go:130: 2022-04-03T08:01:09.955+0200	INFO	m0.raft	9903a56eaf96afac became follower at term 3	{"member": "m0"}
    logger.go:130: 2022-04-03T08:01:09.955+0200	INFO	m0.raft	raft.node: 9903a56eaf96afac lost leader 9903a56eaf96afac at term 3	{"member": "m0"}
    maintenance_test.go:279: Leader established.
```

Tmp
2022-04-03 12:23:09 +02:00
Piotr Tabor
2fab3f3ae5 Make naming of test-nodes consistent and positive: m0, m1, m2
The nodes used to be named: m-1, m0, m1, that was generating very confusing logs
in integration tests.
2022-04-03 09:16:55 +02:00
ahrtr
836bd6bc3a fix WARNING: DATA RACE issue when multiple goroutines access the backend concurrently 2022-04-03 06:13:09 +08:00
Sahdev Zala
3d3c4373e3
Merge pull request #13860 from mrueg/fix-make2
Makefile: Additional logic fix
2022-04-02 14:43:19 -04:00
Piotr Tabor
f85cd0296f
Merge pull request #13872 from ptabor/20220402-osx-unit-test-pass
Fix TestauthTokenBundleOnOverwrite on OsX:
2022-04-02 20:03:38 +02:00
Piotr Tabor
3bb2d0c716
Merge pull request #13870 from howz97/main
fix comment in raft.go
2022-04-02 16:50:26 +02:00
Piotr Tabor
8cd8a1ea10 Flakes in integration/clientv3/examples/...
The tests sometimes flaked due to already existing socket-files.
Now each execution works in a tempoarary directory.
2022-04-02 16:16:25 +02:00
Piotr Tabor
3b589fb3b2 Fix TestauthTokenBundleOnOverwrite on OsX:
```
% (cd client/v3 && env go test -short -timeout=3m --race ./...)
--- FAIL: TestAuthTokenBundleNoOverwrite (0.00s)
    client_test.go:210: listen unix /var/folders/t1/3m8z9xz93t9c3vpt7zyzjm6w00374n/T/TestAuthTokenBundleNoOverwrite3197524989/001/etcd-auth-test:0: bind: invalid argument
FAIL
FAIL	go.etcd.io/etcd/client/v3	4.270s
```

The reason was that the path exceeded 108 chars (that is too much for socket).
In the mitigation we first change chroot (working directory) to the tempDir... such the path is 'local'.
2022-04-02 16:12:02 +02:00
howz97
f9c9bfa44c fix comment in raft.go 2022-04-02 14:27:33 +08:00
Marek Siarkowicz
b1610934e3
Merge pull request #13864 from serathius/logs
Fix inconsistent log format
2022-04-01 11:00:48 +02:00
Marek Siarkowicz
63346bfead server: Use default logging configuration instead of zap production one
This fixes problem where logs json changes format of timestamp.
2022-04-01 10:23:42 +02:00
Piotr Tabor
e4d34f21bc
Merge pull request #13856 from ahrtr/cleanup_unused_code
The file server/storage/mvcc/util.go isn't used at all, so removing it
2022-03-31 21:16:02 +02:00
Marek Siarkowicz
e5bf23037a tests: Keeps log in expect to allow their analysis 2022-03-31 21:02:36 +02:00
Manuel Rüger
29905029f6 Makefile: Additional logic fix
Signed-off-by: Manuel Rüger <manuel@rueg.eu>
2022-03-31 11:18:36 +02:00
Piotr Tabor
0d5c1dce49
Merge pull request #13857 from mrueg/fix-make
Makefile: Fix wrong target
2022-03-31 11:04:52 +02:00
Manuel Rüger
ec29b9ee36 Makefile: Fix wrong target
Signed-off-by: Manuel Rüger <manuel@rueg.eu>
2022-03-31 09:48:21 +02:00
ahrtr
9b3b383366 the file server/storage/mvcc/util.go isn't used at all, so removing it 2022-03-31 10:14:46 +08:00
Marek Siarkowicz
0e83f62e0c
Merge pull request #13852 from serathius/recommend
changelog: Update and deduplicate production recommendations
2022-03-29 19:12:46 +02:00
Marek Siarkowicz
88a39d780f changelog: Update and deduplicate production recommendations 2022-03-29 19:09:01 +02:00
Marek Siarkowicz
27e222e2d7
Merge pull request #13802 from yankay/fix-the-api-dependency-in-pkg-and-update-cobra-to-1.4.0
Fix the etcd api dependency in pkg. And Update Cobra Version to1.4.0
2022-03-28 10:40:24 +02:00
Sahdev Zala
be2929568f
Merge pull request #13834 from ahrtr/tool_decode_meta
enhance etcd-dump-db to display keys in meta more friendly
2022-03-26 13:38:06 -04:00
Sahdev Zala
dcc226491f
Merge pull request #13836 from kkkkun/set-etcdutl-default
test: set etcdutl to default
2022-03-25 20:13:21 -04:00
Kay Yan
afecd3139c fix the api dependency in pkg, and update cobra to 1.4.0
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2022-03-25 17:18:56 +08:00
Sahdev Zala
4019c592ea
Merge pull request #13831 from mrueg/go-1.17.8
Update go to 1.17.8
2022-03-24 16:23:33 -04:00
Marek Siarkowicz
0d55a1ca2a
Merge pull request #13821 from ahrtr/configspec_config
Move the newClientCfg into clientv3 package so as to be reused by both etcdctl and v3discovery
2022-03-24 10:12:55 +01:00
Marek Siarkowicz
cc33b7cee1
Merge pull request #13824 from eval-exec/patch-1
Fix panic in etcd validate secure endpoints #13810
2022-03-24 10:05:58 +01:00
Manuel Rüger
b8c1ac8efd Add Changelog entry 2022-03-24 10:00:09 +01:00
Kun Zhang
62641d3385 set etcdutl to default 2022-03-24 16:20:28 +08:00
ahrtr
49e9a14580 migrate unit test to cover the logic of converting ConfigSpec to Config 2022-03-24 07:24:22 +08:00