17025 Commits

Author SHA1 Message Date
Gyuho Lee
c4f7d578d8
Merge pull request #12898 from tangcong/add-disk-io-failure-case
functional: add disk io failure case
2021-04-26 14:37:53 -07:00
tangcong
4fb22093a6 functional: add SHORT_TTL_LEASE_EXPIRE checker 2021-04-26 20:00:45 +08:00
tangcong
16e38e49a9 functional: add FAILPOINTS_WITH_DISK_IO_LATENCY case 2021-04-26 12:07:05 +08:00
tangcong
370f9cf3b9 fix: failed to get failpoints from member 2021-04-26 11:44:53 +08:00
Makdon
ddcb463822
etcdserver/mvcc: update trace.Step condition 2021-04-25 23:07:45 +08:00
Piotr Tabor
9a3aff6d8f
Merge pull request #12889 from ptabor/20210423-deflake-TestFirstCommitNotification
Deflake: TestFirstCommitNotification
2021-04-23 14:55:44 +02:00
Piotr Tabor
9f559775b8 Deflake: TestFirstCommitNotification
Infrequently the test flaked. Reproducable with:

```
  go test go.etcd.io/etcd/tests/v3/integration --run TestFirstCommitNotification --count=500
```

The moveLeader finishes when configchange is commited by quorum.
It doesn't guarantee that the 'empty' record was committed by the new leader.
From time to time happened that appliedLeaderIndex was returning 9
(without empty entry) and the test flaked. In healthy case the
appliedIndex returned 10.

Fixed by putting kv pair after leader change. The pair is guaranteed
to be stored on index when put finishes (so the empty entry as well).
2021-04-23 13:22:15 +02:00
Piotr Tabor
bd4f8e2b6c
Merge pull request #12885 from ptabor/20210422-error-codes-context
Errors: `context cancelled` or `context deadline exceeded` are exposed as codes.Canceled, codes.DeadlineExceeded instead of 'codes.Unknown'
2021-04-23 00:45:59 +02:00
Piotr Tabor
9a4b2bdccc Errors: context cancelled or context deadline exceeded are exposed as codes.Canceled, codes.DeadlineExceeded instead of 'codes.Unknown' 2021-04-22 14:35:24 +02:00
Piotr Tabor
cc52d994b7
Merge pull request #12883 from ptabor/20210421-backend-refactor-testing
mvcc/backend tests: Refactor: Do not mix testing&prod code.
2021-04-21 12:29:01 +02:00
Piotr Tabor
d7d110b5a8 mvcc/backend tests: Refactor: Do not mix testing&prod code. 2021-04-21 09:43:13 +02:00
Piotr Tabor
ea287dd9f8
Merge pull request #12854 from ptabor/20210410-shouldApplyV3
(no)StoreV2 (Part 3): Applying consistency fix: ClusterVersionSet (and co) might get not applied on v2store
2021-04-21 09:31:38 +02:00
Piotr Tabor
ce3dae6f83
Merge pull request #12873 from ptabor/20210417-test-docker-gcloud
Makefile: Use `gcloud auth configure-docker` instead of `gcloud docker ...` for test-images
2021-04-21 09:30:02 +02:00
Sam Batschelet
0f2c940f64
Merge pull request #12880 from chaochn47/exclude_alarms_from_health_check
etcdhttp/metrics.go: exclude alarms from health check conditionally with `?exclude=NOSPACE`
2021-04-20 21:18:15 -04:00
Chao Chen
b321c48b2d Update CHANGELOG-3.5.md 2021-04-20 13:17:26 -07:00
Chao Chen
140ea4fa29 etcdhttp/metrics.go: exclude alarms from health check conditionally with ?exclude=NOSPACE 2021-04-20 13:17:09 -07:00
Piotr Tabor
8162d9cbdf
Merge pull request #12876 from qsyqian/branch_management_link
doc: fix branch management link
2021-04-19 23:08:42 +02:00
Gyuho Lee
334e696f21
Merge pull request #12878 from lilic/fix-docker-release
Makefile, build.sh: Fix build process
2021-04-19 10:12:42 -07:00
Lili Cosic
51c28fc475 build.sh: Adjust building etcdctl to be same as etcd binary
This fixes so that the ENV vars are taken in the same way as for etcd
binary.
2021-04-19 17:51:46 +02:00
Lili Cosic
81652d16ef Makefile: Fix build-docker-release-master
Since the Dockerfile files are now per arch, this adjusts to detect ARCH
and builds docker release from the Dockerfile.<ARCH> file.
2021-04-19 17:47:03 +02:00
Piotr Tabor
11249fdee9
Merge pull request #12874 from ptabor/20210417-go1.16.3
Update go for 3.5: 1.15.x -> 1.16.3
2021-04-19 16:51:49 +02:00
Piotr Tabor
3423a949c0 Update go for 3.5: 1.15 -> 1.16.(3).
https://github.com/etcd-io/etcd/issues/12732
2021-04-19 16:50:54 +02:00
qsyqian
2cbd86b102 doc: fix branch management link
Fixes #12875
2021-04-19 15:25:19 +08:00
Piotr Tabor
2f77a1ac67
Merge pull request #12864 from ssbostan/master
client: fix check datascale command for https endpoints
2021-04-18 17:44:07 +02:00
Piotr Tabor
06ba0fc5a2
Merge pull request #12846 from pyiyun/fix-snaptmpfile-bug
etcdserver: remove temp files in snap dir when etcdserver starting
2021-04-17 12:58:46 +02:00
Piotr Tabor
5ad8880d77 Makefile: Use gcloud auth configure-docker instead of gcloud docker. 2021-04-17 10:11:09 +00:00
Saeid Bostandoust
97a8affdd3 fix util.go file 2021-04-17 14:24:56 +04:30
chao
80586c5b47
etcdserver/util.go: reduce memory when logging range requests (#12871)
* etcdserver/util.go: reduce memory when logging range requests

Fixes #12835

* Update CHANGELOG-3.5.md
2021-04-16 16:39:34 -07:00
Piotr Tabor
5744cdf199
Merge pull request #12870 from ptabor/20210416-fix-flake-TestSnapshotV3RestoreMultiMemberAdd-master
Fix TestSnapshotV3RestoreMultiMemberAdd flakes (leaks)
2021-04-16 21:49:46 +02:00
Piotr Tabor
17b982382e Fix TestSnapshotV3RestoreMultiMemberAdd flakes (leaks)
- most important: unix's socket transport should not keep idle
connections. For top-level Transport we close them using:
    f3c518025e/server/etcdserver/api/rafthttp/transport.go (L226)
    but currently we don't have access to close them witing the nest (unix) transport. Short idle deadline is good enough.

  - Use dialContext (instead of dial) to make sure context is passed down the stack
  - Make sure Context is cancelled as soon as the operation is done in pipeline
  - nit: use dedicated method to yeld goroutines.

Tested with:
```
d=$(date +"%Y%m%d_%H%M")
(cd tests && go test --timeout=60m ./integration/snapshot -run TestSnapshotV3RestoreMultiMemberAdd -v --count=180 2>&1 | tee log_${d}.log)
```

There were transports & cmux leaked:

```
   leak.go:118: Test appears to have leaked a Transport:
        internal/poll.runtime_pollWait(0x7f6c5c3784c8, 0x72, 0xffffffffffffffff)
        	/usr/lib/google-golang/src/runtime/netpoll.go:222 +0x55
        internal/poll.(*pollDesc).wait(0xc003296298, 0x72, 0x0, 0x18, 0xffffffffffffffff)
        	/usr/lib/google-golang/src/internal/poll/fd_poll_runtime.go:87 +0x45
        internal/poll.(*pollDesc).waitRead(...)
        	/usr/lib/google-golang/src/internal/poll/fd_poll_runtime.go:92
        internal/poll.(*FD).Read(0xc003296280, 0xc0031f60a8, 0x18, 0x18, 0x0, 0x0, 0x0)
        	/usr/lib/google-golang/src/internal/poll/fd_unix.go:166 +0x1d5
        net.(*netFD).Read(0xc003296280, 0xc0031f60a8, 0x18, 0x18, 0x18, 0xc0009056e2, 0x203000)
        	/usr/lib/google-golang/src/net/fd_posix.go:55 +0x4f
        net.(*conn).Read(0xc000010258, 0xc0031f60a8, 0x18, 0x18, 0x0, 0x0, 0x0)
        	/usr/lib/google-golang/src/net/net.go:183 +0x91
        github.com/soheilhy/cmux.(*bufferedReader).Read(0xc0003d24e0, 0xc0031f60a8, 0x18, 0x18, 0xc0003d24d0, 0xc0009056e2, 0xc000278400)
        	/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/buffer.go:53 +0x12d
        github.com/soheilhy/cmux.hasHTTP2Preface(0x1367e20, 0xc0003d24e0, 0x7f6c5c699f40)
        	/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/matchers.go:195 +0x8a
        github.com/soheilhy/cmux.matchersToMatchWriters.func1(0x7f6c5c699f40, 0xc000010258, 0x1367e20, 0xc0003d24e0, 0xc000010258)
        	/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/cmux.go:128 +0x39
        github.com/soheilhy/cmux.(*cMux).serve(0xc003228690, 0x138c410, 0xc000010258, 0xc00327f740, 0xc0059ba860)
        	/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/cmux.go:192 +0x1e7
        created by github.com/soheilhy/cmux.(*cMux).Serve
        	/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/cmux.go:179 +0x191

        internal/poll.runtime_pollWait(0x7f6c5c60f3f0, 0x72, 0xffffffffffffffff)
        	/usr/lib/google-golang/src/runtime/netpoll.go:222 +0x55
        internal/poll.(*pollDesc).wait(0xc000d53018, 0x72, 0x1000, 0x1000, 0xffffffffffffffff)
        	/usr/lib/google-golang/src/internal/poll/fd_poll_runtime.go:87 +0x45
        internal/poll.(*pollDesc).waitRead(...)
        	/usr/lib/google-golang/src/internal/poll/fd_poll_runtime.go:92
        internal/poll.(*FD).Read(0xc000d53000, 0xc000cfd000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
        	/usr/lib/google-golang/src/internal/poll/fd_unix.go:166 +0x1d5
        net.(*netFD).Read(0xc000d53000, 0xc000cfd000, 0x1000, 0x1000, 0x3, 0x3, 0x1000000000001)
        	/usr/lib/google-golang/src/net/fd_posix.go:55 +0x4f
        net.(*conn).Read(0xc00031a570, 0xc000cfd000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
        	/usr/lib/google-golang/src/net/net.go:183 +0x91
        net/http.(*persistConn).Read(0xc00093b320, 0xc000cfd000, 0x1000, 0x1000, 0x577750, 0x60, 0x0)
        	/usr/lib/google-golang/src/net/http/transport.go:1933 +0x77
        bufio.(*Reader).fill(0xc005702fc0)
        	/usr/lib/google-golang/src/bufio/bufio.go:101 +0x108
        bufio.(*Reader).Peek(0xc005702fc0, 0x1, 0xc00077c660, 0xc003b082a0, 0xc000d08de0, 0x5ae586, 0x11dd6c0)
        	/usr/lib/google-golang/src/bufio/bufio.go:139 +0x4f
        net/http.(*persistConn).readLoop(0xc00093b320)
        	/usr/lib/google-golang/src/net/http/transport.go:2094 +0x1a8
        created by net/http.(*Transport).dialConn
        	/usr/lib/google-golang/src/net/http/transport.go:1754 +0xdaa

        net/http.(*persistConn).writeLoop(0xc00093b320)
        	/usr/lib/google-golang/src/net/http/transport.go:2393 +0xf7
        created by net/http.(*Transport).dialConn
        	/usr/lib/google-golang/src/net/http/transport.go:1755 +0xdcf

        sync.runtime_Semacquire(0xc0059ba868)
        	/usr/lib/google-golang/src/runtime/sema.go:56 +0x45
        sync.(*WaitGroup).Wait(0xc0059ba860)
        	/usr/lib/google-golang/src/sync/waitgroup.go:130 +0x65
        github.com/soheilhy/cmux.(*cMux).Serve.func1(0xc003228690, 0xc0059ba860)
        	/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/cmux.go:158 +0x56
        github.com/soheilhy/cmux.(*cMux).Serve(0xc003228690, 0x13698c0, 0xc00377a0f0)
        	/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/cmux.go:173 +0x115
        go.etcd.io/etcd/server/v3/embed.(*Etcd).servePeers.func1(0xc0007cc360, 0x122b75f)
        	/home/ptab/corp/etcd/server/embed/etcd.go:518 +0x2b9
        go.etcd.io/etcd/server/v3/embed.(*Etcd).servePeers.func3(0xc00036d080, 0xc0059330a0)
        	/home/ptab/corp/etcd/server/embed/etcd.go:549 +0x182
        created by go.etcd.io/etcd/server/v3/embed.(*Etcd).servePeers
        	/home/ptab/corp/etcd/server/embed/etcd.go:543 +0x73a
--- FAIL: TestSnapshotV3RestoreMultiMemberAdd (17.74s)
```
2021-04-16 20:17:28 +02:00
pyiyun
28a490b09c etcdserver: remove temp files in snap dir when etcdServer starting
When etcd exits abnormally, tmp files will remain in snap dir, so clean up tmp files in snap dir when etcdserver starting.

Fixes #12837
2021-04-16 20:30:04 +08:00
Saeid Bostandoust
a9c4301c1e fix check datascale command for https endpoints 2021-04-16 03:51:04 +04:30
Piotr Tabor
f3c518025e
Merge pull request #12861 from ptabor/20210413-test-logging-fixes
Embedded server should not mess global loggers (by default)
2021-04-16 00:07:35 +02:00
Piotr Tabor
b47c5fcc12 Address review comments a.d. logging. 2021-04-15 17:54:37 +02:00
Piotr Tabor
c73da740fa
Merge pull request #12862 from lilic/enable-race
.travis.yaml: Enables race in the tests
2021-04-15 17:37:03 +02:00
Lili Cosic
2843fded06 .travis.yaml: Enables race in the tests
As the TestV3WatchRestoreSnapshotUnsync flakiness was fixed, this
removes the TODO and enables the race again.
2021-04-15 14:32:52 +02:00
Piotr Tabor
fad6391745 Configure proper logging for grpc integration tests. 2021-04-14 12:47:38 +02:00
Piotr Tabor
d72f7ef5cc Give control to Embedded servers whether they override global loggers
So far each instance of embed server was overriding the grpc loggers and zap.global loggers.
It's counter intutitive that last created Embedded server was 'wining' and more-over it was breaking grpc expectation to change it "only" before the grpc stack is being used.

This PR introduces explicit call: `embed.Config::SetupGlobalLoggers()`, that changes the loggers where requested. The call is used by etcd main binary.

The immediate benefit from this change is reduction of  test flakiness, as there were flakes due to not a proper logger being used across tests.
2021-04-14 12:47:38 +02:00
Piotr Tabor
eafbc8c57e Update zap logging dependency.
In particular bring up zapgrpc V2 code:
89e382035d
https://pkg.go.dev/google.golang.org/grpc/grpclog#LoggerV2
2021-04-14 12:15:48 +02:00
Piotr Tabor
57a092b45d
Merge pull request #12859 from tomwilkie/fix-mixin
Fix the mixin.
2021-04-13 23:05:52 +02:00
Piotr Tabor
d69e46ea47 Make ShouldApplyV3 an enum - not bool 2021-04-13 23:01:03 +02:00
Tom Wilkie
562d645ac9
Fix the mixin.
Signed-off-by: Tom Wilkie <tom@grafana.com>
2021-04-13 19:38:55 +01:00
Piotr Tabor
4388bfc925
Merge pull request #12858 from hnlq715/patch-1
client: fix doc typo
2021-04-13 17:33:43 +02:00
大可
5db0070150
Update doc.go 2021-04-13 22:22:42 +08:00
大可
d7e971e8d8
client: fix doc typo 2021-04-13 16:33:24 +08:00
Piotr Tabor
b1c04ce043 Applying consistency fix: ClusterVersionSet (and co) might get no applied on v2store
ClusterVersionSet, ClusterMemberAttrSet, DowngradeInfoSet functions are
writing both to V2store and backend. Prior this CL there were
in a branch not executed if shouldApplyV3 was false,
e.g. during restore when Backend is up-to-date (has high
consistency-index) while v2store requires replay from WAL log.

The most serious consequence of this bug was that v2store after restore
could have different index (revision) than the same exact store before restore,
so potentially different content between replicas.

Also this change is supressing double-applying of Membership
(ClusterConfig) changes on Backend (store v3) - that lackilly are not
part of MVCC/KeyValue store, so they didn't caused Revisions to be
bumped.

Inspired by jingyih@ comment:
https://github.com/etcd-io/etcd/pull/12820#issuecomment-815299406
2021-04-12 09:43:48 +02:00
Piotr Tabor
7f97dfd45a
Merge pull request #12795 from wpedrak/resend-read-index-on-first-commit-in-term
etcdserver: resend ReadIndex request on empty apply request
2021-04-09 16:41:54 +02:00
wpedrak
08ea9cb756 etcdserver: integration test covering first commit in current term notification 2021-04-09 16:05:02 +02:00
wpedrak
3991a8c9fa etcdserver: replace forceVersionC with FirstCommitInTermNotify 2021-04-09 11:30:42 +02:00
wpedrak
3d485faac5 etcdserver: resend ReadIndex request on empty apply request
Empty apply indicates first commit in current term. It is first time when follower is sure, that it's ReadIndex request can be processed.
2021-04-09 11:30:42 +02:00