Piotr Tabor
911204cd76
Fix ETCDCTL_API=2 etcdctl backup --with-v3
consistent index consistency
...
Prior to this CL, `ETCDCTL_API=2 etcdctl backup --with-v3` was readacting WAL log
(by removal of some entries), but was NOT updating consistent_index in the backend.
Also the WAL editing logic was buggy, as it didn't took in consideration the fact
that when TERM changes, there can be entries with duplicated indexes in
the log. So its NOT sufficient to subtract number of removed entries to
get accurate log indexes.
The PR replaces removing and shifting of WAL entries with replacing them with an no-op entries.
Thanks to this consistent-index references are staying up to date.
The PR also:
- updates 'verification' logic to check whether consistent_index does not lag befor last snapshot
- env-gated execution of verification framework in `etcdctl backup`.
Tested with:
```
(./build.sh && cd tests && EXPECT_DEBUG=TRUE 'env' 'go' 'test' '-timeout=300m' 'go.etcd.io/etcd/tests/v3/e2e' -run=TestCtlV2Backup --count=1000 2>&1 | tee TestCtlV2BackupV3.log)
```
2021-04-29 11:51:24 +02:00
Piotr Tabor
adc365e14f
etcdctl2: backup command logging cleanup (zap)
2021-04-28 23:26:17 +02:00
Wilson Wang
8d8d0377a2
server: applier uses ReadTx instead of ConcurrentTx
2021-04-28 11:06:24 -07:00
Piotr Tabor
ed4a87d541
Merge pull request #12901 from ptabor/20210427-verification
...
Verification of persisted data
2021-04-28 08:45:15 +02:00
Piotr Tabor
7107cb9f86
fixup! Create 'datadir' package responsible for paths.
2021-04-28 08:44:06 +02:00
Piotr Tabor
c4b13a5c83
Integrate verification framework
...
Verification framework is integrated with:
- integration tests (by default)
- `ETCD_VERIFY=all etcdctl snapshot restore` command
- etcd shutdown when running with `ETCD_VERIFY=all` env.
2021-04-28 07:56:16 +02:00
Piotr Tabor
47b28b600a
Verification package: Verified given data-dir.
...
For now verifies whete Backend.cindex is consistent with WAL log,
but should get expanded to cover memberships & revisions.
2021-04-28 07:56:15 +02:00
Piotr Tabor
6f8f506cf4
Create 'datadir' package responsible for paths.
2021-04-28 07:56:13 +02:00
Piotr Tabor
d2722ff955
Merge pull request #12820 from ptabor/20210326-membership-from-be
...
(no)StoreV2 (Part 2): Prepare to read membership information from backend
2021-04-28 07:53:42 +02:00
Piotr Tabor
067521981e
v2 etcdctl backup: producing consistent state of membership
2021-04-27 19:34:34 +02:00
Piotr Tabor
a70386a1a4
Simplify membership interface: Does not pass the 'unused' token.
2021-04-27 17:17:31 +02:00
Piotr Tabor
4725567d5e
e2e tests: More logging and expect adopted to 3.4.
2021-04-27 17:17:31 +02:00
Piotr Tabor
7ae3d25f91
Membership: Add additional methods to trim/manage membership data in backend.
2021-04-27 17:17:31 +02:00
Piotr Tabor
aa6597384b
etcd-dump-logs: Print full confState as json for debugging purposes.
2021-04-27 17:17:31 +02:00
Piotr Tabor
768da490ed
sever: v2store deprecation: Fix etcdctl snapshot restore
to restore
...
correct 'backend' (bbolt) context in aspect of membership.
Prior to this change the 'restored' backend used to still contain:
- old memberid (mvcc deletion used, why the membership is in bolt
bucket, but not mvcc part):
```
mvs := mvcc.NewStore(s.lg, be, lessor, ci, mvcc.StoreConfig{CompactionBatchLimit: math.MaxInt32})
defer mvs.Close()
txn := mvs.Write(traceutil.TODO())
btx := be.BatchTx()
del := func(k, v []byte) error {
txn.DeleteRange(k, nil)
return nil
}
// delete stored members from old cluster since using new members
btx.UnsafeForEach([]byte("members"), del)
```
- didn't get new members added.
2021-04-27 17:17:30 +02:00
Gyuho Lee
06d6f09a8a
Merge pull request #12894 from MakDon/patch-1
...
etcdserver/mvcc: update tw.trace.Step condition
2021-04-27 00:19:10 -07:00
Gyuho Lee
c4f7d578d8
Merge pull request #12898 from tangcong/add-disk-io-failure-case
...
functional: add disk io failure case
2021-04-26 14:37:53 -07:00
tangcong
4fb22093a6
functional: add SHORT_TTL_LEASE_EXPIRE checker
2021-04-26 20:00:45 +08:00
tangcong
16e38e49a9
functional: add FAILPOINTS_WITH_DISK_IO_LATENCY case
2021-04-26 12:07:05 +08:00
tangcong
370f9cf3b9
fix: failed to get failpoints from member
2021-04-26 11:44:53 +08:00
Makdon
ddcb463822
etcdserver/mvcc: update trace.Step condition
2021-04-25 23:07:45 +08:00
Piotr Tabor
9a3aff6d8f
Merge pull request #12889 from ptabor/20210423-deflake-TestFirstCommitNotification
...
Deflake: TestFirstCommitNotification
2021-04-23 14:55:44 +02:00
Piotr Tabor
9f559775b8
Deflake: TestFirstCommitNotification
...
Infrequently the test flaked. Reproducable with:
```
go test go.etcd.io/etcd/tests/v3/integration --run TestFirstCommitNotification --count=500
```
The moveLeader finishes when configchange is commited by quorum.
It doesn't guarantee that the 'empty' record was committed by the new leader.
From time to time happened that appliedLeaderIndex was returning 9
(without empty entry) and the test flaked. In healthy case the
appliedIndex returned 10.
Fixed by putting kv pair after leader change. The pair is guaranteed
to be stored on index when put finishes (so the empty entry as well).
2021-04-23 13:22:15 +02:00
Piotr Tabor
bd4f8e2b6c
Merge pull request #12885 from ptabor/20210422-error-codes-context
...
Errors: `context cancelled` or `context deadline exceeded` are exposed as codes.Canceled, codes.DeadlineExceeded instead of 'codes.Unknown'
2021-04-23 00:45:59 +02:00
Piotr Tabor
9a4b2bdccc
Errors: context cancelled
or context deadline exceeded
are exposed as codes.Canceled, codes.DeadlineExceeded instead of 'codes.Unknown'
2021-04-22 14:35:24 +02:00
Piotr Tabor
cc52d994b7
Merge pull request #12883 from ptabor/20210421-backend-refactor-testing
...
mvcc/backend tests: Refactor: Do not mix testing&prod code.
2021-04-21 12:29:01 +02:00
Piotr Tabor
d7d110b5a8
mvcc/backend tests: Refactor: Do not mix testing&prod code.
2021-04-21 09:43:13 +02:00
Piotr Tabor
ea287dd9f8
Merge pull request #12854 from ptabor/20210410-shouldApplyV3
...
(no)StoreV2 (Part 3): Applying consistency fix: ClusterVersionSet (and co) might get not applied on v2store
2021-04-21 09:31:38 +02:00
Piotr Tabor
ce3dae6f83
Merge pull request #12873 from ptabor/20210417-test-docker-gcloud
...
Makefile: Use `gcloud auth configure-docker` instead of `gcloud docker ...` for test-images
2021-04-21 09:30:02 +02:00
Sam Batschelet
0f2c940f64
Merge pull request #12880 from chaochn47/exclude_alarms_from_health_check
...
etcdhttp/metrics.go: exclude alarms from health check conditionally with `?exclude=NOSPACE`
2021-04-20 21:18:15 -04:00
Chao Chen
b321c48b2d
Update CHANGELOG-3.5.md
2021-04-20 13:17:26 -07:00
Chao Chen
140ea4fa29
etcdhttp/metrics.go: exclude alarms from health check conditionally with ?exclude=NOSPACE
2021-04-20 13:17:09 -07:00
Piotr Tabor
8162d9cbdf
Merge pull request #12876 from qsyqian/branch_management_link
...
doc: fix branch management link
2021-04-19 23:08:42 +02:00
Gyuho Lee
334e696f21
Merge pull request #12878 from lilic/fix-docker-release
...
Makefile, build.sh: Fix build process
2021-04-19 10:12:42 -07:00
Lili Cosic
51c28fc475
build.sh: Adjust building etcdctl to be same as etcd binary
...
This fixes so that the ENV vars are taken in the same way as for etcd
binary.
2021-04-19 17:51:46 +02:00
Lili Cosic
81652d16ef
Makefile: Fix build-docker-release-master
...
Since the Dockerfile files are now per arch, this adjusts to detect ARCH
and builds docker release from the Dockerfile.<ARCH> file.
2021-04-19 17:47:03 +02:00
Piotr Tabor
11249fdee9
Merge pull request #12874 from ptabor/20210417-go1.16.3
...
Update go for 3.5: 1.15.x -> 1.16.3
2021-04-19 16:51:49 +02:00
Piotr Tabor
3423a949c0
Update go for 3.5: 1.15 -> 1.16.(3).
...
https://github.com/etcd-io/etcd/issues/12732
2021-04-19 16:50:54 +02:00
qsyqian
2cbd86b102
doc: fix branch management link
...
Fixes #12875
2021-04-19 15:25:19 +08:00
Piotr Tabor
2f77a1ac67
Merge pull request #12864 from ssbostan/master
...
client: fix check datascale command for https endpoints
2021-04-18 17:44:07 +02:00
Piotr Tabor
06ba0fc5a2
Merge pull request #12846 from pyiyun/fix-snaptmpfile-bug
...
etcdserver: remove temp files in snap dir when etcdserver starting
2021-04-17 12:58:46 +02:00
Piotr Tabor
5ad8880d77
Makefile: Use gcloud auth configure-docker instead of gcloud docker.
2021-04-17 10:11:09 +00:00
Saeid Bostandoust
97a8affdd3
fix util.go file
2021-04-17 14:24:56 +04:30
chao
80586c5b47
etcdserver/util.go: reduce memory when logging range requests ( #12871 )
...
* etcdserver/util.go: reduce memory when logging range requests
Fixes #12835
* Update CHANGELOG-3.5.md
2021-04-16 16:39:34 -07:00
Piotr Tabor
5744cdf199
Merge pull request #12870 from ptabor/20210416-fix-flake-TestSnapshotV3RestoreMultiMemberAdd-master
...
Fix TestSnapshotV3RestoreMultiMemberAdd flakes (leaks)
2021-04-16 21:49:46 +02:00
Piotr Tabor
17b982382e
Fix TestSnapshotV3RestoreMultiMemberAdd flakes (leaks)
...
- most important: unix's socket transport should not keep idle
connections. For top-level Transport we close them using:
f3c518025e/server/etcdserver/api/rafthttp/transport.go (L226)
but currently we don't have access to close them witing the nest (unix) transport. Short idle deadline is good enough.
- Use dialContext (instead of dial) to make sure context is passed down the stack
- Make sure Context is cancelled as soon as the operation is done in pipeline
- nit: use dedicated method to yeld goroutines.
Tested with:
```
d=$(date +"%Y%m%d_%H%M")
(cd tests && go test --timeout=60m ./integration/snapshot -run TestSnapshotV3RestoreMultiMemberAdd -v --count=180 2>&1 | tee log_${d}.log)
```
There were transports & cmux leaked:
```
leak.go:118: Test appears to have leaked a Transport:
internal/poll.runtime_pollWait(0x7f6c5c3784c8, 0x72, 0xffffffffffffffff)
/usr/lib/google-golang/src/runtime/netpoll.go:222 +0x55
internal/poll.(*pollDesc).wait(0xc003296298, 0x72, 0x0, 0x18, 0xffffffffffffffff)
/usr/lib/google-golang/src/internal/poll/fd_poll_runtime.go:87 +0x45
internal/poll.(*pollDesc).waitRead(...)
/usr/lib/google-golang/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0xc003296280, 0xc0031f60a8, 0x18, 0x18, 0x0, 0x0, 0x0)
/usr/lib/google-golang/src/internal/poll/fd_unix.go:166 +0x1d5
net.(*netFD).Read(0xc003296280, 0xc0031f60a8, 0x18, 0x18, 0x18, 0xc0009056e2, 0x203000)
/usr/lib/google-golang/src/net/fd_posix.go:55 +0x4f
net.(*conn).Read(0xc000010258, 0xc0031f60a8, 0x18, 0x18, 0x0, 0x0, 0x0)
/usr/lib/google-golang/src/net/net.go:183 +0x91
github.com/soheilhy/cmux.(*bufferedReader).Read(0xc0003d24e0, 0xc0031f60a8, 0x18, 0x18, 0xc0003d24d0, 0xc0009056e2, 0xc000278400)
/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/buffer.go:53 +0x12d
github.com/soheilhy/cmux.hasHTTP2Preface(0x1367e20, 0xc0003d24e0, 0x7f6c5c699f40)
/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/matchers.go:195 +0x8a
github.com/soheilhy/cmux.matchersToMatchWriters.func1(0x7f6c5c699f40, 0xc000010258, 0x1367e20, 0xc0003d24e0, 0xc000010258)
/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/cmux.go:128 +0x39
github.com/soheilhy/cmux.(*cMux).serve(0xc003228690, 0x138c410, 0xc000010258, 0xc00327f740, 0xc0059ba860)
/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/cmux.go:192 +0x1e7
created by github.com/soheilhy/cmux.(*cMux).Serve
/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/cmux.go:179 +0x191
internal/poll.runtime_pollWait(0x7f6c5c60f3f0, 0x72, 0xffffffffffffffff)
/usr/lib/google-golang/src/runtime/netpoll.go:222 +0x55
internal/poll.(*pollDesc).wait(0xc000d53018, 0x72, 0x1000, 0x1000, 0xffffffffffffffff)
/usr/lib/google-golang/src/internal/poll/fd_poll_runtime.go:87 +0x45
internal/poll.(*pollDesc).waitRead(...)
/usr/lib/google-golang/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0xc000d53000, 0xc000cfd000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
/usr/lib/google-golang/src/internal/poll/fd_unix.go:166 +0x1d5
net.(*netFD).Read(0xc000d53000, 0xc000cfd000, 0x1000, 0x1000, 0x3, 0x3, 0x1000000000001)
/usr/lib/google-golang/src/net/fd_posix.go:55 +0x4f
net.(*conn).Read(0xc00031a570, 0xc000cfd000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
/usr/lib/google-golang/src/net/net.go:183 +0x91
net/http.(*persistConn).Read(0xc00093b320, 0xc000cfd000, 0x1000, 0x1000, 0x577750, 0x60, 0x0)
/usr/lib/google-golang/src/net/http/transport.go:1933 +0x77
bufio.(*Reader).fill(0xc005702fc0)
/usr/lib/google-golang/src/bufio/bufio.go:101 +0x108
bufio.(*Reader).Peek(0xc005702fc0, 0x1, 0xc00077c660, 0xc003b082a0, 0xc000d08de0, 0x5ae586, 0x11dd6c0)
/usr/lib/google-golang/src/bufio/bufio.go:139 +0x4f
net/http.(*persistConn).readLoop(0xc00093b320)
/usr/lib/google-golang/src/net/http/transport.go:2094 +0x1a8
created by net/http.(*Transport).dialConn
/usr/lib/google-golang/src/net/http/transport.go:1754 +0xdaa
net/http.(*persistConn).writeLoop(0xc00093b320)
/usr/lib/google-golang/src/net/http/transport.go:2393 +0xf7
created by net/http.(*Transport).dialConn
/usr/lib/google-golang/src/net/http/transport.go:1755 +0xdcf
sync.runtime_Semacquire(0xc0059ba868)
/usr/lib/google-golang/src/runtime/sema.go:56 +0x45
sync.(*WaitGroup).Wait(0xc0059ba860)
/usr/lib/google-golang/src/sync/waitgroup.go:130 +0x65
github.com/soheilhy/cmux.(*cMux).Serve.func1(0xc003228690, 0xc0059ba860)
/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/cmux.go:158 +0x56
github.com/soheilhy/cmux.(*cMux).Serve(0xc003228690, 0x13698c0, 0xc00377a0f0)
/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/cmux.go:173 +0x115
go.etcd.io/etcd/server/v3/embed.(*Etcd).servePeers.func1(0xc0007cc360, 0x122b75f)
/home/ptab/corp/etcd/server/embed/etcd.go:518 +0x2b9
go.etcd.io/etcd/server/v3/embed.(*Etcd).servePeers.func3(0xc00036d080, 0xc0059330a0)
/home/ptab/corp/etcd/server/embed/etcd.go:549 +0x182
created by go.etcd.io/etcd/server/v3/embed.(*Etcd).servePeers
/home/ptab/corp/etcd/server/embed/etcd.go:543 +0x73a
--- FAIL: TestSnapshotV3RestoreMultiMemberAdd (17.74s)
```
2021-04-16 20:17:28 +02:00
pyiyun
28a490b09c
etcdserver: remove temp files in snap dir when etcdServer starting
...
When etcd exits abnormally, tmp files will remain in snap dir, so clean up tmp files in snap dir when etcdserver starting.
Fixes #12837
2021-04-16 20:30:04 +08:00
Saeid Bostandoust
a9c4301c1e
fix check datascale command for https endpoints
2021-04-16 03:51:04 +04:30
Piotr Tabor
f3c518025e
Merge pull request #12861 from ptabor/20210413-test-logging-fixes
...
Embedded server should not mess global loggers (by default)
2021-04-16 00:07:35 +02:00
Piotr Tabor
b47c5fcc12
Address review comments a.d. logging.
2021-04-15 17:54:37 +02:00