75 Commits

Author SHA1 Message Date
Hitoshi Mitake
2a750a8dba *: implement a retry logic for auth old revision in the client 2021-09-05 01:13:52 +09:00
Marek Siarkowicz
83a325ac46 server: Move all functions needed for storage bootstrap to storage package
This is prerequestite to move storage bootstrap, splitted to separate PR
to make it easier to review.
2021-08-03 13:09:15 +02:00
Marek Siarkowicz
23b742cfd3 server: Remove Quota direct dependency on EtcdServer 2021-08-03 12:48:41 +02:00
Marek Siarkowicz
44b8ae145b etcdserver: Move datadir and wal to storage package 2021-08-03 12:47:37 +02:00
Sahdev Zala
2526463e44
Merge pull request #13236 from roytman/expensiveRequest
etcdserver: configure "expensive" requests duration
2021-08-02 09:33:43 -04:00
Alexey Roytman
2a26f7ae4c
etcdserver: configure "expensive" requests duration
When a unary request takes more than predefined duration, this request
is defined as "expensive" and a warning is printed. The expensive request
duration is hard-coded to 300 ms. It can be not enough for example
for transactions with a lot of operations. The warnings just blow up
the log files and reduce throughput.

This fix allows user to configure the "expensive" request duration.

Signed-off-by: Alexey Roytman <roytman@il.ibm.com>
2021-07-27 08:33:44 +03:00
Marek Siarkowicz
2f31cc3fbc etcdserver: Create AlarmBackend interface 2021-07-20 17:53:44 +02:00
Marek Siarkowicz
5b6f4579fb server: Rename buckets to schema 2021-07-12 15:37:21 +02:00
Marek Siarkowicz
5e40a8b00c server: Create storage package and move mvcc files to it 2021-07-12 15:37:21 +02:00
ahrtr
d38c383c0d etcdserver: skip empty alarm from the query parameter 2021-07-05 23:54:49 +08:00
Marek Siarkowicz
bf3e7033e9 etcdserver: Move Read/Update methods on Meta bucket to one place
There are still some left like compact keys, but they will require more
work to avoid circular dependency.
2021-07-05 13:23:53 +02:00
Piotr Tabor
1208505290
Merge pull request #13161 from serathius/membership
etcdserver: Membership uses MembershipStorage interface instead of directly accessing Backend
2021-07-03 11:33:38 +02:00
Marek Siarkowicz
e5a026822b etcdserver: Move put/read/delete on Alarm bucket to bucket package 2021-07-01 13:35:10 +02:00
Marek Siarkowicz
50507d5f3c etcdserver: Membership uses MembershipStorage interface instead of directly accessing Backend 2021-06-29 16:14:06 +02:00
Marek Siarkowicz
f79d09d48b etcdserver: Move all named keys to buckets module 2021-06-28 16:40:50 +02:00
Gyuho Lee
2a0f8f0738
Merge pull request #13145 from tangcong/fix-endpoint-health
fix health endpoint not usable when authentication is enabled
2021-06-25 18:33:46 -07:00
Marek Siarkowicz
e2740b4afa server,etcdutl: Preserve etcd version in backend allowing etcdutl to read it from snapshot 2021-06-25 14:06:56 +02:00
tangcong
dd62aebfb5 fix health endpoint not usable when authentication is enabled 2021-06-25 14:02:45 +08:00
Lili Cosic
b9d837183a server/etcdserver/api: Add 3.6 to supported version 2021-06-22 12:25:39 +02:00
Marek Siarkowicz
e1b1d93548 *: Snapshot returns local etcd version
Co-authored-by: Lili Cosic <cosiclili@gmail.com>
2021-06-14 16:36:50 +02:00
J. David Lowe
115c694af6 etcdserver: don't attempt to grant nil permission to a role
Prevent etcd from crashing when given a bad grant payload, e.g.:

$ curl -d '{"name": "foo"}' http://localhost:2379/v3/auth/role/add
{"header":{"cluster_id":"14841639068965178418", ...
$ curl -d '{"name": "foo"}' http://localhost:2379/v3/auth/role/grant
curl: (52) Empty reply from server
2021-06-04 14:20:02 -07:00
Piotr Tabor
f15e0b8237 integration.BeforeTest can be run without leak-detection. 2021-05-27 20:55:11 +02:00
Pavan BG
d9c5e1f0a2 *: Replace internal testutil AssertEqual function in favour of github.com/stretchr/testify/assert.Equals
To maintain uniformity in testing methodology, assert.Equals has been used everywhere while replacing testutil.AssertEqual

Fixes #13015
2021-05-20 19:55:45 +05:30
Piotr Tabor
66752fef2f Represent bucket as object instead of []byte name.
Thanks to this change:
  - all the maps bucket -> buffer are indexed by int's instead of
string. No need to do: byte[] -> string -> hash conversion on each
access.
  - buckets are strongly typed in backend/mvcc API.
2021-05-18 18:58:53 +02:00
Piotr Tabor
e44fb40be5
Merge pull request #12962 from ptabor/20210513-write-conf-state
Save raftpb.ConfState in the backend.
2021-05-13 19:22:28 +02:00
Gyuho Lee
e2d67f2e3b
Merge pull request #12956 from gyuho/rename-to-main
*: rename "master" branch references to "main" in source code
2021-05-13 08:26:33 -07:00
Piotr Tabor
865df75714 Save raftpb.ConfState in the backend.
This makes (bbolt) backend a full feature snapshot in term of WAL/raft,
i.e. carries:
  - commit : (applied_index)
  - confState

Benefits:
  - Backend will be a sufficient point in time definition sufficient to
start replaying WAL. We have applied_index & confState in consistent
state.
  - In case of emergency a backend state can be used for recovery
2021-05-13 14:29:36 +02:00
Piotr Tabor
3cb1ba4b2b
Merge pull request #12954 from serathius/logger-new-ctx-client
client: Add logger argument to NewCtxClient
2021-05-13 09:03:38 +02:00
Gyuho Lee
77c8033739 server: rename "master" branch references
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
2021-05-12 10:37:35 -07:00
Piotr Tabor
f3b4a3e578 Detecting whether v2store is "empty" (metadata only). 2021-05-12 18:09:34 +02:00
Marek Siarkowicz
1189ee3f3d client: Add logger argument to NewCtxClient 2021-05-12 16:40:55 +02:00
Lili Cosic
1a718a958e Add initial Tracing with OpenTelemetry 2021-05-10 10:44:40 +02:00
Piotr Tabor
aeb9b5fc73
Merge pull request #12855 from ptabor/20210409-backend-hooks
(no)StoreV2 (Part 4): Backend hooks:  precommit updates consistency_index
2021-05-08 09:34:31 +02:00
tangcong
b32bc914ff
learner support snapshot RPC (#12890)
* learner support snapshot RPC

* CHANGELOG: update for 12890
2021-05-04 13:26:33 -07:00
Piotr Tabor
2dbecea5b2 Simplify KVStore interaction with cindex thanks to hooks. 2021-05-04 18:21:23 +02:00
Piotr Tabor
a3fc535a5a
Merge pull request #12914 from ptabor/20210430-read-membership
No-storeV2: Read membership information from the backend (Part5)
2021-05-04 10:15:01 +02:00
Piotr Tabor
ffea1537d4 ClientV3 tests use integration.NewClient that configures proper logger. 2021-04-29 18:18:34 +02:00
Piotr Tabor
205a1a442a Read membership information from the backend. 2021-04-29 13:45:45 +02:00
Piotr Tabor
067521981e v2 etcdctl backup: producing consistent state of membership 2021-04-27 19:34:34 +02:00
Piotr Tabor
a70386a1a4 Simplify membership interface: Does not pass the 'unused' token. 2021-04-27 17:17:31 +02:00
Piotr Tabor
7ae3d25f91 Membership: Add additional methods to trim/manage membership data in backend. 2021-04-27 17:17:31 +02:00
Piotr Tabor
768da490ed sever: v2store deprecation: Fix etcdctl snapshot restore to restore
correct 'backend' (bbolt) context in aspect of membership.

Prior to this change the 'restored' backend used to still contain:
  - old memberid (mvcc deletion used, why the membership is in bolt
bucket, but not mvcc part):
    ```
	mvs := mvcc.NewStore(s.lg, be, lessor, ci, mvcc.StoreConfig{CompactionBatchLimit: math.MaxInt32})
	defer mvs.Close()
	txn := mvs.Write(traceutil.TODO())
	btx := be.BatchTx()
	del := func(k, v []byte) error {
		txn.DeleteRange(k, nil)
		return nil
	}

	// delete stored members from old cluster since using new members
	btx.UnsafeForEach([]byte("members"), del)
    ```
  - didn't get new members added.
2021-04-27 17:17:30 +02:00
Piotr Tabor
9a4b2bdccc Errors: context cancelled or context deadline exceeded are exposed as codes.Canceled, codes.DeadlineExceeded instead of 'codes.Unknown' 2021-04-22 14:35:24 +02:00
Piotr Tabor
ea287dd9f8
Merge pull request #12854 from ptabor/20210410-shouldApplyV3
(no)StoreV2 (Part 3): Applying consistency fix: ClusterVersionSet (and co) might get not applied on v2store
2021-04-21 09:31:38 +02:00
Sam Batschelet
0f2c940f64
Merge pull request #12880 from chaochn47/exclude_alarms_from_health_check
etcdhttp/metrics.go: exclude alarms from health check conditionally with `?exclude=NOSPACE`
2021-04-20 21:18:15 -04:00
Chao Chen
140ea4fa29 etcdhttp/metrics.go: exclude alarms from health check conditionally with ?exclude=NOSPACE 2021-04-20 13:17:09 -07:00
Piotr Tabor
17b982382e Fix TestSnapshotV3RestoreMultiMemberAdd flakes (leaks)
- most important: unix's socket transport should not keep idle
connections. For top-level Transport we close them using:
    f3c518025e/server/etcdserver/api/rafthttp/transport.go (L226)
    but currently we don't have access to close them witing the nest (unix) transport. Short idle deadline is good enough.

  - Use dialContext (instead of dial) to make sure context is passed down the stack
  - Make sure Context is cancelled as soon as the operation is done in pipeline
  - nit: use dedicated method to yeld goroutines.

Tested with:
```
d=$(date +"%Y%m%d_%H%M")
(cd tests && go test --timeout=60m ./integration/snapshot -run TestSnapshotV3RestoreMultiMemberAdd -v --count=180 2>&1 | tee log_${d}.log)
```

There were transports & cmux leaked:

```
   leak.go:118: Test appears to have leaked a Transport:
        internal/poll.runtime_pollWait(0x7f6c5c3784c8, 0x72, 0xffffffffffffffff)
        	/usr/lib/google-golang/src/runtime/netpoll.go:222 +0x55
        internal/poll.(*pollDesc).wait(0xc003296298, 0x72, 0x0, 0x18, 0xffffffffffffffff)
        	/usr/lib/google-golang/src/internal/poll/fd_poll_runtime.go:87 +0x45
        internal/poll.(*pollDesc).waitRead(...)
        	/usr/lib/google-golang/src/internal/poll/fd_poll_runtime.go:92
        internal/poll.(*FD).Read(0xc003296280, 0xc0031f60a8, 0x18, 0x18, 0x0, 0x0, 0x0)
        	/usr/lib/google-golang/src/internal/poll/fd_unix.go:166 +0x1d5
        net.(*netFD).Read(0xc003296280, 0xc0031f60a8, 0x18, 0x18, 0x18, 0xc0009056e2, 0x203000)
        	/usr/lib/google-golang/src/net/fd_posix.go:55 +0x4f
        net.(*conn).Read(0xc000010258, 0xc0031f60a8, 0x18, 0x18, 0x0, 0x0, 0x0)
        	/usr/lib/google-golang/src/net/net.go:183 +0x91
        github.com/soheilhy/cmux.(*bufferedReader).Read(0xc0003d24e0, 0xc0031f60a8, 0x18, 0x18, 0xc0003d24d0, 0xc0009056e2, 0xc000278400)
        	/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/buffer.go:53 +0x12d
        github.com/soheilhy/cmux.hasHTTP2Preface(0x1367e20, 0xc0003d24e0, 0x7f6c5c699f40)
        	/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/matchers.go:195 +0x8a
        github.com/soheilhy/cmux.matchersToMatchWriters.func1(0x7f6c5c699f40, 0xc000010258, 0x1367e20, 0xc0003d24e0, 0xc000010258)
        	/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/cmux.go:128 +0x39
        github.com/soheilhy/cmux.(*cMux).serve(0xc003228690, 0x138c410, 0xc000010258, 0xc00327f740, 0xc0059ba860)
        	/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/cmux.go:192 +0x1e7
        created by github.com/soheilhy/cmux.(*cMux).Serve
        	/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/cmux.go:179 +0x191

        internal/poll.runtime_pollWait(0x7f6c5c60f3f0, 0x72, 0xffffffffffffffff)
        	/usr/lib/google-golang/src/runtime/netpoll.go:222 +0x55
        internal/poll.(*pollDesc).wait(0xc000d53018, 0x72, 0x1000, 0x1000, 0xffffffffffffffff)
        	/usr/lib/google-golang/src/internal/poll/fd_poll_runtime.go:87 +0x45
        internal/poll.(*pollDesc).waitRead(...)
        	/usr/lib/google-golang/src/internal/poll/fd_poll_runtime.go:92
        internal/poll.(*FD).Read(0xc000d53000, 0xc000cfd000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
        	/usr/lib/google-golang/src/internal/poll/fd_unix.go:166 +0x1d5
        net.(*netFD).Read(0xc000d53000, 0xc000cfd000, 0x1000, 0x1000, 0x3, 0x3, 0x1000000000001)
        	/usr/lib/google-golang/src/net/fd_posix.go:55 +0x4f
        net.(*conn).Read(0xc00031a570, 0xc000cfd000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
        	/usr/lib/google-golang/src/net/net.go:183 +0x91
        net/http.(*persistConn).Read(0xc00093b320, 0xc000cfd000, 0x1000, 0x1000, 0x577750, 0x60, 0x0)
        	/usr/lib/google-golang/src/net/http/transport.go:1933 +0x77
        bufio.(*Reader).fill(0xc005702fc0)
        	/usr/lib/google-golang/src/bufio/bufio.go:101 +0x108
        bufio.(*Reader).Peek(0xc005702fc0, 0x1, 0xc00077c660, 0xc003b082a0, 0xc000d08de0, 0x5ae586, 0x11dd6c0)
        	/usr/lib/google-golang/src/bufio/bufio.go:139 +0x4f
        net/http.(*persistConn).readLoop(0xc00093b320)
        	/usr/lib/google-golang/src/net/http/transport.go:2094 +0x1a8
        created by net/http.(*Transport).dialConn
        	/usr/lib/google-golang/src/net/http/transport.go:1754 +0xdaa

        net/http.(*persistConn).writeLoop(0xc00093b320)
        	/usr/lib/google-golang/src/net/http/transport.go:2393 +0xf7
        created by net/http.(*Transport).dialConn
        	/usr/lib/google-golang/src/net/http/transport.go:1755 +0xdcf

        sync.runtime_Semacquire(0xc0059ba868)
        	/usr/lib/google-golang/src/runtime/sema.go:56 +0x45
        sync.(*WaitGroup).Wait(0xc0059ba860)
        	/usr/lib/google-golang/src/sync/waitgroup.go:130 +0x65
        github.com/soheilhy/cmux.(*cMux).Serve.func1(0xc003228690, 0xc0059ba860)
        	/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/cmux.go:158 +0x56
        github.com/soheilhy/cmux.(*cMux).Serve(0xc003228690, 0x13698c0, 0xc00377a0f0)
        	/home/ptab/private/golang/pkg/mod/github.com/soheilhy/cmux@v0.1.5/cmux.go:173 +0x115
        go.etcd.io/etcd/server/v3/embed.(*Etcd).servePeers.func1(0xc0007cc360, 0x122b75f)
        	/home/ptab/corp/etcd/server/embed/etcd.go:518 +0x2b9
        go.etcd.io/etcd/server/v3/embed.(*Etcd).servePeers.func3(0xc00036d080, 0xc0059330a0)
        	/home/ptab/corp/etcd/server/embed/etcd.go:549 +0x182
        created by go.etcd.io/etcd/server/v3/embed.(*Etcd).servePeers
        	/home/ptab/corp/etcd/server/embed/etcd.go:543 +0x73a
--- FAIL: TestSnapshotV3RestoreMultiMemberAdd (17.74s)
```
2021-04-16 20:17:28 +02:00
Piotr Tabor
d69e46ea47 Make ShouldApplyV3 an enum - not bool 2021-04-13 23:01:03 +02:00
Piotr Tabor
b1c04ce043 Applying consistency fix: ClusterVersionSet (and co) might get no applied on v2store
ClusterVersionSet, ClusterMemberAttrSet, DowngradeInfoSet functions are
writing both to V2store and backend. Prior this CL there were
in a branch not executed if shouldApplyV3 was false,
e.g. during restore when Backend is up-to-date (has high
consistency-index) while v2store requires replay from WAL log.

The most serious consequence of this bug was that v2store after restore
could have different index (revision) than the same exact store before restore,
so potentially different content between replicas.

Also this change is supressing double-applying of Membership
(ClusterConfig) changes on Backend (store v3) - that lackilly are not
part of MVCC/KeyValue store, so they didn't caused Revisions to be
bumped.

Inspired by jingyih@ comment:
https://github.com/etcd-io/etcd/pull/12820#issuecomment-815299406
2021-04-12 09:43:48 +02:00
Piotr Tabor
3bb7acc8cf Migrate dependencies pkg/foo -> client/pkg/foo 2021-04-07 00:38:47 +02:00