1748 Commits

Author SHA1 Message Date
Benjamin Wang
e44afcfadd
Merge pull request #16460 from geetasg/pr10
Preserve the order of steps done for snapshot
2023-08-23 16:01:31 +08:00
Geeta Gharpure
8729417cee Preserve the order of steps done for snapshot
Signed-off-by: Geeta Gharpure <geetagh@amazon.com>
2023-08-22 19:12:37 +00:00
James Blair
cb0df72b70
Use crypto/rand.Read instead of deprecated math/rand.Read.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-08-22 21:48:27 +12:00
Geeta Gharpure
59332dc194 Update to generate v2 snapshot from v3 state
Signed-off-by: Geeta Gharpure <geetagh@amazon.com>
2023-08-21 19:18:11 +00:00
Benjamin Wang
e2e17c75fe
Merge pull request #16448 from testwill/pkg-import
chore: pkg import more than once
2023-08-21 18:11:06 +08:00
Marek Siarkowicz
9a6eab2d72
Merge pull request #16373 from serathius/unify-arguments
server: Unify arguments for mvcc methods
2023-08-21 10:09:10 +02:00
guoguangwu
f432c1cf20 chore: pkg import more than once
Signed-off-by: guoguangwu <guoguangwu@magic-shield.com>
2023-08-21 10:19:05 +08:00
Jes Cok
52748f60f3 all: stop using math/rand.Seed
Fixes #16428.

Signed-off-by: Jes Cok <xigua67damn@gmail.com>
2023-08-20 16:34:44 +08:00
Marek Siarkowicz
d9408473c5 server: Unify arguments for mvcc methods
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-08-18 13:41:13 +02:00
Wei Fu
4db8df677c feature: add new compactor based revision count
What would you like to be added?

Add new compactor based revision count, instead of fixed interval time.

In order to make it happen, the mvcc store needs to export
`CompactNotify` function to notify the compactor that configured number of
write transactions have occured since previsious compaction. The
new compactor can get the revision change and delete out-of-date data in time,
instead of waiting with fixed interval time. The underly bbolt db can
reuse the free pages as soon as possible.

Why is this needed?

In the kubernetes cluster, for instance, argo workflow, there will be batch
requests to create pods , and then there are also a lot of pod status's PATCH
requests, especially when the pod has more than 3 containers. If the burst
requests increase the db size in short time, it will be easy to exceed the max
quota size. And then the cluster admin get involved to defrag, which may casue
long downtime. So, we hope the ETCD can delete the out-of-date data as
soon as possible and slow down the grow of total db size.

Currently, both revision and periodic are based on time. It's not easy
to use fixed interval time to face the unexpected burst update requests.
The new compactor based on revision count can make the admin life easier.
For instance, let's say that average of object size is 50 KiB. The new
compactor will compact based on 10,000 revisions. It's like that ETCD can
compact after new 500 MiB data in, no matter how long ETCD takes to get
new 10,000 revisions. It can handle the burst update requests well.

There are some test results:

* Fixed value size: 10 KiB, Update Rate: 100/s, Total key space: 3,000

```
enchmark put --rate=100 --total=300000 --compact-interval=0 \
  --key-space-size=3000 --key-size=256 --val-size=10240
```

|                      Compactor | DB Total Size | DB InUse Size |
|                             -- | --            |            -- |
| Revision(5min,retension:10000) | 570 MiB       |       208 MiB |
|                   Periodic(1m) | 232 MiB       |       165 MiB |
|                  Periodic(30s) | 151 MiB       |       127 MiB |
|   NewRevision(retension:10000) | 195 MiB       |       187 MiB |

* Random value size: [9 KiB, 11 KiB], Update Rate: 150/s, Total key space: 3,000

```
bnchmark put --rate=150 --total=300000 --compact-interval=0 \
  --key-space-size=3000 --key-size=256 --val-size=10240 \
  --delta-val-size=1024
```

|                      Compactor | DB Total Size | DB InUse Size |
|                             -- | --            |            -- |
| Revision(5min,retension:10000) | 718 MiB       |       554 MiB |
|                   Periodic(1m) | 297 MiB       |       246 MiB |
|                  Periodic(30s) | 185 MiB       |       146 MiB |
|   NewRevision(retension:10000) | 186 MiB       |       178 MiB |

* Random value size: [6 KiB, 14 KiB], Update Rate: 200/s, Total key space: 3,000

```
bnchmark put --rate=200 --total=300000 --compact-interval=0 \
  --key-space-size=3000 --key-size=256 --val-size=10240 \
  --delta-val-size=4096
```

|                      Compactor | DB Total Size | DB InUse Size |
|                             -- | --            |            -- |
| Revision(5min,retension:10000) | 874 MiB       |       221 MiB |
|                   Periodic(1m) | 357 MiB       |       260 MiB |
|                  Periodic(30s) | 215 MiB       |       151 MiB |
|   NewRevision(retension:10000) | 182 MiB       |       176 MiB |

For the burst requests, we needs to use short periodic interval.
Otherwise, the total size will be large. I think the new compactor can
handle it well.

Additional Change:

Currently, the quota system only checks DB total size. However, there
could be a lot of free pages which can be reused to upcoming requests.
Based on this proposal, I also want to extend current quota system with DB's
InUse size.

If the InUse size is less than max quota size, we should allow requests to
update. Since the bbolt might be resized if there is no available
continuous pages, we should setup a hard limit for the overflow, like 1
GiB.

```diff
 // Quota represents an arbitrary quota against arbitrary requests. Each request
@@ -130,7 +134,17 @@ func (b *BackendQuota) Available(v interface{}) bool {
                return true
        }
        // TODO: maybe optimize Backend.Size()
-       return b.be.Size()+int64(cost) < b.maxBackendBytes
+
+       // Since the compact comes with allocatable pages, we should check the
+       // SizeInUse first. If there is no continuous pages for key/value and
+       // the boltdb continues to resize, it should not increase more than 1
+       // GiB. It's hard limitation.
+       //
+       // TODO: It should be enabled by flag.
+       if b.be.Size()+int64(cost)-b.maxBackendBytes >= maxAllowedOverflowBytes(b.maxBackendBytes) {
+               return false
+       }
+       return b.be.SizeInUse()+int64(cost) < b.maxBackendBytes
 }
```

And it's likely to disable NOSPACE alarm if the compact can get much
more free pages. It can reduce downtime.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-08-16 23:35:08 +08:00
Benjamin Wang
4a3af340b7 dependency: bump go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc to v1.16.0
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-08-15 09:14:44 +01:00
Benjamin Wang
2684447d0d dependency: bump go.opentelemetry.io/otel/sdk to 1.16.0
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-08-15 09:11:50 +01:00
Benjamin Wang
38b2402971 dependency: bump go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc from 0.37.0 to 0.42.0
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-08-15 09:03:11 +01:00
James Blair
b6d123d08b
Update to golang 1.20 minor release.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-08-11 15:03:48 +12:00
chaochn47
6757c444c5 2023-08-10 bump up dependencies
Signed-off-by: chaochn47 <chaochn@amazon.com>
2023-08-10 09:13:34 +08:00
Benjamin Wang
f4b5b052b9
Merge pull request #16379 from jmhbnz/weekly-dependencies
Bump golang.org/x/sys from 0.10.0 to 0.11.0
2023-08-07 07:40:44 +01:00
Marek Siarkowicz
bf2170bb99
Merge pull request #16371 from serathius/txn-read
server: Separate txnRead from txnWrite
2023-08-06 20:46:51 +02:00
James Blair
f7126aa1c3
depdendency: bump golang.org/x/sys from 0.10.0 to 0.11.0.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-08-06 19:02:15 +12:00
Marek Siarkowicz
81ecac11cb server: Separate txnRead from txnWrite
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-08-04 20:46:33 +02:00
Benjamin Wang
def3494a55
Merge pull request #16360 from iiamabby/weekly-dependencies
[2023-08-03] Bump dependencies identified by dependabot
2023-08-04 16:25:21 +01:00
Marek Siarkowicz
524fddc426
Merge pull request #16355 from serathius/txn-refactor
server: Separate internal txn functions for recursion and have public function create transaction and trace
2023-08-04 15:54:14 +02:00
Benjamin Wang
10c7e81cac
Merge pull request #16358 from ahrtr/remove_creds_bundle_20230802
clientv3: remove the experimental gRPC API grpccredentials.Bundle
2023-08-04 08:46:02 +01:00
Benjamin Wang
0021204c15
Merge pull request #16132 from geetasg/pr5
Add a method to export membership info to v2 store from RaftCluster
2023-08-04 08:43:27 +01:00
=
418bab0ed4 dependency: bump golang.org/x/net 0.12.0 to 0.13.0
Co-authored-by: James Blair <mail@jamesblair.net>
Signed-off-by: = <abby.crimlis@outlook.com>
2023-08-04 09:09:16 +12:00
Marek Siarkowicz
fa21c07baa server: Separate internal functions for recursion and have public function create transaction and trace
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-08-03 17:47:03 +02:00
=
5896e40d23 dependency: bump go.uber.org/zap 1.24.0 to 1.25.0
Co-authored-by: James Blair <mail@jamesblair.net>
Signed-off-by: = <abby.crimlis@outlook.com>
2023-08-03 14:46:33 +12:00
Benjamin Wang
979102f895 clientv3: remove the experimental gRPC API grpccredentials.Bundle
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-08-02 19:35:51 +01:00
Chao Chen
24c6fb4b4d Fix 15877 and bump up gRPC from v1.52.0 to v1.57.0
Signed-off-by: Chao Chen <chaochn@amazon.com>
2023-07-31 13:57:24 -07:00
Marek Siarkowicz
9637b07f7b
Merge pull request #16325 from serathius/reader-writer
Separate Writer interface from BatchTx interfaces
2023-07-31 11:48:52 +02:00
Marek Siarkowicz
53cbd81009 Separate Writer interface from BatchTx interfaces
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-07-31 10:18:01 +02:00
Marek Siarkowicz
9126a0fe11
Merge pull request #16324 from chaochn47/bump-up-gRPC
Fix http2 authority header in multiple endpoints scenario and bump up grpc from `v1.51.0` to `v1.52.0`
2023-07-31 08:58:58 +02:00
Geeta Gharpure
e5b7dde17e Add a method to export membership info to v2 store from RaftCluster
Signed-off-by: Geeta Gharpure <geetagh@amazon.com>
2023-07-28 16:55:41 +00:00
Marek Siarkowicz
29769984e6 Remove RLock/RUnlock from BatchTx
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-07-28 11:39:50 +02:00
Chao Chen
e59e3d709c dependency: bump google.golang.org/grpc from 1.51.0 to 1.52.0
Signed-off-by: Chao Chen <chaochn@amazon.com>
2023-07-27 13:25:12 -07:00
Marek Siarkowicz
b4f8a7be51 server: Remove Lock/Unlock from ReadTx
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
2023-07-27 13:34:43 +02:00
Marek Siarkowicz
424ced9ff3
Merge pull request #16248 from CaojiamingAlan/replace_lock_with_rlock
Replace unnecessary Lock()/Unlock()s with RLock()/RUnlock()s
2023-07-27 12:15:46 +02:00
Benjamin Wang
21c4061d5c
Merge pull request #16288 from skitt/server-semconv-v1.17.0
server: switch to semconv v1.17.0
2023-07-26 13:30:55 +01:00
Stephen Kitt
1010115b8f
server: switch to semconv v1.17.0
This is the latest semconv package used in etcd's dependencies.
Switching to that version reduces the overall package dependencies of
the project (and helps downstream projects which track this,
e.g. Kubernetes).

Signed-off-by: Stephen Kitt <skitt@redhat.com>
2023-07-24 15:53:04 +02:00
lan.tian
0f975acf2f
update typo in raft.go
Signed-off-by: lan.tian <lance5890@163.com>
2023-07-24 15:48:55 +08:00
caojiamingalan
bc97a94564 Follow up https://github.com/etcd-io/etcd/pull/16068#discussion_r1263664700
Replace unnecessary Lock()/Unlock()s with RLock()/RUnlock()s

Signed-off-by: caojiamingalan <alan.c.19971111@gmail.com>
2023-07-14 20:08:25 -05:00
iuriatan
abbfc2964a Fix goword issue
Fix `make verify` issues after updating golangci-lint

Signed-off-by: iuriatan <iuriatan@gmail.com>
2023-07-14 16:46:26 -03:00
cui fliter
6760dc9572 remove repetitive the
Signed-off-by: cui fliter <imcusg@gmail.com>
2023-07-13 17:01:01 +08:00
Benjamin Wang
2c22ca7eba dependency: bump golang.org/x/net from v0.11.0 to v0.12.0
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-07-10 18:43:30 +01:00
Benjamin Wang
843ddb4b1e dependency: bump golang.org/x/crypto from v0.10.0 to v0.11.0
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-07-10 18:40:35 +01:00
Benjamin Wang
149256735d dependency: bump golang.org/x/sys from v0.9.0 to v0.10.0
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-07-10 18:38:16 +01:00
Benjamin Wang
f4444e8fb3
Merge pull request #16154 from CaojiamingAlan/uber_applier_test
add tests for uber applier
2023-07-06 20:00:11 +01:00
Benjamin Wang
e887e5291a
Merge pull request #16067 from geetasg/pr1
Adding test for updateClusterVersionV3
2023-07-06 08:36:19 +01:00
caojiamingalan
eff9517a90 etcdserver: add cluster id check for hashKVHandler
Signed-off-by: caojiamingalan <alan.c.19971111@gmail.com>
2023-07-05 14:09:40 -05:00
Tom Wieczorek
a8a9ebd281
auth: Support for EdDSA JWT algorithm
The golang-jwt library supports this already, so supporting it is just a
matter of wiring things up.

Signed-off-by: Tom Wieczorek <twieczorek@mirantis.com>
2023-07-05 11:33:08 +02:00
caojiamingalan
ffe73f9a15 add tests for uber applier
Signed-off-by: caojiamingalan <alan.c.19971111@gmail.com>
2023-06-30 22:03:29 -05:00