20023 Commits

Author SHA1 Message Date
Wei Fu
4db8df677c feature: add new compactor based revision count
What would you like to be added?

Add new compactor based revision count, instead of fixed interval time.

In order to make it happen, the mvcc store needs to export
`CompactNotify` function to notify the compactor that configured number of
write transactions have occured since previsious compaction. The
new compactor can get the revision change and delete out-of-date data in time,
instead of waiting with fixed interval time. The underly bbolt db can
reuse the free pages as soon as possible.

Why is this needed?

In the kubernetes cluster, for instance, argo workflow, there will be batch
requests to create pods , and then there are also a lot of pod status's PATCH
requests, especially when the pod has more than 3 containers. If the burst
requests increase the db size in short time, it will be easy to exceed the max
quota size. And then the cluster admin get involved to defrag, which may casue
long downtime. So, we hope the ETCD can delete the out-of-date data as
soon as possible and slow down the grow of total db size.

Currently, both revision and periodic are based on time. It's not easy
to use fixed interval time to face the unexpected burst update requests.
The new compactor based on revision count can make the admin life easier.
For instance, let's say that average of object size is 50 KiB. The new
compactor will compact based on 10,000 revisions. It's like that ETCD can
compact after new 500 MiB data in, no matter how long ETCD takes to get
new 10,000 revisions. It can handle the burst update requests well.

There are some test results:

* Fixed value size: 10 KiB, Update Rate: 100/s, Total key space: 3,000

```
enchmark put --rate=100 --total=300000 --compact-interval=0 \
  --key-space-size=3000 --key-size=256 --val-size=10240
```

|                      Compactor | DB Total Size | DB InUse Size |
|                             -- | --            |            -- |
| Revision(5min,retension:10000) | 570 MiB       |       208 MiB |
|                   Periodic(1m) | 232 MiB       |       165 MiB |
|                  Periodic(30s) | 151 MiB       |       127 MiB |
|   NewRevision(retension:10000) | 195 MiB       |       187 MiB |

* Random value size: [9 KiB, 11 KiB], Update Rate: 150/s, Total key space: 3,000

```
bnchmark put --rate=150 --total=300000 --compact-interval=0 \
  --key-space-size=3000 --key-size=256 --val-size=10240 \
  --delta-val-size=1024
```

|                      Compactor | DB Total Size | DB InUse Size |
|                             -- | --            |            -- |
| Revision(5min,retension:10000) | 718 MiB       |       554 MiB |
|                   Periodic(1m) | 297 MiB       |       246 MiB |
|                  Periodic(30s) | 185 MiB       |       146 MiB |
|   NewRevision(retension:10000) | 186 MiB       |       178 MiB |

* Random value size: [6 KiB, 14 KiB], Update Rate: 200/s, Total key space: 3,000

```
bnchmark put --rate=200 --total=300000 --compact-interval=0 \
  --key-space-size=3000 --key-size=256 --val-size=10240 \
  --delta-val-size=4096
```

|                      Compactor | DB Total Size | DB InUse Size |
|                             -- | --            |            -- |
| Revision(5min,retension:10000) | 874 MiB       |       221 MiB |
|                   Periodic(1m) | 357 MiB       |       260 MiB |
|                  Periodic(30s) | 215 MiB       |       151 MiB |
|   NewRevision(retension:10000) | 182 MiB       |       176 MiB |

For the burst requests, we needs to use short periodic interval.
Otherwise, the total size will be large. I think the new compactor can
handle it well.

Additional Change:

Currently, the quota system only checks DB total size. However, there
could be a lot of free pages which can be reused to upcoming requests.
Based on this proposal, I also want to extend current quota system with DB's
InUse size.

If the InUse size is less than max quota size, we should allow requests to
update. Since the bbolt might be resized if there is no available
continuous pages, we should setup a hard limit for the overflow, like 1
GiB.

```diff
 // Quota represents an arbitrary quota against arbitrary requests. Each request
@@ -130,7 +134,17 @@ func (b *BackendQuota) Available(v interface{}) bool {
                return true
        }
        // TODO: maybe optimize Backend.Size()
-       return b.be.Size()+int64(cost) < b.maxBackendBytes
+
+       // Since the compact comes with allocatable pages, we should check the
+       // SizeInUse first. If there is no continuous pages for key/value and
+       // the boltdb continues to resize, it should not increase more than 1
+       // GiB. It's hard limitation.
+       //
+       // TODO: It should be enabled by flag.
+       if b.be.Size()+int64(cost)-b.maxBackendBytes >= maxAllowedOverflowBytes(b.maxBackendBytes) {
+               return false
+       }
+       return b.be.SizeInUse()+int64(cost) < b.maxBackendBytes
 }
```

And it's likely to disable NOSPACE alarm if the compact can get much
more free pages. It can reduce downtime.

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-08-16 23:35:08 +08:00
Benjamin Wang
21c4061d5c
Merge pull request #16288 from skitt/server-semconv-v1.17.0
server: switch to semconv v1.17.0
2023-07-26 13:30:55 +01:00
Benjamin Wang
a6bffb8565
Merge pull request #16306 from ArkaSaha30/main
Manual Dependency Bump
2023-07-26 13:00:59 +01:00
ArkaSaha30
da58ac9847
Bump github.com/mattn/go-runewidth to v0.0.15
Signed-off-by: ArkaSaha30 <arkasaha30@gmail.com>
2023-07-26 13:02:43 +05:30
Benjamin Wang
326dab9bd7
Merge pull request #16279 from ahrtr/roadmap_20230721
Documentation: add roadmap
2023-07-26 08:14:17 +01:00
Benjamin Wang
cb4d3a5697 Documentation: add a roadmap
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-07-25 15:54:22 +01:00
Benjamin Wang
0ba8b0fb16
Merge pull request #16294 from etcd-io/dependabot/github_actions/github/codeql-action-2.21.0
build(deps): bump github/codeql-action from 2.20.4 to 2.21.0
2023-07-25 07:29:03 +01:00
dependabot[bot]
0e8c52504e
build(deps): bump github/codeql-action from 2.20.4 to 2.21.0
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 2.20.4 to 2.21.0.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](489225d82a...1813ca74c3)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-07-24 17:56:26 +00:00
Stephen Kitt
1010115b8f
server: switch to semconv v1.17.0
This is the latest semconv package used in etcd's dependencies.
Switching to that version reduces the overall package dependencies of
the project (and helps downstream projects which track this,
e.g. Kubernetes).

Signed-off-by: Stephen Kitt <skitt@redhat.com>
2023-07-24 15:53:04 +02:00
Benjamin Wang
d204487b6a
Merge pull request #16283 from lance5890/fix_typo_in_raft.go
update typo in raft.go
2023-07-24 12:42:37 +01:00
lan.tian
0f975acf2f
update typo in raft.go
Signed-off-by: lan.tian <lance5890@163.com>
2023-07-24 15:48:55 +08:00
Benjamin Wang
26b3ecf5aa
Merge pull request #16281 from eltociear/fix-typo
Fix typo in triage_issues.md
2023-07-23 14:51:46 +01:00
Ikko Eltociear Ashimine
0a314c9da3 Fix typo in triage_issues.md
Signed-off-by: Ikko Eltociear Ashimine <eltociear@gmail.com>
2023-07-23 22:47:17 +09:00
Benjamin Wang
6979a06e6c
Merge pull request #16257 from etcd-io/dependabot/go_modules/github.com/cheggaaa/pb/v3-3.1.4
build(deps): bump github.com/cheggaaa/pb/v3 from 3.1.2 to 3.1.4
2023-07-22 06:14:57 +01:00
Benjamin Wang
8d85baec80 dependency: bump github.com/cheggaaa/pb/v3 from 3.1.2 to 3.1.4 for etcdctl and tests
Signed-off-by: Benjamin Wang <wachao@vmware.com>
2023-07-21 16:14:59 +01:00
dependabot[bot]
824337a272 build(deps): bump github.com/cheggaaa/pb/v3 from 3.1.2 to 3.1.4
Bumps [github.com/cheggaaa/pb/v3](https://github.com/cheggaaa/pb) from 3.1.2 to 3.1.4.
- [Commits](https://github.com/cheggaaa/pb/compare/v3.1.2...v3.1.4)

---
updated-dependencies:
- dependency-name: github.com/cheggaaa/pb/v3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-07-21 16:12:27 +01:00
Benjamin Wang
d1ade07ab0
Merge pull request #16261 from etcd-io/dependabot/go_modules/tools/mod/github.com/mikefarah/yq/v4-4.34.2
build(deps): bump github.com/mikefarah/yq/v4 from 4.34.1 to 4.34.2 in /tools/mod
2023-07-21 15:58:51 +01:00
Benjamin Wang
cd453b931f
Merge pull request #16271 from johnshajiang/cleanup-cluster
tests: cleanup unnecessary assignment in cluster.go
2023-07-20 08:16:36 +01:00
Benjamin Wang
92de641a22
Merge pull request #16268 from fuweid/fix-TestPageWriterRandom
pkg/ioutil: deflake TestPageWriterRandom
2023-07-19 16:07:47 +01:00
John Jiang
51a22c21ff tests: cleanup unnecessary assignment in cluster.go
Signed-off-by: John Jiang <john.sha.jiang@gmail.com>
2023-07-19 21:58:33 +08:00
Wei Fu
fddd1add52 pkg/ioutil: deflake TestPageWriterRandom
The PageWriter has cache buffer so that it doesn't call the Writer until
the cache is almost full. Since the data's length is random, the pending
bytes should be always less than cache buffer size, instead of page
size.

Fix: #16255

Signed-off-by: Wei Fu <fuweid89@gmail.com>
2023-07-18 23:18:01 +08:00
Benjamin Wang
35628b9c78
Merge pull request #16230 from jmhbnz/align-arm64-commands
Ensure release is run for arm64 e2e nightly tests
2023-07-18 12:04:33 +01:00
James Blair
3ff0128842
Fix obtaining UPGRADE_VER in test.sh
Obtain tags from git ls-remote to avoid reliance on local repository state.

Signed-off-by: James Blair <mail@jamesblair.net>
2023-07-18 21:57:51 +12:00
Benjamin Wang
e209968dc3
Merge pull request #16263 from Rajalakshmi-Girish/flake-grpc-rr
Fix flaky integration/clientv3/naming TestEtcdGrpcResolverRoundRobin
2023-07-18 09:49:56 +01:00
Benjamin Wang
eb204f1d32
Merge pull request #16256 from gocurr/simplify_fmt_print
etcdctl/ctlv3/command: simplify code using fmt.Printf with '\n'
2023-07-18 09:47:36 +01:00
Benjamin Wang
7aad281317
Merge pull request #16252 from gocurr/avoid_hardcoding
pkg/expect: avoid hardcoding when checking ErrProcessDone
2023-07-18 09:47:03 +01:00
James Blair
2f65f56351
Ensure release is run for arm64 e2e nightly tests.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-07-18 20:16:55 +12:00
Rajalakshmi Girish
ea72194935 Fix flaky integration/clientv3/naming TestEtcdGrpcResolverRoundRobin
Signed-off-by: Rajalakshmi Girish <rajalakshmi.girish1@ibm.com>
2023-07-17 23:53:02 -07:00
Benjamin Wang
0c643dfb21
Merge pull request #16258 from etcd-io/dependabot/github_actions/github/codeql-action-2.20.4
build(deps): bump github/codeql-action from 2.20.3 to 2.20.4
2023-07-17 19:30:35 +01:00
dependabot[bot]
b71f335740
build(deps): bump github.com/mikefarah/yq/v4 in /tools/mod
Bumps [github.com/mikefarah/yq/v4](https://github.com/mikefarah/yq) from 4.34.1 to 4.34.2.
- [Release notes](https://github.com/mikefarah/yq/releases)
- [Changelog](https://github.com/mikefarah/yq/blob/master/release_notes.txt)
- [Commits](https://github.com/mikefarah/yq/compare/v4.34.1...v4.34.2)

---
updated-dependencies:
- dependency-name: github.com/mikefarah/yq/v4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-07-17 17:40:07 +00:00
dependabot[bot]
91215fb1ca
build(deps): bump github/codeql-action from 2.20.3 to 2.20.4
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 2.20.3 to 2.20.4.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](46ed16ded9...489225d82a)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-07-17 17:24:31 +00:00
Jes Cok
223a75b399 etcdctl/ctlv3/command: simplify code using fmt.Printf with '\n'
The current printing code is complicated. This PR simplifies the
code and reduces the function calls.

Signed-off-by: Jes Cok <xigua67damn@gmail.com>
2023-07-17 19:37:15 +08:00
Benjamin Wang
bedd13298d
Merge pull request #16251 from liangyuanpeng/changelog_backport_13577
Add changelog for backport 13577 to 3.4&3.5.
2023-07-17 08:25:25 +01:00
Marek Siarkowicz
11d22abe2b
Merge pull request #16249 from iuriatan/update-linter
Update linter and protoc
2023-07-17 09:13:06 +02:00
Lan
6a9ea5ba6c Add changelog for backport 13577 to 3.4&3.5.
Signed-off-by: Lan Liang <gcslyp@gmail.com>
2023-07-17 13:39:15 +08:00
Jes Cok
5e65553d27 pkg/expect: avoid hardcoding when checking ErrProcessDone
ExpectProcess's Stop method uses 'strings.Contains' to check
the returned err, however, this can be avoided. os.ErrProcessDone's
error message is the same as the hardcoded string. So I think
this explicit error is what this method wants to compare.

Signed-off-by: Jes Cok <xigua67damn@gmail.com>
2023-07-17 13:14:15 +08:00
Benjamin Wang
ff411f517f
Merge pull request #16224 from CaojiamingAlan/expose_isOptsWithFromKey_and_isOptsWithPrefix
expose op.isOptsWithFromKey and op.isOptsWithPrefix
2023-07-15 18:35:39 +01:00
iuriatan
b424e60289 Update protoc from 3.14.0 to 3.20.3
Signed-off-by: iuriatan <iuriatan@gmail.com>
2023-07-14 16:46:26 -03:00
iuriatan
abbfc2964a Fix goword issue
Fix `make verify` issues after updating golangci-lint

Signed-off-by: iuriatan <iuriatan@gmail.com>
2023-07-14 16:46:26 -03:00
iuriatan
b798aae9c5 Update golangci-lint from 1.49.0 to 1.53.3
Signed-off-by: iuriatan <iuriatan@gmail.com>
2023-07-14 16:46:26 -03:00
Benjamin Wang
882edb3d63
Merge pull request #16231 from jmhbnz/robustness-arm64-release-35
Add new job for nightly release35 arm64 robustness
2023-07-14 14:56:38 +01:00
Benjamin Wang
c59bc52286
Merge pull request #16200 from kensou97/keepalive-ctx-closer
clientv3: create keepAliveCtxCloser goroutine only if ctx can be canc…
2023-07-14 13:46:15 +01:00
Benjamin Wang
dee90e19f1
Merge pull request #16229 from ahrtr/changelog_20230712
Changelog: add items to cover the fix of bumping go to 1.19.11
2023-07-14 12:08:07 +01:00
Marek Siarkowicz
2dc7891c7b
Merge pull request #16234 from jmhbnz/add-new-reviewer
Add jmhbnz as etcd reviewer
2023-07-13 12:01:48 +02:00
Marek Siarkowicz
f008184b0e
Merge pull request #16232 from cuishuang/main
remove repetitive the
2023-07-13 11:59:14 +02:00
James Blair
a35d24ab72
Add jmhbnz as etcd reviewer.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-07-13 21:46:04 +12:00
cui fliter
6760dc9572 remove repetitive the
Signed-off-by: cui fliter <imcusg@gmail.com>
2023-07-13 17:01:01 +08:00
James Blair
5ffac59d88
Add new job for nightly release35 arm64 robustness.
Signed-off-by: James Blair <mail@jamesblair.net>
2023-07-13 20:58:16 +12:00
Benjamin Wang
1cf49e5ef0
Merge pull request #16173 from fuweid/fix-datarace-in-expect
pkg/expect: fix data race
2023-07-13 08:49:13 +01:00
Marek Siarkowicz
1ee6be793e
Merge pull request #16226 from ahrtr/go_20230712
Bump go version to 1.19.11 to fix CVE GO-2023-1878
2023-07-13 09:30:57 +02:00