Mirroristas/etcd

mirror of https://github.com/etcd-io/etcd.git synced 2024-09-27 06:25:44 +00:00

Author	SHA1	Message	Date
Wei Fu	4db8df677c	feature: add new compactor based revision count What would you like to be added? Add new compactor based revision count, instead of fixed interval time. In order to make it happen, the mvcc store needs to export `CompactNotify` function to notify the compactor that configured number of write transactions have occured since previsious compaction. The new compactor can get the revision change and delete out-of-date data in time, instead of waiting with fixed interval time. The underly bbolt db can reuse the free pages as soon as possible. Why is this needed? In the kubernetes cluster, for instance, argo workflow, there will be batch requests to create pods , and then there are also a lot of pod status's PATCH requests, especially when the pod has more than 3 containers. If the burst requests increase the db size in short time, it will be easy to exceed the max quota size. And then the cluster admin get involved to defrag, which may casue long downtime. So, we hope the ETCD can delete the out-of-date data as soon as possible and slow down the grow of total db size. Currently, both revision and periodic are based on time. It's not easy to use fixed interval time to face the unexpected burst update requests. The new compactor based on revision count can make the admin life easier. For instance, let's say that average of object size is 50 KiB. The new compactor will compact based on 10,000 revisions. It's like that ETCD can compact after new 500 MiB data in, no matter how long ETCD takes to get new 10,000 revisions. It can handle the burst update requests well. There are some test results: * Fixed value size: 10 KiB, Update Rate: 100/s, Total key space: 3,000 ``` enchmark put --rate=100 --total=300000 --compact-interval=0 \ --key-space-size=3000 --key-size=256 --val-size=10240 ``` \| Compactor \| DB Total Size \| DB InUse Size \| \| -- \| -- \| -- \| \| Revision(5min,retension:10000) \| 570 MiB \| 208 MiB \| \| Periodic(1m) \| 232 MiB \| 165 MiB \| \| Periodic(30s) \| 151 MiB \| 127 MiB \| \| NewRevision(retension:10000) \| 195 MiB \| 187 MiB \| * Random value size: [9 KiB, 11 KiB], Update Rate: 150/s, Total key space: 3,000 ``` bnchmark put --rate=150 --total=300000 --compact-interval=0 \ --key-space-size=3000 --key-size=256 --val-size=10240 \ --delta-val-size=1024 ``` \| Compactor \| DB Total Size \| DB InUse Size \| \| -- \| -- \| -- \| \| Revision(5min,retension:10000) \| 718 MiB \| 554 MiB \| \| Periodic(1m) \| 297 MiB \| 246 MiB \| \| Periodic(30s) \| 185 MiB \| 146 MiB \| \| NewRevision(retension:10000) \| 186 MiB \| 178 MiB \| * Random value size: [6 KiB, 14 KiB], Update Rate: 200/s, Total key space: 3,000 ``` bnchmark put --rate=200 --total=300000 --compact-interval=0 \ --key-space-size=3000 --key-size=256 --val-size=10240 \ --delta-val-size=4096 ``` \| Compactor \| DB Total Size \| DB InUse Size \| \| -- \| -- \| -- \| \| Revision(5min,retension:10000) \| 874 MiB \| 221 MiB \| \| Periodic(1m) \| 357 MiB \| 260 MiB \| \| Periodic(30s) \| 215 MiB \| 151 MiB \| \| NewRevision(retension:10000) \| 182 MiB \| 176 MiB \| For the burst requests, we needs to use short periodic interval. Otherwise, the total size will be large. I think the new compactor can handle it well. Additional Change: Currently, the quota system only checks DB total size. However, there could be a lot of free pages which can be reused to upcoming requests. Based on this proposal, I also want to extend current quota system with DB's InUse size. If the InUse size is less than max quota size, we should allow requests to update. Since the bbolt might be resized if there is no available continuous pages, we should setup a hard limit for the overflow, like 1 GiB. ```diff // Quota represents an arbitrary quota against arbitrary requests. Each request @@ -130,7 +134,17 @@ func (b *BackendQuota) Available(v interface{}) bool { return true } // TODO: maybe optimize Backend.Size() - return b.be.Size()+int64(cost) < b.maxBackendBytes + + // Since the compact comes with allocatable pages, we should check the + // SizeInUse first. If there is no continuous pages for key/value and + // the boltdb continues to resize, it should not increase more than 1 + // GiB. It's hard limitation. + // + // TODO: It should be enabled by flag. + if b.be.Size()+int64(cost)-b.maxBackendBytes >= maxAllowedOverflowBytes(b.maxBackendBytes) { + return false + } + return b.be.SizeInUse()+int64(cost) < b.maxBackendBytes } ``` And it's likely to disable NOSPACE alarm if the compact can get much more free pages. It can reduce downtime. Signed-off-by: Wei Fu <fuweid89@gmail.com>	2023-08-16 23:35:08 +08:00
Mikel Olasagasti Uranga	89637a4223	Tools/benchmark: migrate cheggaaa/pb.v1 to cheggaaa/pb/v3 etcdctl/ctlv3: migrate cheggaaa/pb.v1 to cheggaaa/pb/v3 This commit also changes the format of the progress bar, from using a custom progress bar to the default provided by the library. Old behaviour: ./benchmarkv1 put 0 / 10000 B ! 0.00% 3987 / 10000 Boooooooooooooom ! 39.87% 10000 / 10000 Boooooooooooooooooooooooooooooooooooooooooooo! 100.00% 1s New behaviour: ./benchmark put 6536 / 10000 [----------------------->________________] 65.36% 7053 p/s 10000 / 10000 [---------------------------------------] 100.00% 7581 p/s Signed-off-by: Mikel Olasagasti Uranga <mikel@olasagasti.info>	2022-06-20 15:47:23 +02:00
Chao Chen	9084acceac	tools/benchmark: add autoSync flag	2021-10-15 22:03:23 -07:00
Piotr Tabor	3bb7acc8cf	Migrate dependencies pkg/foo -> client/pkg/foo	2021-04-07 00:38:47 +02:00
Piotr Tabor	de55bb6331	pkg: Rename imports after making 'pkg' a module find -name '*.go' \| xargs sed --follow-symlinks -i 's\|go.etcd.io/etcd/v3/pkg/\|go.etcd.io/etcd/pkg/v3/\|g' go fmt ./...	2020-10-13 00:09:27 +02:00
Brandon Philips	96cce208c2	go.mod: use go.etcd.io/etcd/v3 versioning This change makes the etcd package compatible with the existing Go ecosystem for module versioning. Used this tool to update package imports: https://github.com/KSubedi/gomove	2020-04-28 00:57:35 +00:00
Gyuho Lee	34bd797e67	*: revert module import paths Signed-off-by: Gyuho Lee <leegyuho@amazon.com>	2019-05-28 15:39:35 -07:00
shivaramr	9150bf52d6	go modules: Fix module path version to include version number	2019-04-26 15:29:50 -07:00
Gyuho Lee	d37f1521b7	*: update import paths to "go.etcd.io/etcd" Signed-off-by: Gyuho Lee <leegyuho@amazon.com>	2018-08-28 17:47:55 -07:00
Gyuho Lee	9bd580f2fc	tools/benchmark: use "TrustedCAFile" Signed-off-by: Gyuho Lee <gyuhox@gmail.com>	2018-03-20 15:31:32 -07:00
harryge00	1c3567da90	tools/benchmark: ask for password when it is not supplied	2017-10-27 14:30:43 +08:00
Hitoshi Mitake	6b030ed7db	benchmark: a new flag --target-leader for targetting a leader endpoint Current benchmark picks destinations of RPCs in a random manner. However, it will result divergent benchmarking result because RPCs other than serializable range must be forwarded to a leader node when a follower node receives it. This commit adds a new flag --target-leader for avoid the problem. If the flag is passed, benchmark always picks an endpoint of a leader node.	2017-04-17 14:24:35 +09:00
Hitoshi Mitake	a662ddefbb	benchmark: a new option for configuring dial timeout Current benchmark doesn't have an option for configuring dial timeout of gRPC. This commit adds --dial-timeout for the purpose. It is useful for stopping long sticking benchmarks.	2016-12-28 14:07:43 +09:00
Anthony Romano	e7d8292cd1	benchmark: add --precise flag Usually benchmark writes with %4.4f; this adds optional %g formatting.	2016-10-06 16:18:47 -07:00
Anthony Romano	3d28faa3eb	pkg/report, tools/benchmark: refactor report out of tools/benchmark Only tracks time series when requested. Can configure output precision.	2016-10-06 16:18:47 -07:00
Hitoshi Mitake	a153448b84	tools: add --user for auth in benchmarks This commit adds --user for auth in benchmarks. Its purpose is measuring overhead of authentication of v3 API. Of course the given user must be granted permission of target keys before benchmarking. Example of a case with no authentication: % ./benchmark range k1 bench with linearizable range 10000 / 10000 Booooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo! 100.00%2m10s Summary: Total: 130.1850 secs. Slowest: 0.4071 secs. Fastest: 0.0064 secs. Average: 0.0130 secs. Stddev: 0.0079 secs. Requests/sec: 76.8138 Response time histogram: 0.006 [1] \| 0.046 [9990] \|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 0.087 [3] \| 0.127 [0] \| 0.167 [3] \| 0.207 [2] \| 0.247 [0] \| 0.287 [0] \| 0.327 [0] \| 0.367 [0] \| 0.407 [1] \| Latency distribution: 10% in 0.0076 secs. 25% in 0.0086 secs. 50% in 0.0113 secs. 75% in 0.0146 secs. 90% in 0.0209 secs. 95% in 0.0272 secs. 99% in 0.0344 secs. Example of a case with authentication: % ./benchmark --user=u1:p range k1 bench with linearizable range 10000 / 10000 Booooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo! 100.00%2m11s Summary: Total: 131.4923 secs. Slowest: 0.1637 secs. Fastest: 0.0065 secs. Average: 0.0131 secs. Stddev: 0.0070 secs. Requests/sec: 76.0501 Response time histogram: 0.006 [1] \| 0.022 [9075] \|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 0.038 [875] \|∎∎∎ 0.054 [36] \| 0.069 [5] \| 0.085 [1] \| 0.101 [1] \| 0.117 [0] \| 0.132 [0] \| 0.148 [5] \| 0.164 [1] \| Latency distribution: 10% in 0.0076 secs. 25% in 0.0087 secs. 50% in 0.0114 secs. 75% in 0.0150 secs. 90% in 0.0215 secs. 95% in 0.0272 secs. 99% in 0.0347 secs. It seems that current auth mechanism does not introduce visible overhead.	2016-07-08 16:53:05 +09:00
Gyu-Ho Lee	3d523e34b1	tools: update LICENSE header	2016-05-12 20:50:17 -07:00
Anthony Romano	bd832e5b0a	*: migrate Godeps to vendor/	2016-03-22 17:10:28 -07:00
Gyu-Ho Lee	bb9a7f5a7c	Godeps: semantic versioning cheggaaa/pb Fix https://github.com/coreos/etcd/issues/4832.	2016-03-21 22:06:16 -07:00
Xiang Li	d3809abe42	*: gRPC + HTTP on the same port We use cmux to do this since we want to do http+https on the same port in the near future too.	2016-03-21 14:29:25 -07:00
Gyu-Ho Lee	c9e4e2b6dc	benchmark: move sample flag to root command Sample is configuration for reports. This should be flag at top command.	2016-03-15 10:36:27 -07:00
Gyu-Ho Lee	a932674a5b	benchmark: minor typos	2016-03-14 13:45:08 -07:00
Gyu-Ho Lee	64e276800f	benchmark: use endpoints for benchmark flag	2016-02-26 16:55:49 -08:00
Anthony Romano	4380617e1a	tools/benchmark: support tls	2016-01-29 16:38:11 -08:00
Xiang Li	6c82d768b2	Merge pull request #4201 from mitake/benchmark-pprof tools/benchmark: add flags for pprof to storage put	2016-01-13 20:17:30 -08:00
Hitoshi Mitake	1c802e9266	tools/benchmark: add flags for pprof to storage put This commit adds flags for profiling with runtime/pprof to storage put: - --cpuprofile: specify a path of CPU profiling result, if it is not empty, profiling is activated - --memprofile: specify a path of heap profiling result, if it is not empty, profiling is activated Of course, the flags should be added to RootCmd ideally. However, adding common flags that shared by children command requires the ongoing PR: https://github.com/spf13/cobra/pull/220 . Therefore this commit adds the flags to storage put only.	2016-01-14 13:10:35 +09:00
Hitoshi Mitake	16b63310b2	tools/benchmark: remove deadcode The Execute() function is a deadcode. Let's remove it.	2016-01-13 16:57:53 +09:00
Anthony Romano	382103af60	tools/benchmark: stream results into reports Reports depended on writing all results to a large buffered channel and reading from that synchronously. Similarly, requests were buffered the same way which can take significant memory on big request strings. Instead, have reports stream in results as they're produced then print when the results channel closes.	2015-12-23 11:24:35 -08:00
Xiang Li	faff00d19e	tools: rewrite benchmark tool	2015-11-30 11:52:43 -08:00

29 Commits