Mirroristas/etcd

mirror of https://github.com/etcd-io/etcd.git synced 2024-09-27 06:25:44 +00:00

Author	SHA1	Message	Date
Wei Fu	4db8df677c	feature: add new compactor based revision count What would you like to be added? Add new compactor based revision count, instead of fixed interval time. In order to make it happen, the mvcc store needs to export `CompactNotify` function to notify the compactor that configured number of write transactions have occured since previsious compaction. The new compactor can get the revision change and delete out-of-date data in time, instead of waiting with fixed interval time. The underly bbolt db can reuse the free pages as soon as possible. Why is this needed? In the kubernetes cluster, for instance, argo workflow, there will be batch requests to create pods , and then there are also a lot of pod status's PATCH requests, especially when the pod has more than 3 containers. If the burst requests increase the db size in short time, it will be easy to exceed the max quota size. And then the cluster admin get involved to defrag, which may casue long downtime. So, we hope the ETCD can delete the out-of-date data as soon as possible and slow down the grow of total db size. Currently, both revision and periodic are based on time. It's not easy to use fixed interval time to face the unexpected burst update requests. The new compactor based on revision count can make the admin life easier. For instance, let's say that average of object size is 50 KiB. The new compactor will compact based on 10,000 revisions. It's like that ETCD can compact after new 500 MiB data in, no matter how long ETCD takes to get new 10,000 revisions. It can handle the burst update requests well. There are some test results: * Fixed value size: 10 KiB, Update Rate: 100/s, Total key space: 3,000 ``` enchmark put --rate=100 --total=300000 --compact-interval=0 \ --key-space-size=3000 --key-size=256 --val-size=10240 ``` \| Compactor \| DB Total Size \| DB InUse Size \| \| -- \| -- \| -- \| \| Revision(5min,retension:10000) \| 570 MiB \| 208 MiB \| \| Periodic(1m) \| 232 MiB \| 165 MiB \| \| Periodic(30s) \| 151 MiB \| 127 MiB \| \| NewRevision(retension:10000) \| 195 MiB \| 187 MiB \| * Random value size: [9 KiB, 11 KiB], Update Rate: 150/s, Total key space: 3,000 ``` bnchmark put --rate=150 --total=300000 --compact-interval=0 \ --key-space-size=3000 --key-size=256 --val-size=10240 \ --delta-val-size=1024 ``` \| Compactor \| DB Total Size \| DB InUse Size \| \| -- \| -- \| -- \| \| Revision(5min,retension:10000) \| 718 MiB \| 554 MiB \| \| Periodic(1m) \| 297 MiB \| 246 MiB \| \| Periodic(30s) \| 185 MiB \| 146 MiB \| \| NewRevision(retension:10000) \| 186 MiB \| 178 MiB \| * Random value size: [6 KiB, 14 KiB], Update Rate: 200/s, Total key space: 3,000 ``` bnchmark put --rate=200 --total=300000 --compact-interval=0 \ --key-space-size=3000 --key-size=256 --val-size=10240 \ --delta-val-size=4096 ``` \| Compactor \| DB Total Size \| DB InUse Size \| \| -- \| -- \| -- \| \| Revision(5min,retension:10000) \| 874 MiB \| 221 MiB \| \| Periodic(1m) \| 357 MiB \| 260 MiB \| \| Periodic(30s) \| 215 MiB \| 151 MiB \| \| NewRevision(retension:10000) \| 182 MiB \| 176 MiB \| For the burst requests, we needs to use short periodic interval. Otherwise, the total size will be large. I think the new compactor can handle it well. Additional Change: Currently, the quota system only checks DB total size. However, there could be a lot of free pages which can be reused to upcoming requests. Based on this proposal, I also want to extend current quota system with DB's InUse size. If the InUse size is less than max quota size, we should allow requests to update. Since the bbolt might be resized if there is no available continuous pages, we should setup a hard limit for the overflow, like 1 GiB. ```diff // Quota represents an arbitrary quota against arbitrary requests. Each request @@ -130,7 +134,17 @@ func (b *BackendQuota) Available(v interface{}) bool { return true } // TODO: maybe optimize Backend.Size() - return b.be.Size()+int64(cost) < b.maxBackendBytes + + // Since the compact comes with allocatable pages, we should check the + // SizeInUse first. If there is no continuous pages for key/value and + // the boltdb continues to resize, it should not increase more than 1 + // GiB. It's hard limitation. + // + // TODO: It should be enabled by flag. + if b.be.Size()+int64(cost)-b.maxBackendBytes >= maxAllowedOverflowBytes(b.maxBackendBytes) { + return false + } + return b.be.SizeInUse()+int64(cost) < b.maxBackendBytes } ``` And it's likely to disable NOSPACE alarm if the compact can get much more free pages. It can reduce downtime. Signed-off-by: Wei Fu <fuweid89@gmail.com>	2023-08-16 23:35:08 +08:00
Benjamin Wang	10c7e81cac	Merge pull request #16358 from ahrtr/remove_creds_bundle_20230802 clientv3: remove the experimental gRPC API grpccredentials.Bundle	2023-08-04 08:46:02 +01:00
Benjamin Wang	979102f895	clientv3: remove the experimental gRPC API grpccredentials.Bundle Signed-off-by: Benjamin Wang <wachao@vmware.com>	2023-08-02 19:35:51 +01:00
Geeta Gharpure	e5b7dde17e	Add a method to export membership info to v2 store from RaftCluster Signed-off-by: Geeta Gharpure <geetagh@amazon.com>	2023-07-28 16:55:41 +00:00
iuriatan	abbfc2964a	Fix goword issue Fix `make verify` issues after updating golangci-lint Signed-off-by: iuriatan <iuriatan@gmail.com>	2023-07-14 16:46:26 -03:00
caojiamingalan	eff9517a90	etcdserver: add cluster id check for hashKVHandler Signed-off-by: caojiamingalan <alan.c.19971111@gmail.com>	2023-07-05 14:09:40 -05:00
Geeta Gharpure	e9fa3d30d7	Enable test to verify membership recovery from backend Signed-off-by: Geeta Gharpure <geetagh@amazon.com>	2023-06-13 18:33:03 +00:00
cui fliter	0c919dc212	use the more efficient strings.Builder Signed-off-by: cui fliter <imcusg@gmail.com>	2023-05-19 10:44:58 +08:00
Wei Fu	1ba577e499	server/etcdserver: togRPCError for maintenance API It's to deflake TestAuthMemberRemove. When the client has multiple endpoints, the client might send a request with valid token to the follower member which hasn't received token replicated log yet. The member will reject the request. For instance, the maintenance.Status API will return "auth: invalid auth token". But the client doesn't identify the error. The client won't retry to refresh auth token. The maintenance.Status should togRPCError before return so that the client can reflesh token. It's align with existing API. Since the maintenance client always creates one connection to target member, the member will have the token after refresh auth. Maybe we can introduce a sync to wait for member is ready with token, instead of refreshing. Fixes: #15758 Signed-off-by: Wei Fu <fuweid89@gmail.com>	2023-04-22 18:35:53 +08:00
Benjamin Wang	dae1d70189	test: workaround the breaking change in jonboulle/clockwork See - https://github.com/jonboulle/clockwork/pull/55 - https://github.com/jonboulle/clockwork/blob/v0.3.0/clockwork.go#L42 Signed-off-by: Benjamin Wang <wachao@vmware.com>	2023-04-11 12:01:09 +08:00
Peter Wortmann	74feb229c7	etcdserver: Guarantee order of requested progress notifications Progress notifications requested using ProgressRequest were sent directly using the ctrlStream, which means that they could race against watch responses in the watchStream. This would especially happen when the stream was not synced - e.g. if you requested a progress notification on a freshly created unsynced watcher, the notification would typically arrive indicating a revision for which not all watch responses had been sent. This changes the behaviour so that v3rpc always goes through the watch stream, using a new RequestProgressAll function that closely matches the behaviour of the v3rpc code - i.e. 1. Generate a message with WatchId -1, indicating the revision for all watchers in the stream 2. Guarantee that a response is (eventually) sent The latter might require us to defer the response until all watchers are synced, which is likely as it should be. Note that we do not guarantee that the number of progress notifications matches the number of requests, only that eventually at least one gets sent. Signed-off-by: Peter Wortmann <peter.wortmann@skao.int>	2023-04-05 11:54:10 +01:00
xakdwch	c767f429f0	rafthttp: replace inline code with existing function The isMsgApp function implements the judgment of the MsgApp message, use the isMsgApp function instead. Signed-off-by: xakdwch <xakdwch5@gmail.com>	2023-03-03 09:50:14 +08:00
xin.li	b17b9c1428	chore: Use http constants to replace numbers as parameters Signed-off-by: xin.li <xin.li@daocloud.io>	2023-02-20 11:53:41 +08:00
Piotr Tabor	9abc895122	Goimports: Apply automated fixing to test files as well. Signed-off-by: Piotr Tabor <ptab@google.com>	2022-12-29 13:04:45 +01:00
Piotr Tabor	9e1abbab6e	Fix goimports in all existing files. Execution of ./scripts/fix.sh Signed-off-by: Piotr Tabor <ptab@google.com>	2022-12-29 09:41:31 +01:00
Benjamin Wang	394956ca4e	doc: cleanup etcd/raft in all documents TODO: 1. Update Documentation/contributor-guide/modules.svg; 2. Update bill-of-materials.json when raft and raftexample are removed in future; Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-12-02 14:13:18 +08:00
Benjamin Wang	faff80a2b3	etcdserve: format the source code gofmt -w ./server Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-12-02 13:00:59 +08:00
Benjamin Wang	e9aa275b36	etcdserver: update etcdserver to use the new raft module go.etcd.io/raft/v3 Just replaced all go.etcd.io/etcd/raft/v3 with go.etcd.io/raft/v3 under directory server. Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-12-02 09:33:45 +08:00
Bhargav Ravuri	2feec4fe68	comments: fix comments as per goword in go test files Comments fixed as per goword in go test files that shell function go_srcs_in_module lists as per changes on #14827 Helps in #14827 Signed-off-by: Bhargav Ravuri <bhargav.ravuri@infracloud.io>	2022-11-23 23:05:42 +05:30
Andrew Sims	f656fa0f49	add missing copyright headers Signed-off-by: Andrew Sims <andrew.cameron.sims@gmail.com>	2022-11-23 19:13:43 +11:00
Sasha Melentyev	c3b6cbdb73	all: goimports -w . Signed-off-by: Sasha Melentyev <sasha@melentyev.io>	2022-11-17 19:07:04 +03:00
Sasha Melentyev	2c9c209eb6	all: Changing Printf and friends to Print if there is no formatting Signed-off-by: Sasha Melentyev <sasha@melentyev.io>	2022-11-15 22:11:23 +03:00
Sasha Melentyev	006e747a44	all: Change time unit Signed-off-by: Sasha Melentyev <sasha@melentyev.io>	2022-11-15 01:15:01 +03:00
Benjamin Wang	f77b8a735f	etcdserver: populate HashRevision when responding to leader or client's HashKV request Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-11-14 08:33:44 +08:00
Nathan VanBenschoten	0f9d7a4f95	raft: make Message.Snapshot nullable, halve struct size This commit makes the rarely used `raftpb.Message.Snapshot` field nullable. In doing so, it reduces the memory size of a `raftpb.Message` message from 264 bytes to 128 bytes — a 52% reduction in size. While this commit does not change the protobuf encoding, it does change how that encoding is used. `(gogoproto.nullable) = false` instruct the generated proto marshaling logic to always encode a value for the field, even if that value is empty. `(gogoproto.nullable) = true` instructs the generated proto marshaling logic to omit an encoded value for the field if the field is nil. This raises compatibility concerns in both directions. Messages encoded by new binary versions without a `Snapshot` field will be decoded as an empty field by old binary versions. In other words, old binary versions can't tell the difference. However, messages encoded by old binary versions with an empty Snapshot field will be decoded as a non-nil, empty field by new binary versions. As a result, new binary versions need to be prepared to handle such messages. While Message.Snapshot is not intentionally part of the external interface of this library, it was possible for users of the library to access it and manipulate it. As such, this change may be considered a breaking change. Signed-off-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>	2022-11-09 17:35:52 +00:00
Benjamin Wang	c967715d93	auth: protect all maintainence APIs when auth is enabled All maintenance APIs require admin privilege when auth is enabled, otherwise, the request will be rejected. If auth isn't enabled, then no such requirement any more. Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-11-03 04:39:42 +08:00
Marek Siarkowicz	f215cd89d2	Merge pull request #14612 from spacewander/azq chore: commit the change generated by scripts/genproto.sh	2022-10-24 20:00:52 +02:00
spacewander	3a63a0d5e3	chore: commit the change generated by scripts/genproto.sh TODO: ensure the generated code is up-to-date in the CI. Signed-off-by: spacewander <spacewanderlzx@gmail.com>	2022-10-23 21:13:55 +08:00
Samuele Resca	b58f9c27e4	Refactoring code to remove duplicate code test. Signed-off-by: Samuele Resca <sr7@ad.datcon.co.uk> Signed-off-by: Samuele Resca <samuele.resca@gmail.com>	2022-10-23 13:46:10 +01:00
Samuele Resca	3d9c5c6166	Adding fuzz test on v3rpc interfaces. Signed-off-by: Samuele Resca <sr7@ad.datcon.co.uk> Signed-off-by: Samuele Resca <samuele.resca@gmail.com>	2022-10-23 13:46:10 +01:00
Benjamin Wang	1c20ed2cc5	Merge pull request #14521 from lovehhf/remove_pick_peer_url membership: Remove PickPeerURL Method	2022-09-27 02:10:35 +08:00
Hongfei Huang	f6d808736c	membership: Remove PickPeerURL Method PickPeerURL only used by unit test Signed-off-by: Hongfei Huang <853885165@qq.com>	2022-09-26 23:21:10 +08:00
Kafuu Chino	f1d4935e91	*: avoid closing a watch with ID 0 incorrectly Signed-off-by: Kafuu Chino <KafuuChinoQ@gmail.com> add test	2022-09-26 20:30:33 +08:00
Benjamin Wang	7f10dccbaf	Bump go 1.19: update all the dependencies and go.sum files 1. run ./scripts/fix.sh; 2. cd tools/mod; gofmt -w . & go mod tidy; Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-09-22 08:47:46 +08:00
Marek Siarkowicz	026794495f	Merge pull request #14494 from demoManito/remove/redundant-type-conversion etcd: remove redundant type conversion	2022-09-21 11:34:19 +02:00
Benjamin Wang	2441a24cee	Merge pull request #14493 from demoManito/style/format-import-order etcd: format import order	2022-09-21 06:03:31 +08:00
demoManito	f67ec10779	etcd: format import order golang CodeReviewComments: https://github.com/golang/go/wiki/CodeReviewComments#imports Signed-off-by: demoManito <1430482733@qq.com>	2022-09-20 18:41:39 +08:00
demoManito	a9c3d56508	etcd: remove redundant type conversion Signed-off-by: demoManito <1430482733@qq.com>	2022-09-20 11:26:02 +08:00
Benjamin Wang	159ed15afc	Merge pull request #14479 from demoManito/fix/declaring-empty-slice etcd: modify declaring empty slices	2022-09-20 05:22:59 +08:00
Hitoshi Mitake	2dcfa83094	*: handle auth invalid token and old revision errors in watch Signed-off-by: Hitoshi Mitake <h.mitake@gmail.com>	2022-09-17 21:51:36 +09:00
demoManito	72cf0cc04a	etcd: modify declaring empty slices declare an empty slice to var s []int replace s :=[]int{}, https://github.com/golang/go/wiki/CodeReviewComments#declaring-empty-slices Signed-off-by: demoManito <1430482733@qq.com>	2022-09-16 14:41:14 +08:00
spacewander	508ce517e0	update according to the review Signed-off-by: spacewander <spacewanderlzx@gmail.com>	2022-08-17 09:25:37 +08:00
spacewander	bebefd8b80	chore: log when an invalid watch request is received As protobuf doesn't have required field, user may send an empty WatchRequest by mistake. Currently, etcd will ignore the invalid request and keep the stream opening. If we don't reject the invalid request by closing the stream, it would be better to leave a log there. This commit also fixes a typo in the comment. Signed-off-by: spacewander <spacewanderlzx@gmail.com>	2022-08-16 11:33:01 +08:00
Austin Benoit	ff56da7745	rafthttp: test transport multiple transport removes Unit test to verify multiple transport removes does not create an issue. Signed-off-by: Austin Benoit <22805659+AustinBenoit@users.noreply.github.com>	2022-07-28 18:23:17 -04:00
杨金珏	6220174687	support custom `grpc.MaxConcurrentStreams` There is no update on the original PR (see below) for more then 2 weeks. So Benjamin(@ahrtr) continues to work on the PR. The first step is to rebase the PR, because there are lots of conflicts with the main branch. The change to go.mod and go.sum reverted, because they are not needed. The e2e test cases are also reverted, because they are not correct. ``` https://github.com/etcd-io/etcd/pull/14081 ``` Signed-off-by: nic-chen <chenjunxu6@gmail.com> Signed-off-by: Benjamin Wang <wachao@vmware.com>	2022-07-06 03:43:46 +08:00
SimFG	107b7c06ab	sanp: Delete the nil judgment of the log object Move some methods into the `Snapshotter` object for removing the `lg == nil` judgment Signed-off-by: SimFG <1142838399@qq.com>	2022-06-28 19:35:33 +08:00
chavacava	756d77663b	removes empty option in JSON tag option can not be empty in JSON tag Signed-off-by: chavacava <salvadorcavadini+github@gmail.com>	2022-06-26 12:13:20 +02:00
L2ncE	637afd359b	Fix a syntax error in a code comment Signed-off-by: L2ncE <llance_24@foxmail.com>	2022-06-15 23:19:12 +08:00
Marek Siarkowicz	2b090e86a6	server: Extract hasher to separate interface Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-06-13 18:20:18 +02:00
Marek Siarkowicz	80828b593a	server: Remove duplicated compaction revision Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>	2022-06-13 18:20:18 +02:00

1 2 3 4 5

239 Commits