Piotr Tabor
fe35b5130e
Fix code scanning alert: This log write receives unsanitized user input
2022-04-19 13:49:08 +02:00
David Wyrobnik
3152dc8174
contrib/raftexample: Save snapshot and WAL before hard state
...
Update raftexample to save the snapshot file and WAL snapshot entry
before hardstate to ensure the snapshot exists during recovery.
Otherwise if there is a failure after storing the hard state there may
be reference to a non-existent snapshot.
This PR introduces the fix from #10219 to the raftexample.
2022-04-11 23:44:54 +00:00
ahrtr
5cf6ba48de
added a unit test for the method processMessages
2022-03-08 09:38:23 +08:00
ahrtr
793218ed2b
update the confstate before sending snapshot
...
When there is a `raftpb.EntryConfChange` after creating the snapshot,
then the confState included in the snapshot is out of date. so We need
to update the confState before sending a snapshot to a follower.
2022-03-07 12:18:29 +08:00
Manuel Rüger
72c33d8b05
contrib/mixin: Generate rules, fix tests
...
* Add Makefile
* Make tests runnable
* Add generated rule manifest file
Signed-off-by: Manuel Rüger <manuel@rueg.eu>
2022-02-10 16:17:03 +01:00
Matthias Lisin
7460379bad
contrib/mixin: add missing summary to alerts
...
to avoid alert messages being templated with undefined values lets
set summary for alerts that are currently missing one
2022-01-19 19:55:40 +01:00
Eng Zer Jun
2a151c8982
*: move from io/ioutil to io and os packages
...
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil . This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-10-28 00:05:28 +08:00
Sam Batschelet
5991da1534
Merge pull request #13388 from grafana/mixin-rate-interval
...
contrib/mixin: Update dashboard promql to use $__rate_interval.
2021-10-21 08:14:25 -04:00
Tom Wilkie
fead3be933
Grafana datasource template should be labelled 'Data Source'.
...
Signed-off-by: Tom Wilkie <tom@grafana.com>
2021-10-20 13:42:43 +01:00
Lili Cosic
aef9131c81
contrib/mixin/mixin.libsonnet: Include gRPC method in alert description
...
This makes it easier for admin to determine the alert issue.
2021-10-15 15:10:52 +02:00
Sam Batschelet
0eb72bde2c
contrib/mixin: omit Defragment method from etcdGRPCRequestsSlow
...
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2021-10-08 16:21:46 -04:00
Ryan J. Geyer
98427d2bed
contrib/mixin: Update dashboard queries to use $__rate_interval
...
A global query variable was introduced in Grafana 7.2 which is "almost always right" for `rate`, `irate`, and `increase` function calls in promql.
2021-10-04 15:02:11 -07:00
Sam Batschelet
b448daa698
Merge pull request #13275 from lilic/add-peer-dashboard
...
contrib/mixin/mixin.libsonnet: Add dashboard for peer round trip time
2021-08-05 08:27:38 -04:00
Lili Cosic
55b697c528
contrib/mixin/mixin.libsonnet: Add dashboard for peer round trip time
...
This helps users debug firing alerts.
2021-08-05 13:15:34 +02:00
Marek Siarkowicz
44b8ae145b
etcdserver: Move datadir and wal to storage package
2021-08-03 12:47:37 +02:00
Johannes 'fish' Ziemke
7885f2a951
Mixin: Support configuring cluster label
2021-07-29 17:54:14 +02:00
Lili Cosic
85f7b3c406
contrib/mixin/mixin.libsonnet: Unify alerting description
2021-07-16 15:25:53 +02:00
Haines Chan
36bb8d293c
Use method const in package http instead of literal
2021-07-08 20:00:03 +08:00
Lili Cosic
f00231951d
contrib/mixin/mixin.libsonnet: Adjust gRPC failed requests
...
OK is not the only one that is allowed, this before also captured
context canceled, NotFound, and other non error requests.
2021-06-21 11:47:53 +02:00
Piotr Tabor
ffea1537d4
ClientV3 tests use integration.NewClient that configures proper logger.
2021-04-29 18:18:34 +02:00
Tom Wilkie
562d645ac9
Fix the mixin.
...
Signed-off-by: Tom Wilkie <tom@grafana.com>
2021-04-13 19:38:55 +01:00
Piotr Tabor
bad0b4d513
Merge pull request #12823 from mtulio/chore/dash-var-refresh
...
chore/dash-var-refresh: change default refresh to 2(time range)
2021-04-08 15:14:53 +02:00
Marco Tulio R Braga
aeeecc06cf
fix/dash-var-refresh: add const and description
2021-04-08 10:12:41 -03:00
Piotr Tabor
816d332d81
Merge pull request #12830 from ptabor/20210405-split-pkg
...
Split client/pkg as dedicated low-dependencies module for client
2021-04-08 00:48:41 +02:00
Piotr Tabor
3bb7acc8cf
Migrate dependencies pkg/foo -> client/pkg/foo
2021-04-07 00:38:47 +02:00
Patrice Chalin
2ba69de281
Contrib lock example
2021-04-06 15:21:01 -04:00
Marco Braga
d2bc5343fb
chore/dash-var-refresh: change default refresh to 2(time range)
2021-04-01 00:06:57 -03:00
Piotr Tabor
fce0c192eb
Regenerate protos.
2021-03-25 00:31:44 +01:00
Piotr Tabor
3976d68ed3
raftExample: Allow closing raftexample node when snapshotting.
...
Fix race that made the raftExample test fail.
2021-02-26 08:56:12 +01:00
Shintaro Murakami
5ae3f879c9
raftexample: Return an appropriate applyDoneC
2021-02-24 21:28:18 +09:00
Shintaro Murakami
cb0d256a18
raftexample: Add test for adding new node to existing cluster
2021-02-22 13:44:33 +09:00
Shintaro Murakami
1b1be43d65
raftexample: New joined node have to start with RestartNode
2021-02-22 09:45:44 +09:00
Shintaro Murakami
cc2b039817
raftexample: Explicitly notify all committed entries are applied
2021-02-19 19:26:36 +09:00
Shintaro Murakami
2d25f7f3da
raftexample: Implement ReportUnreachable and ReportSnapshot
2021-02-17 11:59:32 +09:00
Shintaro Murakami
1302e1edb2
raftexample: Save snapshot file before writing to wal
2021-02-16 13:30:15 +09:00
Piotr Tabor
1395a1a795
Migrate back mixin to contrib/
...
The mixin was moved out together with documentation.
This broke kube-prometheous: https://github.com/etcd-io/etcd/issues/12685#issuecomment-777264143
2021-02-11 09:28:30 +01:00
Piotr Tabor
e8ba375032
Merge pull request #11889 from mrkm4ntr/example-recover-from-snap
...
raftexample: Fix recovery from snapshot
2021-02-10 12:18:30 +01:00
Shintaro Murakami
be2167ebab
Wait until all committed entries are applied
...
To take a snapshot
2021-02-05 19:05:41 +09:00
Shintaro Murakami
cb14cdd774
raftexample: Fix recovery from snapshot
...
* If there is a snapshot, HTTP server won't start.
* Resotring form snapshot occurs after replaying WAL.
* When taking a snapshot, the last change is not applied to the state machine yet.
2021-02-05 09:34:34 +09:00
Piotr Tabor
b5d11723d1
Merge pull request #12393 from viviyww/contrib-doc
...
contrib: del systemd/etcd2-backup-coreos in docs
2021-02-01 15:33:44 +01:00
Piotr Tabor
4af159a30a
Merge pull request #12259 from alvistack/master-aio_graceful_reboot
...
`etcd.service`: Define explicit dependencies of systemd etcd service
2021-01-31 23:58:12 +01:00
Luca BRUNO
b0e2c70c71
contrib/systemd: add a sysusers entry
...
This adds a sysusers.d file, in order to create a system user/group
which matches the one used by the service unit.
Ref: https://www.freedesktop.org/software/systemd/man/sysusers.d.html
2020-12-09 13:59:46 +00:00
Piotr Tabor
aaf423e962
server: Update imports.
...
find -name '*.go' | xargs sed -i --follow-symlinks 's|etcd/v3/|etcd/server/v3/|g'
2020-10-26 13:02:32 +01:00
Piotr Tabor
45b007b8b4
contrib,clientv3: Move contrib/recipies to clientv3/experimental/recipies/...
...
Recipies is set of patterns / primitives implementation on top of clientv3.
It's used by integration tests. It shouldn't be considered "server" code.
2020-10-22 11:10:07 +02:00
Piotr Tabor
e33c6dd9df
client/v3: Rename of imports
2020-10-20 10:13:06 +02:00
Piotr Tabor
e62417297d
*: Rename of imports of raft (as its now a module)
...
% find -name '*.go' -o -name '*.md' -o -name '*.sh' | xargs sed -i --follow-symlinks 's|etcd/v3/raft|etcd/raft/v3|g'
2020-10-16 13:58:18 +02:00
yangweiwei
5d3609e3cf
contrib: del systemd/etcd2-backup-coreos in docs
...
del systemd/etcd2-back-coreos in docs
2020-10-14 09:49:56 +08:00
Piotr Tabor
de55bb6331
pkg: Rename imports after making 'pkg' a module
...
find -name '*.go' | xargs sed --follow-symlinks -i 's|go.etcd.io/etcd/v3/pkg/|go.etcd.io/etcd/pkg/v3/|g'
go fmt ./...
2020-10-13 00:09:27 +02:00
Piotr Tabor
28f2b07623
*: Update references to code moved to the api/ dir.
...
Follow up to file-moves done in the previous commit.
The commit contains purely mechanical consequences of execution (apart
of scripts/genproto.sh):
% find ./ -name '*.go' | xargs sed --follow-symlinks -i 's|v3/etcdserver/api/v3rpc/rpctypes|v3/api/v3rpc/rpctypes|g'
% find ./ -name '*.go' | xargs sed --follow-symlinks -i 's|v3/version|v3/api/version|g'
% find ./ -name '*.go' | xargs sed --follow-symlinks -i 's|v3/mvcc/mvccpb|v3/api/mvccpb|g'
% find ./ -name '*.go' | xargs sed --follow-symlinks -i 's|v3/etcdserver/etcdserverpb|v3/api/etcdserverpb|g'
% find ./ -name '*.go' | xargs sed --follow-symlinks -i 's|v3/etcdserver/api/membership/membershippb|v3/api/membershippb|g'
% find ./ -name '*.go' | xargs sed --follow-symlinks -i 's|v3/auth/authpb|v3/api/authpb|g'
% find ./ -name '*.proto' -o -name '*.md' | xargs -L 1 sed --follow-symlinks -i 's|/mvcc/mvccpb/kv.proto|/api/mvccpb/kv.proto|g'
% find ./ -name '*.proto' -o -name '*.md' | xargs -L 1 sed --follow-symlinks -i 's|/auth/authpb/auth.proto|/api/authpb/auth.proto|g'
% find ./ -name '*.proto' -o -name '*.md' | xargs -L 1 sed --follow-symlinks -i 's|/etcdserver/api/membership/membershippb/membership.proto|/api/membershippb/membership.proto|g'
I also modified manually paths in scripts/genproto.sh.
% go fmt ./...
2020-10-06 11:56:16 +02:00
Wong Hoi Sing Edison
17ceed9b47
etcd.service
: Support Graceful Reboot for AIO Node
...
Currently our sample systemd service file `contrib/systemd/etcd.service`
have startup/shutdown dependency as below:
[Unit]
After=network.target
For some rare condition, e.g. bare matel deployment with slow network
startup, IP could not be assigned e arly enough before etcd default
`ETCD_HEARTBEAT_INTERVAL="100"` and `ETCD_ELECTION_TIMEOUT="1000"` get
timeouted, after graceful system reboot.
This cause etcd false negative classify itself use unhealthy, therefore
stop rejoining the remaining online cluster members.
This PR introduce:
- `etcd.service`: Ensure startup after `network-online.target` and
`time-sync.target`, so effective network connectivity and synced
time is available.
The logic is concept proof by
<https://github.com/alvistack/ansible-role-etcd/tree/develop >; also
works as expected with Ceph + Kubernetes deployment by
<https://github.com/alvistack/ansible-collection-kubernetes/tree/develop >.
No more deadlock happened during graceful system reboot, both AIO
single/multiple node with loopback mount.
Also see:
- <https://github.com/ceph/ceph/pull/36776 >
- <https://github.com/etcd-io/etcd/pull/12259 >
- <https://github.com/cri-o/cri-o/pull/4128 >
- <https://github.com/kubernetes/release/pull/1504 >
Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com>
2020-09-17 16:59:12 +08:00