154 Commits

Author SHA1 Message Date
ahrtr
5cf6ba48de added a unit test for the method processMessages 2022-03-08 09:38:23 +08:00
ahrtr
793218ed2b update the confstate before sending snapshot
When there is a `raftpb.EntryConfChange` after creating the snapshot,
then the confState included in the snapshot is out of date. so We need
to update the confState before sending a snapshot to a follower.
2022-03-07 12:18:29 +08:00
Manuel Rüger
72c33d8b05 contrib/mixin: Generate rules, fix tests
* Add Makefile
* Make tests runnable
* Add generated rule manifest file

Signed-off-by: Manuel Rüger <manuel@rueg.eu>
2022-02-10 16:17:03 +01:00
Matthias Lisin
7460379bad contrib/mixin: add missing summary to alerts
to avoid alert messages being templated with undefined values lets
set summary for alerts that are currently missing one
2022-01-19 19:55:40 +01:00
Eng Zer Jun
2a151c8982
*: move from io/ioutil to io and os packages
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-10-28 00:05:28 +08:00
Sam Batschelet
5991da1534
Merge pull request #13388 from grafana/mixin-rate-interval
contrib/mixin: Update dashboard promql to use $__rate_interval.
2021-10-21 08:14:25 -04:00
Tom Wilkie
fead3be933
Grafana datasource template should be labelled 'Data Source'.
Signed-off-by: Tom Wilkie <tom@grafana.com>
2021-10-20 13:42:43 +01:00
Lili Cosic
aef9131c81 contrib/mixin/mixin.libsonnet: Include gRPC method in alert description
This makes it easier for admin to determine the alert issue.
2021-10-15 15:10:52 +02:00
Sam Batschelet
0eb72bde2c contrib/mixin: omit Defragment method from etcdGRPCRequestsSlow
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2021-10-08 16:21:46 -04:00
Ryan J. Geyer
98427d2bed contrib/mixin: Update dashboard queries to use $__rate_interval
A global query variable was introduced in Grafana 7.2 which is "almost always right" for `rate`, `irate`, and `increase` function calls in promql.
2021-10-04 15:02:11 -07:00
Sam Batschelet
b448daa698
Merge pull request #13275 from lilic/add-peer-dashboard
contrib/mixin/mixin.libsonnet: Add dashboard for peer round trip time
2021-08-05 08:27:38 -04:00
Lili Cosic
55b697c528 contrib/mixin/mixin.libsonnet: Add dashboard for peer round trip time
This helps users debug firing alerts.
2021-08-05 13:15:34 +02:00
Marek Siarkowicz
44b8ae145b etcdserver: Move datadir and wal to storage package 2021-08-03 12:47:37 +02:00
Johannes 'fish' Ziemke
7885f2a951 Mixin: Support configuring cluster label 2021-07-29 17:54:14 +02:00
Lili Cosic
85f7b3c406 contrib/mixin/mixin.libsonnet: Unify alerting description 2021-07-16 15:25:53 +02:00
Haines Chan
36bb8d293c Use method const in package http instead of literal 2021-07-08 20:00:03 +08:00
Lili Cosic
f00231951d contrib/mixin/mixin.libsonnet: Adjust gRPC failed requests
OK is not the only one that is allowed, this before also captured
context canceled, NotFound, and other non error requests.
2021-06-21 11:47:53 +02:00
Piotr Tabor
ffea1537d4 ClientV3 tests use integration.NewClient that configures proper logger. 2021-04-29 18:18:34 +02:00
Tom Wilkie
562d645ac9
Fix the mixin.
Signed-off-by: Tom Wilkie <tom@grafana.com>
2021-04-13 19:38:55 +01:00
Piotr Tabor
bad0b4d513
Merge pull request #12823 from mtulio/chore/dash-var-refresh
chore/dash-var-refresh: change default refresh to 2(time range)
2021-04-08 15:14:53 +02:00
Marco Tulio R Braga
aeeecc06cf
fix/dash-var-refresh: add const and description 2021-04-08 10:12:41 -03:00
Piotr Tabor
816d332d81
Merge pull request #12830 from ptabor/20210405-split-pkg
Split client/pkg as dedicated low-dependencies module for client
2021-04-08 00:48:41 +02:00
Piotr Tabor
3bb7acc8cf Migrate dependencies pkg/foo -> client/pkg/foo 2021-04-07 00:38:47 +02:00
Patrice Chalin
2ba69de281 Contrib lock example 2021-04-06 15:21:01 -04:00
Marco Braga
d2bc5343fb chore/dash-var-refresh: change default refresh to 2(time range) 2021-04-01 00:06:57 -03:00
Piotr Tabor
fce0c192eb Regenerate protos. 2021-03-25 00:31:44 +01:00
Piotr Tabor
3976d68ed3 raftExample: Allow closing raftexample node when snapshotting.
Fix race that made the raftExample test fail.
2021-02-26 08:56:12 +01:00
Shintaro Murakami
5ae3f879c9 raftexample: Return an appropriate applyDoneC 2021-02-24 21:28:18 +09:00
Shintaro Murakami
cb0d256a18 raftexample: Add test for adding new node to existing cluster 2021-02-22 13:44:33 +09:00
Shintaro Murakami
1b1be43d65 raftexample: New joined node have to start with RestartNode 2021-02-22 09:45:44 +09:00
Shintaro Murakami
cc2b039817 raftexample: Explicitly notify all committed entries are applied 2021-02-19 19:26:36 +09:00
Shintaro Murakami
2d25f7f3da raftexample: Implement ReportUnreachable and ReportSnapshot 2021-02-17 11:59:32 +09:00
Shintaro Murakami
1302e1edb2 raftexample: Save snapshot file before writing to wal 2021-02-16 13:30:15 +09:00
Piotr Tabor
1395a1a795 Migrate back mixin to contrib/
The mixin was moved out together with documentation.
This broke kube-prometheous: https://github.com/etcd-io/etcd/issues/12685#issuecomment-777264143
2021-02-11 09:28:30 +01:00
Piotr Tabor
e8ba375032
Merge pull request #11889 from mrkm4ntr/example-recover-from-snap
raftexample: Fix recovery from snapshot
2021-02-10 12:18:30 +01:00
Shintaro Murakami
be2167ebab Wait until all committed entries are applied
To take a snapshot
2021-02-05 19:05:41 +09:00
Shintaro Murakami
cb14cdd774 raftexample: Fix recovery from snapshot
* If there is a snapshot, HTTP server won't start.
* Resotring form snapshot occurs after replaying WAL.
* When taking a snapshot, the last change is not applied to the state machine yet.
2021-02-05 09:34:34 +09:00
Piotr Tabor
b5d11723d1
Merge pull request #12393 from viviyww/contrib-doc
contrib: del systemd/etcd2-backup-coreos in docs
2021-02-01 15:33:44 +01:00
Piotr Tabor
4af159a30a
Merge pull request #12259 from alvistack/master-aio_graceful_reboot
`etcd.service`: Define explicit dependencies of systemd etcd service
2021-01-31 23:58:12 +01:00
Luca BRUNO
b0e2c70c71
contrib/systemd: add a sysusers entry
This adds a sysusers.d file, in order to create a system user/group
which matches the one used by the service unit.

Ref: https://www.freedesktop.org/software/systemd/man/sysusers.d.html
2020-12-09 13:59:46 +00:00
Piotr Tabor
aaf423e962 server: Update imports.
find -name '*.go' | xargs sed -i --follow-symlinks 's|etcd/v3/|etcd/server/v3/|g'
2020-10-26 13:02:32 +01:00
Piotr Tabor
45b007b8b4 contrib,clientv3: Move contrib/recipies to clientv3/experimental/recipies/...
Recipies is set of patterns / primitives implementation on top of clientv3.
It's used by integration tests. It shouldn't be considered "server" code.
2020-10-22 11:10:07 +02:00
Piotr Tabor
e33c6dd9df client/v3: Rename of imports 2020-10-20 10:13:06 +02:00
Piotr Tabor
e62417297d *: Rename of imports of raft (as its now a module)
% find -name '*.go' -o -name '*.md' -o -name '*.sh' | xargs sed -i --follow-symlinks 's|etcd/v3/raft|etcd/raft/v3|g'
2020-10-16 13:58:18 +02:00
yangweiwei
5d3609e3cf contrib: del systemd/etcd2-backup-coreos in docs
del systemd/etcd2-back-coreos in docs
2020-10-14 09:49:56 +08:00
Piotr Tabor
de55bb6331 pkg: Rename imports after making 'pkg' a module
find -name '*.go' | xargs sed --follow-symlinks -i 's|go.etcd.io/etcd/v3/pkg/|go.etcd.io/etcd/pkg/v3/|g'
go fmt ./...
2020-10-13 00:09:27 +02:00
Piotr Tabor
28f2b07623 *: Update references to code moved to the api/ dir.
Follow up to file-moves done in the previous commit.

The commit contains purely mechanical consequences of execution (apart
of scripts/genproto.sh):

  % find ./ -name '*.go'  | xargs sed --follow-symlinks -i 's|v3/etcdserver/api/v3rpc/rpctypes|v3/api/v3rpc/rpctypes|g'
  % find ./ -name '*.go'  | xargs sed --follow-symlinks -i 's|v3/version|v3/api/version|g'
  % find ./ -name '*.go'  | xargs sed --follow-symlinks -i 's|v3/mvcc/mvccpb|v3/api/mvccpb|g'
  % find ./ -name '*.go'  | xargs sed --follow-symlinks -i 's|v3/etcdserver/etcdserverpb|v3/api/etcdserverpb|g'
  % find ./ -name '*.go'  | xargs sed --follow-symlinks -i 's|v3/etcdserver/api/membership/membershippb|v3/api/membershippb|g'
  % find ./ -name '*.go'  | xargs sed --follow-symlinks -i 's|v3/auth/authpb|v3/api/authpb|g'

  % find ./ -name '*.proto' -o -name '*.md'  | xargs -L 1 sed --follow-symlinks -i 's|/mvcc/mvccpb/kv.proto|/api/mvccpb/kv.proto|g'
  % find ./ -name '*.proto' -o -name '*.md'  | xargs -L 1 sed --follow-symlinks -i 's|/auth/authpb/auth.proto|/api/authpb/auth.proto|g'
  % find ./ -name '*.proto' -o -name '*.md'  | xargs -L 1 sed --follow-symlinks -i 's|/etcdserver/api/membership/membershippb/membership.proto|/api/membershippb/membership.proto|g'

  I also modified manually paths in scripts/genproto.sh.

  % go fmt ./...
2020-10-06 11:56:16 +02:00
Wong Hoi Sing Edison
17ceed9b47
etcd.service: Support Graceful Reboot for AIO Node
Currently our sample systemd service file `contrib/systemd/etcd.service`
have startup/shutdown dependency as below:

    [Unit]
    After=network.target

For some rare condition, e.g. bare matel deployment with slow network
startup, IP could not be assigned e arly enough before etcd default
`ETCD_HEARTBEAT_INTERVAL="100"` and `ETCD_ELECTION_TIMEOUT="1000"` get
timeouted, after graceful system reboot.

This cause etcd false negative classify itself use unhealthy, therefore
stop rejoining the remaining online cluster members.

This PR introduce:

  - `etcd.service`: Ensure startup after `network-online.target` and
    `time-sync.target`, so effective network connectivity and synced
    time is available.

The logic is concept proof by
<https://github.com/alvistack/ansible-role-etcd/tree/develop>; also
works as expected with Ceph + Kubernetes deployment by
<https://github.com/alvistack/ansible-collection-kubernetes/tree/develop>.
No more deadlock happened during graceful system reboot, both AIO
single/multiple node with loopback mount.

Also see:

  - <https://github.com/ceph/ceph/pull/36776>
  - <https://github.com/etcd-io/etcd/pull/12259>
  - <https://github.com/cri-o/cri-o/pull/4128>
  - <https://github.com/kubernetes/release/pull/1504>

Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com>
2020-09-17 16:59:12 +08:00
Brandon Philips
96cce208c2 go.mod: use go.etcd.io/etcd/v3 versioning
This change makes the etcd package compatible with the existing Go
ecosystem for module versioning.

Used this tool to update package imports:
  https://github.com/KSubedi/gomove
2020-04-28 00:57:35 +00:00
Jingyi Hu
61f279454e
etcdserver/api: remove capnslog (#11606)
* etcdserver/api/rafthttp: remove capnslog

* etcdserver/api/membership: remove capnslog

* etcdserver/api/v2auth: remove capnslog

* etcdserver/api/v2discovery: remove capnslog

* etdserver/api/v2stats: remove capnslog

* etcdserver/api/v2http: remove capnslog

* etcdserver/api/v3rpc: remove capnslog

* etcdserver/api: remove capnslog

Remove capnslog from etcdserver/api. Note that capnslog was
already removed in some packages under etcdserver/api in
previous commits.
2020-02-11 13:51:25 -08:00