
The TestV3WatchRestoreSnapshotUnsync setups three members' cluster. Before serving any update requests from client, after leader elected, each member will have index 8 log: 3 x ConfChange + 3 x ClusterMemberAttrSet + 1 x ClusterVersionSet. Based on the config (SnapshotCount: 10, CatchUpCount: 5), we need to file update requests to trigger snapshot at least twice. T1: L(snapshot-index: 11, compacted-index: 6) F_m0(index: 8) T2: L(snapshot-index: 22, compacted-index: 17) F_m0(index: 8, out of date) After member0 recovers from network partition, it will reject leader's request and return hint (index:8, term:x). If it happens after second snapshot, leader will find out the index:8 is out of date and force to transfer snapshot. However, the client only files 15 update requests and leader doesn't finish the process of snapshot in time. Since the last of compacted-index is 6, leader can still replicate index:9 to member0 instead of snapshot. ```bash cd tests/integration CLUSTER_DEBUG=true go test -v -count=1 -run TestV3WatchRestoreSnapshotUnsync ./ ... INFO m2.raft 3da8ba707f1a21a4 became leader at term 2 {"member": "m2"} ... INFO m2 triggering snapshot {"member": "m2", "local-member-id": "3da8ba707f1a21a4", "local-member-applied-index": 22, "local-member-snapshot-index": 11, "local-member-snapshot-count": 10, "snapshot-forced": false} ... cluster.go:1359: network partition between: 99626fe5001fde8b <-> 1c964119da6db036 cluster.go:1359: network partition between: 99626fe5001fde8b <-> 3da8ba707f1a21a4 cluster.go:416: WaitMembersForLeader INFO m0.raft 99626fe5001fde8b became follower at term 2 {"member": "m0"} INFO m0.raft raft.node: 99626fe5001fde8b elected leader 3da8ba707f1a21a4 at term 2 {"member": "m0"} DEBUG m2.raft 3da8ba707f1a21a4 received MsgAppResp(rejected, hint: (index 8, term 2)) from 99626fe5001fde8b for index 23 {"member": "m2"} DEBUG m2.raft 3da8ba707f1a21a4 decreased progress of 99626fe5001fde8b to [StateReplicate match=8 next=9 inflight=15] {"member": "m2"} DEBUG m0 Applying entries {"member": "m0", "num-entries": 15} DEBUG m0 Applying entry {"member": "m0", "index": 9, "term": 2, "type": "EntryNormal"} .... INFO m2 saved snapshot {"member": "m2", "snapshot-index": 22} INFO m2 compacted Raft logs {"member": "m2", "compact-index": 17} ``` To fix this issue, the patch uses log monitor to watch "compacted Raft log" and expect that two members should compact log twice. Fixes: #15545 Signed-off-by: Wei Fu <fuweid89@gmail.com>
etcd
Note: The main
branch may be in an unstable or even broken state during development. For stable versions, see releases.
etcd is a distributed reliable key-value store for the most critical data of a distributed system, with a focus on being:
- Simple: well-defined, user-facing API (gRPC)
- Secure: automatic TLS with optional client cert authentication
- Fast: benchmarked 10,000 writes/sec
- Reliable: properly distributed using Raft
etcd is written in Go and uses the Raft consensus algorithm to manage a highly-available replicated log.
etcd is used in production by many companies, and the development team stands behind it in critical deployment scenarios, where etcd is frequently teamed with applications such as Kubernetes, locksmith, vulcand, Doorman, and many others. Reliability is further ensured by rigorous robustness testing.
See etcdctl for a simple command line client.
Maintainers
MAINTAINERS strive to shape an inclusive open source project culture where users are heard and contributors feel respected and empowered. MAINTAINERS maintain productive relationships across different companies and disciplines. Read more about MAINTAINERS role and responsibilities.
Getting started
Getting etcd
The easiest way to get etcd is to use one of the pre-built release binaries which are available for OSX, Linux, Windows, and Docker on the release page.
For more installation guides, please check out play.etcd.io and operating etcd.
Running etcd
First start a single-member cluster of etcd.
If etcd is installed using the pre-built release binaries, run it from the installation location as below:
/tmp/etcd-download-test/etcd
The etcd command can be simply run as such if it is moved to the system path as below:
mv /tmp/etcd-download-test/etcd /usr/local/bin/
etcd
This will bring up etcd listening on port 2379 for client communication and on port 2380 for server-to-server communication.
Next, let's set a single key, and then retrieve it:
etcdctl put mykey "this is awesome"
etcdctl get mykey
etcd is now running and serving client requests. For more, please check out:
etcd TCP ports
The official etcd ports are 2379 for client requests, and 2380 for peer communication.
Running a local etcd cluster
First install goreman, which manages Procfile-based applications.
Our Procfile script will set up a local example cluster. Start it with:
goreman start
This will bring up 3 etcd members infra1
, infra2
and infra3
and optionally etcd grpc-proxy
, which runs locally and composes a cluster.
Every cluster member and proxy accepts key value reads and key value writes.
Follow the steps in Procfile.learner to add a learner node to the cluster. Start the learner node with:
goreman -f ./Procfile.learner start
Install etcd client v3
go get go.etcd.io/etcd/client/v3
Next steps
Now it's time to dig into the full etcd API and other guides.
- Read the full documentation.
- Explore the full gRPC API.
- Set up a multi-machine cluster.
- Learn the config format, env variables and flags.
- Find language bindings and tools.
- Use TLS to secure an etcd cluster.
- Tune etcd.
Contact
- Email: etcd-dev
- Slack: #etcd channel on Kubernetes (get an invite)
- Community meetings
Community meetings
etcd contributors and maintainers have monthly (every four weeks) meetings at 11:00 AM (USA Pacific) on Thursday.
An initial agenda will be posted to the shared Google docs a day before each meeting, and everyone is welcome to suggest additional topics or other agendas.
Meeting recordings are uploaded to official etcd YouTube channel.
Get calendar invitation by joining etcd-dev mailing group.
Join Hangouts Meet: meet.google.com/umg-nrxn-qvs
Join by phone: +1 405-792-0633 PIN: 299 906#
Contributing
See CONTRIBUTING for details on setting up your development environment, submitting patches and the contribution workflow.
Reporting bugs
See reporting bugs for details about reporting any issues.
Reporting a security vulnerability
See security disclosure and release process for details on how to report a security vulnerability and how the etcd team manages it.
Issue and PR management
See issue triage guidelines for details on how issues are managed.
See PR management for guidelines on how pull requests are managed.
etcd Emeritus Maintainers
These emeritus maintainers dedicated a part of their career to etcd and reviewed code, triaged bugs and pushed the project forward over a substantial period of time. Their contribution is greatly appreciated.
- Fanmin Shi
- Anthony Romano
- Brandon Philips
- Joe Betz
- Gyuho Lee
- Jingyi Hu
- Wenjia Zhang
- Xiang Li
- Ben Darnell
- Sam Batschelet
License
etcd is under the Apache 2.0 license. See the LICENSE file for details.