mirror of
https://github.com/etcd-io/etcd.git
synced 2024-09-27 06:25:44 +00:00

Currently our sample systemd service file `contrib/systemd/etcd.service` have startup/shutdown dependency as below: [Unit] After=network.target For some rare condition, e.g. bare matel deployment with slow network startup, IP could not be assigned e arly enough before etcd default `ETCD_HEARTBEAT_INTERVAL="100"` and `ETCD_ELECTION_TIMEOUT="1000"` get timeouted, after graceful system reboot. This cause etcd false negative classify itself use unhealthy, therefore stop rejoining the remaining online cluster members. This PR introduce: - `etcd.service`: Ensure startup after `network-online.target` and `time-sync.target`, so effective network connectivity and synced time is available. The logic is concept proof by <https://github.com/alvistack/ansible-role-etcd/tree/develop>; also works as expected with Ceph + Kubernetes deployment by <https://github.com/alvistack/ansible-collection-kubernetes/tree/develop>. No more deadlock happened during graceful system reboot, both AIO single/multiple node with loopback mount. Also see: - <https://github.com/ceph/ceph/pull/36776> - <https://github.com/etcd-io/etcd/pull/12259> - <https://github.com/cri-o/cri-o/pull/4128> - <https://github.com/kubernetes/release/pull/1504> Signed-off-by: Wong Hoi Sing Edison <hswong3i@gmail.com>
Contrib
Scripts and files which may be useful but aren't part of the core etcd project.
- systemd - an example unit file for deploying etcd on systemd-based distributions
- raftexample - an example distributed key-value store using raft
- systemd/etcd2-backup-coreos - remote backup and restore procedures for etcd2 clusters on CoreOS Linux
- systemd/etcd3-multinode - multi-node cluster setup with systemd