From 17ceed9b47d4020e141dc3e9e3dd45828dd6d1a2 Mon Sep 17 00:00:00 2001 From: Wong Hoi Sing Edison Date: Wed, 26 Aug 2020 11:38:04 +0800 Subject: [PATCH] `etcd.service`: Support Graceful Reboot for AIO Node Currently our sample systemd service file `contrib/systemd/etcd.service` have startup/shutdown dependency as below: [Unit] After=network.target For some rare condition, e.g. bare matel deployment with slow network startup, IP could not be assigned e arly enough before etcd default `ETCD_HEARTBEAT_INTERVAL="100"` and `ETCD_ELECTION_TIMEOUT="1000"` get timeouted, after graceful system reboot. This cause etcd false negative classify itself use unhealthy, therefore stop rejoining the remaining online cluster members. This PR introduce: - `etcd.service`: Ensure startup after `network-online.target` and `time-sync.target`, so effective network connectivity and synced time is available. The logic is concept proof by ; also works as expected with Ceph + Kubernetes deployment by . No more deadlock happened during graceful system reboot, both AIO single/multiple node with loopback mount. Also see: - - - - Signed-off-by: Wong Hoi Sing Edison --- contrib/systemd/etcd.service | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/contrib/systemd/etcd.service b/contrib/systemd/etcd.service index 787413c75..8fc0570c6 100644 --- a/contrib/systemd/etcd.service +++ b/contrib/systemd/etcd.service @@ -1,7 +1,8 @@ [Unit] Description=etcd key-value store Documentation=https://github.com/etcd-io/etcd -After=network.target +After=network-online.target local-fs.target remote-fs.target time-sync.target +Wants=network-online.target local-fs.target remote-fs.target time-sync.target [Service] User=etcd