mirror of
https://github.com/etcd-io/etcd.git
synced 2024-09-27 06:25:44 +00:00
chore: rename 'heartbeat timeout' to 'heartbeat interval'
Heartbeat timeout means the period length that indicates heartbeat is out of service, which is different from heartbeat interval. So we should use '-peer-heartbeat-interval' instead of '-peer-heartbeat-timeout' in etcd. '-peer-heartbeat-timeout' is deprecated but still could be used.
This commit is contained in:
parent
a72f913a60
commit
f434177a9a
@ -1,46 +1,46 @@
|
|||||||
## Tuning
|
## Tuning
|
||||||
|
|
||||||
The default settings in etcd should work well for installations on a local network where the average network latency is low.
|
The default settings in etcd should work well for installations on a local network where the average network latency is low.
|
||||||
However, when using etcd across multiple data centers or over networks with high latency you may need to tweak the heartbeat and election timeout settings.
|
However, when using etcd across multiple data centers or over networks with high latency you may need to tweak the heartbeat interval and election timeout settings.
|
||||||
|
|
||||||
### Timeouts
|
### Time Parameters
|
||||||
|
|
||||||
The underlying distributed consensus protocol relies on two separate timeouts to ensure that nodes can handoff leadership if one stalls or goes offline.
|
The underlying distributed consensus protocol relies on two separate time parameters to ensure that nodes can handoff leadership if one stalls or goes offline.
|
||||||
The first timeout is called the *Heartbeat Timeout*.
|
The first parameter is called the *Heartbeat Interval*.
|
||||||
This is the frequency with which the leader will notify followers that it is still the leader.
|
This is the frequency with which the leader will notify followers that it is still the leader.
|
||||||
etcd batches commands together for higher throughput so this heartbeat timeout is also a delay for how long it takes for commands to be committed.
|
etcd batches commands together for higher throughput so this heartbeat interval is also a delay for how long it takes for commands to be committed.
|
||||||
By default, etcd uses a `50ms` heartbeat timeout.
|
By default, etcd uses a `50ms` heartbeat interval.
|
||||||
|
|
||||||
The second timeout is the *Election Timeout*.
|
The second parameter is the *Election Timeout*.
|
||||||
This timeout is how long a follower node will go without hearing a heartbeat before attempting to become leader itself.
|
This timeout is how long a follower node will go without hearing a heartbeat before attempting to become leader itself.
|
||||||
By default, etcd uses a `200ms` election timeout.
|
By default, etcd uses a `200ms` election timeout.
|
||||||
|
|
||||||
Adjusting these values is a trade off.
|
Adjusting these values is a trade off.
|
||||||
Lowering the heartbeat timeout will cause individual commands to be committed faster but it will lower the overall throughput of etcd.
|
Lowering the heartbeat interval will cause individual commands to be committed faster but it will lower the overall throughput of etcd.
|
||||||
If your etcd instances have low utilization then lowering the heartbeat timeout can improve your command response time.
|
If your etcd instances have low utilization then lowering the heartbeat interval can improve your command response time.
|
||||||
|
|
||||||
The election timeout should be set based on the heartbeat timeout and your network ping time between nodes.
|
The election timeout should be set based on the heartbeat interval and your network ping time between nodes.
|
||||||
Election timeouts should be at least 10 times your ping time so it can account for variance in your network.
|
Election timeouts should be at least 10 times your ping time so it can account for variance in your network.
|
||||||
For example, if the ping time between your nodes is 10ms then you should have at least a 100ms election timeout.
|
For example, if the ping time between your nodes is 10ms then you should have at least a 100ms election timeout.
|
||||||
|
|
||||||
You should also set your election timeout to at least 4 to 5 times your heartbeat timeout to account for variance in leader replication.
|
You should also set your election timeout to at least 4 to 5 times your heartbeat interval to account for variance in leader replication.
|
||||||
For a heartbeat timeout of 50ms you should set your election timeout to at least 200ms - 250ms.
|
For a heartbeat interval of 50ms you should set your election timeout to at least 200ms - 250ms.
|
||||||
|
|
||||||
You can override the default values on the command line:
|
You can override the default values on the command line:
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
# Command line arguments:
|
# Command line arguments:
|
||||||
$ etcd -peer-heartbeat-timeout=100 -peer-election-timeout=500
|
$ etcd -peer-heartbeat-interval=100 -peer-election-timeout=500
|
||||||
|
|
||||||
# Environment variables:
|
# Environment variables:
|
||||||
$ ETCD_PEER_HEARTBEAT_TIMEOUT=100 ETCD_PEER_ELECTION_TIMEOUT=500 etcd
|
$ ETCD_PEER_HEARTBEAT_INTERVAL=100 ETCD_PEER_ELECTION_TIMEOUT=500 etcd
|
||||||
```
|
```
|
||||||
|
|
||||||
Or you can set the values within the configuration file:
|
Or you can set the values within the configuration file:
|
||||||
|
|
||||||
```toml
|
```toml
|
||||||
[peer]
|
[peer]
|
||||||
heartbeat_timeout = 100
|
heartbeat_interval = 100
|
||||||
election_timeout = 100
|
election_timeout = 100
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -25,24 +25,25 @@ const DefaultSystemConfigPath = "/etc/etcd/etcd.conf"
|
|||||||
|
|
||||||
// A lookup of deprecated flags to their new flag name.
|
// A lookup of deprecated flags to their new flag name.
|
||||||
var newFlagNameLookup = map[string]string{
|
var newFlagNameLookup = map[string]string{
|
||||||
"C": "peers",
|
"C": "peers",
|
||||||
"CF": "peers-file",
|
"CF": "peers-file",
|
||||||
"n": "name",
|
"n": "name",
|
||||||
"c": "addr",
|
"c": "addr",
|
||||||
"cl": "bind-addr",
|
"cl": "bind-addr",
|
||||||
"s": "peer-addr",
|
"s": "peer-addr",
|
||||||
"sl": "peer-bind-addr",
|
"sl": "peer-bind-addr",
|
||||||
"d": "data-dir",
|
"d": "data-dir",
|
||||||
"m": "max-result-buffer",
|
"m": "max-result-buffer",
|
||||||
"r": "max-retry-attempts",
|
"r": "max-retry-attempts",
|
||||||
"maxsize": "max-cluster-size",
|
"maxsize": "max-cluster-size",
|
||||||
"clientCAFile": "ca-file",
|
"clientCAFile": "ca-file",
|
||||||
"clientCert": "cert-file",
|
"clientCert": "cert-file",
|
||||||
"clientKey": "key-file",
|
"clientKey": "key-file",
|
||||||
"serverCAFile": "peer-ca-file",
|
"serverCAFile": "peer-ca-file",
|
||||||
"serverCert": "peer-cert-file",
|
"serverCert": "peer-cert-file",
|
||||||
"serverKey": "peer-key-file",
|
"serverKey": "peer-key-file",
|
||||||
"snapshotCount": "snapshot-count",
|
"snapshotCount": "snapshot-count",
|
||||||
|
"peer-heartbeat-timeout": "peer-heartbeat-interval",
|
||||||
}
|
}
|
||||||
|
|
||||||
// Config represents the server configuration.
|
// Config represents the server configuration.
|
||||||
@ -74,13 +75,13 @@ type Config struct {
|
|||||||
VeryVerbose bool `toml:"very_verbose" env:"ETCD_VERY_VERBOSE"`
|
VeryVerbose bool `toml:"very_verbose" env:"ETCD_VERY_VERBOSE"`
|
||||||
VeryVeryVerbose bool `toml:"very_very_verbose" env:"ETCD_VERY_VERY_VERBOSE"`
|
VeryVeryVerbose bool `toml:"very_very_verbose" env:"ETCD_VERY_VERY_VERBOSE"`
|
||||||
Peer struct {
|
Peer struct {
|
||||||
Addr string `toml:"addr" env:"ETCD_PEER_ADDR"`
|
Addr string `toml:"addr" env:"ETCD_PEER_ADDR"`
|
||||||
BindAddr string `toml:"bind_addr" env:"ETCD_PEER_BIND_ADDR"`
|
BindAddr string `toml:"bind_addr" env:"ETCD_PEER_BIND_ADDR"`
|
||||||
CAFile string `toml:"ca_file" env:"ETCD_PEER_CA_FILE"`
|
CAFile string `toml:"ca_file" env:"ETCD_PEER_CA_FILE"`
|
||||||
CertFile string `toml:"cert_file" env:"ETCD_PEER_CERT_FILE"`
|
CertFile string `toml:"cert_file" env:"ETCD_PEER_CERT_FILE"`
|
||||||
KeyFile string `toml:"key_file" env:"ETCD_PEER_KEY_FILE"`
|
KeyFile string `toml:"key_file" env:"ETCD_PEER_KEY_FILE"`
|
||||||
HeartbeatTimeout int `toml:"heartbeat_timeout" env:"ETCD_PEER_HEARTBEAT_TIMEOUT"`
|
HeartbeatInterval int `toml:"heartbeat_interval" env:"ETCD_PEER_HEARTBEAT_INTERVAL"`
|
||||||
ElectionTimeout int `toml:"election_timeout" env:"ETCD_PEER_ELECTION_TIMEOUT"`
|
ElectionTimeout int `toml:"election_timeout" env:"ETCD_PEER_ELECTION_TIMEOUT"`
|
||||||
}
|
}
|
||||||
strTrace string `toml:"trace" env:"ETCD_TRACE"`
|
strTrace string `toml:"trace" env:"ETCD_TRACE"`
|
||||||
GraphiteHost string `toml:"graphite_host" env:"ETCD_GRAPHITE_HOST"`
|
GraphiteHost string `toml:"graphite_host" env:"ETCD_GRAPHITE_HOST"`
|
||||||
@ -98,7 +99,7 @@ func New() *Config {
|
|||||||
c.Snapshot = true
|
c.Snapshot = true
|
||||||
c.SnapshotCount = 10000
|
c.SnapshotCount = 10000
|
||||||
c.Peer.Addr = "127.0.0.1:7001"
|
c.Peer.Addr = "127.0.0.1:7001"
|
||||||
c.Peer.HeartbeatTimeout = defaultHeartbeatTimeout
|
c.Peer.HeartbeatInterval = defaultHeartbeatInterval
|
||||||
c.Peer.ElectionTimeout = defaultElectionTimeout
|
c.Peer.ElectionTimeout = defaultElectionTimeout
|
||||||
return c
|
return c
|
||||||
}
|
}
|
||||||
@ -286,7 +287,7 @@ func (c *Config) LoadFlags(arguments []string) error {
|
|||||||
f.IntVar(&c.MaxRetryAttempts, "max-retry-attempts", c.MaxRetryAttempts, "")
|
f.IntVar(&c.MaxRetryAttempts, "max-retry-attempts", c.MaxRetryAttempts, "")
|
||||||
f.Float64Var(&c.RetryInterval, "retry-interval", c.RetryInterval, "")
|
f.Float64Var(&c.RetryInterval, "retry-interval", c.RetryInterval, "")
|
||||||
f.IntVar(&c.MaxClusterSize, "max-cluster-size", c.MaxClusterSize, "")
|
f.IntVar(&c.MaxClusterSize, "max-cluster-size", c.MaxClusterSize, "")
|
||||||
f.IntVar(&c.Peer.HeartbeatTimeout, "peer-heartbeat-timeout", c.Peer.HeartbeatTimeout, "")
|
f.IntVar(&c.Peer.HeartbeatInterval, "peer-heartbeat-interval", c.Peer.HeartbeatInterval, "")
|
||||||
f.IntVar(&c.Peer.ElectionTimeout, "peer-election-timeout", c.Peer.ElectionTimeout, "")
|
f.IntVar(&c.Peer.ElectionTimeout, "peer-election-timeout", c.Peer.ElectionTimeout, "")
|
||||||
|
|
||||||
f.StringVar(&cors, "cors", "", "")
|
f.StringVar(&cors, "cors", "", "")
|
||||||
@ -321,6 +322,7 @@ func (c *Config) LoadFlags(arguments []string) error {
|
|||||||
f.IntVar(&c.MaxRetryAttempts, "r", c.MaxRetryAttempts, "(deprecated)")
|
f.IntVar(&c.MaxRetryAttempts, "r", c.MaxRetryAttempts, "(deprecated)")
|
||||||
f.IntVar(&c.MaxClusterSize, "maxsize", c.MaxClusterSize, "(deprecated)")
|
f.IntVar(&c.MaxClusterSize, "maxsize", c.MaxClusterSize, "(deprecated)")
|
||||||
f.IntVar(&c.SnapshotCount, "snapshotCount", c.SnapshotCount, "(deprecated)")
|
f.IntVar(&c.SnapshotCount, "snapshotCount", c.SnapshotCount, "(deprecated)")
|
||||||
|
f.IntVar(&c.Peer.HeartbeatInterval, "peer-heartbeat-timeout", c.Peer.HeartbeatInterval, "(deprecated)")
|
||||||
// END DEPRECATED FLAGS
|
// END DEPRECATED FLAGS
|
||||||
|
|
||||||
if err := f.Parse(arguments); err != nil {
|
if err := f.Parse(arguments); err != nil {
|
||||||
@ -428,18 +430,18 @@ func (c *Config) Sanitize() error {
|
|||||||
// EtcdTLSInfo retrieves a TLSInfo object for the etcd server
|
// EtcdTLSInfo retrieves a TLSInfo object for the etcd server
|
||||||
func (c *Config) EtcdTLSInfo() server.TLSInfo {
|
func (c *Config) EtcdTLSInfo() server.TLSInfo {
|
||||||
return server.TLSInfo{
|
return server.TLSInfo{
|
||||||
CAFile: c.CAFile,
|
CAFile: c.CAFile,
|
||||||
CertFile: c.CertFile,
|
CertFile: c.CertFile,
|
||||||
KeyFile: c.KeyFile,
|
KeyFile: c.KeyFile,
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// PeerRaftInfo retrieves a TLSInfo object for the peer server.
|
// PeerRaftInfo retrieves a TLSInfo object for the peer server.
|
||||||
func (c *Config) PeerTLSInfo() server.TLSInfo {
|
func (c *Config) PeerTLSInfo() server.TLSInfo {
|
||||||
return server.TLSInfo{
|
return server.TLSInfo{
|
||||||
CAFile: c.Peer.CAFile,
|
CAFile: c.Peer.CAFile,
|
||||||
CertFile: c.Peer.CertFile,
|
CertFile: c.Peer.CertFile,
|
||||||
KeyFile: c.Peer.KeyFile,
|
KeyFile: c.Peer.KeyFile,
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -5,5 +5,5 @@ const (
|
|||||||
defaultElectionTimeout = 200
|
defaultElectionTimeout = 200
|
||||||
|
|
||||||
// The frequency (in ms) by which heartbeats are sent to followers.
|
// The frequency (in ms) by which heartbeats are sent to followers.
|
||||||
defaultHeartbeatTimeout = 50
|
defaultHeartbeatInterval = 50
|
||||||
)
|
)
|
||||||
|
10
etcd.go
10
etcd.go
@ -109,10 +109,10 @@ func main() {
|
|||||||
serverStats := server.NewRaftServerStats(config.Name)
|
serverStats := server.NewRaftServerStats(config.Name)
|
||||||
|
|
||||||
// Calculate all of our timeouts
|
// Calculate all of our timeouts
|
||||||
heartbeatTimeout := time.Duration(config.Peer.HeartbeatTimeout) * time.Millisecond
|
heartbeatInterval := time.Duration(config.Peer.HeartbeatInterval) * time.Millisecond
|
||||||
electionTimeout := time.Duration(config.Peer.ElectionTimeout) * time.Millisecond
|
electionTimeout := time.Duration(config.Peer.ElectionTimeout) * time.Millisecond
|
||||||
dialTimeout := (3 * heartbeatTimeout) + electionTimeout
|
dialTimeout := (3 * heartbeatInterval) + electionTimeout
|
||||||
responseHeaderTimeout := (3 * heartbeatTimeout) + electionTimeout
|
responseHeaderTimeout := (3 * heartbeatInterval) + electionTimeout
|
||||||
|
|
||||||
// Create peer server
|
// Create peer server
|
||||||
psConfig := server.PeerServerConfig{
|
psConfig := server.PeerServerConfig{
|
||||||
@ -145,7 +145,7 @@ func main() {
|
|||||||
}
|
}
|
||||||
|
|
||||||
// Create raft transporter and server
|
// Create raft transporter and server
|
||||||
raftTransporter := server.NewTransporter(followersStats, serverStats, registry, heartbeatTimeout, dialTimeout, responseHeaderTimeout)
|
raftTransporter := server.NewTransporter(followersStats, serverStats, registry, heartbeatInterval, dialTimeout, responseHeaderTimeout)
|
||||||
if psConfig.Scheme == "https" {
|
if psConfig.Scheme == "https" {
|
||||||
raftClientTLSConfig, err := config.PeerTLSInfo().ClientConfig()
|
raftClientTLSConfig, err := config.PeerTLSInfo().ClientConfig()
|
||||||
if err != nil {
|
if err != nil {
|
||||||
@ -158,7 +158,7 @@ func main() {
|
|||||||
log.Fatal(err)
|
log.Fatal(err)
|
||||||
}
|
}
|
||||||
raftServer.SetElectionTimeout(electionTimeout)
|
raftServer.SetElectionTimeout(electionTimeout)
|
||||||
raftServer.SetHeartbeatInterval(heartbeatTimeout)
|
raftServer.SetHeartbeatInterval(heartbeatInterval)
|
||||||
ps.SetRaftServer(raftServer)
|
ps.SetRaftServer(raftServer)
|
||||||
|
|
||||||
// Create etcd server
|
// Create etcd server
|
||||||
|
@ -44,8 +44,8 @@ Peer Communication Options:
|
|||||||
-peer-ca-file=<path> Path to the peer CA file.
|
-peer-ca-file=<path> Path to the peer CA file.
|
||||||
-peer-cert-file=<path> Path to the peer cert file.
|
-peer-cert-file=<path> Path to the peer cert file.
|
||||||
-peer-key-file=<path> Path to the peer key file.
|
-peer-key-file=<path> Path to the peer key file.
|
||||||
-peer-heartbeat-timeout=<time>
|
-peer-heartbeat-interval=<time>
|
||||||
Time (in milliseconds) for a heartbeat to timeout.
|
Time (in milliseconds) of a heartbeat interval.
|
||||||
-peer-election-timeout=<time>
|
-peer-election-timeout=<time>
|
||||||
Time (in milliseconds) for an election to timeout.
|
Time (in milliseconds) for an election to timeout.
|
||||||
|
|
||||||
|
@ -19,7 +19,7 @@ const (
|
|||||||
testClientURL = "localhost:4401"
|
testClientURL = "localhost:4401"
|
||||||
testRaftURL = "localhost:7701"
|
testRaftURL = "localhost:7701"
|
||||||
testSnapshotCount = 10000
|
testSnapshotCount = 10000
|
||||||
testHeartbeatTimeout = time.Duration(50) * time.Millisecond
|
testHeartbeatInterval = time.Duration(50) * time.Millisecond
|
||||||
testElectionTimeout = time.Duration(200) * time.Millisecond
|
testElectionTimeout = time.Duration(200) * time.Millisecond
|
||||||
)
|
)
|
||||||
|
|
||||||
@ -51,15 +51,15 @@ func RunServer(f func(*server.Server)) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
// Create Raft transporter and server
|
// Create Raft transporter and server
|
||||||
dialTimeout := (3 * testHeartbeatTimeout) + testElectionTimeout
|
dialTimeout := (3 * testHeartbeatInterval) + testElectionTimeout
|
||||||
responseHeaderTimeout := (3 * testHeartbeatTimeout) + testElectionTimeout
|
responseHeaderTimeout := (3 * testHeartbeatInterval) + testElectionTimeout
|
||||||
raftTransporter := server.NewTransporter(followersStats, serverStats, registry, testHeartbeatTimeout, dialTimeout, responseHeaderTimeout)
|
raftTransporter := server.NewTransporter(followersStats, serverStats, registry, testHeartbeatInterval, dialTimeout, responseHeaderTimeout)
|
||||||
raftServer, err := raft.NewServer(testName, path, raftTransporter, store, ps, "")
|
raftServer, err := raft.NewServer(testName, path, raftTransporter, store, ps, "")
|
||||||
if err != nil {
|
if err != nil {
|
||||||
panic(err)
|
panic(err)
|
||||||
}
|
}
|
||||||
raftServer.SetElectionTimeout(testElectionTimeout)
|
raftServer.SetElectionTimeout(testElectionTimeout)
|
||||||
raftServer.SetHeartbeatInterval(testHeartbeatTimeout)
|
raftServer.SetHeartbeatInterval(testHeartbeatInterval)
|
||||||
ps.SetRaftServer(raftServer)
|
ps.SetRaftServer(raftServer)
|
||||||
|
|
||||||
s := server.New(testName, "http://"+testClientURL, ps, registry, store, nil)
|
s := server.New(testName, "http://"+testClientURL, ps, registry, store, nil)
|
||||||
|
Loading…
x
Reference in New Issue
Block a user