It specifies request timeout error possibly caused by connection lost,
and print out better log for user to understand.
It handles two cases:
1. the leader cannot connect to majority of cluster.
2. the connection between follower and leader is down for a while,
and it losts proposals.
log format:
```
20:04:19 etcd3 | 2015-08-25 20:04:19.368126 E | etcdhttp: etcdserver:
request timed out, possibly due to connection lost
20:04:19 etcd3 | 2015-08-25 20:04:19.368227 E | etcdhttp: etcdserver:
request timed out, possibly due to connection lost
```
The request can still time out because we have set dial timeout and
read/write timeout. It increases timeout expectation from 1s to 5s,
but it makes it workable in globally-deployer cluster.
Before this PR, the timeout caused by leader election returns:
```
14:45:37 etcd2 | 2015-08-12 14:45:37.786349 E | etcdhttp: got unexpected
response error (etcdserver: request timed out)
```
After this PR:
```
15:52:54 etcd1 | 2015-08-12 15:52:54.389523 E | etcdhttp: etcdserver:
request timed out, possibly due to leader down
```
It uses heartbeat interval and election timeout to estimate the
expected request timeout.
This PR helps etcd survive under high roundtrip-time environment,
e.g., globally-deployed cluster.
It uses heartbeat interval and election timeout to estimate the
commit timeout for internal requests.
This PR helps etcd survive under high roundtrip-time environment,
e.g., globally-deployed cluster.
Follow the simple rule in the atomic package:
"On both ARM and x86-32, it is the caller's responsibility to arrange
for 64-bit alignment of 64-bit words accessed atomically. The first word
in a global variable or in an allocated struct or slice can be relied
upon to be 64-bit aligned."
Tested on a system with /proc/cpuinfo reporting:
processor : 0
model name : ARMv7 Processor rev 1 (v7l)
Features : swp half thumb fastmult vfp edsp thumbee neon vfpv3
tls vfpv4 idiva idivt vfpd32 lpae evtstrm
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x0
CPU part : 0xc0d
CPU revision : 1
The behavior accelarates the happen of the first-time leader election,
so the cluster could elect its leader fast. Technically, it could
help to reduce `electionMs - heartbeatMs` wait time for the first leader election.
Main usage:
1. Quick start for the local cluster when setting a little longer
election timeout
2. Quick start for the global cluster, which sets election timeout to
its maximum 50s.