Current V2 watch waits by encoding URL with wait=true.
When a client sets 'no-sync', it requests directly to
proxy and the proxy redirects it by cloning the request
object, which leads to cancel the original request when
it times out and the cloned request gets closed prematurely.
This fixes coreos#3894 by querying
the original client request in order to not use context timeout
when 'wait=true'.
Current etcd client library chooses a default destination node from
every member of a cluster in a random manner. However, requests of
write and read (for consistent results) need to be forwarded to the
leader node as the nature of Raft algorithm. If the chosen node is a
follower, additional network traffic will be caused by the forwarding
from follower to leader.
Mainly for reducing the forward traffic, this commit adds a new
mechanism for various endpoint selection mode to the client library
which can be configured with client.Config.SelectionMode.
Currently, two modes are provided:
- EndpointSelectionRandom: default, same to existing behavior (pick
a node in a random manner)
- EndpointSelectionPrioritizeLeader: prioritize leader, for the above
purpose
I evaluated the effectiveness of the EndpointSelectionPrioritizeLeader
with 4 t1.micro instances of AWS (3 nodes for etcd cluster and 1 node
for etcd client). Client executes this simple benchmark
(https://github.com/mitake/etcd-things/tree/master/prioritize-leader-bench),
just writes 10000 keys. When SelectionMode == EndpointSelectionRandom
(default), the benchmark needed 1 min and 32.102 sec to finish. When
SelectionMode == EndpointSelectionPrioritizeLeader, the benchmark
needed 1 min 4.760 sec.
If headerTimeout is not zero then two context are created but only one is released.
cancelCtx in this case is never released which leads to goroutine leak inside it.
Add HeaderTimeout field in Config, so users could set timeout for each request.
Before this, one hanged request may block the call for long time. After
this, if the network is good, the user could set short timeout and expect
that API call can attempt next available endpoint quickly.
If the body is closed to stop watching, it will ignore the error from
reading body and return context error.
Before this PR, the cancel when watching always returns error `read tcp
127.0.0.1:57824: use of closed network connection`. After this PR, it
will return expected context canceled error.