etcd/tools/functional-tester
Yicheng Qin dfc7cc7a62 tools/etcd-tester: extend timeout for stresser
Extend the timeout from 1s to defaultRequestTimeout 5s.

The 1s may bring unwanted burden to the target member. If the member is
busy at recovering, it has limited bandwidth for client requests. A
short timeout at client side will retry quickly while keeping the
on-going connections. Thus, etcd will queue lots of requests and
connections and takes long time to clear them. This finally causes the
timeout of member health check.

This problem is a general one that how etcd handles amounts of requests
at the same time in a good way. We don't plan to address it at current
stage.
2015-11-16 11:47:08 -08:00
..

etcd functional test suite

etcd functional test suite tests the functionality of a etcd cluster with a focus on failure resistance under high pressure. It sets up an etcd cluster and inject failures into the cluster by killing the process or isolate the network of the process. It expects the etcd cluster to recover within a short amount of time after fixing the fault.

etcd functional test suite has two components: etcd-agent and etcd-tester. etcd-agent runs on every test machines and etcd-tester is a single controller of the test. etcd-tester controls all the etcd-agent to start etcd clusters and simulate various failure cases.

requirements

The environment of the cluster must be stable enough, so etcd test suite can assume that most of the failures are generated by itself.

etcd agent

etcd agent is a daemon on each machines. It can start, stop, restart, isolate and terminate an etcd process. The agent exposes these functionality via HTTP RPC.

etcd tester

etcd functional tester control the progress of the functional tests. It calls the the RPC of the etcd agent to simulate various test cases. For example, it can start a three members cluster by sending three start RPC calls to three different etcd agents. It can make one of the member failed by sending stop RPC call to one etcd agent.