5520 Commits

Author SHA1 Message Date
Xiang Li
7314310aed Merge pull request #3233 from xiang90/srv_discovery
better dns discovery error and doc
2015-08-06 14:35:22 -07:00
Xiang Li
e9f05e8959 doc: explain srv error 2015-08-06 14:24:58 -07:00
Yicheng Qin
2c2249dadc Merge pull request #3219 from yichengq/limit-listener
etcdmain: stop accepting client conns when it reachs limit
2015-08-06 12:17:49 -07:00
Yicheng Qin
97923ca3fc etcdmain: close client conns when it exceeds limit
This solves the problem that etcd may fatal because its critical path
cannot get file descriptor resource when the number of clients is too
big. The PR lets the client listener close client connections
immediately after they are accepted when
the file descriptor usage in the process reaches some pre-set limit, so
it ensures that the internal critical path could always get file
descriptor when it needs.

When there are tons to clients connecting to the server, the original
behavior is like this:

```
2015/08/4 16:42:08 etcdserver: cannot monitor file descriptor usage
(open /proc/self/fd: too many open files)
2015/08/4 16:42:33 etcdserver: failed to purge snap file open
default2.etcd/member/snap: too many open files
[halted]
```

Current behavior is like this:

```
2015/08/6 19:05:25 transport: accept error: closing connection,
exceed file descriptor usage limitation (fd limit=874)
2015/08/6 19:05:25 transport: accept error: closing connection,
exceed file descriptor usage limitation (fd limit=874)
2015/08/6 19:05:26 transport: accept error: closing connection,
exceed file descriptor usage limitation (fd limit=874)
2015/08/6 19:05:27 transport: accept error: closing connection,
exceed file descriptor usage limitation (fd limit=874)
2015/08/6 19:05:28 transport: accept error: closing connection,
exceed file descriptor usage limitation (fd limit=874)
2015/08/6 19:05:28 etcdserver: 80% of the file descriptor limit is
used [used = 873, limit = 1024]
```

It is available at linux system today because pkg/runtime only has linux
support.
2015-08-06 12:03:20 -07:00
Xiang Li
203e0f178b etcdmian: better error for srv discovery failure 2015-08-06 11:38:53 -07:00
Xiang Li
01c286ccb6 Merge pull request #3231 from xiang90/fallocate
pkg/fileutil: support perallocate
2015-08-06 10:25:28 -07:00
Xiang Li
39a4b6a5e5 pkg/fileutil: support perallocate 2015-08-06 10:10:58 -07:00
Xiang Li
9a8607fce1 Merge pull request #3187 from yichengq/client-keep-sync
client: add KeepSync function
2015-08-06 00:16:28 -07:00
Yicheng Qin
c53b3016ae client: add AutoSync function
AutoSync provides the way for client to syncing member list from
etcd cluster automatically.
2015-08-05 13:22:56 -07:00
Xiang Li
ff0b8723c7 Merge pull request #2688 from xiang90/versioning
etcdserver: internal request union
2015-08-05 09:27:32 -07:00
Xiang Li
58503817ec etcdserver: internal request union 2015-08-05 07:47:10 -07:00
Xiang Li
487639b2d8 Merge pull request #3222 from mitake/wal-log-error
wal: log errors in wal.Close()
2015-08-04 23:19:45 -07:00
Xiang Li
9cbeffc720 Merge pull request #3224 from xiang90/fix_ls
etcdctl: ls takes / as default key arg
2015-08-04 23:15:29 -07:00
Hitoshi Mitake
ba76e27875 wal: log errors in wal.Close()
This patch adds error logging in wal.Close() if unlocking and
destroying fail. Though it is hard to handling the errors, logging
would be helpful for trouble shooting.
2015-08-05 15:03:45 +09:00
Xiang Li
9527a97720 etcdctl: ls takes / as default key arg 2015-08-04 22:56:55 -07:00
Xiang Li
718a42f408 Merge pull request #3210 from xiang90/probing
monitoring connectivity between peers
2015-08-04 16:56:31 -07:00
Yicheng Qin
0650170a1b Merge pull request #3196 from eyakubovich/fix-watch-timeout
client: handle watch timing out elegantly
2015-08-04 13:52:42 -07:00
Xiang Li
1e048b5c24 rafthttp: cleanup prober when stopping the transport 2015-08-04 17:42:51 +08:00
Xiang Li
709718ed97 godeps: update probing pkg 2015-08-04 17:40:39 +08:00
Xiang Li
0fc764200d rafthttp: monitor connection 2015-08-04 17:39:40 +08:00
Xiang Li
ff5c3469c1 Merge pull request #3197 from xiang90/health
etcdctl: cluster-health supports forever flag
2015-08-03 20:48:06 -07:00
Eugene Yakubovich
6312e22b1d client: handle empty watch responses elegantly
Even though current etcd does not time out
watches, the client could be running against
an old etcd version or the server may close
polling connection for other reasons.
This patch ignores successful (as in 200)
responses with emtpy bodies instead
of producing JSON errors.
2015-08-03 11:47:21 -07:00
Xiang Li
306085db5f Godeps: add probing dependency 2015-08-03 09:07:43 +08:00
Xiang Li
f7f00b0af6 etcdctl: cluster-health supports forever flag
cluster-health command supports checking the cluster health
forever.
2015-08-01 22:29:08 +08:00
Xiang Li
3da1df2648 Merge pull request #3207 from xiang90/rm_migration
*: remove migration related stuff from 2.2
2015-08-01 19:47:17 +08:00
Xiang Li
2b8abeb093 *: remove migration related stuff from 2.2 2015-08-01 19:37:20 +08:00
Xiang Li
eee1c8b8ee Merge pull request #3200 from xiang90/d_doc
doc: unique names must be specified when using public discovery service
2015-08-01 07:34:25 +08:00
Yicheng Qin
8bd9554338 Merge pull request #3202 from yichengq/fix-etcdctl-watch
etcdctl: fix watch -after-index parsing
2015-07-31 14:41:45 -07:00
Yicheng Qin
4a89b3f8f3 Merge pull request #3116 from offscale/master
build: implemented build shell-script for Windows
2015-07-31 11:55:42 -07:00
Xiang Li
05b2d06788 Merge pull request #3199 from xiang90/sdnotify
etcdmain: support sdnotify for readiness
2015-07-31 19:04:35 +08:00
Samuel Marks
4a0d8ee4bd build: implemented build shell-script for Windows 2015-07-31 17:43:47 +10:00
Xiang Li
0cbac56fa2 etcdmain: support sdnotify for readiness 2015-07-31 13:33:18 +08:00
Xiang Li
beeecc32b0 doc: unique names must be specified when using public discovery service 2015-07-31 09:12:44 +08:00
Barak Michener
c1c5c7c99c Merge pull request #3091 from barakmich/client_auth_cov
etcdhttp: Improve test coverage surrounding auth
2015-07-30 17:00:49 -04:00
Barak Michener
dd1a8fe330 etcdhttp: Improve test coverage surrounding auth 2015-07-30 14:21:08 -04:00
Yicheng Qin
147885078c etcdctl: fix watch -after-index parsing
It uses -after-index incorrectly now:

```
$ ./bin/etcdctl --debug watch -after-index 31 foo
Cluster-Endpoints: http://localhost:2379, http://localhost:4001
cURL Command: curl -X GET
http://localhost:2379/v2/keys/foo?recursive=false&wait=true&waitIndex=33
```

After this PR:

```
$ ./bin/etcdctl --debug watch -after-index 31 foo
Cluster-Endpoints: http://localhost:2379, http://localhost:4001
cURL Command: curl -X GET
http://localhost:2379/v2/keys/foo?recursive=false&wait=true&waitIndex=32
```
2015-07-30 11:15:43 -07:00
Yicheng Qin
219ed1695b Merge pull request #3178 from yichengq/refactor-cluster-health
etcdctl: refactor the way to check cluster health
2015-07-29 18:16:26 -07:00
Xiang Li
80b794dccc Merge pull request #3185 from xiang90/add_debug_endpoint
etcdhttp: add config/local/debug endpoint
2015-07-30 08:46:07 +08:00
Xiang Li
4e31df2c2b etcdhttp: add config/local/log endpoint
PUT on the endpoint sets the GlobalDebugLevel to json level value.
The action overwrites the origianl log level setting from
users. We need to write doc to warn this.
2015-07-30 08:35:01 +08:00
Yicheng Qin
e62a3b8a62 Merge pull request #2891 from glensc/patch-1
build: use posix shell
2015-07-29 17:15:57 -07:00
Xiang Li
ff945c7404 Merge pull request #3181 from xiang90/2.2-client-error
client: return cluster error if the etcd cluster is not avaliable
2015-07-30 08:08:09 +08:00
Yicheng Qin
f1aaa7a9e3 etcdctl: refactor the way to check cluster health
This method uses raft status exposed at /debug/varz to determine the
health of the cluster. It uses whether commit index increases to
determine the cluster health, and uses whether match index increases to
determine the member health.

This could fix the bug #2711 that fails to detect follower is unhealthy
because it doesn't rely on whether message in long-polling connection is sent.

This health check is stricter than the old one, and reflects the
situation that whether followers are healthy in the view of the leader. One
example is that if the follower is receiving the snapshot, it will turns
out to be unhealthy because it doesn't move forward.

`etcdctl cluster-health` will reflect the healthy view in the raft level,
while connectivity checks reflects the healthy view in transport level.
2015-07-29 17:06:55 -07:00
Xiang Li
a47e661fff discovery: print out detailed cluster error 2015-07-29 23:06:57 +08:00
Xiang Li
5fa8652241 client: return cluster error if the etcd cluster is not avaliable
Add a new ClusterError type. It contians all encountered errors and
return ClusterNotAvailable as the error string.
2015-07-29 22:55:15 +08:00
Yicheng Qin
6b8b507312 Merge pull request #3176 from yichengq/reject-high-election
etcdmain: reject unreasonably high values of -election-timeout
2015-07-28 10:33:58 -07:00
Yicheng Qin
ec214030d0 etcdmain: reject unreasonably high values of -election-timeout
This helps users to detect setting problem early.
2015-07-28 10:07:57 -07:00
Yicheng Qin
7831a30e46 Merge pull request #3180 from shafreeck/master
Update libraries-and-tools.md
2015-07-27 14:45:31 -07:00
Yicheng Qin
6184e271a4 Merge pull request #3164 from yichengq/pin-endpoint
client: pin itself to an endpoint that given
2015-07-27 14:35:51 -07:00
Yicheng Qin
6fc9dbfe56 Merge pull request #3114 from yichengq/clean-raft-init
etcdserver: clean up start and stop logic of raft
2015-07-27 14:19:25 -07:00
Yicheng Qin
ea2347a40f client: pin itself to an endpoint that given
1. When reset endpoints, client will choose a random endpoint to pin.
2. If the pinned endpoint is healthy, client will keep using it.
3. If the pinned endpoint becomes unhealthy, client will attempt other
endpoints and update its pin.
2015-07-27 13:36:53 -07:00