16699 Commits

Author SHA1 Message Date
K. Alex Mills
3f6e0ec94b fix: pass argument url in defer to avoid loopclosure
Because of the well-known range loop closure issue, the value of u may
have changed by the time the anonymous function mentioned in the defer
is run. To address this, the simplest fix is to pass the url used in the
loop as an argument to the function run in defer.
2020-11-19 15:29:26 -06:00
Gyuho Lee
b5cefb5b3d
Merge pull request #12392 from ironcladlou/fixture-mutations
tests: prevent cross-test contamination via shared state
2020-11-19 10:05:42 -08:00
Dan Mace
9571325fe8 etcdserver: fix incorrect metrics generated when clients cancel watches
Before this patch, a client which cancels the context for a watch results in the
server generating a `rpctypes.ErrGRPCNoLeader` error that leads the recording of
a gRPC `Unavailable` metric in association with the client watch cancellation.
The metric looks like this:

    grpc_server_handled_total{grpc_code="Unavailable",grpc_method="Watch",grpc_service="etcdserverpb.Watch",grpc_type="bidi_stream"}

So, the watch server has misidentified the error as a server error and then
propagates the mistake to metrics, leading to a false indicator that the leader
has been lost. This false signal then leads to false alerting.

The commit 9c103dd0dedfc723cd4f33b6a5e81343d8a6bae7 introduced an interceptor which wraps
watch streams requiring a leader, causing those streams to be actively canceled
when leader loss is detected.

However, the error handling code assumes all stream context cancellations are
from the interceptor. This assumption is broken when the context was canceled
because of a client stream cancelation.

The core challenge is lack of information conveyed via `context.Context` which
is shared by both the send and receive sides of the stream handling and is
subject to cancellation by all paths (including the gRPC library itself). If any
piece of the system cancels the shared context, there's no way for a context
consumer to understand who cancelled the context or why.

To solve the ambiguity of the stream interceptor code specifically, this patch
introduces a custom context struct which the interceptor uses to expose a custom
error through the context when the interceptor decides to actively cancel a
stream. Now the consuming side can more safely assume a generic context
cancellation can be propagated as a cancellation, and the server generated
leader error is preserved and propagated normally without any special inference.

When a client cancels the stream, there remains a race in the error handling
code between the send and receive goroutines whereby the underlying gRPC error
is lost in the case where the send path returns and is handled first, but this
issue can be taken separately as no matter which paths wins, we can detect a
generic cancellation.

This is a replacement of https://github.com/etcd-io/etcd/pull/11375.

Fixes #10289, #9725, #9576, #9166
2020-11-18 17:02:09 -05:00
Jingyi Hu
c11ddc65ce
Merge pull request #12448 from agargi/introduce_config_parameter
server: Added config parameter experimental-apply-warning-duration
2020-11-19 02:29:08 +08:00
Ankur Gargi
c1c681adc3 server: Added config parameter experimental-warning-apply-duration 2020-11-17 17:33:19 -05:00
Sam Batschelet
06e48f0486
Merge pull request #12476 from hexfusion/mixin-typo
Documentation/etcd-mixin: fix typo
2020-11-15 19:19:35 -05:00
Sam Batschelet
07c15890ab Documentation/etcd-mixin: fix typo
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2020-11-15 12:26:18 -05:00
Gyuho Lee
1b8d2b1a47
Merge pull request #12452 from ptabor/20201104-release-mod-scripts
Release scripts for modules
2020-11-14 03:42:42 -08:00
Gyuho Lee
dc586a5ad2
Merge pull request #12459 from jingyih/proper_request_cancellation
server: proper cancellation for range request
2020-11-13 12:20:41 -08:00
spacewander
67f040f921 Update other Documentation/v2 links 2020-11-11 09:57:01 +08:00
spacewander
f2eb15a81b chore: update the documentation link in the comment
Close #12462.
2020-11-11 09:53:18 +08:00
Jingyi Hu
01844fd285
Merge pull request #12455 from mborsz/metrics
Add etcd_server_apply_duration_seconds
2020-11-10 00:47:11 +08:00
Jingyi Hu
718e1a7d89
Merge pull request #12451 from jingyih/update_metrics_doc
Documentation: add generated metrics docs
2020-11-09 23:23:23 +08:00
Maciej Borsz
0bea7df7c1 Add metric tracking apply method duration:
* etcd_server_apply_duration_seconds

It can be used to understand which operations are slow,
in addition to the warning log message.
2020-11-06 11:11:16 +01:00
jingyih
0558e379c3 server: proper request cancellation for range 2020-11-05 21:30:02 -08:00
Piotr Tabor
eeafcef0d2 Use "v3.5.0-pre" to reference within-etcd modules
instead of v3.0.0-000101010000000-00000000000,
that might be misleading as we don't develop etcd v3.0.0 any longer.

This version is a virtual version and is not supposed to be tagged
within the repository. We should tag real versions like: 3.5.0-alpha.0.

Please notice that go.etcd.io/etcd/client/v2 will be versioned as `v2.305.0-pre`.
The reason is that client v2 must have v2 version. I propose a
convention to envode the major version as 100x in minor version to make
the association to the underlying repository clear, staying within v2
version family.

The change was generated using:
```
DRY_RUN=false TARGET_VERSION="v3.5.0-pre" ./scripts/release_mod.sh update_versions
```
2020-11-04 18:28:43 +01:00
jingyih
b33c6c088e Documentation: add metrics docs 2020-11-04 22:20:37 +08:00
Piotr Tabor
fd2f34fd13 Release: Scripts to change versions in all go.mod files and push tags upstream.
Examplar invocations:

Edit go.mod files such that all etcd modules are pointing on given version:

```
% DRY_RUN=false TARGET_VERSION="v3.5.13" ./scripts/release_mod.sh update_versions
```

Tag latest commit with current version number for all the modules and push upstream:
```
% DRY_RUN=true REMOTE_REPO="origin" ./scripts/release_mod.sh push_mod_tags
```
2020-11-04 15:16:36 +01:00
Piotr Tabor
6e800b9b01
20201103 no commit title check (#12447)
* Turn off checking of format of commit message.

* scripts/fix.sh: Fix fixing whitespaces in *.sh scripts

Aparently there is a difference between:
  find ./ -print0 -name *.sh and
  find ./ -name *.sh -print0

* etcdserver unit tests: Do not call .Fatalf(...) from not test's goroutine.

Fixes following test failures:
https://travis-ci.com/github/etcd-io/etcd/jobs/425920416
```
% (cd server && go vet ./...)
stderr: # go.etcd.io/etcd/server/v3/etcdserver
stderr: etcdserver/server_test.go:1002:4: call to (*T).Fatalf from a non-test goroutine
stderr: etcdserver/server_test.go:1166:4: call to (*T).Fatalf from a non-test goroutine
FAIL: (code:2):
  % (cd server && go vet ./...)
FAIL: 'run go vet ./...' checking failed (!=0 return code)
FAIL: 'govet' failed at Tue Nov  3 04:07:47 UTC 2020
```
2020-11-03 07:59:42 -08:00
Jingyi Hu
64e048bea9
Merge pull request #12444 from kolyshkin/fix-lock
pkg/fileutil: fix F_OFD_ constants
2020-11-03 23:31:22 +08:00
Jingyi Hu
8c3c398676
Merge pull request #12437 from cfc4n/down_gobin_noexist
scripts: install github.com/myitcv/gobin while gobin doesn't exist.
2020-11-03 23:23:14 +08:00
Jingyi Hu
cd09726ae0
Merge pull request #12430 from meadlai/master
Fix go get cmd
2020-11-03 22:52:24 +08:00
Jingyi Hu
f224fa4e42
Merge pull request #12425 from viviyww/cluster-set-version
etcdserver: updated cluster version
2020-11-03 22:41:24 +08:00
Kir Kolyshkin
4eb4250e6d pkg/fileutil: fix F_OFD_ constants
Use golang.org/x/sys/unix for F_OFD_* constants.

This fixes the issue that F_OFD_GETLK was defined incorrectly,
resulting in bugs such as https://github.com/moby/moby/issues/31182

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2020-11-02 19:37:25 -08:00
Gyuho Lee
86185ba20f
Merge pull request #12443 from cfc4n/makefile_xargs_r
Makefile: -r is only necessary on GNU xargs.
2020-11-02 10:51:09 -08:00
CFC4N
72ebd50d8a
Makefile: -r is only necessary on GNU xargs. 2020-11-02 16:48:48 +08:00
Gyuho Lee
170af891d6
Merge pull request #12429 from tangcong/fix-cert-exp
*: add self-signed-cert-validity flag to fix cert expire issue
2020-10-30 11:41:23 -07:00
CFC4N
2e55875cc7
scripts: install github.com/myitcv/gobin while gobin doesn't exist. 2020-10-30 20:58:00 +08:00
tangcong
0b4b5d84c6 CHANGELOG: update for 12429 2020-10-30 10:10:30 +08:00
tangcong
8fd24f51c3 documentation: add certificates expired note 2020-10-30 10:10:30 +08:00
tangcong
a960d6b1c7 *: add self-signed-cert-validity flag 2020-10-30 10:10:26 +08:00
meadlai
ec37e15caf
Update README.md 2020-10-28 14:34:03 +08:00
yangweiwei
aa1024a16e etcdserver: updated cluster version
during cluster version update in etcd cluster, the log should info from
XX to XX.
2020-10-27 16:32:40 +08:00
luqi
ed81d2e2db client: replace dial with dialContext 2020-10-27 12:43:14 +08:00
Gyuho Lee
7da5182f1d
Merge pull request #12422 from tangcong/fix-realpath
scripts: fix realpath command not found in mac os
2020-10-26 10:42:26 -07:00
Jingyi Hu
ae7862e8bc
Merge pull request #12417 from ptabor/20201020-server-module
Modularization: Make ./etcd server a module
2020-10-27 01:27:08 +08:00
tangcong
8277395e1b scripts: use manual scripts to replace realpath 2020-10-27 00:36:28 +08:00
Piotr Tabor
aaf423e962 server: Update imports.
find -name '*.go' | xargs sed -i --follow-symlinks 's|etcd/v3/|etcd/server/v3/|g'
2020-10-26 13:02:32 +01:00
Piotr Tabor
6c1efd6ba5 server: Update go.mod 2020-10-26 13:02:32 +01:00
Piotr Tabor
4a5e9d1261 server: Move server files to 'server' directory.
26  git mv mvcc wal auth etcdserver etcdmain proxy embed/ lease/ server
   36  git mv go.mod go.sum server
2020-10-26 12:57:19 +01:00
Gyuho Lee
eee8dec0c3
Merge pull request #12421 from ptabor/20201026-fix-ws-shell
Unify tabs vs. spaces in the shell scripts
2020-10-26 04:42:59 -07:00
Gyuho Lee
bc3a77d298
Merge pull request #12099 from YoyinZyc/downgrade-httphandler
[Etcd downgrade] Add http handler to enable downgrade info communication between each member
2020-10-26 04:42:24 -07:00
Piotr Tabor
0ba16d8ee1 *: Convert tabulators to whitespaces in bash scripts.
Execution of `./scripts/fix.sh` that executed:
```
find ./ -name '*.sh' | xargs sed --follow-symlinks -i 's|\t|  |g'
```
2020-10-26 10:59:40 +01:00
Piotr Tabor
c035df5317 test: Detect indention done using tab (\t) in *.sh 2020-10-26 10:59:40 +01:00
Gyuho Lee
8fc5ef4a03
Merge pull request #12418 from ptabor/20201023-fix-flaky
./pkg/testutil: wait for: (*watchGrpcStream).sendCloseSubstream(...) goroutines.
2020-10-24 11:53:10 -07:00
Ankur Gargi
8866d55b9b
command: Enhance health command to check if there are any active alarms (#12150) 2020-10-22 15:55:15 -07:00
Piotr Tabor
f2ee15a1e1 ./pkg/testutil: wait for: (*watchGrpcStream).sendCloseSubstream(...) goroutines.
Should solve the problem of flakes documented here:
https://github.com/etcd-io/etcd/issues/12372#issuecomment-706337969

```
% (cd tests && env go test -short -timeout=3m -cpu=4 --race=true ./... -p=2)

Unexpected goroutines running after all test(s).
1 instances of:
go.etcd.io/etcd/v3/clientv3.(*watchGrpcStream).sendCloseSubstream(...)
	/go/src/go.etcd.io/etcd/clientv3/watch.go:464 +0x204
created by go.etcd.io/etcd/v3/clientv3.(*watchGrpcStream).closeSubstream
	/go/src/go.etcd.io/etcd/clientv3/watch.go:480 +0x21f
FAIL	go.etcd.io/etcd/tests/v3/integration/clientv3/examples	2.111s
```

The goroutine finishes automatically with timeout of 250ms.The change
makes the test wait for it - if it still exists.

Examples:
  https://travis-ci.com/github/etcd-io/etcd/jobs/397449189
  https://travis-ci.com/github/etcd-io/etcd/jobs/397532784
  https://travis-ci.com/github/etcd-io/etcd/jobs/397696506
  https://travis-ci.com/github/etcd-io/etcd/jobs/403603526
2020-10-22 14:23:08 +02:00
Jingyi Hu
97354af44b
Merge pull request #12411 from ptabor/20201021-move-contrib-recipies
Modularization: Move contrib/recipies to clientv3/experimental/recipies/...
2020-10-22 17:40:05 +08:00
Piotr Tabor
45b007b8b4 contrib,clientv3: Move contrib/recipies to clientv3/experimental/recipies/...
Recipies is set of patterns / primitives implementation on top of clientv3.
It's used by integration tests. It shouldn't be considered "server" code.
2020-10-22 11:10:07 +02:00
Jingyi Hu
41557d9330
Merge pull request #12404 from ptabor/20201020-etcdctl-module
Modularization: etcdctl as a module
2020-10-21 21:48:38 +08:00