573 Commits

Author SHA1 Message Date
Maksim Buldukyan
7e38cfcc8d raft: makes 'ConnReadTimeout/ConnWriteTimeout' customizable 2021-02-10 10:36:50 +07:00
Piotr Tabor
a836a8045b Get rid of legacy client/v3/naming API.
Update grpcproxy to use the new abstractions.
2021-02-09 11:56:28 +01:00
Chao Chen
2ae3e82f07 etcdserver/api/etcdhttp: log successful etcd server side health check in debug level
When we have an external component that checks /health periodically, the
etcd server logs can be quite verbose (e.g., DDOS-ing against insure
etcd health check can lead to disk space full due to large log files).

This change was introduced in #11704.

While we keep the warning logs for etcd health check failures, the
success (or OK) log level should be set to DEBUG.

Fixes #12676
2021-02-08 17:15:43 -08:00
Piotr Tabor
0b75fede64 Replace client/v3/balancer with standard components: resolver + round_robin LB
This commit significantly reduces volume of custom code
in etcd client v3, while preserving full existing functionality.
2021-02-08 18:50:31 +01:00
Brad Davidson
603d975599 Fix cluster peer HTTP SRV discovery
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2021-02-03 03:08:13 -08:00
Yanhao Mo
6d82778a4e etcdserver: export method EtcdServer.leaderChangedNotify (#12378) 2021-02-02 18:13:32 +08:00
Piotr Tabor
d6d03beaea
Merge pull request #12538 from lzhfromustc/12_9_GoroutineLeak
test: change channel operations to avoid potential goroutine leaks
2021-02-01 21:16:43 +01:00
Piotr Tabor
958f6f9878
Merge pull request #12481 from kalexmills/fix-defer-log
fix: pass argument url in defer to avoid loopclosure
2021-01-31 23:20:32 +01:00
Piotr Tabor
1a890a4659
Merge branch 'master' into update 2021-01-16 09:35:05 +01:00
Piotr Tabor
58f78df1de Raft: Expand raft documentation, in particular point on godocs. 2021-01-15 12:34:02 +01:00
Sahdev Zala
69e99e80fa
Merge pull request #12465 from spacewander/fdoc
chore: update the documentation link in the comment
2021-01-14 00:39:25 -05:00
Jingyi Hu
bfc6e2ff30
Merge pull request #12611 from ptabor/20210111-fix-flakes
e2e tests flakes & leaks fixes: In particular TestIssue6361
2021-01-12 21:26:54 +08:00
Piotr Tabor
0d9cfc11c8 Fix usage of reflect.SliceHeader: reported by vet on tip golang
Example:
https://travis-ci.com/github/etcd-io/etcd/jobs/470404938

```
% (cd server && go vet ./...)
stderr: # go.etcd.io/etcd/server/v3/etcdserver/api/v2store
stderr: etcdserver/api/v2store/node_extern_test.go:107:9: possible misuse of reflect.SliceHeader
stderr: etcdserver/api/v2store/node_extern_test.go:107:16: possible misuse of reflect.SliceHeader
```
2021-01-12 00:14:51 +01:00
Piotr Tabor
74274f4417 e2e: Adding better diagnostic and location for temporary files to Snapshot tests. 2021-01-12 00:14:51 +01:00
Piotr Tabor
8ccd4e1146 Fix flaky tests reported due to data race on grpc logging registration.
Example:
```
    ==================
    WARNING: DATA RACE
    Write at 0x000002178320 by goroutine 575:
      google.golang.org/grpc/grpclog.SetLoggerV2()
          /home/ptab/private/golang/pkg/mod/google.golang.org/grpc@v1.29.1/grpclog/loggerv2.go:70 +0x444
      go.etcd.io/etcd/server/v3/embed.(*Config).setupLogging.func1.1()
          /home/ptab/corp/etcd/server/embed/config_logging.go:119 +0x345
      sync.(*Once).doSlow()
          /usr/lib/google-golang/src/sync/once.go:66 +0x109
      sync.(*Once).Do()
          /usr/lib/google-golang/src/sync/once.go:57 +0x68
      go.etcd.io/etcd/server/v3/embed.(*Config).setupLogging.func1()
          /home/ptab/corp/etcd/server/embed/config_logging.go:109 +0x3b1
      go.etcd.io/etcd/server/v3/embed.(*Config).setupLogging()
          /home/ptab/corp/etcd/server/embed/config_logging.go:174 +0x6af
      go.etcd.io/etcd/server/v3/embed.(*Config).Validate()
          /home/ptab/corp/etcd/server/embed/config.go:553 +0x55
      go.etcd.io/etcd/server/v3/embed.StartEtcd()
          /home/ptab/corp/etcd/server/embed/etcd.go:93 +0x84
      go.etcd.io/etcd/tests/v3/integration.TestKVWithEmptyValue()
          /home/ptab/corp/etcd/tests/integration/v3_kv_test.go:33 +0x18c
      testing.tRunner()
          /usr/lib/google-golang/src/testing/testing.go:1123 +0x202

    Previous read at 0x000002178320 by goroutine 956:
      [failed to restore the stack]

    Goroutine 575 (running) created at:
      testing.(*T).Run()
          /usr/lib/google-golang/src/testing/testing.go:1168 +0x5bb
      testing.runTests.func1()
          /usr/lib/google-golang/src/testing/testing.go:1441 +0xa6
      testing.tRunner()
          /usr/lib/google-golang/src/testing/testing.go:1123 +0x202
      testing.runTests()
          /usr/lib/google-golang/src/testing/testing.go:1439 +0x612
      testing.(*M).Run()
          /usr/lib/google-golang/src/testing/testing.go:1347 +0x3c4
      go.etcd.io/etcd/pkg/v3/testutil.MustTestMainWithLeakDetection()
          /home/ptab/corp/etcd/pkg/testutil/leak.go:150 +0x38
      go.etcd.io/etcd/tests/v3/integration.TestMain()
          /home/ptab/corp/etcd/tests/integration/main_test.go:14 +0x272
      main.main()
          _testmain.go:349 +0x269

    Goroutine 956 (finished) created at:
      google.golang.org/grpc/internal/transport.newHTTP2Server()
          /home/ptab/private/golang/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http2_server.go:288 +0x18a4
      google.golang.org/grpc/internal/transport.NewServerTransport()
          /home/ptab/private/golang/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/transport.go:534 +0x2f5
      google.golang.org/grpc.(*Server).newHTTP2Transport()
          /home/ptab/private/golang/pkg/mod/google.golang.org/grpc@v1.29.1/server.go:726 +0x2ca
      google.golang.org/grpc.(*Server).handleRawConn()
          /home/ptab/private/golang/pkg/mod/google.golang.org/grpc@v1.29.1/server.go:693 +0x60f
      google.golang.org/grpc.(*Server).Serve.func3()
          /home/ptab/private/golang/pkg/mod/google.golang.org/grpc@v1.29.1/server.go:663 +0x4c
    ==================
    ...
    {"level":"info","ts":"2021-01-09T22:21:04.550+0100","caller":"embed/etcd.go:330","msg":"closed etcd server","name":"default","data-dir":"/tmp/etcd-017337431","advertise-peer-urls":["http://localhost:2380"],"advertise-client-urls":["http://localhost:2379"]}
    --- FAIL: TestKVWithEmptyValue (1.08s)
        v3_kv_test.go:62: my-namespace/foobar = data
        v3_kv_test.go:62: my-namespace/foobar1 = data
        v3_kv_test.go:62: namespace/foobar1 = data
        v3_kv_test.go:72: foobar = data
        v3_kv_test.go:72: foobar1 = data
        v3_kv_test.go:87: delete keys:2
        testing.go:1038: race detected during execution of test
```
2021-01-11 10:06:31 +01:00
Dan Lorenc
5b90402082 Switch from dgrijalva/jwt-go to form3tech-oss/jwt-go.
dgrijalva/jwt-go has been abandoned and contains several serious
security issues. Most projects are now switching to the form3tech fork.

See https://snyk.io/vuln/SNYK-GOLANG-GITHUBCOMDGRIJALVAJWTGO-596515 for
info on the issues.

Signed-off-by: Dan Lorenc <dlorenc@google.com>
2021-01-10 08:04:20 -06:00
Piotr Tabor
23340bb62a Refresh proto generation script after moving modules files.
With modulatiozation server protos get moved into ./server directory,
but it was not reflected in scripts/genproto.sh.
2021-01-08 16:33:12 +01:00
mlmhl
ebf461a7de backend: fix buffer range bug 2020-12-30 16:05:42 +08:00
lzhfromustc
f2a912a4e6 test: change channel operations to avoid potential goroutine leaks
In these unit tests, goroutines may leak if certain branches are chosen. This commit edits channel operations and buffer sizes, so no matter what branch is chosen, the test will end correctly. This commit doesn't change the semantics of unit tests.
2020-12-09 22:23:21 -05:00
K. Alex Mills
3f6e0ec94b fix: pass argument url in defer to avoid loopclosure
Because of the well-known range loop closure issue, the value of u may
have changed by the time the anonymous function mentioned in the defer
is run. To address this, the simplest fix is to pass the url used in the
loop as an argument to the function run in defer.
2020-11-19 15:29:26 -06:00
Dan Mace
9571325fe8 etcdserver: fix incorrect metrics generated when clients cancel watches
Before this patch, a client which cancels the context for a watch results in the
server generating a `rpctypes.ErrGRPCNoLeader` error that leads the recording of
a gRPC `Unavailable` metric in association with the client watch cancellation.
The metric looks like this:

    grpc_server_handled_total{grpc_code="Unavailable",grpc_method="Watch",grpc_service="etcdserverpb.Watch",grpc_type="bidi_stream"}

So, the watch server has misidentified the error as a server error and then
propagates the mistake to metrics, leading to a false indicator that the leader
has been lost. This false signal then leads to false alerting.

The commit 9c103dd0dedfc723cd4f33b6a5e81343d8a6bae7 introduced an interceptor which wraps
watch streams requiring a leader, causing those streams to be actively canceled
when leader loss is detected.

However, the error handling code assumes all stream context cancellations are
from the interceptor. This assumption is broken when the context was canceled
because of a client stream cancelation.

The core challenge is lack of information conveyed via `context.Context` which
is shared by both the send and receive sides of the stream handling and is
subject to cancellation by all paths (including the gRPC library itself). If any
piece of the system cancels the shared context, there's no way for a context
consumer to understand who cancelled the context or why.

To solve the ambiguity of the stream interceptor code specifically, this patch
introduces a custom context struct which the interceptor uses to expose a custom
error through the context when the interceptor decides to actively cancel a
stream. Now the consuming side can more safely assume a generic context
cancellation can be propagated as a cancellation, and the server generated
leader error is preserved and propagated normally without any special inference.

When a client cancels the stream, there remains a race in the error handling
code between the send and receive goroutines whereby the underlying gRPC error
is lost in the case where the send path returns and is handled first, but this
issue can be taken separately as no matter which paths wins, we can detect a
generic cancellation.

This is a replacement of https://github.com/etcd-io/etcd/pull/11375.

Fixes #10289, #9725, #9576, #9166
2020-11-18 17:02:09 -05:00
Ankur Gargi
c1c681adc3 server: Added config parameter experimental-warning-apply-duration 2020-11-17 17:33:19 -05:00
Gyuho Lee
1b8d2b1a47
Merge pull request #12452 from ptabor/20201104-release-mod-scripts
Release scripts for modules
2020-11-14 03:42:42 -08:00
Gyuho Lee
dc586a5ad2
Merge pull request #12459 from jingyih/proper_request_cancellation
server: proper cancellation for range request
2020-11-13 12:20:41 -08:00
spacewander
67f040f921 Update other Documentation/v2 links 2020-11-11 09:57:01 +08:00
spacewander
f2eb15a81b chore: update the documentation link in the comment
Close #12462.
2020-11-11 09:53:18 +08:00
Maciej Borsz
0bea7df7c1 Add metric tracking apply method duration:
* etcd_server_apply_duration_seconds

It can be used to understand which operations are slow,
in addition to the warning log message.
2020-11-06 11:11:16 +01:00
jingyih
0558e379c3 server: proper request cancellation for range 2020-11-05 21:30:02 -08:00
Piotr Tabor
eeafcef0d2 Use "v3.5.0-pre" to reference within-etcd modules
instead of v3.0.0-000101010000000-00000000000,
that might be misleading as we don't develop etcd v3.0.0 any longer.

This version is a virtual version and is not supposed to be tagged
within the repository. We should tag real versions like: 3.5.0-alpha.0.

Please notice that go.etcd.io/etcd/client/v2 will be versioned as `v2.305.0-pre`.
The reason is that client v2 must have v2 version. I propose a
convention to envode the major version as 100x in minor version to make
the association to the underlying repository clear, staying within v2
version family.

The change was generated using:
```
DRY_RUN=false TARGET_VERSION="v3.5.0-pre" ./scripts/release_mod.sh update_versions
```
2020-11-04 18:28:43 +01:00
Piotr Tabor
6e800b9b01
20201103 no commit title check (#12447)
* Turn off checking of format of commit message.

* scripts/fix.sh: Fix fixing whitespaces in *.sh scripts

Aparently there is a difference between:
  find ./ -print0 -name *.sh and
  find ./ -name *.sh -print0

* etcdserver unit tests: Do not call .Fatalf(...) from not test's goroutine.

Fixes following test failures:
https://travis-ci.com/github/etcd-io/etcd/jobs/425920416
```
% (cd server && go vet ./...)
stderr: # go.etcd.io/etcd/server/v3/etcdserver
stderr: etcdserver/server_test.go:1002:4: call to (*T).Fatalf from a non-test goroutine
stderr: etcdserver/server_test.go:1166:4: call to (*T).Fatalf from a non-test goroutine
FAIL: (code:2):
  % (cd server && go vet ./...)
FAIL: 'run go vet ./...' checking failed (!=0 return code)
FAIL: 'govet' failed at Tue Nov  3 04:07:47 UTC 2020
```
2020-11-03 07:59:42 -08:00
Jingyi Hu
f224fa4e42
Merge pull request #12425 from viviyww/cluster-set-version
etcdserver: updated cluster version
2020-11-03 22:41:24 +08:00
tangcong
a960d6b1c7 *: add self-signed-cert-validity flag 2020-10-30 10:10:26 +08:00
yangweiwei
aa1024a16e etcdserver: updated cluster version
during cluster version update in etcd cluster, the log should info from
XX to XX.
2020-10-27 16:32:40 +08:00
Piotr Tabor
aaf423e962 server: Update imports.
find -name '*.go' | xargs sed -i --follow-symlinks 's|etcd/v3/|etcd/server/v3/|g'
2020-10-26 13:02:32 +01:00
Piotr Tabor
6c1efd6ba5 server: Update go.mod 2020-10-26 13:02:32 +01:00
Piotr Tabor
4a5e9d1261 server: Move server files to 'server' directory.
26  git mv mvcc wal auth etcdserver etcdmain proxy embed/ lease/ server
   36  git mv go.mod go.sum server
2020-10-26 12:57:19 +01:00
Piotr Tabor
e62417297d *: Rename of imports of raft (as its now a module)
% find -name '*.go' -o -name '*.md' -o -name '*.sh' | xargs sed -i --follow-symlinks 's|etcd/v3/raft|etcd/raft/v3|g'
2020-10-16 13:58:18 +02:00
Piotr Tabor
de55bb6331 pkg: Rename imports after making 'pkg' a module
find -name '*.go' | xargs sed --follow-symlinks -i 's|go.etcd.io/etcd/v3/pkg/|go.etcd.io/etcd/pkg/v3/|g'
go fmt ./...
2020-10-13 00:09:27 +02:00
Piotr Tabor
28036db6f0 Mechanical: Move mock packages out of ./pkg
Mechanical execution of:

```
git mv ./pkg/mock/mockserver ./client/pkg/mock
git mv ./pkg/mock/{mockstorage,mockstore,mockwait} ./server/pkg/mock
```

The packages depend on etcd API / protos - so they are NOT etcd-dependencies.
In such situation thay should be placed in 'pkg' folder.
2020-10-12 23:58:09 +02:00
Xiang Li
4296cd3fa4 *: remove old server 2014-09-03 09:20:10 -07:00
Blake Mizerany
0881021e54 all config -> cfg 2014-09-03 09:20:07 -07:00
Blake Mizerany
d77773acb3 server: ignore server in build/tests 2014-09-03 09:20:06 -07:00
Xiang Li
44836d9099 etcd: move server/usage.go to etcd/v2_usage.go 2014-09-03 09:19:49 -07:00
Xiang Li
b8d71dfe70 v2: remove old tests 2014-09-03 09:19:49 -07:00
Yicheng Qin
02ced2c2d7 v1: deprecate v1 support
Etcd moves to 0.5 without the support of v1.
2014-09-03 09:19:49 -07:00
Xiang Li
8d758be3e4 server: remove unused file 2014-09-03 09:05:15 -07:00
Xiang Li
02c854717b config: make config a self-contained pkg 2014-09-03 09:05:13 -07:00
Yicheng Qin
c9db87a302 server: bump to 0.4.6+git 2014-07-29 10:33:44 -07:00
Yicheng Qin
49e0dff2b8 CHANGELOG: v0.4.6 2014-07-29 10:31:48 -07:00
Yicheng Qin
a884f2a18a Merge pull request #881 from unihorn/111
standby server: save Running info correctly
2014-07-24 17:29:25 -07:00