508 Commits

Author SHA1 Message Date
Piotr Tabor
8ccd4e1146 Fix flaky tests reported due to data race on grpc logging registration.
Example:
```
    ==================
    WARNING: DATA RACE
    Write at 0x000002178320 by goroutine 575:
      google.golang.org/grpc/grpclog.SetLoggerV2()
          /home/ptab/private/golang/pkg/mod/google.golang.org/grpc@v1.29.1/grpclog/loggerv2.go:70 +0x444
      go.etcd.io/etcd/server/v3/embed.(*Config).setupLogging.func1.1()
          /home/ptab/corp/etcd/server/embed/config_logging.go:119 +0x345
      sync.(*Once).doSlow()
          /usr/lib/google-golang/src/sync/once.go:66 +0x109
      sync.(*Once).Do()
          /usr/lib/google-golang/src/sync/once.go:57 +0x68
      go.etcd.io/etcd/server/v3/embed.(*Config).setupLogging.func1()
          /home/ptab/corp/etcd/server/embed/config_logging.go:109 +0x3b1
      go.etcd.io/etcd/server/v3/embed.(*Config).setupLogging()
          /home/ptab/corp/etcd/server/embed/config_logging.go:174 +0x6af
      go.etcd.io/etcd/server/v3/embed.(*Config).Validate()
          /home/ptab/corp/etcd/server/embed/config.go:553 +0x55
      go.etcd.io/etcd/server/v3/embed.StartEtcd()
          /home/ptab/corp/etcd/server/embed/etcd.go:93 +0x84
      go.etcd.io/etcd/tests/v3/integration.TestKVWithEmptyValue()
          /home/ptab/corp/etcd/tests/integration/v3_kv_test.go:33 +0x18c
      testing.tRunner()
          /usr/lib/google-golang/src/testing/testing.go:1123 +0x202

    Previous read at 0x000002178320 by goroutine 956:
      [failed to restore the stack]

    Goroutine 575 (running) created at:
      testing.(*T).Run()
          /usr/lib/google-golang/src/testing/testing.go:1168 +0x5bb
      testing.runTests.func1()
          /usr/lib/google-golang/src/testing/testing.go:1441 +0xa6
      testing.tRunner()
          /usr/lib/google-golang/src/testing/testing.go:1123 +0x202
      testing.runTests()
          /usr/lib/google-golang/src/testing/testing.go:1439 +0x612
      testing.(*M).Run()
          /usr/lib/google-golang/src/testing/testing.go:1347 +0x3c4
      go.etcd.io/etcd/pkg/v3/testutil.MustTestMainWithLeakDetection()
          /home/ptab/corp/etcd/pkg/testutil/leak.go:150 +0x38
      go.etcd.io/etcd/tests/v3/integration.TestMain()
          /home/ptab/corp/etcd/tests/integration/main_test.go:14 +0x272
      main.main()
          _testmain.go:349 +0x269

    Goroutine 956 (finished) created at:
      google.golang.org/grpc/internal/transport.newHTTP2Server()
          /home/ptab/private/golang/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http2_server.go:288 +0x18a4
      google.golang.org/grpc/internal/transport.NewServerTransport()
          /home/ptab/private/golang/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/transport.go:534 +0x2f5
      google.golang.org/grpc.(*Server).newHTTP2Transport()
          /home/ptab/private/golang/pkg/mod/google.golang.org/grpc@v1.29.1/server.go:726 +0x2ca
      google.golang.org/grpc.(*Server).handleRawConn()
          /home/ptab/private/golang/pkg/mod/google.golang.org/grpc@v1.29.1/server.go:693 +0x60f
      google.golang.org/grpc.(*Server).Serve.func3()
          /home/ptab/private/golang/pkg/mod/google.golang.org/grpc@v1.29.1/server.go:663 +0x4c
    ==================
    ...
    {"level":"info","ts":"2021-01-09T22:21:04.550+0100","caller":"embed/etcd.go:330","msg":"closed etcd server","name":"default","data-dir":"/tmp/etcd-017337431","advertise-peer-urls":["http://localhost:2380"],"advertise-client-urls":["http://localhost:2379"]}
    --- FAIL: TestKVWithEmptyValue (1.08s)
        v3_kv_test.go:62: my-namespace/foobar = data
        v3_kv_test.go:62: my-namespace/foobar1 = data
        v3_kv_test.go:62: namespace/foobar1 = data
        v3_kv_test.go:72: foobar = data
        v3_kv_test.go:72: foobar1 = data
        v3_kv_test.go:87: delete keys:2
        testing.go:1038: race detected during execution of test
```
2021-01-11 10:06:31 +01:00
Dan Lorenc
5b90402082 Switch from dgrijalva/jwt-go to form3tech-oss/jwt-go.
dgrijalva/jwt-go has been abandoned and contains several serious
security issues. Most projects are now switching to the form3tech fork.

See https://snyk.io/vuln/SNYK-GOLANG-GITHUBCOMDGRIJALVAJWTGO-596515 for
info on the issues.

Signed-off-by: Dan Lorenc <dlorenc@google.com>
2021-01-10 08:04:20 -06:00
Piotr Tabor
23340bb62a Refresh proto generation script after moving modules files.
With modulatiozation server protos get moved into ./server directory,
but it was not reflected in scripts/genproto.sh.
2021-01-08 16:33:12 +01:00
lzhfromustc
f2a912a4e6 test: change channel operations to avoid potential goroutine leaks
In these unit tests, goroutines may leak if certain branches are chosen. This commit edits channel operations and buffer sizes, so no matter what branch is chosen, the test will end correctly. This commit doesn't change the semantics of unit tests.
2020-12-09 22:23:21 -05:00
K. Alex Mills
3f6e0ec94b fix: pass argument url in defer to avoid loopclosure
Because of the well-known range loop closure issue, the value of u may
have changed by the time the anonymous function mentioned in the defer
is run. To address this, the simplest fix is to pass the url used in the
loop as an argument to the function run in defer.
2020-11-19 15:29:26 -06:00
Dan Mace
9571325fe8 etcdserver: fix incorrect metrics generated when clients cancel watches
Before this patch, a client which cancels the context for a watch results in the
server generating a `rpctypes.ErrGRPCNoLeader` error that leads the recording of
a gRPC `Unavailable` metric in association with the client watch cancellation.
The metric looks like this:

    grpc_server_handled_total{grpc_code="Unavailable",grpc_method="Watch",grpc_service="etcdserverpb.Watch",grpc_type="bidi_stream"}

So, the watch server has misidentified the error as a server error and then
propagates the mistake to metrics, leading to a false indicator that the leader
has been lost. This false signal then leads to false alerting.

The commit 9c103dd0dedfc723cd4f33b6a5e81343d8a6bae7 introduced an interceptor which wraps
watch streams requiring a leader, causing those streams to be actively canceled
when leader loss is detected.

However, the error handling code assumes all stream context cancellations are
from the interceptor. This assumption is broken when the context was canceled
because of a client stream cancelation.

The core challenge is lack of information conveyed via `context.Context` which
is shared by both the send and receive sides of the stream handling and is
subject to cancellation by all paths (including the gRPC library itself). If any
piece of the system cancels the shared context, there's no way for a context
consumer to understand who cancelled the context or why.

To solve the ambiguity of the stream interceptor code specifically, this patch
introduces a custom context struct which the interceptor uses to expose a custom
error through the context when the interceptor decides to actively cancel a
stream. Now the consuming side can more safely assume a generic context
cancellation can be propagated as a cancellation, and the server generated
leader error is preserved and propagated normally without any special inference.

When a client cancels the stream, there remains a race in the error handling
code between the send and receive goroutines whereby the underlying gRPC error
is lost in the case where the send path returns and is handled first, but this
issue can be taken separately as no matter which paths wins, we can detect a
generic cancellation.

This is a replacement of https://github.com/etcd-io/etcd/pull/11375.

Fixes #10289, #9725, #9576, #9166
2020-11-18 17:02:09 -05:00
Ankur Gargi
c1c681adc3 server: Added config parameter experimental-warning-apply-duration 2020-11-17 17:33:19 -05:00
Gyuho Lee
1b8d2b1a47
Merge pull request #12452 from ptabor/20201104-release-mod-scripts
Release scripts for modules
2020-11-14 03:42:42 -08:00
Gyuho Lee
dc586a5ad2
Merge pull request #12459 from jingyih/proper_request_cancellation
server: proper cancellation for range request
2020-11-13 12:20:41 -08:00
spacewander
67f040f921 Update other Documentation/v2 links 2020-11-11 09:57:01 +08:00
spacewander
f2eb15a81b chore: update the documentation link in the comment
Close #12462.
2020-11-11 09:53:18 +08:00
Maciej Borsz
0bea7df7c1 Add metric tracking apply method duration:
* etcd_server_apply_duration_seconds

It can be used to understand which operations are slow,
in addition to the warning log message.
2020-11-06 11:11:16 +01:00
jingyih
0558e379c3 server: proper request cancellation for range 2020-11-05 21:30:02 -08:00
Piotr Tabor
eeafcef0d2 Use "v3.5.0-pre" to reference within-etcd modules
instead of v3.0.0-000101010000000-00000000000,
that might be misleading as we don't develop etcd v3.0.0 any longer.

This version is a virtual version and is not supposed to be tagged
within the repository. We should tag real versions like: 3.5.0-alpha.0.

Please notice that go.etcd.io/etcd/client/v2 will be versioned as `v2.305.0-pre`.
The reason is that client v2 must have v2 version. I propose a
convention to envode the major version as 100x in minor version to make
the association to the underlying repository clear, staying within v2
version family.

The change was generated using:
```
DRY_RUN=false TARGET_VERSION="v3.5.0-pre" ./scripts/release_mod.sh update_versions
```
2020-11-04 18:28:43 +01:00
Piotr Tabor
6e800b9b01
20201103 no commit title check (#12447)
* Turn off checking of format of commit message.

* scripts/fix.sh: Fix fixing whitespaces in *.sh scripts

Aparently there is a difference between:
  find ./ -print0 -name *.sh and
  find ./ -name *.sh -print0

* etcdserver unit tests: Do not call .Fatalf(...) from not test's goroutine.

Fixes following test failures:
https://travis-ci.com/github/etcd-io/etcd/jobs/425920416
```
% (cd server && go vet ./...)
stderr: # go.etcd.io/etcd/server/v3/etcdserver
stderr: etcdserver/server_test.go:1002:4: call to (*T).Fatalf from a non-test goroutine
stderr: etcdserver/server_test.go:1166:4: call to (*T).Fatalf from a non-test goroutine
FAIL: (code:2):
  % (cd server && go vet ./...)
FAIL: 'run go vet ./...' checking failed (!=0 return code)
FAIL: 'govet' failed at Tue Nov  3 04:07:47 UTC 2020
```
2020-11-03 07:59:42 -08:00
Jingyi Hu
f224fa4e42
Merge pull request #12425 from viviyww/cluster-set-version
etcdserver: updated cluster version
2020-11-03 22:41:24 +08:00
tangcong
a960d6b1c7 *: add self-signed-cert-validity flag 2020-10-30 10:10:26 +08:00
yangweiwei
aa1024a16e etcdserver: updated cluster version
during cluster version update in etcd cluster, the log should info from
XX to XX.
2020-10-27 16:32:40 +08:00
Piotr Tabor
aaf423e962 server: Update imports.
find -name '*.go' | xargs sed -i --follow-symlinks 's|etcd/v3/|etcd/server/v3/|g'
2020-10-26 13:02:32 +01:00
Piotr Tabor
6c1efd6ba5 server: Update go.mod 2020-10-26 13:02:32 +01:00
Piotr Tabor
4a5e9d1261 server: Move server files to 'server' directory.
26  git mv mvcc wal auth etcdserver etcdmain proxy embed/ lease/ server
   36  git mv go.mod go.sum server
2020-10-26 12:57:19 +01:00
Piotr Tabor
e62417297d *: Rename of imports of raft (as its now a module)
% find -name '*.go' -o -name '*.md' -o -name '*.sh' | xargs sed -i --follow-symlinks 's|etcd/v3/raft|etcd/raft/v3|g'
2020-10-16 13:58:18 +02:00
Piotr Tabor
de55bb6331 pkg: Rename imports after making 'pkg' a module
find -name '*.go' | xargs sed --follow-symlinks -i 's|go.etcd.io/etcd/v3/pkg/|go.etcd.io/etcd/pkg/v3/|g'
go fmt ./...
2020-10-13 00:09:27 +02:00
Piotr Tabor
28036db6f0 Mechanical: Move mock packages out of ./pkg
Mechanical execution of:

```
git mv ./pkg/mock/mockserver ./client/pkg/mock
git mv ./pkg/mock/{mockstorage,mockstore,mockwait} ./server/pkg/mock
```

The packages depend on etcd API / protos - so they are NOT etcd-dependencies.
In such situation thay should be placed in 'pkg' folder.
2020-10-12 23:58:09 +02:00
Xiang Li
4296cd3fa4 *: remove old server 2014-09-03 09:20:10 -07:00
Blake Mizerany
0881021e54 all config -> cfg 2014-09-03 09:20:07 -07:00
Blake Mizerany
d77773acb3 server: ignore server in build/tests 2014-09-03 09:20:06 -07:00
Xiang Li
44836d9099 etcd: move server/usage.go to etcd/v2_usage.go 2014-09-03 09:19:49 -07:00
Xiang Li
b8d71dfe70 v2: remove old tests 2014-09-03 09:19:49 -07:00
Yicheng Qin
02ced2c2d7 v1: deprecate v1 support
Etcd moves to 0.5 without the support of v1.
2014-09-03 09:19:49 -07:00
Xiang Li
8d758be3e4 server: remove unused file 2014-09-03 09:05:15 -07:00
Xiang Li
02c854717b config: make config a self-contained pkg 2014-09-03 09:05:13 -07:00
Yicheng Qin
c9db87a302 server: bump to 0.4.6+git 2014-07-29 10:33:44 -07:00
Yicheng Qin
49e0dff2b8 CHANGELOG: v0.4.6 2014-07-29 10:31:48 -07:00
Yicheng Qin
a884f2a18a Merge pull request #881 from unihorn/111
standby server: save Running info correctly
2014-07-24 17:29:25 -07:00
Yicheng Qin
c00594e680 server: fix timer leak 2014-07-21 14:23:52 -07:00
Brandon Philips
1cffdb3a48 Merge pull request #866 from coreos/qread
feat(get): get from quorum
2014-07-08 18:30:18 -07:00
Yicheng Qin
f7854c4ab9 standby server: save Running info correctly
Running should be true when Start, and set to false when switching to
the other mode.
2014-07-08 08:29:17 -07:00
Brandon Philips
13b0e72304 CHANGELOG: v0.4.5 2014-07-07 18:11:29 -07:00
Brandon Philips
43791a2f41 Merge pull request #877 from Rawn/flush-headers-on-wait-stream
server: Flush headers when using wait=true and stream=true
2014-07-07 17:04:02 -07:00
Brandon Philips
084dcb5596 etcd: add a read/write timeout to server
The default is for connections to last forever[1]. This leads to fds
leaking. I set the timeout so high by default so that watches don't have
to keep retrying but perhaps we should set it slower.

Tested on a cluster with lots of clients and it seems to have relieved
the problem.

[1] https://groups.google.com/forum/#!msg/golang-nuts/JFhGwh1q9xU/heh4J8pul3QJ
2014-07-07 11:42:56 -07:00
Yicheng Qin
8a0266a806 Merge pull request #867 from iand/json-headers
fix(server/server.go): /v2/stats endpoints emit application/json content type header
2014-07-07 10:33:27 -07:00
Christoffer Vikström
2338481bb1 server: Flush headers when using wait=true and stream=true
Many http clients will missbehave unless they get an initial http-
response, even when long-polling. It also saves the user/client from
having to handle headers on the first action of the watch, but rather
handle the response immediately.
2014-07-03 18:04:36 +02:00
Cole Gleason
ce1e19ae2f Merge pull request #849 from colegleason/max-cluster-size
docs(cluster-size): remove outdated references to flag max-cluster-size
2014-07-02 12:21:14 -07:00
Ian Davis
a288333e6f fix(server/server.go): /v2/stats endpoints emit application/json content type header 2014-06-30 10:50:43 +01:00
Brandon Philips
774cb03f83 server: bump to 0.4.4+git 2014-06-24 10:58:11 -07:00
Brandon Philips
4fb6087f4a CHANGELOG: release 0.4.4 2014-06-24 10:56:53 -07:00
Yicheng Qin
5524131a9e Merge pull request #865 from robstrong/hotfix/contentType
fix(peer_server) set content type to application/json in admin
2014-06-23 16:31:49 -07:00
Brandon Philips
3efb4d837b Merge pull request #844 from unihorn/102
chore(peer_server): improve log for auto removal
2014-06-23 14:19:48 -07:00
Xiang Li
973bde9a07 feat(get): get from quorum 2014-06-22 21:33:38 -07:00