416 Commits

Author SHA1 Message Date
Piotr Tabor
725a8c5e02 Enable configuring delegated zap-logging for embed server. 2021-03-17 08:17:36 +01:00
Piotr Tabor
6657d5907c Make sure all integration tests have BeforeTest.
The CL disallows to create NewCluster in tests without BeforeTest.
2021-03-16 23:50:01 +01:00
Piotr Tabor
809e7629ed Add integration.BeforeTest to all missing tests. 2021-03-16 23:50:01 +01:00
Piotr Tabor
a84bd093b0 Integration with grpc-settable logger. 2021-03-16 22:50:41 +01:00
Piotr Tabor
a57e967d84 Integration test flakes fixes. 2021-03-16 16:08:18 +01:00
Piotr Tabor
a47c18d30a Fix 2 remaining 'defer AfterTest' calls. 2021-03-13 23:41:29 +01:00
Piotr Tabor
b406647dd7 Fix/remove broken: TestMetricDbSizeDefragDebugging 2021-03-12 23:05:53 +01:00
Gyuho Lee
4eba403ccc
Merge pull request #12765 from ptabor/20210312-move-config
Move config (ServerConfig) out of etcdserver package.
2021-03-11 15:05:29 -08:00
Piotr Tabor
fd7fed1511 Move config (ServerConfig) out of etcdserver package.
Motivation:
  - ServerConfig is part of 'embed' public API, while etcdserver is more 'internal'
  - EtcdServer is already too big and config is pretty wide-spread leaf
if we were to split etcdserver (e.g. into pre & post-apply part).
2021-03-11 20:56:22 +01:00
Piotr Tabor
b9226d03f4
Merge pull request #12763 from hexfusion/bump-proto
vendor: bump gogo/proto to v1.3.2
2021-03-11 17:59:07 +01:00
Sam Batschelet
d3aa3fb486 vendor: bump gogo/proto to v1.3.2
Signed-off-by: Sam Batschelet <sbatsche@redhat.com>
2021-03-11 11:27:25 -05:00
Piotr Tabor
c8243a9927 Tests: Functional - in case of failure, log the exception. 2021-03-09 18:19:52 +01:00
Piotr Tabor
b6c2e87a74 Testing: Integration tests does not check whether t==nil 2021-03-09 18:19:52 +01:00
Piotr Tabor
5ddabfdb24 tests: Make tests operate in /tmp director instead of src.
Thanks to this, unix sockets should be not longer
created by integration tests in the the source code directory,
so potentially trigger IDE reloads and unnecessery load (and mess).
2021-03-09 18:19:52 +01:00
Piotr Tabor
bfe02c0526 tests: Cluster creation that failed shouldn't leak goroutines. 2021-03-09 18:19:52 +01:00
Piotr Tabor
41f6cc7234 Tests: Better isolation between store_v2v3 integration tests. 2021-03-09 18:19:51 +01:00
Piotr Tabor
fb1d48e98e Integration tests: Use BeforeTest(t) instead of defer AfterTest().
Thanks to this change, a single method BeforeTest(t) can handle
before-test logic as well as registration of cleanup code
(t.Cleanup(func)).
2021-03-09 18:19:51 +01:00
Piotr Tabor
87258efd90 Integration tests: Use zaptest.Logger based testing.TB
Thanks to this the logs:
  - are automatically printed if the test fails.
  - are in pretty consistent format.
  - are annotated by 'member' information of the cluster emitting them.

Side changes:
  - Set propert default got DefaultWarningApplyDuration (used to be '0')
  - Name the members based on their 'place' on the list (as opposed to
'random')
2021-03-09 18:19:51 +01:00
Piotr Tabor
a46a358577 --experimental-memory-mlock support
The flag protects etcd memory from being swapped out to disk.
This can happen in memory constrained systems where mmaped bbolt
area is natural condidate for swapping out.

This flag should provide better tail latency on the cost of higher RSS
ram usage. If the experiment is successful, the logic should get moved
into bbolt layer, where we can protect specific bbolt instances
(e.g. avoid protecting both during defragmentation).
2021-03-07 12:32:57 +01:00
Piotr Tabor
f7a2389992 Update version of certifi/gocertifi to get rid of WTF Public license
Seems old versions of https://github.com/certifi/gocertifi where
categorized as "Do What The F*ck You Want To Public License".

Update to newer version that is explicit `Mozilla Public License` 2.0 (MPL 2.0).
2021-03-04 09:48:34 +01:00
Piotr Tabor
102c198444
Merge pull request #12705 from astromechza/bm_etcd_peer_server_cert
etcdmain: added peer-client-{client,key}-file parameters for supporting separate client and server certs when communicating between peers
2021-03-02 09:03:35 +01:00
Piotr Tabor
5e10d12996 tests: Fixes a few recently spotted test-flakes
```
Unexpected goroutines running after all test(s).
1 instances of:
syscall.Syscall(...)
	/usr/local/go/src/syscall/asm_linux_386.s:19 +0x5
syscall.Close(...)
	/usr/local/go/src/syscall/zsyscall_linux_386.go:285 +0x3d
internal/poll.(*FD).destroy(...)
	/usr/local/go/src/internal/poll/fd_unix.go:77 +0x30
internal/poll.(*FD).decref(...)
	/usr/local/go/src/internal/poll/fd_mutex.go:213 +0x38
internal/poll.(*FD).Close(...)
	/usr/local/go/src/internal/poll/fd_unix.go:99 +0x43
net.(*netFD).Close(...)
	/usr/local/go/src/net/fd_posix.go:37 +0x49
FAIL	go.etcd.io/etcd/tests/v3/integration/client	0.039s
```

```
--- FAIL: TestServer_TCP_Secure_DelayTx (0.20s)
    server_test.go:110: took 128.026085ms with no latency
    server_test.go:125: took 62.980988ms with latency 50ms��5ms
    server_test.go:133: expected took1 128.026085ms < took2 62.980988ms (with latency)
```

https://github.com/etcd-io/etcd/issues/12372
2021-03-01 18:07:38 +01:00
Ben Meier
3d44f5bf80
*: added client-{client,key}-file parameters for supporting separate client and server certs when communicating between peers
In some environments, the CA is not able to sign certificates with both
'client auth' and 'server auth' extended usage parameters and so an operator
needs to be able to set a seperate client certificate to use when making
requests which is different to the certificate used for accepting requests.
This applies to both proxy and etcd member mode and is available as both a CLI
 flag and config file field for peer TLS.

Signed-off-by: Ben Meier <ben.meier@oracle.com>
2021-02-28 14:37:56 +00:00
Piotr Tabor
a7f340216d Reformat code according to 'gotip' rules.
In practices adds annotations in the new syntax:
```
+//go:build !linux
 // +build !linux
```

Fixes failing gotip PASSES='fmt' check:
https://travis-ci.com/github/etcd-io/etcd/jobs/486453806
2021-02-26 10:14:46 +01:00
Piotr Tabor
45b1e6b470 ClientV3: Ordering: Fix the ordering test such it does not fail.
The test depended on very subtle timing semantic and on properties of
'copied' clients.

https://travis-ci.com/github/etcd-io/etcd/jobs/486191449

Examplar failure:
```
{"level":"warn","ts":"2021-02-25T12:34:47.894Z","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc0000d6fc0/#initially=[unix://localhost:86269902489114839060]","attempt":1,"error":"rpc error: code = Unavailable desc = etcdserver: rpc not supported for learner"}
{"level":"warn","ts":"2021-02-25T12:34:48.163Z","caller":"v3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc00035a000/#initially=[unix://localhost:78285857058450835940]","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: not leader"}
{"level":"info","ts":"2021-02-25T12:34:48.255Z","caller":"v3/maintenance.go:211","msg":"opened snapshot stream; downloading"}
{"level":"warn","ts":"2021-02-25T12:34:48.255Z","caller":"v3/maintenance.go:221","msg":"failed to receive from snapshot stream; closing","error":"rpc error: code = Canceled desc = context canceled"}
{"level":"info","ts":"2021-02-25T12:34:48.255Z","caller":"v3/maintenance.go:211","msg":"opened snapshot stream; downloading"}
{"level":"info","ts":"2021-02-25T12:34:50.255Z","caller":"v3/maintenance.go:219","msg":"completed snapshot read; closing"}
{"level":"info","ts":"2021-02-25T12:34:51.717Z","caller":"v3/maintenance.go:211","msg":"opened snapshot stream; downloading"}
{"level":"warn","ts":"2021-02-25T12:34:52.017Z","caller":"v3/maintenance.go:221","msg":"failed to receive from snapshot stream; closing","error":"rpc error: code = Canceled desc = context canceled"}
{"level":"info","ts":"2021-02-25T12:34:52.018Z","caller":"v3/maintenance.go:211","msg":"opened snapshot stream; downloading"}
{"level":"warn","ts":"2021-02-25T12:34:53.018Z","caller":"v3/maintenance.go:221","msg":"failed to receive from snapshot stream; closing","error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
--- FAIL: TestEndpointSwitchResolvesViolation (10.12s)
    ordering_util_test.go:81: failed to resolve order violation etcdclient: no cluster members have a revision higher than the previously received revision
```
2021-02-25 22:15:13 +01:00
Piotr Tabor
60d5159091 version: bump up to 3.5.0-alpha.0 2021-02-24 19:55:45 +00:00
Piotr Tabor
1a9c81abda Update grpc dependency to 1.32.
Simplify grpc testing infrastructure to align with upstream changes.
2021-02-23 11:31:50 +01:00
Piotr Tabor
4a1c24556c clientv3: PS: Replace balancer with upstream grpc solution
Addresses comments from: https://github.com/etcd-io/etcd/pull/12671#pullrequestreview-593942302
2021-02-23 10:03:15 +01:00
Gyuho Lee
3d7aac948b
Merge pull request #12196 from ironcladlou/metrics-watch-error-fix
etcdserver: fix incorrect metrics generated when clients cancel watches
2021-02-19 12:46:49 -08:00
Piotr Tabor
b67ed4e4aa
Merge pull request #12671 from ptabor/20210207-grpc-del-balancer
clientv3: Replace balancer with upstream grpc solution
2021-02-17 22:45:13 +01:00
Piotr Tabor
a836a8045b Get rid of legacy client/v3/naming API.
Update grpcproxy to use the new abstractions.
2021-02-09 11:56:28 +01:00
Piotr Tabor
0b75fede64 Replace client/v3/balancer with standard components: resolver + round_robin LB
This commit significantly reduces volume of custom code
in etcd client v3, while preserving full existing functionality.
2021-02-08 18:50:31 +01:00
limeng01
8feb55f65c *: implement Endpoint Watch and new Resolver 2021-02-08 20:05:45 +08:00
limeng01
571ed502d4 endpoints: implement Update method for EndpointManager.
- Add integration test for endpoints and resolver.
2021-02-04 23:30:22 +08:00
Piotr Tabor
d6d03beaea
Merge pull request #12538 from lzhfromustc/12_9_GoroutineLeak
test: change channel operations to avoid potential goroutine leaks
2021-02-01 21:16:43 +01:00
Piotr Tabor
0d7a671c75 Tests: Fix vet warnings about Fatal in sub-goroutines.
% (cd tests && go vet ./...)
stderr: # go.etcd.io/etcd/tests/v3/integration/clientv3/concurrency_test
stderr: integration/clientv3/concurrency/election_test.go:74:6: call to (*T).Fatal from a non-test goroutine
stderr: integration/clientv3/concurrency/mutex_test.go:57:4: call to (*T).Fatal from a non-test goroutine
2021-01-31 00:00:26 +01:00
Piotr Tabor
f0ecad00e3 Use temp-directory that is covered by framework level cleanup
Prior to this PR, the e2e tests where creating dirs like:
```
/tmp/testname1.etcd030299846
/tmp/testname0.etcd039445123
/tmp/testname0.etcd206372065
```
and not cleaning them, that led to disk-space-exceeded flakes.

After the PR, the testing.TB tempdir mechanism is used and the names are
being cleaned and are more miningful:

```
../../bin/etcd --name test-TestCtlV3EndpointHashKV-2 --listen-client-urls http://localhost:20010 --advertise-client-urls http://localhost:20010 --listen-peer-urls https://localhost:20011 --initial-advertise-peer-urls https://localhost:20011 --initial-cluster-token new --data-dir /tmp/TestCtlV3EndpointHashKV429176179/003 --snapshot-count 100000 --experimental-initial-corrupt-check --peer-auto-tls --initial-cluster test-TestCtlV3EndpointHashKV-0=https://localhost:20001,test-TestCtlV3EndpointHashKV-1=https://localhost:20006,test-TestCtlV3EndpointHashKV-2=https://localhost:20011
```
2021-01-30 13:25:55 +01:00
Piotr Tabor
5d7c1db3a9 Introduce grpc-1.30+ compatible client/v3/naming API.
This is not yet implementation, just API and tests to be filled
with implementation in next CLs,
tracked by: https://github.com/etcd-io/etcd/issues/12652

We propose here 3 packages:
 - clientv3/naming/endpoints ->
    That is abstraction layer over etcd that allows to write, read &
    watch Endpoints information. It's independent from GRPC API. It hides
    the storage details.

 - clientv3/naming/endpoints/internal ->
    That contains the grpc's compatible Update class to preserve the
    internal JSON mashalling format.

 - clientv3/naming/resolver ->
   That implements the GRPC resolver API, such that etcd can be
   used for connection.Dial in grpc.

Please see the grpc_naming.md document changes & grpcproxy/cluster.go
new integration, to see how the new abstractions work.
2021-01-30 12:32:19 +01:00
Piotr Tabor
9600bbd5fe Update root image for docker-test-image to ubuntu:20.10 2021-01-29 22:16:51 +00:00
Piotr Tabor
351bdb33c5 Split intengration/clientv3 tests into multiple packages
They used to take >10min with coverage, so were causing interrupted
Travis runs.

Know thay fit in 100-150s (together), thanks also to parallel
execution.
2021-01-23 11:12:44 +01:00
Piotr Tabor
598ca6caab Modernize release script:
- making sure the DRY_RUN mode can finish e2e, so e.g. commits to
local copy of repository are OK in dry-run (while git pushes are NOT).
  - better interaction with ./test_lib.sh script.
  - more consistent logging
  - bringing back s390x architecture that on go 1.14.3 seems to work as
expected.
2021-01-21 17:47:36 +01:00
Piotr Tabor
5dcd459ae9
Merge pull request #12564 from dhermes/patch-1
Adding `clientv3` import alias to match usage in `register_test.go`.
2021-01-19 23:00:19 +01:00
Piotr Tabor
1a890a4659
Merge branch 'master' into update 2021-01-16 09:35:05 +01:00
Piotr Tabor
577c898fee scripts: Integrate ./scripts/release with new code for tagging modules.
Changes:
  - signing tags.
  - allows to override BRANCH and REPOSITORY using env variables.

Tested by a release in my private fork:
  BRANCH="20201126-ptabor-release" REPOSITORY="git@github.com:ptabor/etcd.git" ./scripts/release 3.5.0-alpha.20
2021-01-15 12:31:44 +01:00
Piotr Tabor
70b5ef1d3a Fix tests flakiness: in particular TestIssue6361.
The root reason of flakes, was that server was considered as ready to
early.
In particular:
```
../../bin/etcd-2456648: {"level":"info","ts":"2021-01-11T09:56:44.474+0100","caller":"rafthttp/stream.go:274","msg":"established TCP streaming connection with remote peer","stream-writer-type":"stream Message","local-member-id":"ed5f620d34a8e61b","remote-peer-id":"ca50e9357181d758"}
../../bin/etcd-2456648: {"level":"warn","ts":"2021-01-11T09:56:49.040+0100","caller":"etcdserver/server.go:1942","msg":"failed to publish local member to cluster through raft","local-member-id":"ed5f620d34a8e61b","local-member-attributes":"{Name:infra2 ClientURLs:[http://localhost:20030]}","request-path":"/0/members/ed5f620d34a8e61b/attributes","publish-timeout":"7s","error":"etcdserver: request timed out, possibly due to connection lost"}
../../bin/etcd-2456648: {"level":"info","ts":"2021-01-11T09:56:49.049+0100","caller":"etcdserver/server.go:1921","msg":"published local member to cluster through raft","local-member-id":"ed5f620d34a8e61b","local-member-attributes":"{Name:infra2 ClientURLs:[http://localhost:20030]}","request-path":"/0/members/ed5f620d34a8e61b/attributes","cluster-id":"34f27e83b3bc2ff","publish-timeout":"7s"}
```
was taking 5s.   If this was happening concurrently with etcdctl, the
etcdctl could timeout.

The fix, requires servers to report 'ready to serve client requests' to consider them up.

Fixed also some whitelisted 'goroutines'.
2021-01-12 00:14:51 +01:00
Piotr Tabor
74274f4417 e2e: Adding better diagnostic and location for temporary files to Snapshot tests. 2021-01-12 00:14:51 +01:00
Piotr Tabor
26f9b4be8f e2e tests were leaking 'defunc' etcdctl processes.
The commit ensures that spawned etcdctl processes are "closed",
so they perform proper os wait processing.
This might have contributed to file-descriptor/open-files limit being
exceeded.
2021-01-11 11:55:30 +01:00
Piotr Tabor
e2a65bee6e Avoid 'interactive prompt' for root password in etcd tests.
Before:
```
{"level":"info","ts":1610273495.3791487,"caller":"agent/handler.go:668","msg":"cleaning up page cache"}
{"level":"info","ts":1610273495.3793094,"caller":"agent/handler.go:94","msg":"created etcd log file","path":"/tmp/etcd-functional-2/etcd.log"}
{"level":"info","ts":1610273495.379328,"caller":"agent/handler.go:668","msg":"cleaning up page cache"}
[sudo] password for ptab:
pam_glogin: invalid password
Sorry, try again.
[sudo] password for ptab:
```

Now the caches are dropped if the current users is in sudoers, bot not
in the other cases.
To be honest I don't see the purpose for dropping the caches at all in
the test.
2021-01-11 10:06:31 +01:00
Piotr Tabor
8ccd4e1146 Fix flaky tests reported due to data race on grpc logging registration.
Example:
```
    ==================
    WARNING: DATA RACE
    Write at 0x000002178320 by goroutine 575:
      google.golang.org/grpc/grpclog.SetLoggerV2()
          /home/ptab/private/golang/pkg/mod/google.golang.org/grpc@v1.29.1/grpclog/loggerv2.go:70 +0x444
      go.etcd.io/etcd/server/v3/embed.(*Config).setupLogging.func1.1()
          /home/ptab/corp/etcd/server/embed/config_logging.go:119 +0x345
      sync.(*Once).doSlow()
          /usr/lib/google-golang/src/sync/once.go:66 +0x109
      sync.(*Once).Do()
          /usr/lib/google-golang/src/sync/once.go:57 +0x68
      go.etcd.io/etcd/server/v3/embed.(*Config).setupLogging.func1()
          /home/ptab/corp/etcd/server/embed/config_logging.go:109 +0x3b1
      go.etcd.io/etcd/server/v3/embed.(*Config).setupLogging()
          /home/ptab/corp/etcd/server/embed/config_logging.go:174 +0x6af
      go.etcd.io/etcd/server/v3/embed.(*Config).Validate()
          /home/ptab/corp/etcd/server/embed/config.go:553 +0x55
      go.etcd.io/etcd/server/v3/embed.StartEtcd()
          /home/ptab/corp/etcd/server/embed/etcd.go:93 +0x84
      go.etcd.io/etcd/tests/v3/integration.TestKVWithEmptyValue()
          /home/ptab/corp/etcd/tests/integration/v3_kv_test.go:33 +0x18c
      testing.tRunner()
          /usr/lib/google-golang/src/testing/testing.go:1123 +0x202

    Previous read at 0x000002178320 by goroutine 956:
      [failed to restore the stack]

    Goroutine 575 (running) created at:
      testing.(*T).Run()
          /usr/lib/google-golang/src/testing/testing.go:1168 +0x5bb
      testing.runTests.func1()
          /usr/lib/google-golang/src/testing/testing.go:1441 +0xa6
      testing.tRunner()
          /usr/lib/google-golang/src/testing/testing.go:1123 +0x202
      testing.runTests()
          /usr/lib/google-golang/src/testing/testing.go:1439 +0x612
      testing.(*M).Run()
          /usr/lib/google-golang/src/testing/testing.go:1347 +0x3c4
      go.etcd.io/etcd/pkg/v3/testutil.MustTestMainWithLeakDetection()
          /home/ptab/corp/etcd/pkg/testutil/leak.go:150 +0x38
      go.etcd.io/etcd/tests/v3/integration.TestMain()
          /home/ptab/corp/etcd/tests/integration/main_test.go:14 +0x272
      main.main()
          _testmain.go:349 +0x269

    Goroutine 956 (finished) created at:
      google.golang.org/grpc/internal/transport.newHTTP2Server()
          /home/ptab/private/golang/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/http2_server.go:288 +0x18a4
      google.golang.org/grpc/internal/transport.NewServerTransport()
          /home/ptab/private/golang/pkg/mod/google.golang.org/grpc@v1.29.1/internal/transport/transport.go:534 +0x2f5
      google.golang.org/grpc.(*Server).newHTTP2Transport()
          /home/ptab/private/golang/pkg/mod/google.golang.org/grpc@v1.29.1/server.go:726 +0x2ca
      google.golang.org/grpc.(*Server).handleRawConn()
          /home/ptab/private/golang/pkg/mod/google.golang.org/grpc@v1.29.1/server.go:693 +0x60f
      google.golang.org/grpc.(*Server).Serve.func3()
          /home/ptab/private/golang/pkg/mod/google.golang.org/grpc@v1.29.1/server.go:663 +0x4c
    ==================
    ...
    {"level":"info","ts":"2021-01-09T22:21:04.550+0100","caller":"embed/etcd.go:330","msg":"closed etcd server","name":"default","data-dir":"/tmp/etcd-017337431","advertise-peer-urls":["http://localhost:2380"],"advertise-client-urls":["http://localhost:2379"]}
    --- FAIL: TestKVWithEmptyValue (1.08s)
        v3_kv_test.go:62: my-namespace/foobar = data
        v3_kv_test.go:62: my-namespace/foobar1 = data
        v3_kv_test.go:62: namespace/foobar1 = data
        v3_kv_test.go:72: foobar = data
        v3_kv_test.go:72: foobar1 = data
        v3_kv_test.go:87: delete keys:2
        testing.go:1038: race detected during execution of test
```
2021-01-11 10:06:31 +01:00
Dan Lorenc
5b90402082 Switch from dgrijalva/jwt-go to form3tech-oss/jwt-go.
dgrijalva/jwt-go has been abandoned and contains several serious
security issues. Most projects are now switching to the form3tech fork.

See https://snyk.io/vuln/SNYK-GOLANG-GITHUBCOMDGRIJALVAJWTGO-596515 for
info on the issues.

Signed-off-by: Dan Lorenc <dlorenc@google.com>
2021-01-10 08:04:20 -06:00