Commit inspired by this failure:
https://travis-ci.com/github/etcd-io/etcd/jobs/391164537
This is not happanning locally - but can be forced by removal of go.sum
file. Let's watch how frequently we will need to refresh go.sum.
This refactoring offers following benefits:
- Unified way how go test commands are being called (in terms of flags intepretation)
- Uses standard go mechanisms (like go lists) to find files/packages that are subject for test. The mechanism are module aware.
- Added instruction how to install tools needed for the tests/checkers.
- Added colors to the output to make it easier to spot any failure.
Confirmed to work using:
- COVERDIR="./coverage" CPU="4" RACE=false COVER=false PASSES="build build_cov cov" ./test
- CPU="4" RACE=false COVER=false PASSES="e2e functional integration" ./test
- COVERDIR="./coverage" COVER="false" CPU="4" RACE="false" PASSES="fmt build unit build_cov integration e2e integration_e2e grpcproxy cov" ./test
- PASSES=unit PKG=./wal TIMEOUT=1m ./test
- PASSES=integration PKG=./clientv3 TIMEOUT=1m ./test
- PASSES=unit PKG=./wal TESTCASE=TestNew TIMEOUT=1m ./test
- PASSES=unit PKG=./wal TESTCASE="\bTestNew\b" TIMEOUT=1m ./test
- PASSES=integration PKG=./client/integration TESTCASE="\bTestV2NoRetryEOF\b" TIMEOUT=1m ./test
- COVERDIR=coverage PASSES="build_cov cov" ./test
To improve debuggability of `agreement among raft nodes before
linearized reading`, we added some tracing inside
`linearizableReadLoop`.
This will allow us to know the timing of `s.r.ReadIndex` vs
`s.applyWait.Wait(rs.Index)`.
Examplar flake: https://travis-ci.com/github/etcd-io/etcd/jobs/388806782
```
go test -timeout=5m -cpu=1 --run=Example ./client/...
ok go.etcd.io/etcd/v3/client 0.085s
testing: warning: no tests to run
PASS
Unexpected goroutines running after all test(s).
1 instances of:
text/template/parse.(*lexer).emit(...)
/usr/local/go/src/text/template/parse/lex.go:157
text/template/parse.lexText(...)
/usr/local/go/src/text/template/parse/lex.go:269 +0x4f0
text/template/parse.(*lexer).run(...)
/usr/local/go/src/text/template/parse/lex.go:230 +0x37
created by text/template/parse.lex
/usr/local/go/src/text/template/parse/lex.go:223 +0x190
FAIL go.etcd.io/etcd/v3/client/integration 0.013s
```
The grpc-proxy test logic was assuming that the context associated to client is closed,
while in practice all tests called client.Close() without explicit context close.
The current testing strategy is complicated 2 fold:
- grpc proxy works like man-in-the middle of each Connection issues
from integration tests and its lifetime is bound to the connection.
- both connections (client -> proxy, and proxy -> etcd-server) are
represented by the same ClientV3 object instance (with substituted
implementations of KV or watcher).
The fix splits context representing proxy from context representing proxy -> etcd-server connection,
thus allowing cancelation of the proxy context.
SubTraceStart and SubTraceEnd steps are only placeholders, not really
steps, we should skip them when logging the long duration steps,
otherwise these steps will lead to incorrect start time and duration
of subsequent steps.
This commit:
- Fires a critical alert when the etcd database quota is 95% full
at any given point of time to alert the user to defrag or increase
the quota in order to avoid the alarm getting triggered which blocks
all the writes to etcd meaning there can't be any new objects created.
This is needed to make sure the cluster supports running large number
of nodes and objects.
- Fires a warning when there is a sudden surge in etcd writes leading to
increase in the etcd database quota size at an alarming rate as it
is disruptive. It might be because of a rougue process and it's
important to alert the admin.
* etcdserver: check authinfo if it is not InternalAuthenticateRequest.
* credentials: let GetRequestMetadata() return nil when authToken isn't initialized.
* clientv3: get AuthToken gracefully without extra connection.
Direct syscalls using syscall.Syscall(SYS_*, ...) should no longer be
used on darwin, see [1]. Instead, use the fcntl libSystem wrappers
provided by the golang.org/x/sys/unix package which implement the same
functionality.
[1] https://golang.org/doc/go1.12#darwin
Fix test case errors in etcd-dump-logs and the error is the time zone.
When the GOST time zone is CST or PST and the test case will be failed.
So we should set UTC as the standard time zone.
Fixes:
go test -tags cluster_proxy ./clientv3/integration -v -run TestWatchRequestProgress
Does not fail the grpc-server (completely) by a not implemented RPC.
Failing whole server by remote request is anti-pattern and security
risk.
Prior to the fix, the command line above was failing with:
```
=== RUN TestWatchRequestProgress/0-watcher
panic: not implemented
goroutine 602 [running]:
go.etcd.io/etcd/v3/proxy/grpcproxy.(*watchProxyStream).recvLoop(0xc0004779d0, 0x0, 0x0)
/home/ptab/corp/etcd/proxy/grpcproxy/watch.go:275 +0xac5
go.etcd.io/etcd/v3/proxy/grpcproxy.(*watchProxy).Watch.func1(0xc0034f94a0, 0xc0004779d0)
/home/ptab/corp/etcd/proxy/grpcproxy/watch.go:129 +0x53
created by go.etcd.io/etcd/v3/proxy/grpcproxy.(*watchProxy).Watch
/home/ptab/corp/etcd/proxy/grpcproxy/watch.go:127 +0x3c8
FAIL go.etcd.io/etcd/v3/clientv3/integration 0.215s
FAIL
```
The flake happened e.g. in:
https://travis-ci.com/github/etcd-io/etcd/jobs/386607570
```
--- PASS: TestWatchClose (0.37s)
PASS
Unexpected goroutines running after all test(s).
1 instances of:
testing.runTests.func1.1(...)
/usr/local/go/src/testing/testing.go:1289 +0x60
created by testing.runTests.func1
/usr/local/go/src/testing/testing.go:1289 +0xdb
FAIL go.etcd.io/etcd/v3/clientv3/integration 344.389s
FAIL
```
This is implementation detail of Go testing.lib and we should not worry.
Marked all 'integrational, e2e' as skipped in the --short mode.
Thanks to this we will be able to significantly simplify ./test script.
The run currently takes ~23s.
With (follow up) move of ~clientv3/snapshot to integration tests (as
part of modularization), we can expect this to fall to 5-10s.
```
% time go test --short ./... --count=1
ok go.etcd.io/etcd/v3 0.098s
? go.etcd.io/etcd/v3/Documentation/learning/lock/client [no test files]
? go.etcd.io/etcd/v3/Documentation/learning/lock/storage [no test files]
ok go.etcd.io/etcd/v3/auth 0.724s
? go.etcd.io/etcd/v3/auth/authpb [no test files]
ok go.etcd.io/etcd/v3/client 0.166s
ok go.etcd.io/etcd/v3/client/integration 0.166s
ok go.etcd.io/etcd/v3/clientv3 3.219s
ok go.etcd.io/etcd/v3/clientv3/balancer 1.102s
? go.etcd.io/etcd/v3/clientv3/balancer/connectivity [no test files]
? go.etcd.io/etcd/v3/clientv3/balancer/picker [no test files]
? go.etcd.io/etcd/v3/clientv3/balancer/resolver/endpoint [no test files]
ok go.etcd.io/etcd/v3/clientv3/clientv3util 0.096s [no tests to run]
ok go.etcd.io/etcd/v3/clientv3/concurrency 3.323s
? go.etcd.io/etcd/v3/clientv3/credentials [no test files]
ok go.etcd.io/etcd/v3/clientv3/integration 0.131s
? go.etcd.io/etcd/v3/clientv3/leasing [no test files]
? go.etcd.io/etcd/v3/clientv3/mirror [no test files]
ok go.etcd.io/etcd/v3/clientv3/namespace 0.041s
ok go.etcd.io/etcd/v3/clientv3/naming 0.115s
ok go.etcd.io/etcd/v3/clientv3/ordering 0.121s
ok go.etcd.io/etcd/v3/clientv3/snapshot 19.325s
ok go.etcd.io/etcd/v3/clientv3/yaml 0.090s
ok go.etcd.io/etcd/v3/contrib/raftexample 7.572s
? go.etcd.io/etcd/v3/contrib/recipes [no test files]
ok go.etcd.io/etcd/v3/embed 0.282s
ok go.etcd.io/etcd/v3/etcdctl 0.054s
? go.etcd.io/etcd/v3/etcdctl/ctlv2 [no test files]
ok go.etcd.io/etcd/v3/etcdctl/ctlv2/command 0.117s
? go.etcd.io/etcd/v3/etcdctl/ctlv3 [no test files]
ok go.etcd.io/etcd/v3/etcdctl/ctlv3/command 0.070s
ok go.etcd.io/etcd/v3/etcdmain 0.172s
ok go.etcd.io/etcd/v3/etcdserver 1.698s
? go.etcd.io/etcd/v3/etcdserver/api [no test files]
ok go.etcd.io/etcd/v3/etcdserver/api/etcdhttp 0.075s
ok go.etcd.io/etcd/v3/etcdserver/api/membership 0.104s
? go.etcd.io/etcd/v3/etcdserver/api/membership/membershippb [no test files]
ok go.etcd.io/etcd/v3/etcdserver/api/rafthttp 0.181s
ok go.etcd.io/etcd/v3/etcdserver/api/snap 0.078s
? go.etcd.io/etcd/v3/etcdserver/api/snap/snappb [no test files]
ok go.etcd.io/etcd/v3/etcdserver/api/v2auth 0.142s
ok go.etcd.io/etcd/v3/etcdserver/api/v2discovery 0.035s
ok go.etcd.io/etcd/v3/etcdserver/api/v2error 0.043s
ok go.etcd.io/etcd/v3/etcdserver/api/v2http 0.070s
ok go.etcd.io/etcd/v3/etcdserver/api/v2http/httptypes 0.031s
? go.etcd.io/etcd/v3/etcdserver/api/v2stats [no test files]
ok go.etcd.io/etcd/v3/etcdserver/api/v2store 0.645s
ok go.etcd.io/etcd/v3/etcdserver/api/v2v3 0.218s
? go.etcd.io/etcd/v3/etcdserver/api/v3alarm [no test files]
? go.etcd.io/etcd/v3/etcdserver/api/v3client [no test files]
ok go.etcd.io/etcd/v3/etcdserver/api/v3compactor 1.765s
? go.etcd.io/etcd/v3/etcdserver/api/v3election [no test files]
? go.etcd.io/etcd/v3/etcdserver/api/v3election/v3electionpb [no test files]
? go.etcd.io/etcd/v3/etcdserver/api/v3election/v3electionpb/gw [no test files]
? go.etcd.io/etcd/v3/etcdserver/api/v3lock [no test files]
? go.etcd.io/etcd/v3/etcdserver/api/v3lock/v3lockpb [no test files]
? go.etcd.io/etcd/v3/etcdserver/api/v3lock/v3lockpb/gw [no test files]
ok go.etcd.io/etcd/v3/etcdserver/api/v3rpc 0.091s
ok go.etcd.io/etcd/v3/etcdserver/api/v3rpc/rpctypes 0.012s
ok go.etcd.io/etcd/v3/etcdserver/cindex 0.054s
ok go.etcd.io/etcd/v3/etcdserver/etcdserverpb 0.039s
? go.etcd.io/etcd/v3/etcdserver/etcdserverpb/gw [no test files]
ok go.etcd.io/etcd/v3/functional/agent 0.094s
? go.etcd.io/etcd/v3/functional/cmd/etcd-agent [no test files]
? go.etcd.io/etcd/v3/functional/cmd/etcd-proxy [no test files]
? go.etcd.io/etcd/v3/functional/cmd/etcd-runner [no test files]
? go.etcd.io/etcd/v3/functional/cmd/etcd-tester [no test files]
ok go.etcd.io/etcd/v3/functional/rpcpb 0.060s
? go.etcd.io/etcd/v3/functional/runner [no test files]
ok go.etcd.io/etcd/v3/functional/tester 0.079s
ok go.etcd.io/etcd/v3/integration 0.684s
ok go.etcd.io/etcd/v3/integration/embed 0.101s
ok go.etcd.io/etcd/v3/lease 3.455s
ok go.etcd.io/etcd/v3/lease/leasehttp 2.185s
? go.etcd.io/etcd/v3/lease/leasepb [no test files]
ok go.etcd.io/etcd/v3/mvcc 7.246s
ok go.etcd.io/etcd/v3/mvcc/backend 0.354s
? go.etcd.io/etcd/v3/mvcc/mvccpb [no test files]
ok go.etcd.io/etcd/v3/pkg/adt 0.025s
? go.etcd.io/etcd/v3/pkg/contention [no test files]
? go.etcd.io/etcd/v3/pkg/cpuutil [no test files]
ok go.etcd.io/etcd/v3/pkg/crc 0.008s
? go.etcd.io/etcd/v3/pkg/debugutil [no test files]
ok go.etcd.io/etcd/v3/pkg/expect 0.015s
ok go.etcd.io/etcd/v3/pkg/fileutil 0.268s
ok go.etcd.io/etcd/v3/pkg/flags 0.021s
ok go.etcd.io/etcd/v3/pkg/httputil 0.020s
ok go.etcd.io/etcd/v3/pkg/idutil 0.008s
ok go.etcd.io/etcd/v3/pkg/ioutil 0.025s
ok go.etcd.io/etcd/v3/pkg/logutil 0.047s
? go.etcd.io/etcd/v3/pkg/mock/mockserver [no test files]
? go.etcd.io/etcd/v3/pkg/mock/mockstorage [no test files]
? go.etcd.io/etcd/v3/pkg/mock/mockstore [no test files]
? go.etcd.io/etcd/v3/pkg/mock/mockwait [no test files]
ok go.etcd.io/etcd/v3/pkg/netutil 1.024s
ok go.etcd.io/etcd/v3/pkg/osutil 0.021s
ok go.etcd.io/etcd/v3/pkg/pathutil 0.008s
ok go.etcd.io/etcd/v3/pkg/pbutil 0.008s
ok go.etcd.io/etcd/v3/pkg/proxy 4.081s
ok go.etcd.io/etcd/v3/pkg/report 0.008s
? go.etcd.io/etcd/v3/pkg/runtime [no test files]
ok go.etcd.io/etcd/v3/pkg/schedule 0.009s
ok go.etcd.io/etcd/v3/pkg/srv 0.019s
ok go.etcd.io/etcd/v3/pkg/stringutil 0.008s
? go.etcd.io/etcd/v3/pkg/systemd [no test files]
ok go.etcd.io/etcd/v3/pkg/testutil 0.023s
ok go.etcd.io/etcd/v3/pkg/tlsutil 3.965s
ok go.etcd.io/etcd/v3/pkg/traceutil 0.034s
ok go.etcd.io/etcd/v3/pkg/transport 0.532s
ok go.etcd.io/etcd/v3/pkg/types 0.028s
ok go.etcd.io/etcd/v3/pkg/wait 0.023s
ok go.etcd.io/etcd/v3/proxy/grpcproxy 0.101s
? go.etcd.io/etcd/v3/proxy/grpcproxy/adapter [no test files]
? go.etcd.io/etcd/v3/proxy/grpcproxy/cache [no test files]
ok go.etcd.io/etcd/v3/proxy/httpproxy 0.044s
ok go.etcd.io/etcd/v3/proxy/tcpproxy 0.047s
ok go.etcd.io/etcd/v3/raft 0.312s
ok go.etcd.io/etcd/v3/raft/confchange 0.183s
ok go.etcd.io/etcd/v3/raft/quorum 0.316s
ok go.etcd.io/etcd/v3/raft/raftpb 0.024s
ok go.etcd.io/etcd/v3/raft/rafttest 0.640s
ok go.etcd.io/etcd/v3/raft/tracker 0.026s
ok go.etcd.io/etcd/v3/tests/e2e 0.077s
? go.etcd.io/etcd/v3/tools/benchmark [no test files]
? go.etcd.io/etcd/v3/tools/benchmark/cmd [no test files]
? go.etcd.io/etcd/v3/tools/etcd-dump-db [no test files]
ok go.etcd.io/etcd/v3/tools/etcd-dump-logs 0.088s
? go.etcd.io/etcd/v3/tools/etcd-dump-metrics [no test files]
? go.etcd.io/etcd/v3/tools/local-tester/bridge [no test files]
? go.etcd.io/etcd/v3/version [no test files]
ok go.etcd.io/etcd/v3/wal 1.517s
? go.etcd.io/etcd/v3/wal/walpb [no test files]
go test --short ./... --count=1 76.12s user 12.57s system 375% cpu 23.635 total
```
As we get rid of ./vendor there is no need to update the dependencies.
If anyone needs up-to-date vendor directory locally, getting it is as
simple as:
```go mod vendor```
Fixes:
go test --tags cluster_proxy --timeout=30m -run TestLeasingTxnOwnerGet -v ./clientv3/integration/...
The explicit code to close client is needed due to:
76e769ce95/clientv3/watch.go (L72)
as just ctx close by LeasingKeyValue store does not interrupts opened Watches.
The only way to interrupt open Watch is to close the 'whole' Watcher / Client.
The code used to:
- report time since previous compaction success as a 'duration' of compaction process itself.
- mix real clock with 'injected' clock. This led to strange log-lines
in tests in order of 234543543h of duration.
Added check that ensures that go.mod & go.sum files are up-to-date.
The check verifies whether 'go mod tidy' does not generate any mutations
in these files.
The check can be run on its own:
PASSES="mod_tidy" ./test
Or as part of "fmt" pass:
PASSES="fmt" ./test
Examplar outputs:
```
% PASSES="fmt" ./test
Running with TEST_CPUS: 1,2,4
Starting 'fmt' pass at Fri 11 Sep 2020 11:07:54 PM CEST
'shellcheck' started at Fri 11 Sep 2020 11:07:54 PM CEST
'shellcheck' completed at Fri 11 Sep 2020 11:07:54 PM CEST
'markdown_you' started at Fri 11 Sep 2020 11:07:54 PM CEST
'markdown_you' completed at Fri 11 Sep 2020 11:07:54 PM CEST
'goword' started at Fri 11 Sep 2020 11:07:54 PM CEST
'goword' completed at Fri 11 Sep 2020 11:07:54 PM CEST
'gofmt' started at Fri 11 Sep 2020 11:07:54 PM CEST
'gofmt' completed at Fri 11 Sep 2020 11:07:55 PM CEST
'govet' started at Fri 11 Sep 2020 11:07:55 PM CEST
'govet' completed at Fri 11 Sep 2020 11:07:57 PM CEST
'revive' started at Fri 11 Sep 2020 11:07:57 PM CEST
Skipping revive...
'revive' completed at Fri 11 Sep 2020 11:07:57 PM CEST
'license_header' started at Fri 11 Sep 2020 11:07:57 PM CEST
'license_header' completed at Fri 11 Sep 2020 11:07:58 PM CEST
'receiver_name' started at Fri 11 Sep 2020 11:07:58 PM CEST
'receiver_name' completed at Fri 11 Sep 2020 11:07:58 PM CEST
'commit_title' started at Fri 11 Sep 2020 11:07:58 PM CEST
'commit_title' completed at Fri 11 Sep 2020 11:07:58 PM CEST
'mod_tidy' started at Fri 11 Sep 2020 11:07:58 PM CEST
*** /tmp/fileiALKRA_go.mod 2020-09-11 23:07:58.838010716 +0200
--- ./go.mod 2020-09-11 23:07:58.974010922 +0200
***************
*** 29,39 ****
github.com/mattn/go-runewidth v0.0.2 // indirect
github.com/modern-go/reflect2 v1.0.1
github.com/olekukonko/tablewriter v0.0.0-20170122224234-a0225b3f23b5
github.com/prometheus/client_golang v1.0.0
github.com/prometheus/client_model v0.0.0-20190812154241-14fe0d1b01d4
- github.com/prometheus/common v0.4.1
github.com/sirupsen/logrus v1.4.2 // indirect
github.com/soheilhy/cmux v0.1.4
github.com/spf13/cobra v0.0.3
github.com/spf13/pflag v1.0.1
github.com/tmc/grpc-websocket-proxy v0.0.0-20170815181823-89b8d40f7ca8
--- 29,38 ----
./go.mod is not in sync with 'go mod tidy'
```
```
% PASSES="mod_tidy" ./test
Running with TEST_CPUS: 1,2,4
Starting 'mod_tidy' pass at Fri 11 Sep 2020 11:09:21 PM CEST
*** /tmp/file9gy4so_go.mod 2020-09-11 23:09:21.166133290 +0200
--- ./go.mod 2020-09-11 23:09:21.286133466 +0200
***************
*** 29,39 ****
github.com/mattn/go-runewidth v0.0.2 // indirect
github.com/modern-go/reflect2 v1.0.1
github.com/olekukonko/tablewriter v0.0.0-20170122224234-a0225b3f23b5
github.com/prometheus/client_golang v1.0.0
github.com/prometheus/client_model v0.0.0-20190812154241-14fe0d1b01d4
- github.com/prometheus/common v0.4.1
github.com/sirupsen/logrus v1.4.2 // indirect
github.com/soheilhy/cmux v0.1.4
github.com/spf13/cobra v0.0.3
github.com/spf13/pflag v1.0.1
github.com/tmc/grpc-websocket-proxy v0.0.0-20170815181823-89b8d40f7ca8
--- 29,38 ----
./go.mod is not in sync with 'go mod tidy'
```
To fix a panic that happens when trying to get ids of etcd members in
force new cluster mode, the issue happen if the cluster previously had
etcd learner nodes added to the cluster
Fixes#12285
Currently, the grpc gateway is enabled by default when run the etcd server
using command line configuration. However, when provide config through
config file, the grpc gateway is disabled by default. We should either use
the same approach or at least document this.
related #https://github.com/etcd-io/etcd/issues/12093