A client-side optimization was made in #6100 to filter ascending key sorts to avoid an unnecessary re-sort since this is the order already returned by the back-end logic.
It seems to me that this really belongs on the server side since it's tied to the server implementation and should apply for any caller of the kv api (for example non-go clients).
Related, the client/v3 syncer depends on this default sorting which isn't explicit in the kv api contract. So I'm proposing the required sort parameters be included explicitly; it will take the fast path either way.
add 'FAIL:' to egrep
```shell
[root@LF-136-9 etcd]# egrep "(--- FAIL:|DATA RACE|panic: test timed out|appears to have leaked)" -B50 -A10 test-MTYyNjIyOTc0MQo.log
[root@LF-136-9 etcd]# egrep "(--- FAIL:|FAIL:|DATA RACE|panic: test timed out|appears to have leaked)" -B50 -A10 test-MTYyNjIyOTc0MQo.log
ok go.etcd.io/etcd/server/v3/auth 3.247s
ok go.etcd.io/etcd/server/v3/config 0.047s
ok go.etcd.io/etcd/server/v3/datadir 0.035s
ok go.etcd.io/etcd/server/v3/embed 1.944s
ok go.etcd.io/etcd/server/v3/etcdmain 0.326s
FAIL go.etcd.io/etcd/server/v3/etcdserver [build failed]
? go.etcd.io/etcd/server/v3/etcdserver/api [no test files]
ok go.etcd.io/etcd/server/v3/etcdserver/api/etcdhttp 0.110s
ok go.etcd.io/etcd/server/v3/etcdserver/api/membership 0.479s
ok go.etcd.io/etcd/server/v3/etcdserver/api/rafthttp 0.251s
ok go.etcd.io/etcd/server/v3/etcdserver/api/snap 0.045s
? go.etcd.io/etcd/server/v3/etcdserver/api/snap/snappb [no test files]
ok go.etcd.io/etcd/server/v3/etcdserver/api/v2auth 1.470s
ok go.etcd.io/etcd/server/v3/etcdserver/api/v2discovery 0.088s
ok go.etcd.io/etcd/server/v3/etcdserver/api/v2error 0.034s
ok go.etcd.io/etcd/server/v3/etcdserver/api/v2http 0.128s
ok go.etcd.io/etcd/server/v3/etcdserver/api/v2http/httptypes 0.033s
? go.etcd.io/etcd/server/v3/etcdserver/api/v2stats [no test files]
ok go.etcd.io/etcd/server/v3/etcdserver/api/v2store 0.068s
? go.etcd.io/etcd/server/v3/etcdserver/api/v2v3 [no test files]
? go.etcd.io/etcd/server/v3/etcdserver/api/v3alarm [no test files]
? go.etcd.io/etcd/server/v3/etcdserver/api/v3client [no test files]
ok go.etcd.io/etcd/server/v3/etcdserver/api/v3compactor 1.793s
? go.etcd.io/etcd/server/v3/etcdserver/api/v3election [no test files]
? go.etcd.io/etcd/server/v3/etcdserver/api/v3election/v3electionpb [no test files]
? go.etcd.io/etcd/server/v3/etcdserver/api/v3election/v3electionpb/gw [no test files]
? go.etcd.io/etcd/server/v3/etcdserver/api/v3lock [no test files]
? go.etcd.io/etcd/server/v3/etcdserver/api/v3lock/v3lockpb [no test files]
? go.etcd.io/etcd/server/v3/etcdserver/api/v3lock/v3lockpb/gw [no test files]
ok go.etcd.io/etcd/server/v3/etcdserver/api/v3rpc 0.089s
ok go.etcd.io/etcd/server/v3/etcdserver/cindex 0.045s
ok go.etcd.io/etcd/server/v3/lease 3.324s
ok go.etcd.io/etcd/server/v3/lease/leasehttp 2.096s
? go.etcd.io/etcd/server/v3/lease/leasepb [no test files]
? go.etcd.io/etcd/server/v3/mock/mockstorage [no test files]
? go.etcd.io/etcd/server/v3/mock/mockstore [no test files]
? go.etcd.io/etcd/server/v3/mock/mockwait [no test files]
ok go.etcd.io/etcd/server/v3/mvcc 8.805s
ok go.etcd.io/etcd/server/v3/mvcc/backend 1.983s
? go.etcd.io/etcd/server/v3/mvcc/backend/testing [no test files]
? go.etcd.io/etcd/server/v3/mvcc/buckets [no test files]
? go.etcd.io/etcd/server/v3/proxy/grpcproxy [no test files]
? go.etcd.io/etcd/server/v3/proxy/grpcproxy/adapter [no test files]
? go.etcd.io/etcd/server/v3/proxy/grpcproxy/cache [no test files]
ok go.etcd.io/etcd/server/v3/proxy/httpproxy 0.046s
ok go.etcd.io/etcd/server/v3/proxy/tcpproxy 0.035s
? go.etcd.io/etcd/server/v3/verify [no test files]
ok go.etcd.io/etcd/server/v3/wal 0.513s
ok go.etcd.io/etcd/server/v3/wal/walpb 0.045s
FAIL
FAIL: (code:2):
% (cd server && env go test -short -timeout=3m --race --cpu=16 ./...)
FAIL: 'unit' failed at Wed Jul 14 10:29:37 CST 2021
```
When a unary request takes more than predefined duration, this request
is defined as "expensive" and a warning is printed. The expensive request
duration is hard-coded to 300 ms. It can be not enough for example
for transactions with a lot of operations. The warnings just blow up
the log files and reduce throughput.
This fix allows user to configure the "expensive" request duration.
Signed-off-by: Alexey Roytman <roytman@il.ibm.com>
The file `zap_raft.go` adds the raft.Logger proxy logger on top of `*zap.Logger`.
Adding a proxy requires adding the option `zap.AddCallerSkip(1)`,
so that the logging message specifies the correct caller,
two of the three constructors in the `zap_raft.go` adds this option.
This commit fixes the third constructor so that it also adds `zap.AddCallerSkip`.
Before fix:
`{"level":"info","ts":"2021-07-22T17:46:01.435Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"bd07d29169ff0c5a [logterm: 2, index: 8, vote: 38447ba545569bbe] ignored MsgPreVote from c7baeaad79d6d5ed [logterm: 2, index: 8] at term 2: lease is not expired (remaining ticks: 10)"}`
After fix:
`{"level":"info","ts":"2021-07-22T17:46:51.227Z","logger":"raft","caller":"raft/raft.go:859","msg":"bd07d29169ff0c5a [logterm: 2, index: 8, vote: c7baeaad79d6d5ed] ignored MsgPreVote from 38447ba545569bbe [logterm: 2, index: 8] at term 2: lease is not expired (remaining ticks: 9)"}`
During the refactoring process, duplicate logging
of the send buffer overflow event was added.
Each of these log lines logs exactly the same information, the logging
context is sufficient to distinguish the cause.
Additionally, the unnecessary context (in parentheses) in the log
message was removed, which was necessary without the zap context (with
the old logger), but now only confuses.
If one of the nodes in the cluster has lost a dns record,
restarting the second node will break it.
This PR makes an attempt to add a comparison without using a resolver,
which allows to protect cluster from dns errors and does not break
the current logic of comparing urls in the URLStringsEqual function.
You can read more in the issue #7798Fixes#7798
we found a lease leak issue:
if a new member(by member add) is recovered by snapshot, and then
become leader, the lease will never expire afterwards. leader will
log the revoke failure caused by "invalid auth token", since the
token provider is not functional, and drops all generated token
from upper layer, which in this case, is the lease revoking
routine.
When running 100 times in row those tests flaked around 10-20%. Based on
some experimentation 10 keys was enough to ensure that wal snapshot is
created and prevented any flakes.