In v3.5 it is assumed that the logger should not be nil, however it is
still a case in v3.4. The PR targeted to v3.5 was backported to 3.4 and
that's why it's possible to get panic on nil logger in 3.4. This commit
fixed this issue.
Fixes#14402
Signed-off-by: Vladimir Sokolov <vsvastey@gmail.com>
To avoid inconsistant behavior during cluster upgrade we are feature
gating persistance behind cluster version. This should ensure that
all cluster members are upgraded to v3.6 before changing behavior.
To allow backporting this fix to v3.5 we are also introducing flag
--experimental-enable-lease-checkpoint-persist that will allow for
smooth upgrade in v3.5 clusters with this feature enabled.
Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
This is a backport of https://github.com/etcd-io/etcd/pull/13435 and is
part of the work for 3.4.20
https://github.com/etcd-io/etcd/issues/14232.
The original change had a second commit that modifies a changelog file.
The 3.4 branch does not include any changelog file, so that part was not
cherry-picked.
Local Testing:
- `make build`
- `make test`
Both succeed.
Signed-off-by: Ramsés Morales <ramses@gmail.com>
To improve debuggability of `agreement among raft nodes before
linearized reading`, we added some tracing inside
`linearizableReadLoop`.
This will allow us to know the timing of `s.r.ReadIndex` vs
`s.applyWait.Wait(rs.Index)`.
Signed-off-by: Chao Chen <chaochn@amazon.com>
Cherry pick https://github.com/etcd-io/etcd/pull/13932 to 3.4.
When etcdserver receives a LeaseRenew request, it may be still in
progress of processing the LeaseGrantRequest on exact the same
leaseID. Accordingly it may return a TTL=0 to client due to the
leaseID not found error. So the leader should wait for the appliedID
to be available before processing client requests.
Signed-off-by: Benjamin Wang <wachao@vmware.com>
The first bug fix is to resolve the race condition between goroutine
and channel on the same leases to be revoked. It's a classic mistake
in using Golang channel + goroutine. Please refer to
https://go.dev/doc/effective_go#channels
The second bug fix is to resolve the issue that etcd lessor may
continue to schedule checkpoint after stepping down the leader role.
Signed-off-by: Benjamin Wang <wachao@vmware.com>
Items resolved:
1. fix the vet error: possible misuse of reflect.SliceHeader;
2. fix the vet error: call to (*T).Fatal from a non-test goroutine;
3. bump package golang.org/x/crypto, net and sys;
4. bump boltdb from 1.3.3 to 1.3.6;
5. remove the vendor directory;
6. remove go 1.12.17 and 1.15.15, add go 1.16.15 into pipeline;
7. bump go version to 1.16 in go.mod;
8. fix the issue: compile: version go1.16.15 does not match go tool version go1.17.11,
refer to https://github.com/actions/setup-go/issues/107;
9. fix data race on compactMainRev and watcherGauge;
10. fix test failure for TestLeasingTxnOwnerGet in cluster_proxy mode.
Signed-off-by: Benjamin Wang <wachao@vmware.com>
To fix a panic that happens when trying to get ids of etcd members in
force new cluster mode, the issue happen if the cluster previously had
etcd learner nodes added to the cluster
Fixes#12285
Similar counts are exposed via Prometheus.
This adds the one that are perceived by etcd server.
e.g.
os_fd_limit 120000
os_fd_used 14
process_cpu_seconds_total 0.31
process_max_fds 120000
process_open_fds 17
Signed-off-by: Gyuho Lee <leegyuho@amazon.com>
When an etcd instance attempts to perform service discovery, if a
cluster size with negative value is provided, the etcd instance
will panic without recovery because of
This makes it possible to run an etcd node for testing and development
without placing lots of load on the file system.
Fixes#11930.
Signed-off-by: David Crawshaw <crawshaw@tailscale.com>