The alarm list is the only exception that doesn't move consistent_index
forward. The reproduction steps are as simple as,
```
etcd --snapshot-count=5 &
for i in {1..6}; do etcdctl alarm list; done
kill -9 <etcd_pid>
etcd
```
Signed-off-by: Benjamin Wang <wachao@vmware.com>
For a cluster with only one member, the raft always send identical
unstable entries and committed entries to etcdserver, and etcd
responds to the client once it finishes (actually partially) the
applying workflow.
When the client receives the response, it doesn't mean etcd has already
successfully saved the data, including BoltDB and WAL, because:
1. etcd commits the boltDB transaction periodically instead of on each request;
2. etcd saves WAL entries in parallel with applying the committed entries.
Accordingly, it may run into a situation of data loss when the etcd crashes
immediately after responding to the client and before the boltDB and WAL
successfully save the data to disk.
Note that this issue can only happen for clusters with only one member.
For clusters with multiple members, it isn't an issue, because etcd will
not commit & apply the data before it being replicated to majority members.
When the client receives the response, it means the data must have been applied.
It further means the data must have been committed.
Note: for clusters with multiple members, the raft will never send identical
unstable entries and committed entries to etcdserver.
Signed-off-by: Benjamin Wang <wachao@vmware.com>
As protobuf doesn't have required field, user may send an empty
WatchRequest by mistake. Currently, etcd will ignore the invalid request
and keep the stream opening. If we don't reject the invalid request by
closing the stream, it would be better to leave a log there.
This commit also fixes a typo in the comment.
Signed-off-by: spacewander <spacewanderlzx@gmail.com>
Problem: We pass grpc context down to applier in readonly serializable txn.
This context can be cancelled for example due to timeout.
This will trigger panic inside applyTxn
Solution: Only panic for transactions with write operations
fixes https://github.com/etcd-io/etcd/issues/14110
Signed-off-by: Bogdan Kanivets <bkanivets@apple.com>
There is no update on the original PR (see below) for more then 2
weeks. So Benjamin(@ahrtr) continues to work on the PR. The first
step is to rebase the PR, because there are lots of conflicts with
the main branch.
The change to go.mod and go.sum reverted, because they are not needed.
The e2e test cases are also reverted, because they are not correct.
```
https://github.com/etcd-io/etcd/pull/14081
```
Signed-off-by: nic-chen <chenjunxu6@gmail.com>
Signed-off-by: Benjamin Wang <wachao@vmware.com>
We have already defined all the constant etcd versions in the
centralized place api/version/version.go. So we should replace all
the versions with the centralized definitions.