Problem: We pass grpc context down to applier in readonly serializable txn.
This context can be cancelled for example due to timeout.
This will trigger panic inside applyTxn
Solution: Only panic for transactions with write operations
fixes https://github.com/etcd-io/etcd/issues/14110
Signed-off-by: Bogdan Kanivets <bkanivets@apple.com>
The `ErrFileNotFound` was used for for three cases:
1. There is no any WAL files (probably due to no read permission);
2. There is no WAL files matching the snapshot index;
3. The WAL file seqs do not increase continuously.
It's not good for debug when users see the `ErrFileNotFound` error,
so in this PR, a different error is returned for each case above.
Signed-off-by: Benjamin Wang <wachao@vmware.com>
Currently the max size of each WAL entry is hard coded as 10MB. If users
set a value > 10MB for the flag --max-request-bytes, then etcd may run
into a situation that it successfully processes a big request, but fails
to decode it when replaying the WAL file on startup.
On the other hand, we can't just remove the limitation, because if a
WAL entry is somehow corrupted, and its recByte is a huge value, then
etcd may run out of memory. So the solution is to restrict the max size
of each WAL entry as a dynamic value, which is the remaining size of
the WAL file.
Signed-off-by: Benjamin Wang <wachao@vmware.com>
The TestFilePipelineFailLockFile case maybe test the line 77 in the alloc function. But the LockFile will not throw an error when the file lock is hold. And TestFilePipelineFailPreallocate and TestFilePipelineFailLockFile are exactly the same except for the comments.
Signed-off-by: SimFG <1142838399@qq.com>
We have already defined all the constant etcd versions in the
centralized place api/version/version.go. So we should replace all
the versions with the centralized definitions.
Reason to store CI and term in backend was to make db fully independent
snapshot, it was never meant to interfere with apply logic. Skip of CI
was introduced for v2->v3 migration where we wanted to prevent it from
decreasing when replaying wal in
https://github.com/etcd-io/etcd/pull/5391. By mistake it was added to
apply flow during refactor in
https://github.com/etcd-io/etcd/pull/12855#commitcomment-70713670.
Consistency index and term should only be negotiated and used by raft to make
decisions. Their values should only driven by raft state machine and
backend should only be responsible for storing them.
Removed the fields consistentIdx and consistentTerm from struct EtcdServer,
and added applyingIndex and applyingTerm into struct consistentIndex in
package cindex. We may remove the two fields completely if we decide to
remove the OnPreCommitUnsafe, and it will depend on the performance test
result.
Previously the SetConsistentIndex() is called during the apply workflow,
but it's outside the db transaction. If a commit happens between SetConsistentIndex
and the following apply workflow, and etcd crashes for whatever reason right
after the commit, then etcd commits an incomplete transaction to db.
Eventually etcd runs into the data inconsistency issue.
In this commit, we move the SetConsistentIndex into a txPostLockHook, so
it will be executed inside the transaction lock.