Previously the SetConsistentIndex() is called during the apply workflow,
but it's outside the db transaction. If a commit happens between SetConsistentIndex
and the following apply workflow, and etcd crashes for whatever reason right
after the commit, then etcd commits an incomplete transaction to db.
Eventually etcd runs into the data inconsistency issue.
In this commit, we move the SetConsistentIndex into a txPostLockHook, so
it will be executed inside the transaction lock.
Thanks to this change:
- all the maps bucket -> buffer are indexed by int's instead of
string. No need to do: byte[] -> string -> hash conversion on each
access.
- buckets are strongly typed in backend/mvcc API.
This makes (bbolt) backend a full feature snapshot in term of WAL/raft,
i.e. carries:
- commit : (applied_index)
- confState
Benefits:
- Backend will be a sufficient point in time definition sufficient to
start replaying WAL. We have applied_index & confState in consistent
state.
- In case of emergency a backend state can be used for recovery
correct 'backend' (bbolt) context in aspect of membership.
Prior to this change the 'restored' backend used to still contain:
- old memberid (mvcc deletion used, why the membership is in bolt
bucket, but not mvcc part):
```
mvs := mvcc.NewStore(s.lg, be, lessor, ci, mvcc.StoreConfig{CompactionBatchLimit: math.MaxInt32})
defer mvs.Close()
txn := mvs.Write(traceutil.TODO())
btx := be.BatchTx()
del := func(k, v []byte) error {
txn.DeleteRange(k, nil)
return nil
}
// delete stored members from old cluster since using new members
btx.UnsafeForEach([]byte("members"), del)
```
- didn't get new members added.