Windows requires this lock to be released before the directory is
renamed. But on unix-like operating systems, releasing the lock and
trying to reacquire it immediately can be flaky if a process is forked
around the same time. The file descriptors are marked as close-on-exec
by the Go runtime, but there is a window between the fork and exec where
another process will be holding the lock.
Whenever the WAL is opened for writes, it should write zeroes to its tail
starting from the first zero record. Otherwise, if there are entries past
the first zero record due to a torn write, any new writes that overlap the
old entries will lead to a garbage record on the tail and cause a CRC
mismatch.
Code is only there to handle an edge case where the tail wasn't preallocated
already (e.g., via old etcd version or a crash). It also triggers tmpfs
corruption, so remove it.
On ReadAll, WAL seeks to the end of the last record in the tail. If the tail did not
end with preallocated space, the decoder would report 0 as the last offset and begin
writing at offset 0 of the tail.
Fixes#4903
File lock interface was more verbose than it needed to be while
simultaneously making it difficult to support systems (e.g., Windows)
that only permit locked writes on a single fd holding the lock.
This patch adds error logging in wal.Close() if unlocking and
destroying fail. Though it is hard to handling the errors, logging
would be helpful for trouble shooting.
Backup process should be able to read all WALs until io.EOF to
generate a point-in-time backup.
Our WAL file is append-only. And the backup process will lock all
files before start reading, which can prevent the gc routine from
removing any files in the middle.
If the process dies during wal.cut(), it might leave a corrupted wal
file. This commit solves the problem by creating a temp wal file first,
then atomically rename it to a wal file when we are sure it is vaild.
SaveSnap and Save are called in separate goroutines now. Allow at most
one WAL function being called at one time to protect internal fields and
guarantee execution order.
Or one possible bug is that the new cut file is started with snapshot
entry instead of crc entry.
It is safe to repair the unexpectedEOF error in the last wal. raft
will not send out message before the entry successfully comitted
into wal. Thus we can safely truncate the last entry in the wal
to repair.
ReleaseLockTo should not release the lock on the WAL
segment that is right before the given index. When
restarting etcd, etcd needs to read from the WAL segment
that has a smaller index than the snapshot index.
The correct behavior is that ReleaseLockTo releases
the locks w is holding so that w only holds one lock
that has an index smaller than the given index.
WAL should control the cut logic itself. We want to do falloc to
per allocate the space for a segmented wal file at the beginning
and cut it when it size reaches the limit.