Re: [PATCH] Btrfs: fix partial loss of prealloc extent past i_size after fsync

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 21, 2020 at 11:25:20AM +0100, fdmanana@xxxxxxxxxx wrote:
> From: Filipe Manana <fdmanana@xxxxxxxx>
> 
> When we have an inode with a prealloc extent that starts at an offset
> lower than the i_size and there is another prealloc extent that starts at
> an offset beyond i_size, we can end up losing part of the first prealloc
> extent (the part that starts at i_size) and have an implicit hole if we
> fsync the file and then have a power failure.
> 
> Consider the following example with comments explaining how and why it
> happens.
> 
>   $ mkfs.btrfs -f /dev/sdb
>   $ mount /dev/sdb /mnt
> 
>   # Create our test file with 2 consecutive prealloc extents, each with a
>   # size of 128Kb, and covering the range from 0 to 256Kb, with a file
>   # size of 0.
>   $ xfs_io -f -c "falloc -k 0 128K" /mnt/foo
>   $ xfs_io -c "falloc -k 128K 128K" /mnt/foo
> 
>   # Fsync the file to record both extents in the log tree.
>   $ xfs_io -c "fsync" /mnt/foo
> 
>   # Now do a redudant extent allocation for the range from 0 to 64Kb.
>   # This will merely increase the file size from 0 to 64Kb. Instead we
>   # could also do a truncate to set the file size to 64Kb.
>   $ xfs_io -c "falloc 0 64K" /mnt/foo
> 
>   # Fsync the file, so we update the inode item in the log tree with the
>   # new file size (64Kb). This also ends up setting the number of bytes
>   # for the first prealloc extent to 64Kb. This is done by the truncation
>   # at btrfs_log_prealloc_extents().
>   # This means that if a power failure happens after this, a write into
>   # the file range 64Kb to 128Kb will not use the prealloc extent and
>   # will result in allocation of a new extent.
>   $ xfs_io -c "fsync" /mnt/foo
> 
>   # Now set the file size to 256K with a truncate and then fsync the file.
>   # Since no changes happened to the extents, the fsync only updates the
>   # i_size in the inode item at the log tree. This results in an implicit
>   # hole for the file range from 64Kb to 128Kb, something which fsck will
>   # complain when not using the NO_HOLES feature if we replay the log
>   # after a power failure.
>   $ xfs_io -c "truncate 256K" -c "fsync" /mnt/foo
> 
> So instead of always truncating the log to the inode's current i_size at
> btrfs_log_prealloc_extents(), check first if there's a prealloc extent
> that starts at an offset lower than the i_size and with a length that
> crosses the i_size - if there is one, just make sure we truncate to a
> size that corresponds to the end offset of that prealloc extent, so
> that we don't lose the part of that extent that starts at i_size if a
> power failure happens.
> 
> A test case for fstests follows soon.
> 
> Signed-off-by: Filipe Manana <fdmanana@xxxxxxxx>

Added to misc-next, thanks.



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux