Re: [PATCH v2] Btrfs: fix missing hole after hole punching and fsync when using NO_HOLES

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 19, 2019 at 12:07:33PM +0000, fdmanana@xxxxxxxxxx wrote:
> From: Filipe Manana <fdmanana@xxxxxxxx>
> 
> When using the NO_HOLES feature, if we punch a hole into a file and then
> fsync it, there are cases where a subsequent fsync will miss the fact that
> a hole was punched, resulting in the holes not existing after replaying
> the log tree.
> 
> Essentially these cases all imply that, tree-log.c:copy_items(), is not
> invoked for the leafs that delimit holes, because nothing changed those
> leafs in the current transaction. And it's precisely copy_items() where
> we currenly detect and log holes, which works as long as the holes are
> between file extent items in the input leaf or between the beginning of
> input leaf and the previous leaf or between the last item in the leaf
> and the next leaf.
> 
> First example where we miss a hole:
> 
>   *) The extent items of the inode span multiple leafs;
> 
>   *) The punched hole covers a range that affects only the extent items of
>      the first leaf;
> 
>   *) The fsync operation is done in full mode (BTRFS_INODE_NEEDS_FULL_SYNC
>      is set in the inode's runtime flags).
> 
>   That results in the hole not existing after replaying the log tree.
> 
>   For example, if the fs/subvolume tree has the following layout for a
>   particular inode:
> 
>       Leaf N, generation 10:
> 
>       [ ... INODE_ITEM INODE_REF EXTENT_ITEM (0 64K) EXTENT_ITEM (64K 128K) ]
> 
>       Leaf N + 1, generation 10:
> 
>       [ EXTENT_ITEM (128K 64K) ... ]
> 
>   If at transaction 11 we punch a hole coverting the range [0, 128K[, we end
>   up dropping the two extent items from leaf N, but we don't touch the other
>   leaf, so we end up in the following state:
> 
>       Leaf N, generation 11:
> 
>       [ ... INODE_ITEM INODE_REF ]
> 
>       Leaf N + 1, generation 10:
> 
>       [ EXTENT_ITEM (128K 64K) ... ]
> 
>   A full fsync after punching the hole will only process leaf N because it
>   was modified in the current transaction, but not leaf N + 1, since it
>   was not modified in the current transaction (generation 10 and not 11).
>   As a result the fsync will not log any holes, because it didn't process
>   any leaf with extent items.
> 
> Second example where we will miss a hole:
> 
>   *) An inode as its items spanning 5 (or more) leafs;
> 
>   *) A hole is punched and it covers only the extents items of the 3rd
>      leaf. This resulsts in deleting the entire leaf and not touching any
>      of the other leafs.
> 
>   So the only leaf that is modified in the current transaction, when
>   punching the hole, is the first leaf, which contains the inode item.
>   During the full fsync, the only leaf that is passed to copy_items()
>   is that first leaf, and that's not enough for the hole detection
>   code in copy_items() to determine there's a hole between the last
>   file extent item in the 2nd leaf and the first file extent item in
>   the 3rd leaf (which was the 4th leaf before punching the hole).
> 
> Fix this by scanning all leafs and punch holes as necessary when doing a
> full fsync (less common than a non-full fsync) when the NO_HOLES feature
> is enabled. The lack of explicit file extent items to mark holes makes it
> necessary to scan existing extents to determine if holes exist.
> 
> A test case for fstests follows soon.
> 
> Fixes: 16e7549f045d33 ("Btrfs: incompatible format change to remove hole extents")
> Signed-off-by: Filipe Manana <fdmanana@xxxxxxxx>

Reviewed-by: Josef Bacik <josef@xxxxxxxxxxxxxx>

Thanks,

Josef



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux