Re: [PATCH] Btrfs: fix race between shrinking truncate and fiemap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 07, 2020 at 12:23:09PM +0000, fdmanana@xxxxxxxxxx wrote:
> From: Filipe Manana <fdmanana@xxxxxxxx>
> 
> When there is a fiemap executing in parallel with a shrinking truncate we
> can end up in a situation where we have extent maps for which we no longer
> have corresponding file extent items. This is generally harmless and at
> the moment the only consequences are missing file extent items representing
> holes after we expand the file size again after the truncate operation
> removed the prealloc extent items, and stale information for future fiemap
> calls (reporting extents that no longer exist or may have been reallocated
> to other files for example).
> 
> Consider the following example:
> 
> 1) Our inode has a size of 128Kb, one 128Kb extent at file offset 0 and
>    a 1Mb prealloc extent at file offset 128Kb;
> 
> 2) Task A starts doing a shrinking truncate of our inode to reduce it to
>    a size of 64Kb. Before it searches the subvolume tree for file extent
>    items to delete, it drops all the extent maps in the range from 64Kb
>    to (u64)-1 by calling btrfs_drop_extent_cache();
> 
> 3) Task B starts doing a fiemap against our inode. When looking up for the
>    inode's extent maps in the range from 128Kb to (u64)-1, it doesn't find
>    any in the inode's extent map tree, since they were removed by task A.
>    Because it didn't find any in the extent map tree, it scans the inode's
>    subvolume tree for file extent items, and it finds the 1Mb prealloc
>    extent at file offset 128Kb, then it creates an extent map based on
>    that file extent item and adds it to inode's extent map tree (this ends
>    up being done by btrfs_get_extent() <- btrfs_get_extent_fiemap() <-
>    get_extent_skip_holes());
> 
> 4) Task A then drops the prealloc extent at file offset 128Kb and shrinks
>    the 128Kb extent file offset 0 to a length of 64Kb. The truncation
>    operation finishes and we end up with an extent map representing a 1Mb
>    prealloc extent at file offset 128Kb, despite we don't have any more
>    that extent;
> 
> After this the two types of problems we have are:
> 
> 1) Future calls to fiemap always report that a 1Mb prealloc extent exists
>    at file offset 128Kb. This is stale information, no longer correct;
> 
> 2) If the size of the file is increased, by a truncate operation that
>    increases the file size or by a write into a file offset > 64Kb for
>    example, we end up not inserting file extent items to represent
>    holes for any range between 128Kb and 128Kb + 1Mb, since the hole
>    expansion function, btrfs_cont_expand() will skip hole insertion
>    for any range for which an extent map exists that represents a
>    prealloc extent. This causes fsck to complain about missing file
>    extent items when not using the NO_HOLES feature.
> 
> The second issue could be often triggered by test case generic/561 from
> fstests, which runs fsstress and duperemove in parallel, and duperemove
> does frequent fiemap calls.
> 
> Essentially the problems happens because fiemap does not acquire the
> inode's lock while truncate does, and fiemap locks the file range in the
> inode's iotree while truncate does not. So fix the issue by making
> btrfs_truncate_inode_items() lock the file range from the new file size
> to (u64)-1, so that it serializes with fiemap.
> 
> Signed-off-by: Filipe Manana <fdmanana@xxxxxxxx>

Added to misc-next, thanks.



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux