On Tue, Nov 12, 2019 at 5:35 PM Josef Bacik <josef@xxxxxxxxxxxxxx> wrote:
>
> On Tue, Nov 12, 2019 at 03:13:31PM +0000, fdmanana@xxxxxxxxxx wrote:
> > From: Filipe Manana <fdmanana@xxxxxxxx>
> >
> > When using the NO_HOLES feature, if we punch a hole into a file and then
> > fsync it, there is a case where a subsequent fsync will miss the fact that
> > a hole was punched:
> >
> > 1) The extent items of the inode span multiple leafs;
> >
> > 2) The hole covers a range that affects only the extent items of the first
> > leaf;
> >
> > 3) The fsync operation is done in full mode (BTRFS_INODE_NEEDS_FULL_SYNC
> > is set in the inode's runtime flags).
> >
> > That results in the hole not existing after replaying the log tree.
> >
> > For example, if the fs/subvolume tree has the following layout for a
> > particular inode:
> >
> > Leaf N, generation 10:
> >
> > [ ... INODE_ITEM INODE_REF EXTENT_ITEM (0 64K) EXTENT_ITEM (64K 128K) ]
> >
> > Leaf N + 1, generation 10:
> >
> > [ EXTENT_ITEM (128K 64K) ... ]
> >
> > If at transaction 11 we punch a hole coverting the range [0, 128K[, we end
> > up dropping the two extent items from leaf N, but we don't touch the other
> > leaf, so we end up in the following state:
> >
> > Leaf N, generation 11:
> >
> > [ ... INODE_ITEM INODE_REF ]
> >
> > Leaf N + 1, generation 10:
> >
> > [ EXTENT_ITEM (128K 64K) ... ]
> >
> > A full fsync after punching the hole will only process leaf N because it
> > was modified in the current transaction, but not leaf N + 1, since it was
> > not modified in the current transaction (generation 10 and not 11). As
> > a result the fsync will not log any holes, because it didn't process any
> > leaf with extent items.
> >
> > So fix this by detecting any leading hole in the file for a full fsync
> > when using the NO_HOLES feature if we didn't process any extent items for
> > the file.
> >
> > A test case for fstests follows soon.
> >
> > Fixes: 16e7549f045d33 ("Btrfs: incompatible format change to remove hole extents")
> > Signed-off-by: Filipe Manana <fdmanana@xxxxxxxx>
>
> This adds an extra search for every FULL_SYNC, can we just catch this case in
> the main loop, say we keep track of the last extent we found,
It's already doing that by checking if "last_extent == 0" before
calling the new function.
Having last_extent == 0, no extents processed is very rare (hitting
that specific item layout and hole range).
> and then when we
> end up with ret > 1 || a min_key that's past the end of the last extent we saw
> we know we had a hole punch? Thanks,
>
> Josef