Re: [PATCH] btrfs: clear file extent mapping for punch past i_size

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 20, 2020 at 2:30 PM Josef Bacik <josef@xxxxxxxxxxxxxx> wrote:
>
> In my stress testing we were still seeing some hole's with my patches to
> fix missing hole extents.  Turns out we do not fill in holes during hole
> punch if the punch is past i_size.  I incorrectly assumed this was fine,
> because anybody extending would use btrfs_cont_expand, however there is
> a corner that still can give us trouble.  Start with an empty file and
>
> fallocate KEEP_SIZE 1m-2m
>
> We now have a 0 length file, and a hole file extent from 0-1m, and a
> prealloc extent from 1m-2m.  Now
>
> punch 1m-1.5m
>
> Because this is past i_size we have
>
> [HOLE EXTENT][ NOTHING ][PREALLOC]
> [0        1m][1m   1.5m][1.5m  2m]
>
> with an i_size of 0.  Now if we pwrite 0-1.5m we'll increas our i_size
> to 1.5m, but our disk_i_size is still 0 until the ordered extent
> completes.
>
> However if we now immediately truncate 2m on the file we'll just call
> btrfs_cont_expand(inode, 1.5m, 2m), since our old i_size is 1.5m.  If we
> commit the transaction here and crash we'll expose the gap.
>
> To fix this we need to clear the file extent mapping for the range that
> we punched but didn't insert a corresponding file extent for.  This will
> mean the truncate will only get an disk_i_size set to 1m if we crash
> before the finish ordered io happens.
>
> I've written an xfstest to reproduce the problem and validate this fix.
>
> Signed-off-by: Josef Bacik <josef@xxxxxxxxxxxxxx>

I hit another case, besides the one I reported last week, in my
automated tests during one of these evenings.
This is probably it, but I haven't tried to debug it yet - reusing the
same fsstress seed didn't trigger it, as the test used 20 fsstress
processes (with dm log writes).

Looks good,

Reviewed-by: Filipe Manana <fdmanana@xxxxxxxx>

> ---
> - Dave, this needs to be folded into "btrfs: use the file extent tree
>   infrastructure" and the changelog needs to be adjusted since I incorrectly
>   point out that we don't need to clear for hole punch.  We definitely need to
>   clear for the case that we're punching past i_size as we aren't inserting hole
>   file extents.
>
>  fs/btrfs/file.c | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
>
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 5cdcdbdd908b..6f6f1805e6fd 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -2597,6 +2597,24 @@ int btrfs_punch_hole_range(struct inode *inode, struct btrfs_path *path,
>                                 btrfs_abort_transaction(trans, ret);
>                                 break;
>                         }
> +               } else if (!clone_info && cur_offset < drop_end) {
> +                       /*
> +                        * We are past the i_size here, but since we didn't
> +                        * insert holes we need to clear the mapped area so we
> +                        * know to not set disk_i_size in this area until a new
> +                        * file extent is inserted here.
> +                        */
> +                       ret = btrfs_inode_clear_file_extent_range(BTRFS_I(inode),
> +                                       cur_offset, drop_end - cur_offset);
> +                       if (ret) {
> +                               /*
> +                                * We couldn't clear our area, so we could
> +                                * presumably adjust up and corrupt the fs, so
> +                                * we need to abort.
> +                                */
> +                               btrfs_abort_transaction(trans, ret);
> +                               break;
> +                       }
>                 }
>
>                 if (clone_info && drop_end > clone_info->file_offset) {
> @@ -2687,6 +2705,15 @@ int btrfs_punch_hole_range(struct inode *inode, struct btrfs_path *path,
>                         btrfs_abort_transaction(trans, ret);
>                         goto out_trans;
>                 }
> +       } else if (!clone_info && cur_offset < drop_end) {
> +               /* See the comment in the loop above for the reasoning here. */
> +               ret = btrfs_inode_clear_file_extent_range(BTRFS_I(inode),
> +                                       cur_offset, drop_end - cur_offset);
> +               if (ret) {
> +                       btrfs_abort_transaction(trans, ret);
> +                       goto out_trans;
> +               }
> +
>         }
>         if (clone_info) {
>                 ret = btrfs_insert_clone_extent(trans, inode, path, clone_info,
> --
> 2.24.1
>


-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux