Re: [PATCH v2 03/15] btrfs: fix double __endio_write_update_ordered in direct I/O

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 17.04.20 г. 0:46 ч., Omar Sandoval wrote:
> From: Omar Sandoval <osandov@xxxxxx>
> 
> In btrfs_submit_direct(), if we fail to allocate the btrfs_dio_private,
> we complete the ordered extent range. However, we don't mark that the
> range doesn't need to be cleaned up from btrfs_direct_IO() until later.
> Therefore, if we fail to allocate the btrfs_dio_private, we complete the
> ordered extent range twice. We could fix this by updating
> unsubmitted_oe_range earlier, but it's cleaner to reorganize the code so
> that creating the btrfs_dio_private and submitting the bios are
> separate, and once the btrfs_dio_private is created, cleanup always
> happens through the btrfs_dio_private.
> 
> Fixes: f28a49287817 ("Btrfs: fix leaking of ordered extents after direct IO write error")
> Signed-off-by: Omar Sandoval <osandov@xxxxxx>

Generally looks ok, so :

Reviewed-by: Nikolay Borisov <nborisov@xxxxxxxx>, however please see
below for some remarks on the logic around unsubmitted_oe_range_start/end

> ---
>  fs/btrfs/inode.c | 174 ++++++++++++++++++-----------------------------
>  1 file changed, 66 insertions(+), 108 deletions(-)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index b628c319a5b6..f6ce9749adb6 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -7903,14 +7903,60 @@ static inline blk_status_t btrfs_submit_dio_bio(struct bio *bio,
>  	return ret;
>  }
>  
> -static int btrfs_submit_direct_hook(struct btrfs_dio_private *dip)
> +/*
> + * If this succeeds, the btrfs_dio_private is responsible for cleaning up locked
> + * or ordered extents whether or not we submit any bios.
> + */
> +static struct btrfs_dio_private *btrfs_create_dio_private(struct bio *dio_bio,
> +							  struct inode *inode,
> +							  loff_t file_offset)
>  {
> -	struct inode *inode = dip->inode;
> +	const bool write = (bio_op(dio_bio) == REQ_OP_WRITE);
> +	struct btrfs_dio_private *dip;
> +	struct bio *bio;
> +
> +	dip = kzalloc(sizeof(*dip), GFP_NOFS);
> +	if (!dip)
> +		return NULL;
> +
> +	bio = btrfs_bio_clone(dio_bio);
> +	bio->bi_private = dip;
> +	btrfs_io_bio(bio)->logical = file_offset;
> +
> +	dip->private = dio_bio->bi_private;
> +	dip->inode = inode;
> +	dip->logical_offset = file_offset;
> +	dip->bytes = dio_bio->bi_iter.bi_size;
> +	dip->disk_bytenr = (u64)dio_bio->bi_iter.bi_sector << 9;
> +	dip->orig_bio = bio;
> +	dip->dio_bio = dio_bio;
> +	atomic_set(&dip->pending_bios, 1);
> +
> +	if (write) {
> +		struct btrfs_dio_data *dio_data = current->journal_info;
> +
> +		dio_data->unsubmitted_oe_range_end = dip->logical_offset +
> +			dip->bytes;
> +		dio_data->unsubmitted_oe_range_start =
> +			dio_data->unsubmitted_oe_range_end;

The logic around those 2 members is really subtle. We really have the
following:

1. btrfs_direct_IO sets those two to the same value.

2. When we call __blockdev_direct_IO unless
btrfs_get_blocks_direct->btrfs_get_blocks_direct_write is called to
modify unsubmitted_oe_range_start so that start < end. Cleanup won't
happen.

3. We come into btrfs_submit_direct - if it dip allocation fails we'd
return with oe_range_end now modified so cleanup will happen.

4. If we manage to allocate the dip we reset the unsubmitted range
members to be equal so that cleanup happens from btrfs_endio_direct_write.

This 4-step logic is not really obvious, especially given it's scattered
across 3 functions. Perhaps a comment saying that having the 2 members
being equal means cleanup in btrfs_DIRECT_io is disabled.

I'd rather have it spelled out in the changelog, I guess David can also
do that ?

> +
> +		bio->bi_end_io = btrfs_endio_direct_write;
> +	} else {
> +		bio->bi_end_io = btrfs_endio_direct_read;
> +		dip->subio_endio = btrfs_subio_endio_read;
> +	}
> +	return dip;
> +}
> +

<snip>




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux