Re: [PATCH] Btrfs: Fix deadlock between direct IO and fast fsync

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 23, 2016 at 9:30 AM, Chandan Rajendra
<chandan@xxxxxxxxxxxxxxxxxx> wrote:
> The following deadlock is seen when executing generic/113 test,
>
>  ---------------------------------------------------------+----------------------------------------------------
>   Direct I/O task                                           Fast fsync task
>  ---------------------------------------------------------+----------------------------------------------------
>   btrfs_direct_IO
>     __blockdev_direct_IO
>      do_blockdev_direct_IO
>       do_direct_IO
>        btrfs_get_blocks_direct
>         while (blocks needs to written)
>          get_more_blocks (first iteration)
>           btrfs_get_blocks_direct
>            btrfs_create_dio_extent
>              down_read(&BTRFS_I(inode) >dio_sem)
>              Create and add extent map and ordered extent
>              up_read(&BTRFS_I(inode) >dio_sem)
>                                                             btrfs_sync_file
>                                                               btrfs_log_dentry_safe
>                                                                btrfs_log_inode_parent
>                                                                 btrfs_log_inode
>                                                                  btrfs_log_changed_extents
>                                                                   down_write(&BTRFS_I(inode) >dio_sem)
>                                                                    Collect new extent maps and ordered extents
>                                                                     wait for ordered extent completion
>          get_more_blocks (second iteration)
>           btrfs_get_blocks_direct
>            btrfs_create_dio_extent
>              down_read(&BTRFS_I(inode) >dio_sem)
>  --------------------------------------------------------------------------------------------------------------
>
> In the above description, Btrfs direct I/O code path has not yet started
> submitting bios for file range covered by the initial ordered
> extent. Meanwhile, The fast fsync task obtains the write semaphore and
> waits for I/O on the ordered extent to get completed. However, the
> Direct I/O task is now blocked on obtaining the read semaphore.
>
> To resolve the deadlock, this commit modifies the Direct I/O code path
> to obtain the read semaphore before invoking
> __blockdev_direct_IO(). The semaphore is then given up after
> __blockdev_direct_IO() returns. This allows the Direct I/O code to
> complete I/O on all the ordered extents it creates.
>
> Signed-off-by: Chandan Rajendra <chandan@xxxxxxxxxxxxxxxxxx>

As pointed out by Liu Bo, this was known, discussed and with a
proposed fix in another older thread:
https://www.marc.info/?l=linux-btrfs&m=147998603528213&w=2

I was on vacations and waiting for Josef's feedback as well (cc'ed but
never replied) before making a change log and formalizing the proposed
patch in that thread.

>From my point of view, it's ok and you can have:

Reviewed-by: Filipe Manana <fdmanana@xxxxxxxx>

thanks

> ---
>  fs/btrfs/inode.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 5ca88f0..f796037 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -7325,7 +7325,6 @@ static struct extent_map *btrfs_create_dio_extent(struct inode *inode,
>         struct extent_map *em = NULL;
>         int ret;
>
> -       down_read(&BTRFS_I(inode)->dio_sem);
>         if (type != BTRFS_ORDERED_NOCOW) {
>                 em = create_pinned_em(inode, start, len, orig_start,
>                                       block_start, block_len, orig_block_len,
> @@ -7344,7 +7343,6 @@ static struct extent_map *btrfs_create_dio_extent(struct inode *inode,
>                 em = ERR_PTR(ret);
>         }
>   out:
> -       up_read(&BTRFS_I(inode)->dio_sem);
>
>         return em;
>  }
> @@ -8800,6 +8798,7 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
>                 dio_data.unsubmitted_oe_range_start = (u64)offset;
>                 dio_data.unsubmitted_oe_range_end = (u64)offset;
>                 current->journal_info = &dio_data;
> +               down_read(&BTRFS_I(inode)->dio_sem);
>         } else if (test_bit(BTRFS_INODE_READDIO_NEED_LOCK,
>                                      &BTRFS_I(inode)->runtime_flags)) {
>                 inode_dio_end(inode);
> @@ -8812,6 +8811,7 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
>                                    iter, btrfs_get_blocks_direct, NULL,
>                                    btrfs_submit_direct, flags);
>         if (iov_iter_rw(iter) == WRITE) {
> +               up_read(&BTRFS_I(inode)->dio_sem);
>                 current->journal_info = NULL;
>                 if (ret < 0 && ret != -EIOCBQUEUED) {
>                         if (dio_data.reserve)
> --
> 2.5.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Filipe David Manana,

"People will forget what you said,
 people will forget what you did,
 but people will never forget how you made them feel."
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux