On Friday, December 23, 2016 03:57:40 PM Chandan Rajendra wrote: > On Friday, December 23, 2016 03:00:18 PM Chandan Rajendra wrote: > > The following deadlock is seen when executing generic/113 test, > > > > ---------------------------------------------------------+---------------------------------------------------- > > Direct I/O task Fast fsync task > > ---------------------------------------------------------+---------------------------------------------------- > > btrfs_direct_IO > > __blockdev_direct_IO > > do_blockdev_direct_IO > > do_direct_IO > > btrfs_get_blocks_direct > > while (blocks needs to written) > > get_more_blocks (first iteration) > > btrfs_get_blocks_direct > > btrfs_create_dio_extent > > down_read(&BTRFS_I(inode) >dio_sem) > > Create and add extent map and ordered extent > > up_read(&BTRFS_I(inode) >dio_sem) > > btrfs_sync_file > > btrfs_log_dentry_safe > > btrfs_log_inode_parent > > btrfs_log_inode > > btrfs_log_changed_extents > > down_write(&BTRFS_I(inode) >dio_sem) > > Collect new extent maps and ordered extents > > wait for ordered extent completion > > get_more_blocks (second iteration) > > btrfs_get_blocks_direct > > btrfs_create_dio_extent > > down_read(&BTRFS_I(inode) >dio_sem) > > -------------------------------------------------------------------------------------------------------------- > > > > In the above description, Btrfs direct I/O code path has not yet started > > submitting bios for file range covered by the initial ordered > > extent. Meanwhile, The fast fsync task obtains the write semaphore and > > waits for I/O on the ordered extent to get completed. However, the > > Direct I/O task is now blocked on obtaining the read semaphore. > > > > To resolve the deadlock, this commit modifies the Direct I/O code path > > to obtain the read semaphore before invoking > > __blockdev_direct_IO(). The semaphore is then given up after > > __blockdev_direct_IO() returns. This allows the Direct I/O code to > > complete I/O on all the ordered extents it creates. > > > > Btw, I was able to reproduce the issue on kdave/for-next branch with "Merge > branch 'for-next-next-4.9-20161125' into for-next-20161125" as the topmost > commit. The issue cannot be reproduced yet on latest code available from > kdave/for-next branch. > > Maybe changes in upstream might have masked the issue in the recent kdave/for-next branch. I say that because 'git bisect' resulted in the following commit ... e3597e6090ddf40904dce6d0a5a404e2c490cac6 Author: Chris Mason <clm@xxxxxx> AuthorDate: Tue Nov 1 12:54:45 2016 -0700 Commit: Chris Mason <clm@xxxxxx> CommitDate: Tue Nov 1 12:54:45 2016 -0700 Parent: 570dd45 btrfs: fix races on root_log_ctx lists Parent: 9d1032c btrfs: fix WARNING in btrfs_select_ref_head() Merged: btrfs-next-for-linus-4.8 kdave-master linus-v4.7-rc6 local-v4.7-rc4 Containing: direct-io-fsync-deadlock kdave-for-next Follows: v4.8-rc8 (57) Precedes: next-20161219 (30006) Merge branch 'for-4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.9 5 files changed, 29 insertions(+), 9 deletions(-) fs/btrfs/extent-tree.c | 3 +++ fs/btrfs/extent_io.c | 8 ++++---- fs/btrfs/inode.c | 13 +++++++++---- fs/btrfs/ioctl.c | 5 +++++ fs/btrfs/relocation.c | 9 ++++++++- -- chandan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
