On Thu, Feb 07, 2013 at 03:12:07AM -0700, Miao Xie wrote: > The deadlock problem happened when running fsstress(a test program in LTP). > > Steps to reproduce: > # mkfs.btrfs -b 100M <partition> > # mount <partition> <mnt> > # <Path>/fsstress -p 3 -n 10000000 -d <mnt> > > The reason is: > btrfs_direct_IO() > |->do_direct_IO() > |->get_page() > |->get_blocks() > | |->btrfs_delalloc_resereve_space() > | |->btrfs_add_ordered_extent() ------- Add a new ordered extent > |->dio_send_cur_page(page0) -------------- We didn't submit bio here > |->get_page() > |->get_blocks() > |->btrfs_delalloc_resereve_space() > |->flush_space() > |->btrfs_start_ordered_extent() > |->wait_event() ---------- Wait the completion of > the ordered extent that is > mentioned above > > But because we didn't submit the bio that is mentioned above, the ordered > extent can not complete, we would wait for its completion forever. > > There are two methods which can fix this deadlock problem: > 1. submit the bio before we invoke get_blocks() > 2. reserve the space before we do dio > > Though the 1st is the simplest way, we need modify the code of VFS, and it > is likely to break contiguous requests, and introduce performance regression > for the other filesystems. > > So we have to choose the 2nd way. The 3rd option is to have get_blocks return -EAGAIN to the direct-io.c code and let the higher levels submit the bios they have built. Josef will probably go for option #4, which is dropping the generic code completely and doing it all ourselves. But I do like your approach, it makes sense here. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
