Hi Josef, Can you please help me with another question. I am looking at your patch: Btrfs: fix chunk allocation error handling https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=0448748849ef7c593be40e2c1404f7974bd3aac6 Here you changed the order of btrfs_make_block_group() vs btrfs_alloc_dev_extent(), because we could have allocated from the in-memory block group, before we have inserted the dev extent into a tree. However, with this fix, I hit the deadlock[1] of btrfs_alloc_dev_extent() that also wants to allocate a chunk and recursively calls do_chunk_alloc, but then is stuck on chunk_mutex. Was this patch: Btrfs: don't re-enter when allocating a chunk https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=c6b305a89b1903d63652691ad5eb9f05aa0326b8 introduced to fix this deadlock? Thanks, Alex. [1] [<ffffffffa044e57d>] do_chunk_alloc+0x8d/0x510 [btrfs] [<ffffffffa04532ad>] find_free_extent+0x9cd/0xb90 [btrfs] [<ffffffffa0453510>] btrfs_reserve_extent+0xa0/0x1b0 [btrfs] [<ffffffffa0453bc9>] btrfs_alloc_free_block+0xf9/0x570 [btrfs] [<ffffffffa043d9e6>] __btrfs_cow_block+0x126/0x500 [btrfs] [<ffffffffa043dfba>] btrfs_cow_block+0x17a/0x230 [btrfs] [<ffffffffa04425b1>] btrfs_search_slot+0x381/0x820 [btrfs] [<ffffffffa044463c>] btrfs_insert_empty_items+0x7c/0x120 [btrfs] [<ffffffffa048f31b>] btrfs_alloc_dev_extent+0x9b/0x1c0 [btrfs] [<ffffffffa048f9ca>] __btrfs_alloc_chunk+0x58a/0x850 [btrfs] [<ffffffffa049239f>] btrfs_alloc_chunk+0xbf/0x160 [btrfs] [<ffffffffa044e81b>] do_chunk_alloc+0x32b/0x510 [btrfs] [<ffffffffa04532ad>] find_free_extent+0x9cd/0xb90 [btrfs] [<ffffffffa0453510>] btrfs_reserve_extent+0xa0/0x1b0 [btrfs] [<ffffffffa0453bc9>] btrfs_alloc_free_block+0xf9/0x570 [btrfs] [<ffffffffa043d9e6>] __btrfs_cow_block+0x126/0x500 [btrfs] [<ffffffffa043dfba>] btrfs_cow_block+0x17a/0x230 [btrfs] [<ffffffffa0441613>] push_leaf_right+0x133/0x1a0 [btrfs] [<ffffffffa0441d51>] split_leaf+0x5e1/0x770 [btrfs] [<ffffffffa04429b5>] btrfs_search_slot+0x785/0x820 [btrfs] [<ffffffffa0449c0e>] lookup_inline_extent_backref+0x8e/0x5b0 [btrfs] [<ffffffffa044a193>] insert_inline_extent_backref+0x63/0x130 [btrfs] [<ffffffffa044abaf>] __btrfs_inc_extent_ref+0x9f/0x240 [btrfs] [<ffffffffa0451841>] run_clustered_refs+0x971/0xd00 [btrfs] [<ffffffffa0455db0>] btrfs_run_delayed_refs+0xd0/0x330 [btrfs] [<ffffffffa0467a87>] __btrfs_end_transaction+0xf7/0x440 [btrfs] [<ffffffffa0467e20>] btrfs_end_transaction+0x10/0x20 [btrfs] On Mon, Jun 24, 2013 at 9:56 PM, Alex Lyakas <alex.btrfs@xxxxxxxxxxxxxxxxx> wrote: > > Thanks for commenting Josef. I hope your head will get better:) > Actually, while re-looking at the code, I see that there are couple of > "goto cleanup;", before we ensure that all the writers have detached > from the committing transaction. So Liu's code is still needed, looks > like. > > Thanks, > Alex. > > On Mon, Jun 24, 2013 at 7:24 PM, Josef Bacik <jbacik@xxxxxxxxxxxx> wrote: > > On Sun, Jun 23, 2013 at 09:52:14PM +0300, Alex Lyakas wrote: > >> Hello Josef, Liu, > >> I am reviewing commits in the mainline tree: > >> > >> e4a2bcaca9643e7430207810653222fc5187f2be Btrfs: if we aren't > >> committing just end the transaction if we error out > >> (call end_transaction() instead of goto cleanup_transaction) - Josef > >> > >> f094ac32aba3a51c00e970a2ea029339af2ca048 Btrfs: fix NULL pointer after > >> aborting a transaction > >> (wait until all writers detach, before setting running_transaction to > >> NULL) - Liu > >> > >> 66b6135b7cf741f6f44ba938b27583ea3b83bd12 Btrfs: avoid deadlock on > >> transaction waiting list > >> (check if transaction was already removed from the transactions list) - > >> Liu > >> > >> Josef, according to your fix, if the committer encounters a problem > >> early, it just doesn't commit. Instead it aborts the transaction > >> (possibly setting FS to read-only) and detaches from the transaction. > >> So if this was the only committer (e.g., the transaction_kthread), > >> then transaction commit will not happen at all. Is this what you > >> intended? So then the user will notice that FS went read-only, and she > >> will unmount the FS, and transaction will be cleaned up in > >> btrfs_error_commit_super()=>btrfs_cleanup_transaction(), and not in > >> cleanup_transaction() via btrfs_commit_transaction(). Is my > >> understanding correct? > >> > >> Liu, it looks like after having Josef's fix, the above two commits of > >> yours are not strictly needed, right? Because Josef's fix ensures that > >> only the "real" committer will call cleanup_transaction(), so at this > >> point there is only one writer attached to the transaction, which is > >> the committer itself (your fixes doesn't hurt though). Is that > >> correct? > >> > > > > I've looked at the patches and I'm going to say yes with the caveat that > > I > > stopped thinking about it when my head started hurting :). Thanks, > > > > Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
