Re: question about transaction-abort-related commits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Josef,
Can you please help me with another question.

I am looking at your patch:
Btrfs: fix chunk allocation error handling
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=0448748849ef7c593be40e2c1404f7974bd3aac6

Here you changed the order of btrfs_make_block_group() vs
btrfs_alloc_dev_extent(), because we could have allocated from the
in-memory block group, before we have inserted the dev extent into a
tree. However, with this fix, I hit the deadlock[1] of
btrfs_alloc_dev_extent() that also wants to allocate a chunk and
recursively calls do_chunk_alloc, but then is stuck on chunk_mutex.

Was this patch:
Btrfs: don't re-enter when allocating a chunk
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=c6b305a89b1903d63652691ad5eb9f05aa0326b8
introduced to fix this deadlock?

Thanks,
Alex.

[1]
[<ffffffffa044e57d>] do_chunk_alloc+0x8d/0x510 [btrfs]
[<ffffffffa04532ad>] find_free_extent+0x9cd/0xb90 [btrfs]
[<ffffffffa0453510>] btrfs_reserve_extent+0xa0/0x1b0 [btrfs]
[<ffffffffa0453bc9>] btrfs_alloc_free_block+0xf9/0x570 [btrfs]
[<ffffffffa043d9e6>] __btrfs_cow_block+0x126/0x500 [btrfs]
[<ffffffffa043dfba>] btrfs_cow_block+0x17a/0x230 [btrfs]
[<ffffffffa04425b1>] btrfs_search_slot+0x381/0x820 [btrfs]
[<ffffffffa044463c>] btrfs_insert_empty_items+0x7c/0x120 [btrfs]
[<ffffffffa048f31b>] btrfs_alloc_dev_extent+0x9b/0x1c0 [btrfs]
[<ffffffffa048f9ca>] __btrfs_alloc_chunk+0x58a/0x850 [btrfs]
[<ffffffffa049239f>] btrfs_alloc_chunk+0xbf/0x160 [btrfs]
[<ffffffffa044e81b>] do_chunk_alloc+0x32b/0x510 [btrfs]
[<ffffffffa04532ad>] find_free_extent+0x9cd/0xb90 [btrfs]
[<ffffffffa0453510>] btrfs_reserve_extent+0xa0/0x1b0 [btrfs]
[<ffffffffa0453bc9>] btrfs_alloc_free_block+0xf9/0x570 [btrfs]
[<ffffffffa043d9e6>] __btrfs_cow_block+0x126/0x500 [btrfs]
[<ffffffffa043dfba>] btrfs_cow_block+0x17a/0x230 [btrfs]
[<ffffffffa0441613>] push_leaf_right+0x133/0x1a0 [btrfs]
[<ffffffffa0441d51>] split_leaf+0x5e1/0x770 [btrfs]
[<ffffffffa04429b5>] btrfs_search_slot+0x785/0x820 [btrfs]
[<ffffffffa0449c0e>] lookup_inline_extent_backref+0x8e/0x5b0 [btrfs]
[<ffffffffa044a193>] insert_inline_extent_backref+0x63/0x130 [btrfs]
[<ffffffffa044abaf>] __btrfs_inc_extent_ref+0x9f/0x240 [btrfs]
[<ffffffffa0451841>] run_clustered_refs+0x971/0xd00 [btrfs]
[<ffffffffa0455db0>] btrfs_run_delayed_refs+0xd0/0x330 [btrfs]
[<ffffffffa0467a87>] __btrfs_end_transaction+0xf7/0x440 [btrfs]
[<ffffffffa0467e20>] btrfs_end_transaction+0x10/0x20 [btrfs]




On Mon, Jun 24, 2013 at 9:56 PM, Alex Lyakas
<alex.btrfs@xxxxxxxxxxxxxxxxx> wrote:
>
> Thanks for commenting Josef. I hope your head will get better:)
> Actually, while re-looking at the code, I see that there are couple of
> "goto cleanup;", before we ensure that all the writers have detached
> from the committing transaction. So Liu's code is still needed, looks
> like.
>
> Thanks,
> Alex.
>
> On Mon, Jun 24, 2013 at 7:24 PM, Josef Bacik <jbacik@xxxxxxxxxxxx> wrote:
> > On Sun, Jun 23, 2013 at 09:52:14PM +0300, Alex Lyakas wrote:
> >> Hello Josef, Liu,
> >> I am reviewing commits in the mainline tree:
> >>
> >> e4a2bcaca9643e7430207810653222fc5187f2be Btrfs: if we aren't
> >> committing just end the transaction if we error out
> >> (call end_transaction() instead of goto cleanup_transaction) - Josef
> >>
> >> f094ac32aba3a51c00e970a2ea029339af2ca048 Btrfs: fix NULL pointer after
> >> aborting a transaction
> >> (wait until all writers detach, before setting running_transaction to
> >> NULL) - Liu
> >>
> >> 66b6135b7cf741f6f44ba938b27583ea3b83bd12 Btrfs: avoid deadlock on
> >> transaction waiting list
> >> (check if transaction was already removed from the transactions list) -
> >> Liu
> >>
> >> Josef, according to your fix, if the committer encounters a problem
> >> early, it just doesn't commit. Instead it aborts the transaction
> >> (possibly setting FS to read-only) and detaches from the transaction.
> >> So if this was the only committer (e.g., the transaction_kthread),
> >> then transaction commit will not happen at all. Is this what you
> >> intended? So then the user will notice that FS went read-only, and she
> >> will unmount the FS, and transaction will be cleaned up in
> >> btrfs_error_commit_super()=>btrfs_cleanup_transaction(), and not in
> >> cleanup_transaction() via btrfs_commit_transaction(). Is my
> >> understanding correct?
> >>
> >> Liu, it looks like after having Josef's fix, the above two commits of
> >> yours are not strictly needed, right? Because Josef's fix ensures that
> >> only the "real" committer will call cleanup_transaction(), so at this
> >> point there is only one writer attached to the transaction, which is
> >> the committer itself (your fixes doesn't hurt though). Is that
> >> correct?
> >>
> >
> > I've looked at the patches and I'm going to say yes with the caveat that
> > I
> > stopped thinking about it when my head started hurting :).  Thanks,
> >
> > Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux