Re: [PATCH] Btrfs: fix the deadlock between the transaction attach and commit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 07, 2013 at 11:55:51PM -0700, Miao Xie wrote:
> Here is the whole story:
> 	Trans_Attach_Task		Trans_Commit_Task
> 					btrfs_commit_transaction()
> 					 |->wait writers to be 1
> 	btrfs_attach_transaction()	 |
> 	btrfs_commit_transaction()	 |
> 	 |				 |->set trans_no_join to 1
> 	 |				 |  (close join transaction)
> 	 |->btrfs_run_ordered_operations |
> 	    (Those ordered operations	 |
> 	     are added when releasing	 |
> 	     file)			 |
> 	     |->btrfs_join_transaction() |
> 		|->wait_commit()	 |
> 					 |->wait writers to be 1
> 
> Then these two tasks waited for each other.
> 
> As we know, btrfs_attach_transaction() is used to catch the current
> transaction, and commit it, so if someone has committed the transaction,
> it is unnecessary to join it and commit it, wait is the best choice
> for it. In this way, we can fix the above problem.
> 
> Signed-off-by: Miao Xie <miaox@xxxxxxxxxxxxxx>

This caused another problem

[ 8050.503904] btrfs-transacti D 0000000000000000     0  5546      2 0x00000080
[ 8050.503913]  ffff88037bfb9d18 0000000000000046 ffff88037bfb9cb8 ffffffff810c6d4d
[ 8050.503924]  ffff88037c4d8000 ffff88037bfb9fd8 ffff88037bfb9fd8 ffff88037bfb9fd8
[ 8050.503933]  ffff88042f17a000 ffff88037c4d8000 ffff88042c33b000 ffff88037ba0bdb8
[ 8050.503943] Call Trace:
[ 8050.503953]  [<ffffffff810c6d4d>] ? trace_hardirqs_on+0xd/0x10
[ 8050.503962]  [<ffffffff816507c9>] schedule+0x29/0x70
[ 8050.504002]  [<ffffffffa084eb75>] wait_current_trans+0xb5/0x110 [btrfs]
[ 8050.504011]  [<ffffffff810891f0>] ? __init_waitqueue_head+0x60/0x60
[ 8050.504047]  [<ffffffffa08503c0>] start_transaction+0x160/0x4e0 [btrfs]
[ 8050.504082]  [<ffffffffa0850757>] btrfs_attach_transaction+0x17/0x20 [btrfs]
[ 8050.504114]  [<ffffffffa084857a>] transaction_kthread+0x15a/0x240 [btrfs]
[ 8050.504147]  [<ffffffffa0848420>] ? btrfs_destroy_delayed_refs+0x330/0x330 [btrfs]
[ 8050.504155]  [<ffffffff8108883a>] kthread+0xea/0xf0
[ 8050.504166]  [<ffffffff81088750>] ? flush_kthread_worker+0x150/0x150
[ 8050.504175]  [<ffffffff8165a06c>] ret_from_fork+0x7c/0xb0
[ 8050.504183]  [<ffffffff81088750>] ? flush_kthread_worker+0x150/0x150
[ 8050.504189] sync            D 0000000000000000     0  5572   5342 0x00000080
[ 8050.504198]  ffff88037c235dd8 0000000000000046 ffff88037c235d78 ffffffff810c6d4d
[ 8050.504207]  ffff88037ca8a000 ffff88037c235fd8 ffff88037c235fd8 ffff88037c235fd8
[ 8050.504217]  ffff88042f184000 ffff88037ca8a000 ffff88042c33b000 ffff88037ba0bdb8
[ 8050.504227] Call Trace:
[ 8050.504236]  [<ffffffff810c6d4d>] ? trace_hardirqs_on+0xd/0x10
[ 8050.504245]  [<ffffffff816507c9>] schedule+0x29/0x70
[ 8050.504278]  [<ffffffffa084eb75>] wait_current_trans+0xb5/0x110 [btrfs]
[ 8050.504287]  [<ffffffff810891f0>] ? __init_waitqueue_head+0x60/0x60
[ 8050.504322]  [<ffffffffa08503c0>] start_transaction+0x160/0x4e0 [btrfs]
[ 8050.504360]  [<ffffffffa0866d94>] ? btrfs_wait_ordered_extents+0x174/0x230 [btrfs]
[ 8050.504395]  [<ffffffffa0850757>] btrfs_attach_transaction+0x17/0x20 [btrfs]
[ 8050.504420]  [<ffffffffa0820133>] btrfs_sync_fs+0x53/0x130 [btrfs]
[ 8050.504430]  [<ffffffff811cac30>] ? __sync_filesystem+0x60/0x60
[ 8050.504438]  [<ffffffff811cac30>] ? __sync_filesystem+0x60/0x60
[ 8050.504447]  [<ffffffff811cac50>] sync_fs_one_sb+0x20/0x30
[ 8050.504455]  [<ffffffff8119e0c1>] iterate_supers+0xf1/0x100
[ 8050.504463]  [<ffffffff811cad25>] sys_sync+0x55/0x90
[ 8050.504472]  [<ffffffff8165a119>] system_call_fastpath+0x16/0x1b

So we're getting stuck in the

if (may_wait_transaction())
	wait_current_trans();

thing.  If we set blocked in __btrfs_end_transaction we'll just sit there
forever because nobody can actually commit the transaction.  Probably need to
change this to

if (type == TRANS_ATTACH && trans->in_commit)

or something like that.  Me and kdave reproduced by running 274 in a loop, it
happpened pretty quick.  I'd fix it myself but I have to leave my house for
people to come look at it.  If you haven't fixed this by tomorrow I'll fix it
up.  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux