Re: 3.14rc3 kernel also hangs some processes with btrfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Feb 24, 2014 at 09:35:19AM -0800, Marc MERLIN wrote:
> On Mon, Feb 24, 2014 at 07:29:58AM +0000, Duncan wrote:
> > > But I'm still seeing these, albeit less often.
> > > Any idea what they could be linked to?
> > > (I have a btrs send/receive going right now, it could hanging
> > > /mnt/btrfs_pool1 in a way that affects smbd, but the array feels ok
> > > otherwise, weird...)
> > > 
> > > [ 1332.548370] INFO: task smbd:21882 blocked for more than 120 seconds.
> > > [ 1332.587455]       Not tainted 3.13.5-ia32-i915-preempt-20140216 #1
> > > [ 1332.625478] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> > > disables this message.
> > 
> > I've not seen anything like that here, but there are several kernel 
> > 3.13/3.14-rc reports of similar behavior on the list.
>  
> I'll have to try this. Interestingly, I just got the following overnight.
> 
> Something killed 3 different drives from software raid arrays, and the
> drives are all fine.
> 
> It looks like btrfs hung my kernel hard enough that it messed up
> software raid, killed my array on which btrfs was, and then caused a
> cascade of further btrfs failures.
> Are those 2 traces useful?

I upgraded to 3.14rc3  and I'm also seeing hangs, although not as severe

To answer earlier questions:
1) the filesystem I'm copying is brand new, I just created it
2) files were contiguous and not being written randomly (mp3s)

Here are the logs:

[  381.512765] BTRFS: device label bigbackup devid 3 transid 179 /dev/dm-5
[  382.875847] BTRFS: device label bigbackup devid 2 transid 179 /dev/dm-6
[  384.795334] BTRFS: device label bigbackup devid 4 transid 179 /dev/dm-7
[  387.265961] BTRFS: device label bigbackup devid 5 transid 179 /dev/dm-8
[  389.904834] BTRFS: device label bigbackup devid 1 transid 179 /dev/dm-9
[  390.054771] BTRFS: device label bigbackup devid 3 transid 179 /dev/mapper/crypt_sdo1
[  390.107006] BTRFS: device label bigbackup devid 3 transid 179 /dev/dm-5
[  390.129071] BTRFS: device label bigbackup devid 2 transid 179 /dev/mapper/crypt_sdn1
[  390.177737] BTRFS: device label bigbackup devid 2 transid 179 /dev/dm-6
[  390.221597] BTRFS: device label bigbackup devid 4 transid 179 /dev/mapper/crypt_sdp1
[  390.291095] BTRFS: device label bigbackup devid 1 transid 179 /dev/mapper/crypt_sdm1
[  390.334910] BTRFS: device label bigbackup devid 1 transid 179 /dev/dm-9
[  390.357362] BTRFS: device label bigbackup devid 4 transid 179 /dev/dm-7
[  390.378629] BTRFS: device label bigbackup devid 5 transid 179 /dev/mapper/crypt_sdq1
[  390.441157] BTRFS: device label bigbackup devid 1 transid 179 /dev/mapper/crypt_sdm1
[  390.468279] BTRFS info (device dm-9): disk space caching is enabled
[  390.476577] BTRFS: device label bigbackup devid 5 transid 179 /dev/dm-8
[  725.501091] INFO: task btrfs-transacti:3218 blocked for more than 120 seconds.
[  725.529118]       Not tainted 3.14.0-rc3 #1
[  725.542713] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  725.567489] btrfs-transacti D ffff88020d5fa880     0  3218      2 0x00000000
[  725.591645]  ffff88020cad3dc8 0000000000000046 ffff88020cad3fd8 ffff88020d5fa350
[  725.616304]  00000000000141c0 ffff88020d5fa350 ffff8801a5ed82a0 ffff8801a5ed8310
[  725.640545]  ffff8801a5ed82a0 ffff8801a5ed82a0 0000000000000000 ffff88020cad3dd8
[  725.665227] Call Trace:
[  725.674397]  [<ffffffff8160b059>] schedule+0x73/0x75
[  725.690761]  [<ffffffff81229973>] wait_for_commit.isra.10+0x50/0x67
[  725.711610]  [<ffffffff81085062>] ? finish_wait+0x65/0x65
[  725.729355]  [<ffffffff8122acd8>] btrfs_commit_transaction+0x143/0x849
[  725.750238]  [<ffffffff81227598>] transaction_kthread+0xf8/0x1ab
[  725.769867]  [<ffffffff812274a0>] ? btrfs_cleanup_transaction+0x43f/0x43f
[  725.791465]  [<ffffffff8106bc1b>] kthread+0xae/0xb6
[  725.807384]  [<ffffffff8106bb6d>] ? __kthread_parkme+0x61/0x61
[  725.826176]  [<ffffffff8161413c>] ret_from_fork+0x7c/0xb0
[  725.843645]  [<ffffffff8106bb6d>] ? __kthread_parkme+0x61/0x61
[  845.962068] INFO: task btrfs-transacti:3218 blocked for more than 120 seconds.
[  845.986941]       Not tainted 3.14.0-rc3 #1
[  846.000617] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  846.025730] btrfs-transacti D ffff88020d5fa880     0  3218      2 0x00000000
[  846.048610]  ffff88020cad3dc8 0000000000000046 ffff88020cad3fd8 ffff88020d5fa350
[  846.072100]  00000000000141c0 ffff88020d5fa350 ffff8801a5ed82a0 ffff8801a5ed8310
[  846.095559]  ffff8801a5ed82a0 ffff8801a5ed82a0 0000000000000000 ffff88020cad3dd8
[  846.119008] Call Trace:
[  846.127440]  [<ffffffff8160b059>] schedule+0x73/0x75
[  846.144015]  [<ffffffff81229973>] wait_for_commit.isra.10+0x50/0x67
[  846.164512]  [<ffffffff81085062>] ? finish_wait+0x65/0x65
[  846.182412]  [<ffffffff8122acd8>] btrfs_commit_transaction+0x143/0x849
[  846.203734]  [<ffffffff81227598>] transaction_kthread+0xf8/0x1ab
[  846.223484]  [<ffffffff812274a0>] ? btrfs_cleanup_transaction+0x43f/0x43f
[  846.245587]  [<ffffffff8106bc1b>] kthread+0xae/0xb6
[  846.261992]  [<ffffffff8106bb6d>] ? __kthread_parkme+0x61/0x61
[  846.281143]  [<ffffffff8161413c>] ret_from_fork+0x7c/0xb0
[  846.299126]  [<ffffffff8106bb6d>] ? __kthread_parkme+0x61/0x61
[  966.418867] INFO: task btrfs-transacti:3218 blocked for more than 120 seconds.
[  966.442696]       Not tainted 3.14.0-rc3 #1
[  966.457407] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  966.483016] btrfs-transacti D ffff88020d5fa880     0  3218      2 0x00000000
[  966.506440]  ffff88020cad3dc8 0000000000000046 ffff88020cad3fd8 ffff88020d5fa350
[  966.530916]  00000000000141c0 ffff88020d5fa350 ffff8801a5ed82a0 ffff8801a5ed8310
[  966.555390]  ffff8801a5ed82a0 ffff8801a5ed82a0 0000000000000000 ffff88020cad3dd8
[  966.579921] Call Trace:
[  966.589422]  [<ffffffff8160b059>] schedule+0x73/0x75
[  966.606433]  [<ffffffff81229973>] wait_for_commit.isra.10+0x50/0x67
[  966.627339]  [<ffffffff81085062>] ? finish_wait+0x65/0x65
[  966.645553]  [<ffffffff8122acd8>] btrfs_commit_transaction+0x143/0x849
[  966.667268]  [<ffffffff81227598>] transaction_kthread+0xf8/0x1ab
[  966.687403]  [<ffffffff812274a0>] ? btrfs_cleanup_transaction+0x43f/0x43f
[  966.709881]  [<ffffffff8106bc1b>] kthread+0xae/0xb6
[  966.726630]  [<ffffffff8106bb6d>] ? __kthread_parkme+0x61/0x61
[  966.746227]  [<ffffffff8161413c>] ret_from_fork+0x7c/0xb0
[  966.764083]  [<ffffffff8106bb6d>] ? __kthread_parkme+0x61/0x61
[ 1086.879792] INFO: task btrfs-transacti:3218 blocked for more than 120 seconds.
[ 1086.903594]       Not tainted 3.14.0-rc3 #1
[ 1086.918322] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1086.943833] btrfs-transacti D ffff88020d5fa880     0  3218      2 0x00000000
[ 1086.967139]  ffff88020cad3dc8 0000000000000046 ffff88020cad3fd8 ffff88020d5fa350
[ 1086.991476]  00000000000141c0 ffff88020d5fa350 ffff8801a5ed82a0 ffff8801a5ed8310
[ 1087.015821]  ffff8801a5ed82a0 ffff8801a5ed82a0 0000000000000000 ffff88020cad3dd8
[ 1087.039461] Call Trace:
[ 1087.048083]  [<ffffffff8160b059>] schedule+0x73/0x75
[ 1087.064249]  [<ffffffff81229973>] wait_for_commit.isra.10+0x50/0x67
[ 1087.084318]  [<ffffffff81085062>] ? finish_wait+0x65/0x65
[ 1087.101780]  [<ffffffff8122acd8>] btrfs_commit_transaction+0x143/0x849
[ 1087.122610]  [<ffffffff81227598>] transaction_kthread+0xf8/0x1ab
[ 1087.141906]  [<ffffffff812274a0>] ? btrfs_cleanup_transaction+0x43f/0x43f
[ 1087.163541]  [<ffffffff8106bc1b>] kthread+0xae/0xb6
[ 1087.179446]  [<ffffffff8106bb6d>] ? __kthread_parkme+0x61/0x61
[ 1087.198812]  [<ffffffff8161413c>] ret_from_fork+0x7c/0xb0
[ 1087.216988]  [<ffffffff8106bb6d>] ? __kthread_parkme+0x61/0x61
[ 1693.760950] perf samples too long (2502 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
[ 1693.789603] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 28.656 msecs

My btrfs copy is working now, without hangs, but 2mn hangs on the above is still bad, especially
on a brand new filesystem.

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux