Re: btrfs on 3.14rc5 stuck on "btrfs_tree_read_lock sync"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 07, 2014 at 12:10:52PM -0400, Josef Bacik wrote:
> On 04/07/2014 12:05 PM, Marc MERLIN wrote:
> >I was debugging my why backup failed to run, and eventually found it was
> >stuck on sync:
> >14080       18:18 btrfs_tree_read_lock           sync
> >
> >This was hung for hours on this lock.
> >
> >Strangely, it looks like taking my sysrq-w hung the machine pretty hard for
> >close to 30sec, but this seems to have unhung sync and in the end btrfs send
> >completed after that.
> >
> >Sysqrq-w is here:
> >https://urldefense.proofpoint.com/v1/url?u=http://marc.merlins.org/tmp/sysrq-btrfs-sync-hang.txt&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=cKCbChRKsMpTX8ybrSkonQ%3D%3D%0A&m=IHXWC1Chbc0jEiUWu1v4Va9NOphtjPbjYp6yVMdUmXM%3D%0A&s=bd787a3422e9ff0972d2d09de7d424f56589aadc9d6db33e19fc44886dce604f
> 
> Try Chris's integration branch in a few hours and see if that fixes
> it.  Thanks,

Mmmh, so I rebooted that server with 3.14.0 (no rc), and it was
deadlocked a long time during boot (about 10mn) before it unlocked
itself and finished booting.

This is a bit vexing, I don't yet know which of my 3 btrfs filesystems
is causing this, and how to fix it.
After boot, it seems ok enough.

You're recommending that I try btrfs-next on a 3.15 pre kernel, correct?
If so would it be likely to fix my filesystem and let me go back to a
stable 3.14? (I'm a bit warry about running some unstable 3.15 on it :).

Is there a chance balance or some file system cleaning will fix this?

For now, during boot, I get:
INFO: task btrfs-transacti:3633 blocked for more than 120 seconds.
      Not tainted 3.14.0-rc5-amd64-i915-preempt-20140216c #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
btrfs-transacti D ffff88020d762680     0  3633      2 0x00000000
 ffff88020c6c7dc0 0000000000000046 ffff88020c6c7fd8 ffff88020d762150
 00000000000141c0 ffff88020d762150 ffff88020e11be90 ffff8802106271e8
 0000000000000000 ffff880210627000 ffff8800c5c82740 ffff88020c6c7dd0
Call Trace:
 [<ffffffff8160c331>] schedule+0x73/0x75
 [<ffffffff8122a5f9>] wait_current_trans.isra.15+0x98/0xf4
 [<ffffffff810850c9>] ? finish_wait+0x65/0x65
 [<ffffffff8122b812>] start_transaction+0x202/0x4f2
 [<ffffffff8122bb9e>] btrfs_attach_transaction+0x17/0x19
 [<ffffffff812277a8>] transaction_kthread+0xd6/0x1ab
 [<ffffffff812276d2>] ? btrfs_cleanup_transaction+0x43f/0x43f
 [<ffffffff8106bc56>] kthread+0xae/0xb6
 [<ffffffff8106bba8>] ? __kthread_parkme+0x61/0x61
 [<ffffffff816153fc>] ret_from_fork+0x7c/0xb0
 [<ffffffff8106bba8>] ? __kthread_parkme+0x61/0x61
INFO: task btrfs-transacti:3633 blocked for more than 120 seconds.
      Not tainted 3.14.0-rc5-amd64-i915-preempt-20140216c #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
btrfs-transacti D ffff88020d762680     0  3633      2 0x00000000
 ffff88020c6c7dc0 0000000000000046 ffff88020c6c7fd8 ffff88020d762150
 00000000000141c0 ffff88020d762150 ffff88020e11be90 ffff8802106271e8
 0000000000000000 ffff880210627000 ffff8800c5c82740 ffff88020c6c7dd0
Call Trace:
 [<ffffffff8160c331>] schedule+0x73/0x75
 [<ffffffff8122a5f9>] wait_current_trans.isra.15+0x98/0xf4
 [<ffffffff810850c9>] ? finish_wait+0x65/0x65
 [<ffffffff8122b812>] start_transaction+0x202/0x4f2
 [<ffffffff8122bb9e>] btrfs_attach_transaction+0x17/0x19
 [<ffffffff812277a8>] transaction_kthread+0xd6/0x1ab
 [<ffffffff812276d2>] ? btrfs_cleanup_transaction+0x43f/0x43f
 [<ffffffff8106bc56>] kthread+0xae/0xb6
 [<ffffffff8106bba8>] ? __kthread_parkme+0x61/0x61
 [<ffffffff816153fc>] ret_from_fork+0x7c/0xb0
 [<ffffffff8106bba8>] ? __kthread_parkme+0x61/0x61
INFO: task btrfs-transacti:3633 blocked for more than 120 seconds.
      Not tainted 3.14.0-rc5-amd64-i915-preempt-20140216c #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
btrfs-transacti D ffff88020d762680     0  3633      2 0x00000000
 ffff88020c6c7dc0 0000000000000046 ffff88020c6c7fd8 ffff88020d762150
 00000000000141c0 ffff88020d762150 ffff88020e11be90 ffff8802106271e8
 0000000000000000 ffff880210627000 ffff8800c5c82740 ffff88020c6c7dd0
Call Trace:
 [<ffffffff8160c331>] schedule+0x73/0x75
 [<ffffffff8122a5f9>] wait_current_trans.isra.15+0x98/0xf4
 [<ffffffff810850c9>] ? finish_wait+0x65/0x65
 [<ffffffff8122b812>] start_transaction+0x202/0x4f2
 [<ffffffff8122bb9e>] btrfs_attach_transaction+0x17/0x19
 [<ffffffff812277a8>] transaction_kthread+0xd6/0x1ab
 [<ffffffff812276d2>] ? btrfs_cleanup_transaction+0x43f/0x43f
 [<ffffffff8106bc56>] kthread+0xae/0xb6
 [<ffffffff8106bba8>] ? __kthread_parkme+0x61/0x61
 [<ffffffff816153fc>] ret_from_fork+0x7c/0xb0
 [<ffffffff8106bba8>] ? __kthread_parkme+0x61/0x61

Eventually the boot finishes, but it hangs way too long.

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux