On Tue, Jan 14, 2020 at 10:41 AM jakub nantl <jn@xxxxxxxxxx> wrote: > > hello, > > thank for reply, here is the call trace, no need to reboot it, so I am > waiting :) > > [538847.101197] sysrq: Show Blocked State > [538847.101206] task PC stack pid father > [538847.101321] btrfs D 0 16014 1 0x00004004 > [538847.101324] Call Trace: > [538847.101335] __schedule+0x2e3/0x740 > [538847.101339] ? __switch_to_asm+0x40/0x70 > [538847.101342] ? __switch_to_asm+0x34/0x70 > [538847.101345] schedule+0x42/0xb0 > [538847.101348] schedule_timeout+0x203/0x2f0 > [538847.101351] ? __schedule+0x2eb/0x740 > [538847.101355] io_schedule_timeout+0x1e/0x50 > [538847.101358] wait_for_completion_io+0xb1/0x120 > [538847.101363] ? wake_up_q+0x70/0x70 > [538847.101401] write_all_supers+0x896/0x960 [btrfs] > [538847.101426] btrfs_commit_transaction+0x6ea/0x960 [btrfs] > [538847.101456] prepare_to_merge+0x210/0x250 [btrfs] > [538847.101484] relocate_block_group+0x36b/0x5f0 [btrfs] > [538847.101512] btrfs_relocate_block_group+0x15e/0x300 [btrfs] > [538847.101539] btrfs_relocate_chunk+0x2a/0x90 [btrfs] > [538847.101566] __btrfs_balance+0x409/0xa50 [btrfs] > [538847.101593] btrfs_balance+0x3ae/0x530 [btrfs] > [538847.101621] btrfs_ioctl_balance+0x2c1/0x380 [btrfs] > [538847.101648] btrfs_ioctl+0x836/0x20d0 [btrfs] > [538847.101652] ? do_anonymous_page+0x2e6/0x650 > [538847.101656] ? __handle_mm_fault+0x760/0x7a0 > [538847.101662] do_vfs_ioctl+0x407/0x670 > [538847.101664] ? do_vfs_ioctl+0x407/0x670 > [538847.101669] ? do_user_addr_fault+0x216/0x450 > [538847.101672] ksys_ioctl+0x67/0x90 > [538847.101675] __x64_sys_ioctl+0x1a/0x20 > [538847.101680] do_syscall_64+0x57/0x190 > [538847.101683] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [538847.101687] RIP: 0033:0x7f3cb04c85d7 > [538847.101695] Code: Bad RIP value. > [538847.101697] RSP: 002b:00007ffcd4e5fe88 EFLAGS: 00000246 ORIG_RAX: > 0000000000000010 > [538847.101701] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: > 00007f3cb04c85d7 > [538847.101704] RDX: 00007ffcd4e5ff18 RSI: 00000000c4009420 RDI: > 0000000000000003 > [538847.101707] RBP: 00007ffcd4e5ff18 R08: 0000000000000078 R09: > 0000000000000000 > [538847.101710] R10: 0000559f27675010 R11: 0000000000000246 R12: > 0000000000000003 > [538847.101713] R13: 00007ffcd4e62734 R14: 0000000000000001 R15: > 0000000000000000 > [538847.101718] btrfs D 0 30196 1 0x00000004 > [538847.101720] Call Trace: > [538847.101724] __schedule+0x2e3/0x740 > [538847.101727] schedule+0x42/0xb0 > [538847.101753] btrfs_cancel_balance+0xf8/0x170 [btrfs] > [538847.101759] ? wait_woken+0x80/0x80 > [538847.101786] btrfs_ioctl+0x13af/0x20d0 [btrfs] > [538847.101789] ? do_anonymous_page+0x2e6/0x650 > [538847.101793] ? __handle_mm_fault+0x760/0x7a0 > [538847.101797] do_vfs_ioctl+0x407/0x670 > [538847.101800] ? do_vfs_ioctl+0x407/0x670 > [538847.101803] ? do_user_addr_fault+0x216/0x450 > [538847.101806] ksys_ioctl+0x67/0x90 > [538847.101809] __x64_sys_ioctl+0x1a/0x20 > [538847.101813] do_syscall_64+0x57/0x190 > [538847.101856] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [538847.101859] RIP: 0033:0x7fa33680c5d7 > [538847.101864] Code: Bad RIP value. > [538847.101873] RSP: 002b:00007ffdbe2b9c58 EFLAGS: 00000246 ORIG_RAX: > 0000000000000010 > [538847.101888] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: > 00007fa33680c5d7 > [538847.101897] RDX: 0000000000000002 RSI: 0000000040049421 RDI: > 0000000000000003 > [538847.101908] RBP: 00007ffdbe2ba1d8 R08: 0000000000000078 R09: > 0000000000000000 > [538847.101918] R10: 00005604500f4010 R11: 0000000000000246 R12: > 00007ffdbe2ba735 > [538847.101928] R13: 00007ffdbe2ba1c0 R14: 0000000000000000 R15: > 0000000000000000 > I think it got clipped. And also the MUA is wrapping it and making it hard to read. I suggest 'journalctl -k -o short-monotonic' because what started the problem might actually be much earlier and there's no way to know that without the entire thing. Put that up in a dropbox or pastebin or google drive or equivalent. And hopefully a dev will be able to figure out why it's hung up. All I can tell from the above is that it's hung up on cancelling, which doesn't say much. _handle_mm_fault is suspicious. On second thought, I suggest doing sysrq+t. And then output journalctl -k, and post that. It'll have the complete dmesg, the sysrq+w, and +t. That for sure won't post to the list, it'll be too long, and the way MUA's wrap it, it's hard to read. -- Chris Murphy
