On 2019/4/23 上午4:37, Nathan Dehnel wrote: > I have a raid10 volume that frequently locks up when I try to write to > it or delete things. Any command that touches it will hang (and can't > be killed) and I have to start a new ssh session to get into the > computer again. Nothing fixes it besides a reboot, and the volume will > fail to unmount while the computer is shutting down. > > [ 302.360912] sysrq: SysRq : Show Blocked State > [ 302.360951] task PC stack pid father > [ 302.360987] btrfs-transacti D 0 2187 2 0x80000000 > [ 302.360993] Call Trace: > [ 302.361007] ? __schedule+0x59d/0x5f1 > [ 302.361012] schedule+0x6a/0x85 > [ 302.361019] btrfs_commit_transaction+0x219/0x7ac Btrfs is waiting other transaction to be committed. At that freeze, would you check which thread is taking CPU time? Are you utilizing backgroup balance or qgroup? If you have tons of snapshots, they may hugely slow down qgroup. Or you just have a running balance with qgroup enabled, it will almost hang your system. It's going to be fixed in v5.2. While during 4.19 to v5.1 we have a lot of optimizations to make balance + qgroup much faster. Thanks, Qu > [ 302.361027] ? wait_woken+0x6d/0x6d > [ 302.361031] transaction_kthread+0xc9/0x135 > [ 302.361036] ? btrfs_cleanup_transaction+0x4c7/0x4c7 > [ 302.361041] kthread+0x115/0x11d > [ 302.361046] ? kthread_park+0x76/0x76 > [ 302.361050] ret_from_fork+0x35/0x40 > [ 302.361064] nfsd D 0 2292 2 0x80000000 > [ 302.361067] Call Trace: > [ 302.361072] ? __schedule+0x59d/0x5f1 > [ 302.361077] schedule+0x6a/0x85 > [ 302.361120] wait_current_trans+0x9b/0xd8 > [ 302.361126] ? wait_woken+0x6d/0x6d > [ 302.361131] start_transaction+0x1ae/0x38e > [ 302.361135] btrfs_create+0x59/0x1d0 > [ 302.361142] vfs_create+0xbf/0xef > [ 302.361160] do_nfsd_create+0x2be/0x41d [nfsd] > [ 302.361214] nfsd4_open+0x223/0x578 [nfsd] > [ 302.361229] nfsd4_proc_compound+0x44a/0x562 [nfsd] > [ 302.361240] nfsd_dispatch+0xb9/0x16e [nfsd] > [ 302.361258] svc_process+0x524/0x6e2 [sunrpc] > [ 302.361270] ? nfsd_destroy+0x5f/0x5f [nfsd] > [ 302.361278] nfsd+0xf9/0x150 [nfsd] > [ 302.361284] kthread+0x115/0x11d > [ 302.361289] ? kthread_park+0x76/0x76 > [ 302.361292] ret_from_fork+0x35/0x40 > [ 302.361297] nfsd D 0 2293 2 0x80000000 > [ 302.361300] Call Trace: > [ 302.361305] ? __schedule+0x59d/0x5f1 > [ 302.361309] schedule+0x6a/0x85 > [ 302.361314] rwsem_down_write_failed+0x1af/0x210 > [ 302.361325] ? nfsd_permission+0xa3/0xe8 [nfsd] > [ 302.361330] call_rwsem_down_write_failed+0x13/0x20 > [ 302.361335] down_write+0x20/0x2e > [ 302.361345] nfsd_unlink+0xb1/0x16b [nfsd] > [ 302.361359] nfsd4_remove+0x4e/0x10a [nfsd] > [ 302.361371] nfsd4_proc_compound+0x44a/0x562 [nfsd] > [ 302.361381] nfsd_dispatch+0xb9/0x16e [nfsd] > [ 302.361395] svc_process+0x524/0x6e2 [sunrpc] > [ 302.361401] ? __mutex_unlock_slowpath.isra.6+0x1e8/0x20a > [ 302.361410] ? nfsd_destroy+0x5f/0x5f [nfsd] > [ 302.361419] nfsd+0xf9/0x150 [nfsd] > [ 302.361424] kthread+0x115/0x11d > [ 302.361428] ? kthread_park+0x76/0x76 > [ 302.361434] ret_from_fork+0x35/0x40 > [ 302.361441] rm D 0 2388 2334 0x00000004 > [ 302.361444] Call Trace: > [ 302.361449] ? __schedule+0x59d/0x5f1 > [ 302.361453] schedule+0x6a/0x85 > [ 302.361457] wait_current_trans+0x9b/0xd8 > [ 302.361462] ? wait_woken+0x6d/0x6d > [ 302.361466] start_transaction+0x1ae/0x38e > [ 302.361471] btrfs_start_transaction_fallback_global_rsv+0x32/0x127 > [ 302.361475] btrfs_unlink+0x30/0xc0 > [ 302.361478] vfs_unlink+0xd2/0x147 > [ 302.361482] do_unlinkat+0x112/0x223 > [ 302.361488] do_syscall_64+0x7e/0x133 > [ 302.361492] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [ 302.361496] RIP: 0033:0x7f681509b5d7 > [ 302.361504] Code: Bad RIP value. > [ 302.361506] RSP: 002b:00007fffb1aed668 EFLAGS: 00000202 ORIG_RAX: > 0000000000000107 > [ 302.361510] RAX: ffffffffffffffda RBX: 000055672760c6c0 RCX: 00007f681509b5d7 > [ 302.361512] RDX: 0000000000000000 RSI: 000055672760b490 RDI: 00000000ffffff9c > [ 302.361514] RBP: 0000000000000000 R08: 0000000000000003 R09: 0000000000000000 > [ 302.361516] R10: fffffffffffff12b R11: 0000000000000202 R12: 00007fffb1aed848 > [ 302.361518] R13: 000055672760b400 R14: 0000000000000002 R15: 0000000000000000 >
Attachment:
signature.asc
Description: OpenPGP digital signature
