Re: reading/writing btrfs volume regularly freezes system

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2019/4/23 上午4:37, Nathan Dehnel wrote:
> I have a raid10 volume that frequently locks up when I try to write to
> it or delete things. Any command that touches it will hang (and can't
> be killed) and I have to start a new ssh session to get into the
> computer again. Nothing fixes it besides a reboot, and the volume will
> fail to unmount while the computer is shutting down.
> 
> [  302.360912] sysrq: SysRq : Show Blocked State
> [  302.360951]   task                        PC stack   pid father
> [  302.360987] btrfs-transacti D    0  2187      2 0x80000000
> [  302.360993] Call Trace:
> [  302.361007]  ? __schedule+0x59d/0x5f1
> [  302.361012]  schedule+0x6a/0x85
> [  302.361019]  btrfs_commit_transaction+0x219/0x7ac

Btrfs is waiting other transaction to be committed.

At that freeze, would you check which thread is taking CPU time?

Are you utilizing backgroup balance or qgroup?
If you have tons of snapshots, they may hugely slow down qgroup.
Or you just have a running balance with qgroup enabled, it will almost
hang your system.

It's going to be fixed in v5.2. While during 4.19 to v5.1 we have a lot
of optimizations to make balance + qgroup much faster.

Thanks,
Qu



> [  302.361027]  ? wait_woken+0x6d/0x6d
> [  302.361031]  transaction_kthread+0xc9/0x135
> [  302.361036]  ? btrfs_cleanup_transaction+0x4c7/0x4c7
> [  302.361041]  kthread+0x115/0x11d
> [  302.361046]  ? kthread_park+0x76/0x76
> [  302.361050]  ret_from_fork+0x35/0x40
> [  302.361064] nfsd            D    0  2292      2 0x80000000
> [  302.361067] Call Trace:
> [  302.361072]  ? __schedule+0x59d/0x5f1
> [  302.361077]  schedule+0x6a/0x85
> [  302.361120]  wait_current_trans+0x9b/0xd8
> [  302.361126]  ? wait_woken+0x6d/0x6d
> [  302.361131]  start_transaction+0x1ae/0x38e
> [  302.361135]  btrfs_create+0x59/0x1d0
> [  302.361142]  vfs_create+0xbf/0xef
> [  302.361160]  do_nfsd_create+0x2be/0x41d [nfsd]
> [  302.361214]  nfsd4_open+0x223/0x578 [nfsd]
> [  302.361229]  nfsd4_proc_compound+0x44a/0x562 [nfsd]
> [  302.361240]  nfsd_dispatch+0xb9/0x16e [nfsd]
> [  302.361258]  svc_process+0x524/0x6e2 [sunrpc]
> [  302.361270]  ? nfsd_destroy+0x5f/0x5f [nfsd]
> [  302.361278]  nfsd+0xf9/0x150 [nfsd]
> [  302.361284]  kthread+0x115/0x11d
> [  302.361289]  ? kthread_park+0x76/0x76
> [  302.361292]  ret_from_fork+0x35/0x40
> [  302.361297] nfsd            D    0  2293      2 0x80000000
> [  302.361300] Call Trace:
> [  302.361305]  ? __schedule+0x59d/0x5f1
> [  302.361309]  schedule+0x6a/0x85
> [  302.361314]  rwsem_down_write_failed+0x1af/0x210
> [  302.361325]  ? nfsd_permission+0xa3/0xe8 [nfsd]
> [  302.361330]  call_rwsem_down_write_failed+0x13/0x20
> [  302.361335]  down_write+0x20/0x2e
> [  302.361345]  nfsd_unlink+0xb1/0x16b [nfsd]
> [  302.361359]  nfsd4_remove+0x4e/0x10a [nfsd]
> [  302.361371]  nfsd4_proc_compound+0x44a/0x562 [nfsd]
> [  302.361381]  nfsd_dispatch+0xb9/0x16e [nfsd]
> [  302.361395]  svc_process+0x524/0x6e2 [sunrpc]
> [  302.361401]  ? __mutex_unlock_slowpath.isra.6+0x1e8/0x20a
> [  302.361410]  ? nfsd_destroy+0x5f/0x5f [nfsd]
> [  302.361419]  nfsd+0xf9/0x150 [nfsd]
> [  302.361424]  kthread+0x115/0x11d
> [  302.361428]  ? kthread_park+0x76/0x76
> [  302.361434]  ret_from_fork+0x35/0x40
> [  302.361441] rm              D    0  2388   2334 0x00000004
> [  302.361444] Call Trace:
> [  302.361449]  ? __schedule+0x59d/0x5f1
> [  302.361453]  schedule+0x6a/0x85
> [  302.361457]  wait_current_trans+0x9b/0xd8
> [  302.361462]  ? wait_woken+0x6d/0x6d
> [  302.361466]  start_transaction+0x1ae/0x38e
> [  302.361471]  btrfs_start_transaction_fallback_global_rsv+0x32/0x127
> [  302.361475]  btrfs_unlink+0x30/0xc0
> [  302.361478]  vfs_unlink+0xd2/0x147
> [  302.361482]  do_unlinkat+0x112/0x223
> [  302.361488]  do_syscall_64+0x7e/0x133
> [  302.361492]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [  302.361496] RIP: 0033:0x7f681509b5d7
> [  302.361504] Code: Bad RIP value.
> [  302.361506] RSP: 002b:00007fffb1aed668 EFLAGS: 00000202 ORIG_RAX:
> 0000000000000107
> [  302.361510] RAX: ffffffffffffffda RBX: 000055672760c6c0 RCX: 00007f681509b5d7
> [  302.361512] RDX: 0000000000000000 RSI: 000055672760b490 RDI: 00000000ffffff9c
> [  302.361514] RBP: 0000000000000000 R08: 0000000000000003 R09: 0000000000000000
> [  302.361516] R10: fffffffffffff12b R11: 0000000000000202 R12: 00007fffb1aed848
> [  302.361518] R13: 000055672760b400 R14: 0000000000000002 R15: 0000000000000000
> 

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux