Le 28/01/2020 à 02:23, Qu Wenruo a écrit : > > On 2020/1/28 上午5:20, Pepie 34 wrote: >> Dear BTRFS community, >> >> I've a raid 1 setup on two luks encrypted drives for 4 years that serves >> me as btrbk backup target from an other computer. >> There is a lot of ro snaptshots on it. >> >> I've mistakenly launched a balance on it which was extremely slow and >> tried to cancelled it. >> After two days of cancelling without results, I decided to power off the >> computer. >> >> After the reboot, even with the skip_balance mount option, the mounting >> is endless, no error in the kernel message and it never mounts. > Is there anything like "relocating block group XXXX flags XXXX" ? No but other messages see below > >> What I have done so far: >> - mount the volume with the ro option (fast to mount, data OK). >> - scrub in ro mode, no error found > So data are all OK. > Just need a way to cancel the balance. > >> - btrfs check >> In the extent check there is plenty of errors like this : >> => >> ref mismatch on [9404816285696 32768] extent item 6, found 5 >> >> incorrect local backref count on 9404816285696 parent 5712684302336 >> owner 0 offset 0 found 0 wanted 1 back 0x55f371ee1ad0 >> backref disk bytenr does not match extent record, bytenr=9404816285696, >> ref bytenr=0 >> backpointer mismatch on [9404816285696 32768] >> <= > It could be caused by half-balanced fs. > Need to re-check after we cancel the balance. > >> No errors in other checks, though checking "quota groups" is very slow. > That's caused by the nature of qgroup. > >> What should I do ? btrfs check --repair ? >> btrfs check --init-extent-tree ? >> btrfs --clear-space-cache ? > None of the options should affect data, but none of them are recommened. > > Since the problem is about the balance. > > Have you tried to mount the fs with RO,skip_balance, then remount it rw? I have mount it ro,skip_balance then rw. It is now 12h it is trying to mount rw. I 've messages that tasks have taken more than 120 seconds in the kernel log. Some samples: [43621.876315] INFO: task btrfs-transacti:21846 blocked for more than 120 seconds. [43621.876325] Not tainted 4.19.0-6-amd64 #1 Debian 4.19.67-2+deb10u2 [43621.876327] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [43621.876331] btrfs-transacti D 0 21846 2 0x80000000 [43621.876334] Call Trace: [43621.876345] ? __schedule+0x2a2/0x870 [43621.876347] schedule+0x28/0x80 [43621.876394] btrfs_commit_transaction+0x75f/0x880 [btrfs] [43621.876399] ? finish_wait+0x80/0x80 [43621.876419] transaction_kthread+0x147/0x180 [btrfs] [43621.876440] ? btrfs_cleanup_transaction+0x530/0x530 [btrfs] [43621.876443] kthread+0x112/0x130 [43621.876445] ? kthread_bind+0x30/0x30 [43621.876447] ret_from_fork+0x22/0x40 [44346.867777] INFO: task mount:21595 blocked for more than 120 seconds. [44346.867788] Not tainted 4.19.0-6-amd64 #1 Debian 4.19.67-2+deb10u2 [44346.867791] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [44346.867795] mount D 0 21595 21594 0x00000000 [44346.867797] Call Trace: [44346.867809] ? __schedule+0x2a2/0x870 [44346.867812] ? __wake_up_common+0x7a/0x190 [44346.867814] schedule+0x28/0x80 [44346.867859] wait_current_trans+0xc3/0xf0 [btrfs] [44346.867863] ? finish_wait+0x80/0x80 [44346.867884] start_transaction+0x317/0x3e0 [btrfs] [44346.867908] merge_reloc_root+0xf5/0x560 [btrfs] [44346.867933] merge_reloc_roots+0xda/0x1f0 [btrfs] [44346.867957] btrfs_recover_relocation+0x42d/0x490 [btrfs] [44346.867978] open_ctree+0x1860/0x1bf0 [btrfs] [44346.867995] btrfs_mount_root+0x682/0x740 [btrfs] [44346.867999] ? cpumask_next+0x16/0x20 [44346.868002] ? pcpu_alloc+0x321/0x640 [44346.868005] mount_fs+0x3e/0x145 [44346.868008] vfs_kern_mount.part.36+0x54/0x120 [44346.868024] btrfs_mount+0x16f/0x860 [btrfs] [44346.868027] ? path_lookupat.isra.48+0xa3/0x220 [44346.868028] ? legitimize_path.isra.41+0x2d/0x60 [44346.868030] ? cpumask_next+0x16/0x20 [44346.868031] ? pcpu_alloc+0x321/0x640 [44346.868032] ? mount_fs+0x3e/0x145 [44346.868034] mount_fs+0x3e/0x145 [44346.868035] vfs_kern_mount.part.36+0x54/0x120 [44346.868037] do_mount+0x20e/0xcc0 [44346.868039] ? _cond_resched+0x15/0x30 [44346.868041] ? kmem_cache_alloc_trace+0x155/0x1d0 [44346.868043] ksys_mount+0xb6/0xd0 [44346.868044] __x64_sys_mount+0x21/0x30 [44346.868047] do_syscall_64+0x53/0x110 [44346.868050] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [44346.868052] RIP: 0033:0x7ff50cb41fea [44346.868060] Code: Bad RIP value. [44346.868061] RSP: 002b:00007ffd2257b2e8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 [44346.868063] RAX: ffffffffffffffda RBX: 000055cc47409a40 RCX: 00007ff50cb41fea [44346.868064] RDX: 000055cc4740be00 RSI: 000055cc47409c50 RDI: 000055cc4740aa50 [44346.868065] RBP: 00007ff50ce961c4 R08: 000055cc47409c70 R09: 000055cc474119e0 [44346.868065] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [44346.868066] R13: 0000000000000000 R14: 000055cc4740aa50 R15: 000055cc4740be00 Besides shutting down the computer, is there a proper way to stop the mounting ? Best regards, Pepie 34 > > Thanks, > Qu > >> Will the "init extent tree" option break btrfs receive with old snapshot >> parents ? >> >> Best regards, >> >> Pepie34 >>
