Re: Endless mount and backpointer mismatch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Le 28/01/2020 à 02:23, Qu Wenruo a écrit :
>
> On 2020/1/28 上午5:20, Pepie 34 wrote:
>> Dear BTRFS community,
>>
>> I've a raid 1 setup on two luks encrypted drives for 4 years that serves
>> me as btrbk backup target from an other computer.
>> There is a lot of ro snaptshots on it.
>>
>> I've mistakenly launched a balance on it which was extremely slow and
>> tried to cancelled it.
>> After two days of cancelling without results, I decided to power off the
>> computer.
>>
>> After the reboot, even with the skip_balance mount option, the mounting
>> is endless, no error in the kernel message and it never mounts.
> Is there anything like "relocating block group XXXX flags XXXX" ?

No but other messages see below


>
>> What I have done so far:
>> - mount the volume with the ro option (fast to mount, data OK).
>> - scrub in ro mode, no error found
> So data are all OK.
> Just need a way to cancel the balance.
>
>> - btrfs check
>> In the extent check  there is plenty of errors like this :
>> =>
>> ref mismatch on [9404816285696 32768] extent item 6, found 5
>>
>> incorrect local backref count on 9404816285696 parent 5712684302336
>> owner 0 offset 0 found 0 wanted 1 back 0x55f371ee1ad0
>> backref disk bytenr does not match extent record, bytenr=9404816285696,
>> ref bytenr=0
>> backpointer mismatch on [9404816285696 32768]
>> <=
> It could be caused by half-balanced fs.
> Need to re-check after we cancel the balance.
>
>> No errors in other checks, though checking "quota groups" is very slow.
> That's caused by the nature of qgroup.
>
>> What should I do ? btrfs check --repair ?
>> btrfs check --init-extent-tree ?
>> btrfs --clear-space-cache ?
> None of the options should affect data, but none of them are recommened.
>
> Since the problem is about the balance.
>
> Have you tried to mount the fs with RO,skip_balance, then remount it rw?

I have mount it ro,skip_balance then rw.

It is now 12h it is trying to mount rw.

I 've messages that tasks have taken more than 120 seconds in the kernel
log.

Some samples:

[43621.876315] INFO: task btrfs-transacti:21846 blocked for more than
120
seconds.                                                                                                                                

[43621.876325]       Not tainted 4.19.0-6-amd64 #1 Debian
4.19.67-2+deb10u2                                                                                                                                       

[43621.876327] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this
message.                                                                                                                          

[43621.876331] btrfs-transacti D    0 21846      2
0x80000000                                                                                                                                                     

[43621.876334] Call
Trace:                                                                                                                                                                                        

[43621.876345]  ?
__schedule+0x2a2/0x870                                                                                                                                                                          

[43621.876347] 
schedule+0x28/0x80                                                                                                                                                                                

[43621.876394]  btrfs_commit_transaction+0x75f/0x880
[btrfs]                                                                                                                                                      

[43621.876399]  ?
finish_wait+0x80/0x80                                                                                                                                                                           

[43621.876419]  transaction_kthread+0x147/0x180
[btrfs]                                                                                                                                                           

[43621.876440]  ? btrfs_cleanup_transaction+0x530/0x530
[btrfs]                                                                                                                                                   

[43621.876443] 
kthread+0x112/0x130                                                                                                                                                                               

[43621.876445]  ?
kthread_bind+0x30/0x30                                                                                                                                                                          

[43621.876447] 
ret_from_fork+0x22/0x40                                                                                                                                                                                                              



[44346.867777] INFO: task mount:21595 blocked for more than 120
seconds.                                                                                                                                          

[44346.867788]       Not tainted 4.19.0-6-amd64 #1 Debian 4.19.67-2+deb10u2
[44346.867791] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[44346.867795] mount           D    0 21595  21594 0x00000000
[44346.867797] Call Trace:
[44346.867809]  ? __schedule+0x2a2/0x870
[44346.867812]  ? __wake_up_common+0x7a/0x190
[44346.867814]  schedule+0x28/0x80
[44346.867859]  wait_current_trans+0xc3/0xf0 [btrfs]
[44346.867863]  ? finish_wait+0x80/0x80
[44346.867884]  start_transaction+0x317/0x3e0 [btrfs]
[44346.867908]  merge_reloc_root+0xf5/0x560 [btrfs]
[44346.867933]  merge_reloc_roots+0xda/0x1f0 [btrfs]
[44346.867957]  btrfs_recover_relocation+0x42d/0x490 [btrfs]
[44346.867978]  open_ctree+0x1860/0x1bf0 [btrfs]
[44346.867995]  btrfs_mount_root+0x682/0x740 [btrfs]
[44346.867999]  ? cpumask_next+0x16/0x20
[44346.868002]  ? pcpu_alloc+0x321/0x640
[44346.868005]  mount_fs+0x3e/0x145
[44346.868008]  vfs_kern_mount.part.36+0x54/0x120
[44346.868024]  btrfs_mount+0x16f/0x860 [btrfs]
[44346.868027]  ? path_lookupat.isra.48+0xa3/0x220
[44346.868028]  ? legitimize_path.isra.41+0x2d/0x60
[44346.868030]  ? cpumask_next+0x16/0x20
[44346.868031]  ? pcpu_alloc+0x321/0x640
[44346.868032]  ? mount_fs+0x3e/0x145
[44346.868034]  mount_fs+0x3e/0x145
[44346.868035]  vfs_kern_mount.part.36+0x54/0x120
[44346.868037]  do_mount+0x20e/0xcc0
[44346.868039]  ? _cond_resched+0x15/0x30
[44346.868041]  ? kmem_cache_alloc_trace+0x155/0x1d0
[44346.868043]  ksys_mount+0xb6/0xd0
[44346.868044]  __x64_sys_mount+0x21/0x30
[44346.868047]  do_syscall_64+0x53/0x110
[44346.868050]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[44346.868052] RIP: 0033:0x7ff50cb41fea
[44346.868060] Code: Bad RIP value.
[44346.868061] RSP: 002b:00007ffd2257b2e8 EFLAGS: 00000246 ORIG_RAX:
00000000000000a5
[44346.868063] RAX: ffffffffffffffda RBX: 000055cc47409a40 RCX:
00007ff50cb41fea
[44346.868064] RDX: 000055cc4740be00 RSI: 000055cc47409c50 RDI:
000055cc4740aa50
[44346.868065] RBP: 00007ff50ce961c4 R08: 000055cc47409c70 R09:
000055cc474119e0
[44346.868065] R10: 0000000000000000 R11: 0000000000000246 R12:
0000000000000000
[44346.868066] R13: 0000000000000000 R14: 000055cc4740aa50 R15:
000055cc4740be00

Besides shutting down the computer, is there a proper way to stop the
mounting ?

Best regards,

Pepie 34


>
> Thanks,
> Qu
>
>> Will the "init extent tree" option break btrfs receive with old snapshot
>> parents ?
>>
>> Best regards,
>>
>> Pepie34
>>





[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux