Re: failed to read block groups: -5; open_ctree failed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2020/3/18 上午10:26, Liwei wrote:
> Hi list,
> I'm getting the following log while trying to mount my filesystem:
> [   23.403026] BTRFS: device label dstore devid 1 transid 1288839 /dev/dm-8
> [   23.491459] BTRFS info (device dm-8): enabling auto defrag
> [   23.491461] BTRFS info (device dm-8): disk space caching is enabled
> [   23.717506] BTRFS info (device dm-8): bdev /dev/mapper/vg- dstore
> errs: wr 0, rd 728, flush 0, corrupt 16, gen 0
> [   32.108724] BTRFS error (device dm-8): bad tree block start, want
> 39854304329728 have 0
> [   32.110570] BTRFS error (device dm-8): bad tree block start, want
> 39854304329728 have 0
> [   32.112030] BTRFS error (device dm-8): failed to read block groups: -5
> [   32.273712] BTRFS error (device dm-8): open_ctree failed

Extent tree corruption.

And it's not some small problem, but data loss.
The on-disk data is completely wiped (all 0).

> 
> A check gives me:
> #btrfs check /dev/mapper/recovery
> Opening filesystem to check...
> checksum verify failed on 39854304329728 found E4E3BDB6 wanted 00000000
> checksum verify failed on 39854304329728 found E4E3BDB6 wanted 00000000
> checksum verify failed on 39854304329728 found E4E3BDB6 wanted 00000000
> checksum verify failed on 39854304329728 found E4E3BDB6 wanted 00000000
> bad tree block 39854304329728, bytenr mismatch, want=39854304329728, have=0
> ERROR: cannot open file system
> 
> The same thing happens with the other superblocks, all superblocks are
> not corrupted.

Super blocks is only in 4K size, you won't expect that could contain all
your metadata, right?

> 
> The reason this happened is a controller failure occurred while trying
> to expand the underlying raid6 causing some pretty nasty drive
> dropouts. Looking through older generations of tree roots, I'm getting
> the same zeroed node at 39854304329728.
> 
> It seems like at some point md messed up recovering from the
> controller failure (or rather I did) and it seems like I am getting a
> lot of zeroed-out/corrupted areas?

Yes, that's exactly the case.

> Can someone confirm if that is the
> case or if it is just some weird state the filesystem is in?
> 
> I'm not hung up about hosing the filesystem as we have a complete
> backup before doing the raid expansion, but it'd be great if I can
> avoid restoring as that will take a very long time.

Since part of your on-disk data/metadata is wiped, I don't believe the
wiped metadata is only limited in extent tree.

But if you're really lucky, and the wiped range is only in extent tree,
btrfs-restore would be able to restore most of your data.

Thanks,
Qu

> 
> Other obligatory information:
> # uname -a
> Linux dstore-1 4.19.0-4-amd64 #1 SMP Debian 4.19.28-2 (2019-03-15)
> x86_64 GNU/Linux
> # btrfs --version
> btrfs-progs v4.20.1
> 
> Thank you very much!
> Liwei
> 

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux