Re: BTRFS errors, and won't mount

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2019/10/4 下午2:59, Patrick Dijkgraaf wrote:
> Hi guys,
> 
> During the night, I started getting the following errors and data was
> no longer accessible:
> 
> [Fri Oct  4 08:04:26 2019] btree_readpage_end_io_hook: 2522 callbacks
> suppressed
> [Fri Oct  4 08:04:26 2019] BTRFS error (device sde2): bad tree block
> start 17686343003259060482 7808404996096

Tree block at address 7808404996096 is completely broken.

All the other messages with 7808404996096 shows btrfs is trying all
possible device combinations to rebuild that tree block, but obviously
all failed.

Not sure why the tree block is corrupted, but it's pretty possible that
RAID5/6 write hole ruined your possibility to recover.

> [Fri Oct  4 08:04:26 2019] BTRFS error (device sde2): bad tree block
> start 254095834002432 7808404996096
> [Fri Oct  4 08:04:26 2019] BTRFS error (device sde2): bad tree block
> start 2574563607252646368 7808404996096
> [Fri Oct  4 08:04:26 2019] BTRFS error (device sde2): bad tree block
> start 17873260189421384017 7808404996096
> [Fri Oct  4 08:04:26 2019] BTRFS error (device sde2): bad tree block
> start 9965805624054187110 7808404996096
> [Fri Oct  4 08:04:26 2019] BTRFS error (device sde2): bad tree block
> start 15108378087789580224 7808404996096
> [Fri Oct  4 08:04:26 2019] BTRFS error (device sde2): bad tree block
> start 7914705769619568652 7808404996096
> [Fri Oct  4 08:04:26 2019] BTRFS error (device sde2): bad tree block
> start 16752645757091223687 7808404996096
> [Fri Oct  4 08:04:26 2019] BTRFS error (device sde2): bad tree block
> start 9617669583708276649 7808404996096
> [Fri Oct  4 08:04:26 2019] BTRFS error (device sde2): bad tree block
> start 3384408928046898608 7808404996096
[...]
> Decided to reboot (for another reason) and tried to mount afterwards:
> 
> [Fri Oct  4 08:29:42 2019] BTRFS info (device sde2): disk space caching
> is enabled
> [Fri Oct  4 08:29:42 2019] BTRFS info (device sde2): has skinny extents
> [Fri Oct  4 08:29:44 2019] BTRFS error (device sde2): parent transid
> verify failed on 5483020828672 wanted 470169 found 470108
> [Fri Oct  4 08:29:45 2019] BTRFS error (device sde2): bad tree block
> start 2286352011705795888 5483020828672
> [Fri Oct  4 08:29:45 2019] BTRFS error (device sde2): bad tree block
> start 2286318771218040112 5483020828672
> [Fri Oct  4 08:29:45 2019] BTRFS error (device sde2): bad tree block
> start 2286363934109025584 5483020828672
> [Fri Oct  4 08:29:45 2019] BTRFS error (device sde2): bad tree block
> start 2286229742125204784 5483020828672
> [Fri Oct  4 08:29:45 2019] BTRFS error (device sde2): bad tree block
> start 2286353230849918256 5483020828672
> [Fri Oct  4 08:29:45 2019] BTRFS error (device sde2): bad tree block
> start 2286246155688035632 5483020828672
> [Fri Oct  4 08:29:45 2019] BTRFS error (device sde2): bad tree block
> start 2286321695890425136 5483020828672
> [Fri Oct  4 08:29:45 2019] BTRFS error (device sde2): bad tree block
> start 2286384677254874416 5483020828672
> [Fri Oct  4 08:29:45 2019] BTRFS error (device sde2): bad tree block
> start 2286386365024912688 5483020828672
> [Fri Oct  4 08:29:45 2019] BTRFS error (device sde2): bad tree block
> start 2286284400752608560 5483020828672
> [Fri Oct  4 08:29:45 2019] BTRFS error (device sde2): failed to recover
> balance: -5
> [Fri Oct  4 08:29:45 2019] BTRFS error (device sde2): open_ctree failed

You're lucky, as the problem is from balance recovery, thus you may have
a chance to mount the RO.
As your fs can progress to btrfs_recover_relocation(), most essential
trees should be OK, thus you have a chance to mount it RO.

> 
> The FS info is shown below. It is a RAID6.
> 
> Label: 'data'  uuid: 43472491-7bb3-418c-b476-874a52e8b2b0
> 	Total devices 16 FS bytes used 36.73TiB

You won't want to salvage data from a near 40T fs...

> 	devid    1 size 7.28TiB used 2.66TiB path /dev/sde2
> 	devid    2 size 3.64TiB used 2.66TiB path /dev/sdf2
> 	devid    3 size 3.64TiB used 2.66TiB path /dev/sdg2
> 	devid    4 size 7.28TiB used 2.66TiB path /dev/sdh2
> 	devid    5 size 3.64TiB used 2.66TiB path /dev/sdi2
> 	devid    6 size 7.28TiB used 2.66TiB path /dev/sdj2
> 	devid    7 size 3.64TiB used 2.66TiB path /dev/sdk2
> 	devid    8 size 3.64TiB used 2.66TiB path /dev/sdl2
> 	devid    9 size 7.28TiB used 2.66TiB path /dev/sdm2
> 	devid   10 size 3.64TiB used 2.66TiB path /dev/sdn2
> 	devid   11 size 7.28TiB used 2.66TiB path /dev/sdo2
> 	devid   12 size 3.64TiB used 2.66TiB path /dev/sdp2
> 	devid   13 size 7.28TiB used 2.66TiB path /dev/sdq2
> 	devid   14 size 7.28TiB used 2.66TiB path /dev/sdr2
> 	devid   15 size 3.64TiB used 2.66TiB path /dev/sds2
> 	devid   16 size 3.64TiB used 2.66TiB path /dev/sdt2

And you won't want to use RAID6 if you're expecting RAID6 to tolerant 2
disks malfunction.

As btrfs RAID5/6 has write-hole problem, any unexpected power loss or
disk error could reduce the error tolerance step by step, if you're not
running scrub regularly.

> 
> The initial error refers to sdw, so possibly something happened that
> caused one or more disks in the external cabinet to disappear and
> reappear.
> 
> Kernel is 4.18.16-arch1-1-ARCH. Very hesitant to upgrade it, because
> previously I had to downgrade the kernel to get the volume mounted
> again.
> 
> Question: I know that running checks on BTRFS can be dangerous, what
> can you recommend me doing to get the volume back online?

"btrfs check" is not dangerous at all. In fact it's pretty safe and it's
the main tool we use to expose any problem.

It's "btrfs check --repair" dangerous, but way less dangerous in recent
years. (although in your case, --repair is completely unrelated and
won't help at all)

"btrfs check" output from latest btrfs-progs would help.

Thanks,
Qu

> 

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux