Decided to upgrade my system to the latest and give it a shot: # btrfs check /dev/sde2 Opening filesystem to check... parent transid verify failed on 4314780106752 wanted 470169 found 470107 checksum verify failed on 4314780106752 found 7077566E wanted 9494EBD8 checksum verify failed on 4314780106752 found 489FC179 wanted 73D057EA checksum verify failed on 4314780106752 found 489FC179 wanted 73D057EA bad tree block 4314780106752, bytenr mismatch, want=4314780106752, have=20212047631104 ERROR: cannot open file system # uname -r 5.3.1-arch1-1-ARCH # btrfs --version btrfs-progs v5.2.2 Does that help anything? -- Groet / Cheers, Patrick Dijkgraaf On Fri, 2019-10-04 at 09:41 +0200, Patrick Dijkgraaf wrote: > Hi Qu, > > I know about RAID5/6 risks, so I won't blame anyone but myself. I'm > currenlty working on another solution, but I was not quite there > yet... > > mount -o ro /dev/sdh2 /mnt/data gives me: > > [Fri Oct 4 09:36:27 2019] BTRFS info (device sde2): disk space > caching > is enabled > [Fri Oct 4 09:36:27 2019] BTRFS info (device sde2): has skinny > extents > [Fri Oct 4 09:36:27 2019] BTRFS error (device sde2): parent transid > verify failed on 5483020828672 wanted 470169 found 470108 > [Fri Oct 4 09:36:27 2019] btree_readpage_end_io_hook: 5 callbacks > suppressed > [Fri Oct 4 09:36:27 2019] BTRFS error (device sde2): bad tree block > start 2286352011705795888 5483020828672 > [Fri Oct 4 09:36:27 2019] BTRFS error (device sde2): bad tree block > start 2286318771218040112 5483020828672 > [Fri Oct 4 09:36:27 2019] BTRFS error (device sde2): bad tree block > start 2286363934109025584 5483020828672 > [Fri Oct 4 09:36:27 2019] BTRFS error (device sde2): bad tree block > start 2286229742125204784 5483020828672 > [Fri Oct 4 09:36:27 2019] BTRFS error (device sde2): bad tree block > start 2286353230849918256 5483020828672 > [Fri Oct 4 09:36:27 2019] BTRFS error (device sde2): bad tree block > start 2286246155688035632 5483020828672 > [Fri Oct 4 09:36:27 2019] BTRFS error (device sde2): bad tree block > start 2286321695890425136 5483020828672 > [Fri Oct 4 09:36:27 2019] BTRFS error (device sde2): bad tree block > start 2286384677254874416 5483020828672 > [Fri Oct 4 09:36:27 2019] BTRFS error (device sde2): bad tree block > start 2286386365024912688 5483020828672 > [Fri Oct 4 09:36:27 2019] BTRFS error (device sde2): bad tree block > start 2286284400752608560 5483020828672 > [Fri Oct 4 09:36:27 2019] BTRFS error (device sde2): failed to > recover > balance: -5 > [Fri Oct 4 09:36:27 2019] BTRFS error (device sde2): open_ctree > failed > > Do you think there is any chance to recover? > > Thanks, > Patrick. > > > On Fri, 2019-10-04 at 15:22 +0800, Qu Wenruo wrote: > > On 2019/10/4 下午2:59, Patrick Dijkgraaf wrote: > > > Hi guys, > > > > > > During the night, I started getting the following errors and data > > > was > > > no longer accessible: > > > > > > [Fri Oct 4 08:04:26 2019] btree_readpage_end_io_hook: 2522 > > > callbacks > > > suppressed > > > [Fri Oct 4 08:04:26 2019] BTRFS error (device sde2): bad tree > > > block > > > start 17686343003259060482 7808404996096 > > > > Tree block at address 7808404996096 is completely broken. > > > > All the other messages with 7808404996096 shows btrfs is trying all > > possible device combinations to rebuild that tree block, but > > obviously > > all failed. > > > > Not sure why the tree block is corrupted, but it's pretty possible > > that > > RAID5/6 write hole ruined your possibility to recover. > > > > > [Fri Oct 4 08:04:26 2019] BTRFS error (device sde2): bad tree > > > block > > > start 254095834002432 7808404996096 > > > [Fri Oct 4 08:04:26 2019] BTRFS error (device sde2): bad tree > > > block > > > start 2574563607252646368 7808404996096 > > > [Fri Oct 4 08:04:26 2019] BTRFS error (device sde2): bad tree > > > block > > > start 17873260189421384017 7808404996096 > > > [Fri Oct 4 08:04:26 2019] BTRFS error (device sde2): bad tree > > > block > > > start 9965805624054187110 7808404996096 > > > [Fri Oct 4 08:04:26 2019] BTRFS error (device sde2): bad tree > > > block > > > start 15108378087789580224 7808404996096 > > > [Fri Oct 4 08:04:26 2019] BTRFS error (device sde2): bad tree > > > block > > > start 7914705769619568652 7808404996096 > > > [Fri Oct 4 08:04:26 2019] BTRFS error (device sde2): bad tree > > > block > > > start 16752645757091223687 7808404996096 > > > [Fri Oct 4 08:04:26 2019] BTRFS error (device sde2): bad tree > > > block > > > start 9617669583708276649 7808404996096 > > > [Fri Oct 4 08:04:26 2019] BTRFS error (device sde2): bad tree > > > block > > > start 3384408928046898608 7808404996096 > > > > [...] > > > Decided to reboot (for another reason) and tried to mount > > > afterwards: > > > > > > [Fri Oct 4 08:29:42 2019] BTRFS info (device sde2): disk space > > > caching > > > is enabled > > > [Fri Oct 4 08:29:42 2019] BTRFS info (device sde2): has skinny > > > extents > > > [Fri Oct 4 08:29:44 2019] BTRFS error (device sde2): parent > > > transid > > > verify failed on 5483020828672 wanted 470169 found 470108 > > > [Fri Oct 4 08:29:45 2019] BTRFS error (device sde2): bad tree > > > block > > > start 2286352011705795888 5483020828672 > > > [Fri Oct 4 08:29:45 2019] BTRFS error (device sde2): bad tree > > > block > > > start 2286318771218040112 5483020828672 > > > [Fri Oct 4 08:29:45 2019] BTRFS error (device sde2): bad tree > > > block > > > start 2286363934109025584 5483020828672 > > > [Fri Oct 4 08:29:45 2019] BTRFS error (device sde2): bad tree > > > block > > > start 2286229742125204784 5483020828672 > > > [Fri Oct 4 08:29:45 2019] BTRFS error (device sde2): bad tree > > > block > > > start 2286353230849918256 5483020828672 > > > [Fri Oct 4 08:29:45 2019] BTRFS error (device sde2): bad tree > > > block > > > start 2286246155688035632 5483020828672 > > > [Fri Oct 4 08:29:45 2019] BTRFS error (device sde2): bad tree > > > block > > > start 2286321695890425136 5483020828672 > > > [Fri Oct 4 08:29:45 2019] BTRFS error (device sde2): bad tree > > > block > > > start 2286384677254874416 5483020828672 > > > [Fri Oct 4 08:29:45 2019] BTRFS error (device sde2): bad tree > > > block > > > start 2286386365024912688 5483020828672 > > > [Fri Oct 4 08:29:45 2019] BTRFS error (device sde2): bad tree > > > block > > > start 2286284400752608560 5483020828672 > > > [Fri Oct 4 08:29:45 2019] BTRFS error (device sde2): failed to > > > recover > > > balance: -5 > > > [Fri Oct 4 08:29:45 2019] BTRFS error (device sde2): open_ctree > > > failed > > > > You're lucky, as the problem is from balance recovery, thus you may > > have > > a chance to mount the RO. > > As your fs can progress to btrfs_recover_relocation(), most > > essential > > trees should be OK, thus you have a chance to mount it RO. > > > > > The FS info is shown below. It is a RAID6. > > > > > > Label: 'data' uuid: 43472491-7bb3-418c-b476-874a52e8b2b0 > > > Total devices 16 FS bytes used 36.73TiB > > > > You won't want to salvage data from a near 40T fs... > > > > > devid 1 size 7.28TiB used 2.66TiB path /dev/sde2 > > > devid 2 size 3.64TiB used 2.66TiB path /dev/sdf2 > > > devid 3 size 3.64TiB used 2.66TiB path /dev/sdg2 > > > devid 4 size 7.28TiB used 2.66TiB path /dev/sdh2 > > > devid 5 size 3.64TiB used 2.66TiB path /dev/sdi2 > > > devid 6 size 7.28TiB used 2.66TiB path /dev/sdj2 > > > devid 7 size 3.64TiB used 2.66TiB path /dev/sdk2 > > > devid 8 size 3.64TiB used 2.66TiB path /dev/sdl2 > > > devid 9 size 7.28TiB used 2.66TiB path /dev/sdm2 > > > devid 10 size 3.64TiB used 2.66TiB path /dev/sdn2 > > > devid 11 size 7.28TiB used 2.66TiB path /dev/sdo2 > > > devid 12 size 3.64TiB used 2.66TiB path /dev/sdp2 > > > devid 13 size 7.28TiB used 2.66TiB path /dev/sdq2 > > > devid 14 size 7.28TiB used 2.66TiB path /dev/sdr2 > > > devid 15 size 3.64TiB used 2.66TiB path /dev/sds2 > > > devid 16 size 3.64TiB used 2.66TiB path /dev/sdt2 > > > > And you won't want to use RAID6 if you're expecting RAID6 to > > tolerant > > 2 > > disks malfunction. > > > > As btrfs RAID5/6 has write-hole problem, any unexpected power loss > > or > > disk error could reduce the error tolerance step by step, if you're > > not > > running scrub regularly. > > > > > The initial error refers to sdw, so possibly something happened > > > that > > > caused one or more disks in the external cabinet to disappear and > > > reappear. > > > > > > Kernel is 4.18.16-arch1-1-ARCH. Very hesitant to upgrade it, > > > because > > > previously I had to downgrade the kernel to get the volume > > > mounted > > > again. > > > > > > Question: I know that running checks on BTRFS can be dangerous, > > > what > > > can you recommend me doing to get the volume back online? > > > > "btrfs check" is not dangerous at all. In fact it's pretty safe and > > it's > > the main tool we use to expose any problem. > > > > It's "btrfs check --repair" dangerous, but way less dangerous in > > recent > > years. (although in your case, --repair is completely unrelated and > > won't help at all) > > > > "btrfs check" output from latest btrfs-progs would help. > > > > Thanks, > > Qu > > >
