Re: BTRFS Raid5 error during Scrub.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 3, 2019 at 6:18 AM Robert Krig
<robert.krig@xxxxxxxxxxxxxxxxxx> wrote:
>
> By the way, how serious is the error I've encountered?
> I've run a second scrub in the meantime, it aborted when it came close
> to the end, just like the first time.
> If the files that are corrupt have been deleted is this error going to
> go away?

Maybe.


> > > > Opening filesystem to check...
> > > > Checking filesystem on /dev/sda
> > > > UUID: f7573191-664f-4540-a830-71ad654d9301
> > > > [1/7] checking root items                      (0:01:17 elapsed,
> > > > 5138533 items checked)
> > > > parent transid verify failed on 48781340082176 wanted 109181
> > > > found
> > > > 109008items checked)
> > > > parent transid verify failed on 48781340082176 wanted 109181
> > > > found
> > > > 109008
> > > > parent transid verify failed on 48781340082176 wanted 109181
> > > > found
> > > > 109008

These look suspiciously like the 5.2 regression:
https://lore.kernel.org/linux-btrfs/20190911145542.1125-1-fdmanana@xxxxxxxxxx/T/#u

You should either revert to a 5.1 kernel, or use 5.2.15+.

As far as I'm aware it's not possible to fix this kind of corruption,
so I suggest refreshing your backups while you can still mount this
file system, and prepare to create it from scratch.


> > > > Ignoring transid failure
> > > > leaf parent key incorrect 48781340082176
> > > > bad block 48781340082176
> > > > [2/7] checking extents                         (0:03:22 elapsed,
> > > > 1143429 items checked)
> > > > ERROR: errors found in extent allocation tree or chunk allocation

That's usually not a good sign.



> > > > [3/7] checking free space cache                (0:05:10 elapsed,
> > > > 7236
> > > > items checked)
> > > > parent transid verify failed on 48781340082176 wanted 109181
> > > > found
> > > > 109008ems checked)
> > > > Ignoring transid failure
> > > > root 15197 inode 81781 errors 1000, some csum missing48 elapsed,

That's inode 81781 in the subvolume with ID 15197. I'm not sure what
error 1000 is, but btrfs check is a bit fussy when it enounters files
that are marked +C (nocow) but have been compressed. This used to be
possible with older kernels when nocow files were defragmented while
the file system is mounted with compression enabled. If that sounds
like your use case, that might be what's going on here, and it's
actually a benign message. It's normal for nocow files to be missing
csums. To confirm you can use 'find /pathtosubvol/ -inum 81781' to
find the file, then lsattr it and see if +C is set.

You have a few options but the first thing is to refresh backups and
prepare to lose this file system:

a. bail now, and just create a new Btrfs from scratch and restore from backup
b. try 'btrfs check --repair' to see if the transid problems are fixed; if not
c. try 'btrfs check --repair --init-extent-tree' there's a good chance
this fails and makes things worse but probably faster to try than
restoring from backup

-- 
Chris Murphy



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux