Re: 5.6 pretty massive unexplained btrfs corruption: parent transid verify failed + open_ctree failed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/6/20 11:55 PM, Marc MERLIN wrote:
I'd love to know what went wrong so that it doesn't happen again, but let me know if you'd like data off this
before I wipe it (which I assume is the only way out at this point)
myth:~# btrfs check --mode=lowmem /dev/mapper/crypt_bcache0
Opening filesystem to check...
parent transid verify failed on 7325633544192 wanted 359658 found 359661
parent transid verify failed on 7325633544192 wanted 359658 found 359661
parent transid verify failed on 7325633544192 wanted 359658 found 359661
Ignoring transid failure
leaf parent key incorrect 7325633544192
ERROR: failed to read block groups: Operation not permitted
ERROR: cannot open file system


I did run bees on that filesystem, but I also just did a full btrfs check on it, and it came back clean:

Opening filesystem to check...
Checking filesystem on /dev/mapper/crypt_bcache4
UUID: 36f5079e-ca6c-4855-8639-ccb82695c18d
[1/7] checking root items
Fixed 0 roots.
[2/7] checking extents
No device size related problem found
[3/7] checking free space cache
cache and super generation don't match, space cache will be invalidated
[4/7] checking fs roots
[5/7] checking only csums items (without verifying data)
[6/7] checking root refs
[7/7] checking quota groups skipped (not enabled on this FS)
found 18089211043840 bytes used, no error found
total csum bytes: 17580412652
total tree bytes: 82326192128
total fs tree bytes: 56795086848
total extent tree bytes: 5154258944
btree space waste bytes: 13682108904
file data blocks allocated: 24050542804992


I then moved it to the target machine, started a btrfs send to it, and it failed quickly (due to a mistake
I had an old btrfs binary on that server, but I'm hoping most of the work is done in kernel space and that the user space
btrfs should not corrupt the disk if it's a bit old)

myth:/mnt# uname -r
5.6.5-amd64-preempt-sysrq-20190817

Soon after, the copy failed:
[ 2575.931316] BTRFS info (device dm-0): use zlib compression, level 3
[ 2575.931329] BTRFS info (device dm-0): disk space caching is enabled
[ 2575.931343] BTRFS info (device dm-0): has skinny extents
[ 2577.286749] BTRFS info (device dm-0): bdev /dev/mapper/crypt_bcache0 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0

You have a corrupt counter here at mount time, does your logs go back far enough to see where those came in? Thanks,

Josef



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux