Re: Help needed to recover from partition resize/move

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Apr 25, 2020 at 4:24 AM Yegor Yegorov <gochkin@xxxxxxxxx> wrote:
>
> Hi, I have been stupid enough to try to move and extend my btrfs
> partition using a GUI software of my distro. The operation ended with
> an error. From the logs of the operation, I understood that the
> movement of the partition succeeded, but some finishing operation is
> failed. I don't have this log anymore, so I can't provide further
> information on that.
>
> Now I ended up with btrfs partition that can't be mounted. Here the
> output of the various system and btrfs tools:
>
> $> mount -t btrfs /dev/nvme0n1p3 /mnt/
> mount: /mnt/: wrong fs type, bad option, bad superblock on
> /dev/nvme0n1p3, missing codepage or helper program, or other error.
>
> $>dmesg | tail
> [11637.931751] BTRFS info (device nvme0n1p3): disk space caching is enabled
> [11637.931754] BTRFS info (device nvme0n1p3): has skinny extents
> [11637.936339] BTRFS error (device nvme0n1p3): bad tree block start,
> want 1048576 have 6267530653245814412
> [11637.936350] BTRFS error (device nvme0n1p3): failed to read chunk root
> [11637.950289] BTRFS error (device nvme0n1p3): open_ctree failed
> [11637.950893] audit: type=1106 audit(1587809374.388:663): pid=14229
> uid=0 auid=1000 ses=2 msg='op=PAM:session_close
> grantors=pam_limits,pam_unix,pam_permit acct="root"
> exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/1 res=success'
> [11637.951039] audit: type=1104 audit(1587809374.388:664): pid=14229
> uid=0 auid=1000 ses=2 msg='op=PAM:setcred
> grantors=pam_unix,pam_permit,pam_env acct="root" exe="/usr/bin/sudo"
> hostname=? addr=? terminal=/dev/pts/1 res=success'
> [11674.981082] audit: type=1101 audit(1587809411.415:665): pid=14277
> uid=1000 auid=1000 ses=2 msg='op=PAM:accounting
> grantors=pam_unix,pam_permit,pam_time acct="go4a" exe="/usr/bin/sudo"
> hostname=? addr=? terminal=/dev/pts/1 res=success'
> [11674.981423] audit: type=1110 audit(1587809411.415:666): pid=14277
> uid=0 auid=1000 ses=2 msg='op=PAM:setcred
> grantors=pam_unix,pam_permit,pam_env acct="root" exe="/usr/bin/sudo"
> hostname=? addr=? terminal=/dev/pts/1 res=success'
> [11674.985959] audit: type=1105 audit(1587809411.422:667): pid=14277
> uid=0 auid=1000 ses=2 msg='op=PAM:session_open
> grantors=pam_limits,pam_unix,pam_permit acct="root"
> exe="/usr/bin/sudo" hostname=? addr=? terminal=/dev/pts/1 res=success'
>
> $> btrfs check /dev/nvme0n1p3
> Opening filesystem to check...
> checksum verify failed on 1048576 found 00000067 wanted 0000006E
> checksum verify failed on 1048576 found 00000067 wanted 0000006E
> bad tree block 1048576, bytenr mismatch, want=1048576, have=6267530653245814412
> ERROR: cannot read chunk root
> ERROR: cannot open file system
>
> $> btrfs restore /dev/nvme0n1p3 /mnt/
> checksum verify failed on 1048576 found 00000067 wanted 0000006E
> checksum verify failed on 1048576 found 00000067 wanted 0000006E
> bad tree block 1048576, bytenr mismatch, want=1048576, have=6267530653245814412
> ERROR: cannot read chunk root
> Could not open root, trying backup super
> checksum verify failed on 1048576 found 00000067 wanted 0000006E
> checksum verify failed on 1048576 found 00000067 wanted 0000006E
> bad tree block 1048576, bytenr mismatch, want=1048576, have=6267530653245814412
> ERROR: cannot read chunk root
> Could not open root, trying backup super
> ERROR: superblock bytenr 274877906944 is larger than device size 188743680000


This suggests the partition was changed (shrunk) before the file
system shrink had succeeded; or that the partition was changed even
after file system shrink failed. At this point it's required to make
no changes to the file system, including repairs, until the partition
size matches the file system size as reported by the superblock.


> Could not open root, trying backup super
>
> $> btrfs rescue chunk-recover /dev/nvme0n1p3
> Scanning: DONE in dev0
> Check chunks successfully with no orphans
> Chunk tree recovered successfully


My suspicion is this might have made things worse. I don't see how
chunk-recover can do the right thing, if the partition is wrong. And
further I think that all of the Btrfs tools should do an early check
that the partition size matches file system size, and if that fails,
it should warn and stop.



>
> $>btrfs rescue super-recover /dev/nvme0n1p3
> All supers are valid, no need to recover

True only in the narrow case that their checksum matches contents,
sanity tests, and they match each other. But I still think even this
tool should confirm/deny the very basic thing: partition size (or
block device size) matches file system size.


>
> $>btrfs rescue zero-log /dev/nvme0n1p3
> checksum verify failed on 1048576 found 00000067 wanted 0000006E
> checksum verify failed on 1048576 found 00000067 wanted 0000006E
> bad tree block 1048576, bytenr mismatch, want=1048576, have=6267530653245814412
> ERROR: cannot read chunk root
> ERROR: could not open ctree

OK sorry but now you're just throwing spaghetti at a wall. It's
desperation, which is a great way to cause further damage to a file
system. Fortunately the log is not critical, it just means it's
possible to lose the most recent data being written and fsync'd.



>
> $>btrfs-find-root /dev/nvme0n1p3
> WARNING: cannot read chunk root, continue anyway
> Superblock thinks the generation is 49
> Superblock thinks the level is 1
>
> $>btrfs check --repair /dev/nvme0n1p3
> Starting repair.
> Opening filesystem to check...
> checksum verify failed on 1048576 found 00000067 wanted 0000006E
> checksum verify failed on 1048576 found 00000067 wanted 0000006E
> bad tree block 1048576, bytenr mismatch, want=1048576, have=6267530653245814412
> ERROR: cannot read chunk root
> ERROR: cannot open file system

The want and have are way far apart, the file system is damaged. And I
don't see how it can be fixed when the partition is smaller than the
file system. Too much is missing and repair is impossible.

Doing a repair in this case will make things worse. But again, I argue
check even with --repair should quickly fail if there's a block device
size and file system size mismatch, specifically the case where block
device size is smaller than file system size, as in this case.


> $> btrfs check --repair --init-csum-tree --init-extent-tree /dev/nvme0n1p3
> Starting repair.
> Opening filesystem to check...
> checksum verify failed on 1048576 found 00000067 wanted 0000006E
> checksum verify failed on 1048576 found 00000067 wanted 0000006E
> bad tree block 1048576, bytenr mismatch, want=1048576, have=6267530653245814412
> ERROR: cannot read chunk root
> ERROR: cannot open file system

The heavy hammer. Again, likely to make things worse, but luckily it
can't find enough to proceed with the repair.

You need to fix the partition/filesystem size mismatch first.

fdisk -l /dev/sda
btrfs insp dump-s /dev/sdaN

where N is the partition for this file system


-- 
Chris Murphy



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux