Re: unable to mount btrfs partition, please help :(

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Mar 19, 2016 at 5:35 PM, Patrick Tschackert <Killing-Time@xxxxxx> wrote:
> Hi Chris,
>
> thank you for answering so quickly!
>
>> Try 'btrfs check' without any options first.
> $ btrfs check /dev/mapper/storage
> checksum verify failed on 36340960788480 found 8F8E1006 wanted 4AA1BC89
> checksum verify failed on 36340960788480 found 8F8E1006 wanted 4AA1BC89
> bytenr mismatch, want=36340960788480, have=4530277753793296986
> Couldn't read chunk tree
> Couldn't open file system
>
>> To me it seems the problem is instigated by lower layers either not
>> completing critical writes at the time of the power failure, or didn't
>> rebuild correctly.
>
> There wasn't a power failure, a VM crashed whilst writing to the btrfs filesys.

OK I went back and read this again: host is managing the md raid5, the
guest is writing Btrfs to an "encrypted container" but what is that? A
LUKS encrypted LVM LV that's directly used by Virtual Box as a raw
device? It's hard to say what layer broke this. But the VM crashing is
in effect like a power failure, and it's an open question (for me) how
this setup deals with barriers. A shutdown -r now should still cleanly
stop the array so I wouldn't expect there to be an array problem but
then you also report a device failure. Bad luck.

I think in retrospect the safe way to do these kinds of Virtual Box
updates, which require kernel module updates, would have been to
shutdown the VM and stop the array. *shrug*


>
>> You should check the SCT ERC setting on each drive with 'smartctl -l
>> scterc /dev/sdX' and also the kernel command timer setting with 'cat
>> /sys/block/sdX/device/timeout' also for each device. The SCT ERC value
>> must be less than the command timer. It's a common misconfiguration
>> with raid setups.
>
> $ smartctl -l scterc /dev/sda (sdb, sdc, sde, sdg)
> gives me
>
> smartctl 6.4 2014-10-07 r4002 [x86_64-linux-3.16.0-4-amd64] (local build)
> Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org
>
> SCT Error Recovery Control command not supported

These drives are technically not suitable for use in any kind of raid
except linear and raid 0 (which have no redundancy so they aren't
really raid). You'd have to dig up drive specs, assuming they're
published, to see what the recovery times are for the drive models
when a bad sector is encountered. But it's typical for such drives to
exceed 30 seconds for recovery, with some drives reported to have 2+
minute recoveries. To properly configure them, you'll have to increase
the kernel's SCSI comment timer to at least 120 to make sure there's
sufficient time to wait for the drive to explicitly spit back a read
error to the kernel. Otherwise, the kernel gives up after 30 seconds,
and resets the link to the drive, and any possibility of fixing up the
bad sector via the raid read error fixup mechanism is thwarted. It's
really common, the linux-raid@ list has many of these kinds of threads
with this misconfiguration as the source problem.




>
> while
> $ smartctl -l scterc /dev/sdf (sdh, sdi, sdj, sdk)
> gives me
>
> smartctl 6.4 2014-10-07 r4002 [x86_64-linux-3.16.0-4-amd64] (local build)
> Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org
>
> SCT Error Recovery Control:
>            Read:     70 (7.0 seconds)
>           Write:     70 (7.0 seconds)

These drives are suitable for raid out of the box.


>
> $ cat /sys/block/sdX/device/timeout
> gives me "30" for every device
>
> Does that mean my settings for the device timeouts are wrong?

For the first listing of drives yes. And 120 second delays might be
too long for your use case, but that's the reality.

You should change the command timer for the drives that do not support
configurable SCT ERC. And then do a scrub check. And then check both
cat /sys/block/mdX/md/mismatch_cnt, which ideally should be 0, and
also check kernel messages for libata read errors.


>
>> After that's fixed you should do a scrub, and I'm thinking it's best
>> to do only a check, which means 'echo check >
>> /sys/block/mdX/md/sync_action' rather than issuing repair which
>> assumes data strips are correct and parity strips are wrong and
>> rebuilds all parity strips.
>
> I don't quite understand, I thought a scrub could only be done on a mounted filesys?

You have two scrubs. There's a Btrfs scrub. And an md scrub. I'm
referring to the latter.


> Do you reall mean executing the command "echo check > /sys/block/md0/md/sync_action"? At the moment it says "idle" in that file.
> Also, the btrfs filesys sits in an encrypted container, so the setup looks like this:
>
> /dev/md0 (this is the Raid device)
> /dev/mapper/storage (after cryptsetup luksOpen, this is where filesys should be mounted from)
> /media/storage (i always mounted the filesystem into this folder by executing "mount /dev/mapper/storage /media/storage")
>
> Apologies if I didn't make that clear enough in my initial email

Ok so the host is writing Btrfs to /dev/mapper/storage? I guess now I
don't understand what the relevance is of Virtual Box and that crash.
Is it writing VDI files onto the host mounted Btrfs?


>
>
>>> $ uname -a
>>> Linux vmhost 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u4
>>> (2016-02-29) x86_64 GNU/Linux
>>This is old. You should upgrade to something newer, ideally 4.5 but
>>4.4.6 is good also, and then oldest I'd suggest is 4.1.20.
>
> Shouldn't I be able to get the newest kernel by executing "apt-get update && apt-get dist-upgrade"?
> That's what I ran just now, and it doesn't install a newer kernel. Do I really have to manually upgrade to a newer one?

I'm not sure. You might do a list search for debian, as I know debian
users are using newer kernels that they didn't build themselves.


> On top of the sticky situation i'm already in, i'm not sure if I trust myself manually building a new kernel. Should I?



>
>> What do you get for
>> btrfs-find-root /dev/mdX
>> btrfs-show-super -fa /dev/mdX
>
> $ btrfs-find-root /dev/mapper/storage
> Couldn't read chunk tree
> Open ctree failed

Hmm not good. See this similar thread.

http://www.spinics.net/lists/linux-btrfs/msg51711.html




> generation              1322969
> root                    24022309593088
> chunk_root_generation   1275381
> chunk_root              36340959809536

backups in all superblocks have the same chunk_root, no alternative
chunk root to try.

So at the moment I think it's worth trying a newer kernel version and
mounting normally; then mounting with -o recovery; then - recovery,ro.

If that doesn't work, you're best off waiting for a developer to give
advice on the next step;  'btrfs rescue chunk-recover' seems most
appropriate but again someone else a while back had success with
zero-log, but it's hard to say if the two cases are really similar and
maybe that person just got lucky. Both of those change the file system
in irreversible ways, that's why I suggest waiting or asking on IRC.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux