Re: btrfs dev del not transaction protected?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2019/12/20 下午2:37, Marc Lehmann wrote:
> On Fri, Dec 20, 2019 at 01:24:20PM +0800, Qu Wenruo <quwenruo.btrfs@xxxxxxx> wrote:
>>> I used btrfs del /somedevice /mountpoint to remove a device, and then typed
>>> sync. A short time later the system had a hard reset.
>>
>> Then it doesn't look like the title.
> 
> Hmm, I am not sure I understand: do you mean the subject?

Oh, sorry, I mean subject line "btrfs dev del not transaction protected".

> The command here
> is obviously not copied and pasted, and when typing it into my mail client,
> I forgot the "dev" part. The exact command, I think, was this:

No big deal, as we all get the point.

> 
>    btrfs dev del /dev/mapper/xmnt-cold13 /oldcold>
>> Normally for sync, btrfs will commit transaction, thus even something
>> like the title happened, you shouldn't be affected at all.
> 
> Exactly, that is my expectation.
> 
>>> [  247.385346] BTRFS error (device dm-32): devid 1 uuid f5c3dc63-1fac-45b3-b9ba-ed1ec5f92403 is missing
>>> [  247.386942] BTRFS error (device dm-32): failed to read chunk tree: -2
>>> [  247.462693] BTRFS error (device dm-32): open_ctree failed
>>
>> Is that devid 1 the device you tried to deleted?
>> Or some unrelated device?
> 
> I think the device I removed had devid 1. I am not 100% sure, but I am
> reasonably sure because I had "watch -n10 btrfs dev us" running while
> waiting for the removal to finish and not being able to control the device
> ids triggers my ocd reflexes (mostly because btrfs fi res needs the device
> id even for some single-device filesystems :), so I kind of memorised
> them.

Then it looks like a big deal.

After looking into the code (at least v5.5-rc kernel), btrfs will commit
transaction after deleting the device item in btrfs_rm_dev_item().

So even no manual sync is called, as long as there is no error report
from "btrfs dev del", such case shouldn't happen.

> 
>>> The thing is, the device is still there and accessible, but btrfs no longer
>>> recognises it, as it already deleted it before the crash.
>>
>> I think it's not what you thought, but btrfs device scan is not properly
>> triggered.
> 
> Quite possible - I based my statement that it is no longer recognized
> based on the fact that a) blkid also didn't recognize a filesystem on
> the removed device anymore and b) btrfs found the other two remaining
> devices, so if btrfs scan is not properly triggered, then this is a
> serious issue in current GNU/Linux distributions (I use debian buster on
> that server).

a) means btrfs has wiped the superblock, which happens after
btrfs_rm_dev_item().
Something is not sane now.

> 
> I assume that the device is not recognised as btrfs by blkid anymore
> because the signature had been wiped by btrfs dev del, based on previous
> experience, but I of course can't exactly know it's not, say, a hardware
> error that wiped that disk, although I would find that hard to believe :)
> 
>> Would you please give some more dmesg? As each scanned btrfs device will
>> show up in dmesg.
> 
> Here should be all btrfs-related messages for this (from grep -i btrfs):
> 
>  [   10.288533] BTRFS: device label ROOT devid 1 transid 2106939 /dev/mapper/vg_doom-root
>  [   10.314498] BTRFS info (device dm-0): disk space caching is enabled
>  [   10.316488] BTRFS info (device dm-0): has skinny extents
>  [   10.900930] BTRFS info (device dm-0): enabling ssd optimizations
>  [   10.902741] BTRFS info (device dm-0): disk space caching is enabled
>  [   11.524129] BTRFS info (device dm-0): device fsid bb3185c8-19f0-4018-b06f-38678c06c7c2 devid 1 moved old:/dev/mapper/vg_doom-root new:/dev/dm-0
>  [   11.528554] BTRFS info (device dm-0): device fsid bb3185c8-19f0-4018-b06f-38678c06c7c2 devid 1 moved old:/dev/dm-0 new:/dev/mapper/vg_doom-root
>  [   42.273530] BTRFS: device label LOCALVOL3 devid 1 transid 1240483 /dev/dm-28
>  [   42.312354] BTRFS info (device dm-28): enabling auto defrag
>  [   42.314152] BTRFS info (device dm-28): force zstd compression, level 12
>  [   42.315938] BTRFS info (device dm-28): using free space tree
>  [   42.317696] BTRFS info (device dm-28): has skinny extents
>  [   49.115007] BTRFS: device label LOCALVOL5 devid 1 transid 146201 /dev/dm-29
>  [   49.138816] BTRFS info (device dm-29): using free space tree
>  [   49.140590] BTRFS info (device dm-29): has skinny extents
>  [  102.348872] BTRFS info (device dm-29): checking UUID tree
>  [  102.393185] BTRFS: device label COLD1 devid 5 transid 1876906 /dev/dm-30

dm-30 is one transaction older than other devices.

Is that expected? If not, it may explain why we got the dead device. As
we're using older superblock, which may points to older chunk tree which
has the device item.

>  [  109.626550] BTRFS: device label COLD1 devid 4 transid 1876907 /dev/dm-32
>  [  109.654401] BTRFS: device label COLD1 devid 3 transid 1876907 /dev/dm-31

And I'm also curious about the 7s delay between devid5 and devid 3/4
detection.

Can you find a way to make devid 3/4 show up before devid 5 and try again?

And if you find a way to mount the volume RW, please write a single
empty file, and sync the fs, then umount the fs, ensure "btrfs ins
dump-super" gives the same transid of all 3 related disks.

Then the problem *may* be gone if it matches my assumption.
(After all these assumed success, please to do an unmounted btrfs check
just to make sure nothing is wrong)

>  [  109.656171] BTRFS info (device dm-32): use zstd compression, level 12
>  [  109.657924] BTRFS info (device dm-32): using free space tree
>  [  109.660917] BTRFS info (device dm-32): has skinny extents
>  [  109.662687] BTRFS error (device dm-32): devid 1 uuid f5c3dc63-1fac-45b3-b9ba-ed1ec5f92403 is missing
>  [  109.664832] BTRFS error (device dm-32): failed to read chunk tree: -2
>  [  109.742501] BTRFS error (device dm-32): open_ctree failed
> 
> At this point, /dev/mapper/xmnt-cold11 (dm-32),
> /dev/mapper/xmnt-oldcold12 (dm-31) and /dev/mapper/xmnt-cold14 (dm-30)
> were the remaining disks in the filesystem, while xmnt-cold13 was the
> device I had formerly removed (which doesn't show up).
> 
> (There are two btrfs filesystems with the COLD1 label in this machine at
> the moment, as I was migrating the fs, but the above COLD1 messages should
> all relate to the same fs).
> 
> "blkid -o value -s TYPE /dev/mapper/xmnt-cold13" didn't give any output
> (the mounting script checks for that and pauses to make provisioning
> of new disks easier), while normally it would give "btrfs" on volume
> members. This, I think, would be normal behaviour for devices that have
> been removed from a btrfs.
> 
> BTW, the four devices in question are all dmcrypt-on-lvm and are single
> devices in a hardware raid controller (a perc h740).
> 
>>> Probably nbot related, but maybe worth mentioning: I found that system
>>> crashes (resets, not power failures) cause btrfs to not mount the first
>>> time a mount is attempted, but it always succeeds the second time, e.g.:
>>>
>>>    # mount /device /mnt
>>>    ... no errors or warnings in kernel log, except:
>>>    BTRFS error (device dm-34): open_ctree failed
>>>    # mount /device /mnt
>>>    magically succeeds
>>
>> Yep, this makes it sound more like a scan related bug.
> 
> BTW, this (second issue) also happens with filesystems that are not
> multi-device.

Single device btrfs doesn't need device scan.
If that happened, something insane happened again...
Thanks,
Qu

> Not sure if that menas that btrfs scan would be involved, as
> I would assume the only device btrfs would need in such cases is the one
> given to mount, but maybe that also needs a working btrfs scan?
> 
> Thanks for your working on btrfs btw. :)
> 

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux