Re: btrfs device ready purpose

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jul 22, 2017 at 1:58 PM, Adam Borowski <kilobyte@xxxxxxxxxx> wrote:
> On Sat, Jul 22, 2017 at 06:15:58PM +0000, Hugo Mills wrote:
>> On Sat, Jul 22, 2017 at 12:06:17PM -0600, Chris Murphy wrote:
>> > I just did an additional test that's pretty icky behavior.
>> >
>> > 2x HDD device Btrfs volume. Add both devices and `btrfs devices ready`
>> > exits with 0 as expected. Physically remove both USB devices.
>> > Reconnect one device. `btrfs device ready` still exits 0. That's
>> > definitely not good. (If I leave that one device connected and reboot,
>> > `btrfs device ready` exits 1).
>>
>>    In a slightly less-specific way, this has been a problem pretty
>> much since the inception of the FS. It's not possible to do the
>> reverse of the "scan" operation on a device -- that is, invalidate/
>> remove the device's record in the kernel. So, as you've discovered
>> here, if you have a device which is removed (overwritten, unplugged),
>> the kernel still thinks it's a part of the FS.
>
> Alas, this needs to be fixed.  The reproducers I posted last week give data
> corruption in case a device that was once a part of the FS is reconnected.
> It doesn't matter what it contains now -- be it another part of the FS or
> something totally unrelated, as far as the device node (/dev/loop0,
> /dev/sda1, etc) is reused, degraded mounts get confused.
>
> It wasn't urgent before as degraded mounts were broken before Qu's chunk
> check patch (that's not even merged yet) -- but once running degraded is
> not an emergency, there'll be folks doing so for an extended time.
>
>>    It's something I recall being talked about a bit, some years ago. I
>> don't recall now why it was going to be useful, though. I think you
>> have a good use-case for such a new ioctl (or extension to the
>> SCAN_DEV ioctl) now, though.
>
> Such an ioctl would be inherently racey.  Even current udev code is --
> mounting right after losetup often fails, sometimes you even need to sleep
> longer than 1 second.  With the above in mind, I see no way other than
> invalidating and re-checking all known devices at mount time.


If we go back even further in time, what I'm trying to avoid is the
problem with DE's where the user connects a two device Btrfs, and then
they want to eject it. The DE is already confused because behind the
scenes it has actually mounted each device to two different mount
points, which Btrfs allows (it's one file system, on two mount
points). That's confusing, but not a big problem. The big problem
happens when the user wants to stop using that file system. So they
eject one of the two appearing devices (which should of course only be
one with Btrfs) and behind the scenes udisksd umounts just one of the
mountpoints and then appears to delete that device node, which in
effect makes the still mounted file system degraded, and results in
corruption.

Btrfs fixes this up on the next mount of both devices. But it's just
asking for trouble.

Output of this behavior here:
https://bugs.freedesktop.org/show_bug.cgi?id=87277#c3

So then I started to look at whether it's possible to easily determine
in advance if a Btrfs file system is single or multiple device, and
let udisksd have a policy where it will just ignore multiple device
Btrfs entirely - just don't support it until the guts of all this
infrastructure gets better.

'strace btrfs filesystem show' curiously shows BTRFS_IOC_FS_INFO is
only called for single device Btrfs. There is seemingly a much more
esoteric, btrfs-progs only method for getting information for multiple
device Btrfs volumes. And therefore I'm not certain if
BTRFS_IOC_FS_INFO supports multiple device Btrfs, and would return
num_devices so that it's possible to know whether to ignore devices
for a multiple device Btrfs volume.

*sigh*


Chris Murphy



-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux