Re: Can I see what device was used to mount btrfs?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2017-05-03 14:12, Andrei Borzenkov wrote:
03.05.2017 14:26, Austin S. Hemmelgarn пишет:
On 2017-05-02 15:50, Goffredo Baroncelli wrote:
On 2017-05-02 20:49, Adam Borowski wrote:
It could be some daemon that waits for btrfs to become complete.  Do we
have something?
Such a daemon would also have to read the chunk tree.

I don't think that a daemon is necessary. As proof of concept, in the
past I developed a mount helper [1] which handled the mount of a btrfs
filesystem:
this handler first checks if the filesystem is a multivolume devices,
if so it waits that all the devices are appeared. Finally mount the
filesystem.

It's not so simple -- such a btrfs device would have THREE states:

1. not mountable yet (multi-device with not enough disks present)
2. mountable ro / rw-degraded
3. healthy

My mount.btrfs could be "programmed" to wait a timeout, then it mounts
the filesystem as degraded if not all devices are present. This is a
very simple strategy, but this could be expanded.

I am inclined to think that the current approach doesn't fit well the
btrfs requirements.  The roles and responsibilities are spread to too
much layer (udev, systemd, mount)... I hoped that my helper could be
adopted in order to concentrate all the responsibility to only one
binary; this would reduce the interface number with the other
subsystem (eg systemd, udev).
The primary problem is that systemd treats BTRFS like a block-layer
instead of a filesystem (so it assumes all devices need to be present),
and that it doesn't trust the kernel's mount function to work correctly.

My understanding is that before kernel mount can succeed for
multi-device btrfs, kernel must be made aware of devices that comprise
this filesystem. This is done by using (equivalent of) "btrfs device
scan" or "btrfs device ready". Am I wrong here?
That is correct, the kernel needs to be notified about the devices via 'btrfs device scan' (or directly with the ioctl that calls). Udev calls this automatically on newly connected block devices though, so currently there is no reason manually run it on most systems. Ideally, this should be in a mount helper and possibly triggered by 'btrfs filesystem show'. Unless you're mounting a BTRFS volume or listing what the kernel knows about, there is no reason the kernel needs to be tracking the FS, so there is no point in regularly wasting time in udev processing by scanning all newly connected devices.

As far as 'btrfs device ready', that only tells you if the kernel thinks the filesystem is mountable _and_ not degraded. It's usually correct, but watching that has the usual TOCTOU races present in any kind of status checking system, and it's useless if you want to mount degraded.

 As a result, it assumes that the mount operation will fail if it
doesn't see all the devices instead of just trying it like it should.

So do you suggest that mount will succeed even if kernel is not made
aware of all devices? If not, could you elaborate how btrfs should be
mounted on boot - we must give mount command some device, right? How
should we chose this device?
See my above comment on kernel awareness.

If you have 'degraded' in the mount options, the mount can succeed even if not all the devices are present. Systemd refuses to even try the mount if it doesn't see all the devices, and then *unmounts* the FS if it gets mounted manually and not all devices are present. Both of these are undesired behaviors for many people (the second more than the first).

I think I've outlined my thoughts on all of this somewhere before, but I can't find them, so I might as well do so here:

1. Device scanning should be done by a mount helper, not udev. This closes a serious data safety/security issue present in the current combined implementation (if you plug in a device that has the same UUID as an existing BTRFS volume on the system and both volumes are marked as multi-device, you can cause data loss in the existing volume), allows for more concise tracking of devices, and also eliminates the need for system-wide scanning in some cases (if you use 'device=' mount options that cover all the devices in the filesystem). It also saves some time in processing of uevents for hot-plugged devices.

2. Systemd should not default to unmounting filesystems it thinks aren't ready yet when they've been manually mounted. This behavior is highly counter-intuitive for most users ('The mount command didn't complain and returned 0 and dmesg has no errors, why the hell is the filesystem I just mounted not mounted?'), and more importantly in this context, makes it impossible to manually repair a BTRFS filesystem that's listed in a mount unit without dropping to emergency mode, which largely defeats the purpose of using a multi-device filesystem that can be repaired online.

3. For BTRFS, and possibly under special circumstances with other filesystems (partially present ZFS pool, partially assembled LVM or MD array that can run degraded, etc), systemd should try to mount the FS when it times out waiting for devices, and there should be an option to control this behavior. While I don't advocate mounting filesystems degraded then letting the system run, some people do, and I still expect it to work, but currently it does not when using systemd. Alternatively, it could do a polling loop with a delay to call mount instead of using 'btrfs device ready'.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux