On Sat, Jan 4, 2020 at 3:46 PM Georg Großmann <georg@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote: > > Dear btrfs community, > > I wanted to use a setup with Open Suse Tumbleweed together with with a > btrfs raid 1 on two disks in my virtual box. I want a system that can > still boot if one of the disks fails so I installed a bootloader to each > of the disks in /dev/sda1 and /dev/sdb1. > > I then used /dev/sda2 and /dev/sdb2 for the btrfs raid 1. After > unplugging one disk, the boot process always fails with the message > "timed out waiting for device dev-disk-by\x2duuid". I found a mailing > list here > https://lists.freedesktop.org/archives/systemd-devel/2014-May/019217.html > which pretty well describes my problem. Unfortunately, I can't find an > appropriate solution there. Since this mailing list is from 2014, has > there been some progress in the meantime? Or is this the expected > behaviour and the user has to help himself out manually? It's the same situation. Most distributions have a udev rule that waits indefinitely for all Btrfs member devices to appear. This is done because Btrfs doesn't have automatic degraded mount. If mount is attempted, and any device is missing, mount fails - even if there is a tiny delay (somewhat common) rather than a device failure that causes a device to be missing. So instead, udev waits. Mount isn't even attempted. Should the udev rule wait for 1-2 minutes, similar to the dracut script for mdadm arrays? Even if it did, it just means we get to mount after the wait, and now mount fails because Btrfs doesn't have automatic degraded mount. What's the trouble with deleting this udev rule, and then always using degraded mount option in fstab or as a kernel rootflags parameter? If there is any small delay with any device becoming available at mount time, you get a degraded mount. And however briefly, the drives can be out of sync. There is no automatic resync once all devices do become available, and Btrfs has no concept of becoming "undegraded". All of this makes things messy for the casual user, so the decision so far is to just wait indefinitely, using this udev rule. And the open question is what should this look like in 5 or 10 years? The btrfs on-disk format has enough information to figure out how to do a partial resync to catch up a slow device, similar to the mdadm write intent bitmap + resync. But does this need some enhancement so it can be totally unattended? Like a partial scrub capability? There are more questions than answers so far, that's why it requires intervention. -- Chris Murphy
