Hi!
I need to boot with dracut to get my btrfs root partition properly
initialized (because it is a multi-device btrfs). Today, after upgrading to
systemd v220, I tracked a booting issue down to what looks like a general
problem with the btrfs udev rules distributed with systemd:
If I drop down to an emergency shell through rd.break=pre-mount, when trying
to mount sysroot, I get the error "open_ctree failed" and "BTRFS: failed to
read the system array". This is generally a problem when probing for btrfs
devices hasn't been done yet.
So I looked into the dracut sources to find that it brings it's own udev
rule which properly does this. The caveat however is: If it already finds a
udev rules for btrfs, it won't install its own rule. The rule in question
is:
$ cat 64-btrfs.rules
# do not edit this file, it will be overwritten on update
SUBSYSTEM!="block", GOTO="btrfs_end"
ACTION=="remove", GOTO="btrfs_end"
ENV{ID_FS_TYPE}!="btrfs", GOTO="btrfs_end"
# let the kernel know about this btrfs filesystem, and check if it is
complete
IMPORT{builtin}="btrfs ready $devnode"
# mark the device as not ready to be used by the system
ENV{ID_BTRFS_READY}=="0", ENV{SYSTEMD_READY}="0"
LABEL="btrfs_end"
It comes distributed with systemd so I believe this is a systemd issue. I
fixed it by placing the following work-around:
/usr/lib/dracut/modules.d/99btrfs-device-scan/btrfs_device_scan.sh:
#!/bin/sh
type getarg >/dev/null 2>&1 || . /lib/dracut-lib.sh
info "Scanning for all btrfs devices"
/sbin/btrfs device scan >/dev/null 2>&1
/usr/lib/dracut/modules.d/99btrfs-device-scan/module-setup.sh:
#!/bin/bash
# called by dracut
check() {
local _rootdev
# if we don't have btrfs installed on the host system,
# no point in trying to support it in the initramfs.
require_binaries btrfs || return 1
[[ $hostonly ]] || [[ $mount_needs ]] && {
for fs in ${host_fs_types[@]}; do
[[ "$fs" == "btrfs" ]] && return 0
done
return 255
}
return 0
}
# called by dracut
depends() {
echo btrfs
return 0
}
# called by dracut
install() {
inst_hook pre-mount 99 "$moddir/btrfs_device_scan.sh"
}
This issues an explicit "btrfs device scan" in the pre-mount hook. However,
looking at the udev rules of systemd for btrfs, it should accomblish more or
less the same. So something is buggy or racy there. I took note that I saw
only one of the following lines in dmesg when the problem was present:
[ 5.514318] BTRFS: device label system devid 5 transid 2779055
/dev/bcache2
[ 5.514422] BTRFS: device label system devid 6 transid 2779055
/dev/bcache1
[ 5.514521] BTRFS: device label system devid 4 transid 2779055
/dev/bcache0
Without my "fix", only one line showed up in the log - probably exactly at
mount time when systemd's sysroot.mount unit started. It wasn't always the
same, tho.
With v219 I only had sometimes this problem. A reboot usually fixed it. This
supports my theory of the rule being racy somewhere, especially around the
line ENV{ID_BTRFS_READY}=="0", ENV{SYSTEMD_READY}="0".
My btrfs setup looks like this:
Overall:
Device size: 2.71TiB
Device allocated: 1.85TiB
Device unallocated: 880.47GiB
Device missing: 0.00B
Used: 1.30TiB
Free (estimated): 1.41TiB (min: 1003.50GiB)
Data ratio: 1.00
Metadata ratio: 2.00
Global reserve: 512.00MiB (used: 0.00B)
Data,RAID0: Size:1.84TiB, Used:1.29TiB
/dev/bcache0 628.00GiB
/dev/bcache1 628.00GiB
/dev/bcache2 628.00GiB
Metadata,RAID1: Size:6.00GiB, Used:4.21GiB
/dev/bcache0 4.00GiB
/dev/bcache1 4.00GiB
/dev/bcache2 4.00GiB
System,RAID1: Size:32.00MiB, Used:120.00KiB
/dev/bcache0 32.00MiB
/dev/bcache2 32.00MiB
Unallocated:
/dev/bcache0 293.48GiB
/dev/bcache1 293.51GiB
/dev/bcache2 293.48GiB
Dracut is v041, systemd is v220, kernel is 4.0.4, cmdline is:
root=/dev/bcache0 ro snd_hda_intel.enable_msi=1 rootfstype=btrfs
rootflags=compress=lzo zswap.enabled=1 splash quiet
It may be worth noting that I'm using bcache whose udev rules may interfere
with those for btrfs.
CC'ing bcache-devel and btrfs-devel just in case, f'up btrfs-devel.
--
Replies to list only preferred.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html