On 7/21/20 10:55 PM, Steven Davies wrote:
On 21/07/2020 21:33, Goffredo Baroncelli wrote:
Hi all,
this is an RFC to discuss a my idea to allow a simple rollback of the
root filesystem at boot time.
The problem that I want to solve is the following: DPKG is very slow on
a BTRFS filesystem. The reason is that DPKG massively uses
sync()/fsync() to guarantee that the filesystem is always coherent even
in case of sudden shutdown.
The same can be useful even to the RPM Linux based distribution (which however
suffer less than DPKG).
A way to avoid the sync()/fsync() calls without loosing the DPKG
guarantees, is:
1) perform a snapshot of the root filesystem (the rollback one)
2) upgrade the filesystem without using sync/fsync
3) final (global) sync
4) destroy the rollback snapshot
If an unclean shutdown happens between 1) and 4), two subvolume exists:
the 'main' one and the 'rollback' one (which is the snapshot before the
update). In this case the system at boot time should mount the "rollback"
subvolume instead of the "main" one. Otherwise in case of a "clean" boot, the
"rollback" subvolume doesn't exist and only the "main" one can be
mounted.
In [1] I discussed a way to implement the steps 1 to 4. (ok, I missed
the point 3) ).
The part that was missed until now, is an automatic way to mount the rollback
subvolume at boot time when it is present.
My idea is to allow more 'subvol=' option. In this case BTRFS tries all the
passed subvolumes until the first succeed. So invoking the kernel as:
linux root=UUID=xxxx rootflags=subvol=rollback,subvol=main ro
First, the kernel tries to mount the 'rollback' subvolume. If the rollback
subvolume doesn't exist then it mounts the 'main' subvolume.
Of course after the mount, the system should perform a cleanup of the
subvolumes: i.e. if a rollback subvolume exists, the system should destroy
the "main" one (which contains garbage) and rename "rollback" to "main".
To be more precise:
if test -d "rollback"; then
if test -d "old"; then
btrfs sub del "old"
fi
if test -d "main"; then
mv "main" "old"
fi
mv "rollback" "main"
btrfs sub del "old"
fi
Comments are welcome
I like this idea. Do we have an easy way of detecting which subvolume has been mounted (through sysfs or similar), or would you expect to always be testing this based on the existence of certain subvolumes/directories?
You can use findmnt or cat /proc/self/mountinfo
$ findmnt | egrep btrfs
/ /dev/sde3[/debian] btrfs rw,noatime,nodiratime,nossd,space_cache,subvolid=257,subvol=/debian
├─/boot /dev/sde3[/boot] btrfs rw,noatime,nodiratime,nossd,space_cache,subvolid=258,subvol=/boot
├─/var/btrfs /dev/sde3 btrfs rw,noatime,nodiratime,nossd,space_cache,subvolid=5,subvol=/
└─/mnt/btrfs-raid1 /dev/sdd2 btrfs rw,noatime,nodiratime,space_cache,subvolid=5,subvol=/
$ cat /proc/self/mountinfo | egrep btrfs
26 1 0:22 /debian / rw,noatime,nodiratime shared:1 - btrfs /dev/sde3 rw,nossd,space_cache,subvolid=257,subvol=/debian
113 26 0:22 / /var/btrfs rw,noatime,nodiratime shared:61 - btrfs /dev/sde3 rw,nossd,space_cache,subvolid=5,subvol=/
112 26 0:22 /boot /boot rw,noatime,nodiratime shared:63 - btrfs /dev/sde3 rw,nossd,space_cache,subvolid=258,subvol=/boot
127 26 0:46 / /mnt/btrfs-raid1 rw,noatime,nodiratime shared:71 - btrfs /dev/sdd2 rw,space_cache,subvolid=5,subvol=/
--
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5