Re: [RFC] btrfs: strategy to perform a rollback at boot time

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 21, 2020 at 2:33 PM Goffredo Baroncelli <kreijack@xxxxxxxxx> wrote:
>
>
> Hi all,
>
> this is an RFC to discuss a my idea to allow a simple rollback of the
> root filesystem at boot time.
>
> The problem that I want to solve is the following: DPKG is very slow on
> a BTRFS filesystem. The reason is that DPKG massively uses
> sync()/fsync() to guarantee that the filesystem is always coherent even
> in case of sudden shutdown.
>
> The same can be useful even to the RPM Linux based distribution (which however
> suffer less than DPKG).
>
> A way to avoid the sync()/fsync() calls without loosing the DPKG
> guarantees, is:
> 1) perform a snapshot of the root filesystem (the rollback one)
> 2) upgrade the filesystem without using sync/fsync
> 3) final (global) sync
> 4) destroy the rollback snapshot
>
> If an unclean shutdown happens between 1) and 4), two subvolume exists:
> the 'main' one and the 'rollback' one (which is the snapshot before the
> update). In this case the system at boot time should mount the "rollback"
> subvolume instead of the "main" one. Otherwise in case of a "clean" boot, the
> "rollback" subvolume doesn't exist and only the "main" one can be
> mounted.
>
> In [1] I discussed a way to implement the steps 1 to 4. (ok, I missed
> the point 3) ).
>
> The part that was missed until now, is an automatic way to mount the rollback
> subvolume at boot time when it is present.
>
> My idea is to allow more 'subvol=' option. In this case BTRFS tries all the
> passed subvolumes until the first succeed. So invoking the kernel as:
>
>   linux root=UUID=xxxx rootflags=subvol=rollback,subvol=main ro
>
> First, the kernel tries to mount the 'rollback' subvolume. If the rollback
> subvolume doesn't exist then it mounts the 'main' subvolume.
>
> Of course after the mount, the system should perform a cleanup of the
> subvolumes: i.e. if a rollback subvolume exists, the system should destroy
> the "main" one (which contains garbage) and rename "rollback" to "main".
> To be more precise:
>
>         if test -d "rollback"; then
>                 if test -d "old"; then
>                         btrfs sub del "old"
>                 fi
>                 if test -d "main"; then
>                         mv "main" "old"
>                 fi
>                 mv "rollback" "main"
>                 btrfs sub del "old"
>         fi
>
> Comments are welcome
> BR
> G.Baroncelli
>
> [1] http://lore.kernel.org/linux-btrfs/69396573-b5b3-b349-06f5-f5b74eb9720d@xxxxxxxxx/
>
> P.S.
> I am guessing if an idea like this can be applied to a file. E.g. a sqlite
> database that instead of reling to sync/fsync, creates a reflink file as
> "rollback" if something goes wrong.... The ordering is preserved. Not the
> duration.

One way:
btrfs sub snap main rollback
change bootloader rootflags=subvol=rollback and /etc/fstab (or use
btrfs sub set-default)
do the update to main
- if it blows up at anytime, rollback is what's used, delete main and
rename rollback to main
- if it succeeds, revert the bootloader changes so main boots, but
keep rollback in case booting main fails

Another way:
btrfs sub snap main update
lock the /var /etc /boot for main from changes: no configuration
changes, no package changes, but user can keep working on user space
things
use bwrap/nspawn/podman to load up and assemble the update tree and
perform the update out of band
- if update blows up, just delete the update snapshot, and then unlock
the system from disallowed changes
- if update succeeds, main can be renamed mainold and update can be
renamed main, update bootloader stuff; everything still stays locked
and the user can keep working on user space things until they're ready
to reboot; nice thing about containers is you can apply cgroupsv2
controls to make sure the update has no resource control impact on the
user's current work

Personally I prefer the latter, doing the update out of band rather
than applying the update either on a running sysroot or having to do
an offline (reboot to a minimal environment) update. I think locking
the user out of system changes is acceptable for such an out of band
update. The alternative is something like the merge of /etc /var
things that have changed during the time the update was initiated - I
think it's not worth that complexity but if someone wants to build
that, OK.


-- 
Chris Murphy



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux