Re: [PATCH][RFC] btrfs: introduce rescue=onlyfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 01, 2020 at 05:39:19PM +0200, David Sterba wrote:
> On Wed, Jul 01, 2020 at 05:22:18PM +0200, Lukas Straub wrote:
> > On Wed,  1 Jul 2020 10:44:38 -0400
> > Josef Bacik <josef@xxxxxxxxxxxxxx> wrote:
> > 
> > > One of the things that came up consistently in talking with Fedora about
> > > switching to btrfs as default is that btrfs is particularly vulnerable
> > > to metadata corruption.  If any of the core global roots are corrupted,
> > > the fs is unmountable and fsck can't usually do anything for you without
> > > some special options.
> > > 
> > > Qu addressed this sort of with rescue=skipbg, but that's poorly named as
> > > what it really does is just allow you to operate without an extent root.
> > > However there are a lot of other roots, and I'd rather not have to do
> > > 
> > > mount -o rescue=skipbg,rescue=nocsum,rescue=nofreespacetree,rescue=blah
> > > 
> > > Instead take his original idea and modify it so it just works for
> > > everything.  Turn it into rescue=onlyfs, and then any major root we fail
> > > to read just gets left empty and we carry on.
> > > 
> > > Obviously if the fs roots are screwed then the user is in trouble, but
> > > otherwise this makes it much easier to pull stuff off the disk without
> > > needing our special rescue tools.  I tested this with my TEST_DEV that
> > > had a bunch of data on it by corrupting the csum tree and then reading
> > > files off the disk.
> > > 
> > > Signed-off-by: Josef Bacik <josef@xxxxxxxxxxxxxx>
> > > ---
> > > 
> > > I'm not married to the rescue=onlyfs name, if we can think of something better
> > > I'm good.
> > 
> > Maybe you could go a step further and automatically switch to rescue
> > mode if something is corrupt. This is easier for the user than having
> > to remember the mount flags.
> 
> We don't want to do the auto-switching in general as it's a non-standard
> situation.  It's better to get user attention than to silently mount
> with limited capabilities and then let the user find out that something
> went wrong, eg. system services randomly failing to start or work.

To be fair, auto-switching is almost exactly what happens now, with mounts
being the exception.  If the tree corruption is detected after mounting,
we panic and flip the RO bit right in the middle of the work day.
When mounting with default options and a tree error detected during the
mount, we fail to mount at all--IMHO that's the non-standard situation.



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux