Re: [PATCH][RFC] btrfs: introduce rescue=onlyfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/1/20 11:09 PM, Qu Wenruo wrote:


On 2020/7/2 上午3:53, Josef Bacik wrote:
On 7/1/20 3:43 PM, waxhead wrote:


Josef Bacik wrote:
One of the things that came up consistently in talking with Fedora about
switching to btrfs as default is that btrfs is particularly vulnerable
to metadata corruption.  If any of the core global roots are corrupted,
the fs is unmountable and fsck can't usually do anything for you without
some special options.

Qu addressed this sort of with rescue=skipbg, but that's poorly named as
what it really does is just allow you to operate without an extent root.
However there are a lot of other roots, and I'd rather not have to do

mount -o rescue=skipbg,rescue=nocsum,rescue=nofreespacetree,rescue=blah

Instead take his original idea and modify it so it just works for
everything.  Turn it into rescue=onlyfs, and then any major root we fail
to read just gets left empty and we carry on.

Obviously if the fs roots are screwed then the user is in trouble, but
otherwise this makes it much easier to pull stuff off the disk without
needing our special rescue tools.  I tested this with my TEST_DEV that
had a bunch of data on it by corrupting the csum tree and then reading
files off the disk.

Signed-off-by: Josef Bacik <josef@xxxxxxxxxxxxxx>
---

Just an idea inspired from RAID1c3 and RAID1c3, how about introducing
DUP2 and/or even DUP3 making multiple copies of the metadata to
increase the chance to recover metadata on even a single storage device?

Because this only works on HDD.  On SSD's concurrent writes will often
be shunted to the same erase block, and if the whole erase block goes,
so do all of your copies.  This is why we default to 'single' for SSD's.

The one thing I _do_ want to do is make better use of the backup roots.
Right now we always free the pinned extents once the transaction
commits, which makes the backup roots useless as we're likely to re-use
those blocks.

IIRC Filipe tried this before and didn't go that direction due to ENOSPC.
As we need to commit multiple transactions to free the pinned extents.

But maybe the latest async pinned extent drop could solve the problem?


Yeah before it was tricky, but with Nikolay's work it made async pinned extent drop possible, I've been testing that patch internally.

Now it's just a matter of keeping the last 4 transactions worth of pinned around and only unpinning under enospc conditions. I'll dig out the async unpinning and send that up next week since that's already valuable by itself, and then we can talk about wiring up the ENOSPC part of it. Thanks,

Josef




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux