Re: Feature requests: online backup - defrag - change RAID level

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2019-09-12 19:54, Zygo Blaxell wrote:
On Thu, Sep 12, 2019 at 06:57:26PM -0400, General Zed wrote:

Quoting Chris Murphy <lists@xxxxxxxxxxxxxxxxx>:

On Thu, Sep 12, 2019 at 3:34 PM General Zed <general-zed@xxxxxxxxx> wrote:


Quoting Chris Murphy <lists@xxxxxxxxxxxxxxxxx>:

On Thu, Sep 12, 2019 at 1:18 PM <webmaster@xxxxxxxxx> wrote:

It is normal and common for defrag operation to use some disk space
while it is running. I estimate that a reasonable limit would be to
use up to 1% of total partition size. So, if a partition size is 100
GB, the defrag can use 1 GB. Lets call this "defrag operation space".

The simplest case of a file with no shared extents, the minimum free
space should be set to the potential maximum rewrite of the file, i.e.
100% of the file size. Since Btrfs is COW, the entire operation must
succeed or fail, no possibility of an ambiguous in between state, and
this does apply to defragment.

So if you're defragging a 10GiB file, you need 10GiB minimum free
space to COW those extents to a new, mostly contiguous, set of exents,

False.

You can defragment just 1 GB of that file, and then just write out to
disk (in new extents) an entire new version of b-trees.
Of course, you don't really need to do all that, as usually only a
small part of the b-trees need to be updated.

The `-l` option allows the user to choose a maximum amount to
defragment. Setting up a default defragment behavior that has a
variable outcome is not idempotent and probably not a good idea.

We are talking about a future, imagined defrag. It has no -l option yet, as
we haven't discussed it yet.

As for kernel behavior, it presumably could defragment in portions,
but it would have to completely update all affected metadata after
each e.g. 1GiB section, translating into 10 separate rewrites of file
metadata, all affected nodes, all the way up the tree to the super.
There is no such thing as metadata overwrites in Btrfs. You're
familiar with the wandering trees problem?

No, but it doesn't matter.

At worst, it just has to completely write-out "all metadata", all the way up
to the super. It needs to be done just once, because what's the point of
writing it 10 times over? Then, the super is updated as the final commit.

This is kind of a silly discussion.  The biggest extent possible on
btrfs is 128MB, and the incremental gains of forcing 128MB extents to
be consecutive are negligible.  If you're defragging a 10GB file, you're
just going to end up doing 80 separate defrag operations.
Do you have a source for this claim of a 128MB max extent size? Because everything I've seen indicates the max extent size is a full data chunk (so 1GB for the common case, potentially up to about 5GB for really big filesystems)

128MB is big enough you're going to be seeking in the middle of reading
an extent anyway.  Once you have the file arranged in 128MB contiguous
fragments (or even a tenth of that on medium-fast spinning drives),
the job is done.

On my comouter the ENTIRE METADATA is 1 GB. That would be very tolerable and
doable.

You must have a small filesystem...mine range from 16 to 156GB, a bit too
big to fit in RAM comfortably.

Don't forget you have to write new checksum and free space tree pages.
In the worst case, you'll need about 1GB of new metadata pages for each
128MB you defrag (though you get to delete 99.5% of them immediately
after).

But that is a very bad case, because usually not much metadata has to be
updated or written out to disk.

So, there is no problem.






[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux