On Sun, Sep 15, 2019 at 02:05:47PM -0400, General Zed wrote:
>
> Quoting Zygo Blaxell <ce3g8jdj@xxxxxxxxxxxxxxxxxxxxx>:
> > 3% of 45TB is 1.35TB...seems a little harsh. Recall no extent can be
> > larger than 128MB, so we're talking about enough space for ten thousand
> > of defrag's worst-case output extents. A limit based on absolute numbers
> > might make more sense, though the only way to really know what the limit is
> > on any given filesystem is to try to reach it.
>
> Nah.
>
> The free space minimum limit must, unfortunately, be based on absolute
> percentages. There is no better way. The problem is that, in order for
> defrag to work, it has to (partially) consolidate some of the free space, in
> order to produce a contiguous free area which will be the destination for
> defrag data.
One quirk of btrfs is that it has two levels of allocation: it
divides disks into multi-GB block groups, then allocates extents in
the block groups. Any unallocated space on the disks ("unallocated"
meaning "not allocated to a block group") is contiguous, so as long
as there is unallocated space, there are guaranteed to be contiguous
areas a minimum of 8 times the maximum extent to defrag into. So 3%
free space on a big disk ("big" meaning "relative to the maximum extent
size") can mean a lot of contiguous space left, more than enough room
to defrag while moving each extent exactly once.
Not necessarily, of course: if you fill all the way to 100%, there's no
unallocated space any more, and if you then delete 3% of it at random,
you have a severe fragmentation problem (97% of all the block groups are
occupied) and no space to fix it (no unallocated block groups available).
> In order to be able to produce this contiguous free space area, it is of
> utmost importance that there is sufficient free space left on the partition.
> Otherwise, this free space consolidation operation will take too much time
> (too much disk I/O). There is no good way around it the common cases of free
> space fragmentation.
>
> If you reduce the free space minimum limit below 3%, you are likely to spend
> 2x more I/O in consolidating free space than what is needed to actually
> defrag the data. I mean, the defrag will still work, but I think that the
> slowdown is unacceptable.
>
> I mean, the user should just free some space! The filesystems should not be
> left with less than 10% free space, that's simply bad management from the
> user's part, and the user should accept the consequences.
Well, yes, the performance of the allocator drops exponentially once
you go past 90% usage of the allocated block groups (there's no
optimization like a free-space btree with lengths as keys).