On Fri, Mar 13, 2020 at 03:28:48PM -0400, Josef Bacik wrote:
> Nikolay noticed a bunch of test failures with my global rsv steal
> patches. At first he thought they were introduced by them, but they've
> been failing for a while with 64k nodes.
>
> The problem is with 64k nodes we have a global reserve that calculates
> out to 13mib on a freshly made file system, which only has 8mib of
> metadata space. Because of changes I previously made we no longer
> account for the global reserve in the overcommit logic, which means we
> correctly allow overcommit to happen even though we are already
> overcommitted.
>
> However in some corner cases, for example btrfs/170, we will allocate
> the entire file system up with data chunks before we have enough space
> pressure to allocate a metadata chunk. Then once the fs is full we
> ENOSPC out because we cannot overcommit and the global reserve is taking
> up all of the available space.
>
> The most ideal way to deal with this is to change our space reservation
> stuff to take into account the height of the tree's that we're
> modifying, so that our global reserve calculation does not end up so
> obscenely large.
>
> However that is a huuuuuuge undertaking. Instead fix this by forcing a
> chunk allocation if the global reserve is larger than the total metadata
> space. This gives us essentially the same behavior that happened
> before, we get a chunk allocated and these tests can pass.
>
> This is meant to be a stop-gap measure until we can tackle the "tree
> height only" project.
>
> Fixes: 0096420adb03 ("btrfs: do not account global reserve in can_overcommit")
> Signed-off-by: Josef Bacik <josef@xxxxxxxxxxxxxx>
Added to misc-next, thanks.