Re: [PATCH] btrfs: statfs: Don't reset f_bavail if we're over committing metadata space

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2020/1/17 下午10:10, David Sterba wrote:
> On Fri, Jan 17, 2020 at 09:32:49AM +0800, Qu Wenruo wrote:
>> On 2020/1/17 上午8:54, Qu Wenruo wrote:
>>> On 2020/1/16 下午10:29, David Sterba wrote:
>>>> On Wed, Jan 15, 2020 at 11:41:28AM +0800, Qu Wenruo wrote:
>>>>> [BUG]
>>>>> When there are a lot of metadata space reserved, e.g. after balancing a
>>>>> data block with many extents, vanilla df would report 0 available space.
>>>>>
>>>>> [CAUSE]
>>>>> btrfs_statfs() would report 0 available space if its metadata space is
>>>>> exhausted.
>>>>> And the calculation is based on currently reserved space vs on-disk
>>>>> available space, with a small headroom as buffer.
>>>>> When there is not enough headroom, btrfs_statfs() will report 0
>>>>> available space.
>>>>>
>>>>> The problem is, since commit ef1317a1b9a3 ("btrfs: do not allow
>>>>> reservations if we have pending tickets"), we allow btrfs to over commit
>>>>> metadata space, as long as we have enough space to allocate new metadata
>>>>> chunks.
>>>>>
>>>>> This makes old calculation unreliable and report false 0 available space.
>>>>>
>>>>> [FIX]
>>>>> Don't do such naive check anymore for btrfs_statfs().
>>>>> Also remove the comment about "0 available space when metadata is
>>>>> exhausted".
>>>>
>>>> This is intentional and was added to prevent a situation where 'df'
>>>> reports available space but exhausted metadata don't allow to create new
>>>> inode.
>>>
>>> But this behavior itself is not accurate.
>>>
>>> We have global reservation, which is normally always larger than the
>>> immediate number 4M.
>>>
>>> So that check will never really be triggered.
>>>
>>> Thus invalidating most of your argument.
>>>>
>>>> If it gets removed you are trading one bug for another. With the changed
>>>> logic in the referenced commit, the metadata exhaustion is more likely
>>>> but it's also temporary.
>>
>> Furthermore, the point of the patch is, current check doesn't play well
>> with metadata over-commit.
> 
> The recent overcommit updates broke statfs in a new way and left us
> almost nothing to make it better.

It's not impossible to solve in fact.

Exporting can_overcommit() can do pretty well in this particular case.

> 
>> If it's before v5.4, I won't touch the check considering it will never
>> hit anyway.
>>
>> But now for v5.4, either:
>> - We over-commit metadata
>>   Meaning we have unallocated space, nothing to worry
> 
> Can we estimate how much unallocated data are there? I don't know how,
> and "nothing to worry" always worries me.

Data never over-commit. We always ensure there are enough data chunk
allocated before we allocate data extents.

> 
>> - No more space for over-commit
>>   But in that case, we still have global rsv to update essential trees.
>>   Please note that, btrfs should never fall into a status where no files
>>   can be deleted.
> 
> Of course, the global reserve is there for last resort actions and will
> be used for deletion and updating essential trees. What statfs says is
> how much data is there left for the user. New files, writing more data
> etc.
> 
>> Consider all these, we're no longer able to really hit that case.
>>
>> So that's why I'm purposing deleting that. I see no reason why that
>> magic number 4M would still work nowadays.
> 
> So, the corner case that resulted in the guesswork needs to be
> reevaluated then, the space reservations and related updates clearly
> affect that. That's out of 5.5-rc timeframe though.

Although we can still solve the problem only using facility in v5.5, I'm
still not happy enough with the idea of "one exhausted resource would
result a different resource exhausted"

I still believe in that we should report different things independently.
(Which obviously makes our lives easier in statfs case).

That's also why we require reporters to include 'btrfs fi df' result
other than vanilla 'df', because we have different internals.

Or, can we reuse the f_files/f_free facility to report metadata space,
and forgot all these mess?

Thanks,
Qu

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux