On 2020/1/7 下午11:09, Josef Bacik wrote:
> On 1/7/20 6:08 AM, Qu Wenruo wrote:
>>
>>
>> On 2020/1/7 上午12:50, Josef Bacik wrote:
>>> btrfs/061 has been failing consistently for me recently with a
>>> transaction abort. We run out of space in the system chunk array, which
>>> means we've allocated way too many system chunks than we need.
>>
>> Isn't that caused by scrubbing creating unnecessary system chunks?
>>
>> IIRC I had a patch to address that problem by just simply not allocating
>> system chunks for scrub.
>> ("btrfs: scrub: Don't check free space before marking a block group RO")
>>
>
> This addresses the symptoms, not the root cause of the problem. Your
> fix is valid, because we probably shouldn't be doing that, but we also
> shouldn't be forcing restriping of block groups arbitrarily.
>
>> Although that doesn't address the whole problem, but it should at least
>> reduce the possibility.
>>
>>
>> Furthermore, with the newer over-commit behavior for inc_block_group_ro
>> ("btrfs: use btrfs_can_overcommit in inc_block_group_ro"), we won't
>> really allocate new system chunks anymore if we can over-commit.
>>
>> With those two patches, I guess we should have solved the problem.
>> Or did I miss something?
>>
> You are missing that we're getting forced to allocate a system chunk
> from this
>
> alloc_flags = update_block_group_flags(fs_info, cache->flags);
> if (alloc_flags != cache->flags) {
> ret = btrfs_chunk_alloc(trans, alloc_flags, CHUNK_ALLOC_FORCE);
>
> which you move down in your patch, but will still get tripped by
> rebalance. So you sort of paper over the real problem, we just don't
> get bitten by it as hard with 061 because balance takes longer than
> scrub does. If we let it run longer per fs type we'd still hit the same
> problem.
>
> In short, your patches do make it better, and are definitely correct
> because we probably shouldn't be allocating new chunks for scrub, but
> they don't address the real cause of the problem. All the patches are
> needed. Thanks,
Indeed.
Then the patch looks good to me.
Reviewed-by: Qu Wenruo <wqu@xxxxxxxx>
And thanks again for fixing the missing piece of the unnecessary chunk
allocation.
Thanks,
Qu
>
> Josef
Attachment:
signature.asc
Description: OpenPGP digital signature
