On 2020/7/2 下午9:56, Josef Bacik wrote:
> On 7/2/20 9:50 AM, Qu Wenruo wrote:
>>
>>
>> On 2020/7/2 下午9:40, Josef Bacik wrote:
>>> On 7/1/20 8:14 PM, Qu Wenruo wrote:
>>>> [PROBLEM]
>>>> Before this patch, when btrfs_qgroup_reserve_data() fails, we free all
>>>> reserved space of the changeset.
>>>>
>>>> This means the following call is not possible:
>>>> ret = btrfs_qgroup_reserve_data();
>>>> if (ret == -EDQUOT) {
>>>> /* Do something to free some qgroup space */
>>>> ret = btrfs_qgroup_reserve_data();
>>>> }
>>>>
>>>> As if the first btrfs_qgroup_reserve_data() fails, it will free all
>>>> reserved qgroup space, so the next btrfs_qgroup_reserve_data() will
>>>> always success, and can go beyond qgroup limit.
>>>>
>>>> [CAUSE]
>>>> This is mostly due to the fact that we don't have a good way to revert
>>>> changeset modification accurately.
>>>>
>>>> Currently the changeset infrastructure is implemented using ulist,
>>>> which
>>>> can only store two u64 values, used as start and length for each
>>>> changed
>>>> extent range.
>>>>
>>>> So we can't record which changed extent is modified in last
>>>> modification, thus unable to revert to previous status.
>>>>
>>>> [FIX]
>>>> This patch will re-implement using pure rbtree, adding a new member,
>>>> changed_extent::seq, so we can remove changed extents which is
>>>> modified in previous modification.
>>>>
>>>> This allows us to implement qgroup_revert(), which allow btrfs to
>>>> revert
>>>> its modification to the io_tree.
>>>>
>>>
>>> I'm having a hard time groking what's going on here. These changesets
>>> are limited to a [start, end] range correct?
>>
>> Yes, but we may be only responsible for part of the changeset.
>>
>> One example is we want to falloc range [0, 16K)
>> And [0, 4K), [8K, 12K) has already one existing file extent.
>>
>> Then we succeeded in allocating space for [4K, 8K), but failed to
>> allocating space for [8K, 12K).
Sorry, this should be [12K, 16K).
>>
>> In that case, if we just return EDQUOT and clear the range for [4K, 8k)
>> and [8K, 12K), everything is completely fine.
>>
>> But if we want to retry, then we should only clear the range for [8K,
>> 12K), but not to clear [4K, 8K).
>>
>> That's what we need for the next patch, just revert what we did in
>> previous set_extent_bit(), but not touching previously set bits.
>>
>
> Ok so how do we get to this case then? The changeset already has the
> range we're responsible for correct?
> Why do we need the sequence
> number? Like we should have a changeset for [4k, 8k) and one for [8k,
> 12k) right? Or are we handed back the whole thing? If we fail _just_
> for the [8k, 12k) part, do we know that? Or do we just know our whole
> [0k, 16k) thing failed, and so we need all this convoluted tracking to
> be able to retry for the range that we fucked up, and this is what you
> are facilitating? Thanks,
Oh, I got your point.
Right, when the failure happens, the range passed in is not [0, 16K),
but the last failed one.
We can just check the range, free any existing entry in the failure
range, and call it a day...
And I made the case over complicated them.
Thanks for the advice, I'll try to use the existing ulist mechanism to
do the same thing.
Thanks,
Qu
>
> Josef
Attachment:
signature.asc
Description: OpenPGP digital signature
