On Mon, Mar 13, 2017 at 12:27 PM, Goldwyn Rodrigues <rgoldwyn@xxxxxxx> wrote:
>
>
> On 03/13/2017 07:14 AM, Filipe Manana wrote:
>> On Sat, Mar 11, 2017 at 11:02 PM, Goldwyn Rodrigues <rgoldwyn@xxxxxxx> wrote:
>>> From: Goldwyn Rodrigues <rgoldwyn@xxxxxxxx>
>>>
>>> We are facing the same problem with EDQUOT which was experienced with
>>> ENOSPC. Not sure if we require a full ticketing system such as ENOSPC, but
>>> here is a fix. Let me know if it is too big a hammer.
>>>
>>> Quotas are reserved during the start of an operation, incrementing
>>> qg->reserved. However, it is written to disk in a commit_transaction
>>> which could take as long as commit_interval. In the meantime there
>>> could be deletions which are not accounted for because deletions are
>>> accounted for only while committed (free_refroot). So, when we get
>>> a EDQUOT flush the data to disk and try again.
>>
>> So this doesn't explain why the problem happens, taking into account
>> the reproducer below.
>> You mention that deletions are only accounted at transaction commit
>> time, but the reproducer does not do any deletions.
>
> Yes it does, after Cleanup. It is a loop. I suppose you got confused by
> the indent.
Indeed, the indentation is screwed up.
>
>> So why does the reproducer gets EDQUOT errors? The second sentence
>> "However, it is written to disk in a commit_transaction
>> which could take as long as commit_interval." also does not explain
>> why is that a problem, why it makes the reproducer get EDQUOT errors.
>>
>> I would like to see a clear and complete explanation in the changelog.
>
> I suppose this comment is irrelevant now?
Yes.
>
>>
>>>
>>> I combined the conditions of rfer and excl to reduce code lines, though
>>> the condition looks uglier.
>>
>> This sentence is outdated, it's no longer valid for v2 as it now calls
>> qgroup_check_limits().
>>
>>>
>>> Here is a sample script which shows this issue.
>>>
>>> DEVICE=/dev/vdb
>>> MOUNTPOINT=/mnt
>>> TESTVOL=$MOUNTPOINT/tmp
>>> QUOTA=5
>>> PROG=btrfs
>>> DD_BS="4k"
>>> DD_COUNT="256"
>>> RUN_TIMES=5000
>>>
>>> mkfs.btrfs -f $DEVICE
>>> mount -o commit=240 $DEVICE $MOUNTPOINT
>>> $PROG subvolume create $TESTVOL
>>> $PROG quota enable $TESTVOL
>>> $PROG qgroup limit ${QUOTA}G $TESTVOL
>>>
>>> typeset -i DD_RUN_GOOD
>>> typeset -i QUOTA
>>>
>>> function _check_cmd() {
>>> if [[ ${?} > 0 ]]; then
>>> echo -n "$(date) E: Running previous command"
>>> echo ${*}
>>> echo "Without sync"
>>> $PROG qgroup show -pcreFf ${TESTVOL}
>>> echo "With sync"
>>> $PROG qgroup show -pcreFf --sync ${TESTVOL}
>>> exit 1
>>> fi
>>> }
>>>
>>> while true; do
>>> DD_RUN_GOOD=$RUN_TIMES
>>>
>>> while (( ${DD_RUN_GOOD} != 0 )); do
>>> dd if=/dev/zero of=${TESTVOL}/quotatest${DD_RUN_GOOD} bs=${DD_BS} count=${DD_COUNT}
>>> _check_cmd "dd if=/dev/zero of=${TESTVOL}/quotatest${DD_RUN_GOOD} bs=${DD_BS} count=${DD_COUNT}"
>>> DD_RUN_GOOD=(${DD_RUN_GOOD}-1)
>>> done
>>>
>>> $PROG qgroup show -pcref $TESTVOL
>>> echo "----------- Cleanup ---------- "
>>> rm $TESTVOL/quotatest*
>
> Here is the deletion.
>
>>>
>>> done
>
> And the infinite loop ending.
>
>>
>> Please write a test case for fstests.
>
> You mean xfstests. will do.
xfstests got renamed to fstests long time ago, and obviously it's
xfstests as that's the test suite we all use.
thanks
>
>>
>>>
>>> Changes since v1:
>>> - Changed start_delalloc_roots() to start_delalloc_inode() to target
>>> the root in question only to reduce the amount of flush to be done.
>>> - Added wait_ordered_extents().
>>
>> Listing what changed between versions is not meant to appear in the
>> changelog. This should go after the --- below. You should also prefix
>> the patch subject with "btrfs:" (or "btrfs-progs:"). See the examples
>> from others or the wiki for intsructions
>> (https://btrfs.wiki.kernel.org/index.php/Writing_patch_for_btrfs#Updating_patches).
>>
>
> Ok, understood. Thanks.
>
>> thanks
>>
>>>
>>> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@xxxxxxxx>
>>> Reviewed-by: Qu Wenruo <quwenruo@xxxxxxxxxxxxxx>
>>> ---
>>> fs/btrfs/qgroup.c | 15 +++++++++++++++
>>> 1 file changed, 15 insertions(+)
>>>
>>> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
>>> index a5da750..6215264 100644
>>> --- a/fs/btrfs/qgroup.c
>>> +++ b/fs/btrfs/qgroup.c
>>> @@ -2367,6 +2367,7 @@ static int qgroup_reserve(struct btrfs_root *root, u64 num_bytes, bool enforce)
>>> struct btrfs_fs_info *fs_info = root->fs_info;
>>> u64 ref_root = root->root_key.objectid;
>>> int ret = 0;
>>> + int retried = 0;
>>> struct ulist_node *unode;
>>> struct ulist_iterator uiter;
>>>
>>> @@ -2376,6 +2377,7 @@ static int qgroup_reserve(struct btrfs_root *root, u64 num_bytes, bool enforce)
>>> if (num_bytes == 0)
>>> return 0;
>>>
>>> +retry:
>>> spin_lock(&fs_info->qgroup_lock);
>>> quota_root = fs_info->quota_root;
>>> if (!quota_root)
>>> @@ -2402,6 +2404,19 @@ static int qgroup_reserve(struct btrfs_root *root, u64 num_bytes, bool enforce)
>>> qg = unode_aux_to_qgroup(unode);
>>>
>>> if (enforce && !qgroup_check_limits(qg, num_bytes)) {
>>> + if (!retried && qg->reserved > 0) {
>>> + struct btrfs_trans_handle *trans;
>>> + spin_unlock(&fs_info->qgroup_lock);
>>> + btrfs_start_delalloc_inodes(root, 0);
>>> + btrfs_wait_ordered_extents(root, -1, 0,
>>> + (u64)-1);
>>> + trans = btrfs_join_transaction(root);
>>> + if (IS_ERR(trans))
>>> + return PTR_ERR(trans);
>>> + btrfs_commit_transaction(trans);
>>> + retried++;
>>> + goto retry;
>>> + }
>>> ret = -EDQUOT;
>>> goto out;
>>> }
>>> --
>>> 2.10.2
>>>
>
> --
> Goldwyn
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html