On 2018年02月22日 09:50, Jeff Mahoney wrote:
> On 2/21/18 8:36 PM, Qu Wenruo wrote:
>>
>>
>> On 2018年02月22日 04:19, jeffm@xxxxxxxx wrote:
>>> From: Jeff Mahoney <jeffm@xxxxxxxx>
>>>
>>> There are several places where we call btrfs_qgroup_reserve_meta and
>>> assume that a return value of 0 means that the reservation was successful.
>>>
>>> Later, we use the original bytes value passed to that call to free
>>> bytes during error handling or to pass the number of bytes reserved to
>>> the caller.
>>>
>>> This patch returns -ENODATA when we don't perform a reservation so that
>>> callers can make the distinction. This also lets call sites not
>>> necessarily care whether qgroups are enabled.
>>
>> IMHO if we don't need to reserve, returning 0 seems good enough.
>> Caller doesn't really need to care if it has reserved some bytes.
>>
>> Or is there any special case where we need to distinguish such case?
>
> Anywhere where the reservation takes place prior to the transaction
> starting, which is pretty much everywhere. We wait until transaction
> commit to flip the bit to turn on quotas, which means that if a
> transaction commits that enables quotas lands in between the reservation
> being take and any error handling that involves freeing the reservation,
> we'll end up with an underflow.
So the same case as btrfs_qgroup_reserve_data().
In that case we could use ret > 0 to indicates the real bytes we
reserved, instead of -ENODATA which normally means error.
>
> This is the first patch of a series I'm working on, but it can stand
> alone. The rest is the patch set I mentioned when we talked a few
> months ago where the lifetimes of reservations are incorrect. We can't
> just drop all the reservations at the end of the transaction because 1)
> the lifetime of some reservations can cross transactions and 2) because
> especially in the start_transaction case, we do the reservation prior to
> waiting to join the transaction. So if the transaction we're waiting on
> commits, our reservation goes away with it but we continue on as if we
> still have it.
Right, the same problem I also addressed in patchset "[PATCH v2 00/10]
Use split qgroup rsv type".
In 6th patch, "[PATCH v2 06/10] btrfs: qgroup: Use
root->qgroup_meta_rsv_* to record qgroup meta reserved space" qgroup
meta reserve will only be increased if we succeeded in reserving
metadata, so later free won't underflow that number.
Thanks,
Qu
>
> -Jeff
>
>> Thanks,
>> Qu
>>
>>>
>>> Signed-off-by: Jeff Mahoney <jeffm@xxxxxxxx>
>>> ---
>>> fs/btrfs/extent-tree.c | 33 ++++++++++++++++-----------------
>>> fs/btrfs/qgroup.c | 4 ++--
>>> fs/btrfs/transaction.c | 5 ++++-
>>> 3 files changed, 22 insertions(+), 20 deletions(-)
>>>
>>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
>>> index c1618ab9fecf..2d5e963fae88 100644
>>> --- a/fs/btrfs/extent-tree.c
>>> +++ b/fs/btrfs/extent-tree.c
>>> @@ -5988,20 +5988,18 @@ int btrfs_subvolume_reserve_metadata(struct btrfs_root *root,
>>> u64 *qgroup_reserved,
>>> bool use_global_rsv)
>>> {
>>> - u64 num_bytes;
>>> int ret;
>>> struct btrfs_fs_info *fs_info = root->fs_info;
>>> struct btrfs_block_rsv *global_rsv = &fs_info->global_block_rsv;
>>> + /* One for parent inode, two for dir entries */
>>> + u64 num_bytes = 3 * fs_info->nodesize;
>>>
>>> - if (test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags)) {
>>> - /* One for parent inode, two for dir entries */
>>> - num_bytes = 3 * fs_info->nodesize;
>>> - ret = btrfs_qgroup_reserve_meta(root, num_bytes, true);
>>> - if (ret)
>>> - return ret;
>>> - } else {
>>> + ret = btrfs_qgroup_reserve_meta(root, num_bytes, true);
>>> + if (ret == -ENODATA) {
>>> num_bytes = 0;
>>> - }
>>> + ret = 0;
>>> + } else if (ret)
>>> + return ret;
>>>
>>> *qgroup_reserved = num_bytes;
>>>
>>> @@ -6057,6 +6055,7 @@ int btrfs_delalloc_reserve_metadata(struct btrfs_inode *inode, u64 num_bytes)
>>> enum btrfs_reserve_flush_enum flush = BTRFS_RESERVE_FLUSH_ALL;
>>> int ret = 0;
>>> bool delalloc_lock = true;
>>> + u64 qgroup_reserved;
>>>
>>> /* If we are a free space inode we need to not flush since we will be in
>>> * the middle of a transaction commit. We also don't need the delalloc
>>> @@ -6090,17 +6089,17 @@ int btrfs_delalloc_reserve_metadata(struct btrfs_inode *inode, u64 num_bytes)
>>> btrfs_calculate_inode_block_rsv_size(fs_info, inode);
>>> spin_unlock(&inode->lock);
>>>
>>> - if (test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags)) {
>>> - ret = btrfs_qgroup_reserve_meta(root,
>>> - nr_extents * fs_info->nodesize, true);
>>> - if (ret)
>>> - goto out_fail;
>>> - }
>>> + qgroup_reserved = nr_extents * fs_info->nodesize;
>>> + ret = btrfs_qgroup_reserve_meta(root, qgroup_reserved, true);
>>> + if (ret == -ENODATA) {
>>> + ret = 0;
>>> + qgroup_reserved = 0;
>>> + } if (ret)
>>> + goto out_fail;
>>>
>>> ret = btrfs_inode_rsv_refill(inode, flush);
>>> if (unlikely(ret)) {
>>> - btrfs_qgroup_free_meta(root,
>>> - nr_extents * fs_info->nodesize);
>>> + btrfs_qgroup_free_meta(root, qgroup_reserved);
>>> goto out_fail;
>>> }
>>>
>>> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
>>> index aa259d6986e1..5d9e011243c6 100644
>>> --- a/fs/btrfs/qgroup.c
>>> +++ b/fs/btrfs/qgroup.c
>>> @@ -3025,7 +3025,7 @@ int btrfs_qgroup_reserve_meta(struct btrfs_root *root, int num_bytes,
>>>
>>> if (!test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags) ||
>>> !is_fstree(root->objectid) || num_bytes == 0)
>>> - return 0;
>>> + return -ENODATA;
>>>
>>> BUG_ON(num_bytes != round_down(num_bytes, fs_info->nodesize));
>>> trace_qgroup_meta_reserve(root, (s64)num_bytes);
>>> @@ -3057,7 +3057,7 @@ void btrfs_qgroup_free_meta(struct btrfs_root *root, int num_bytes)
>>> struct btrfs_fs_info *fs_info = root->fs_info;
>>>
>>> if (!test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags) ||
>>> - !is_fstree(root->objectid))
>>> + !is_fstree(root->objectid) || num_bytes == 0)
>>> return;
>>>
>>> BUG_ON(num_bytes != round_down(num_bytes, fs_info->nodesize));
>>> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
>>> index 04f07144b45c..ab67b73bd7fa 100644
>>> --- a/fs/btrfs/transaction.c
>>> +++ b/fs/btrfs/transaction.c
>>> @@ -510,7 +510,10 @@ start_transaction(struct btrfs_root *root, unsigned int num_items,
>>> qgroup_reserved = num_items * fs_info->nodesize;
>>> ret = btrfs_qgroup_reserve_meta(root, qgroup_reserved,
>>> enforce_qgroups);
>>> - if (ret)
>>> + if (ret == -ENODATA) {
>>> + ret = 0;
>>> + qgroup_reserved = 0;
>>> + } else if (ret)
>>> return ERR_PTR(ret);
>>>
>>> num_bytes = btrfs_calc_trans_metadata_size(fs_info, num_items);
>>>
>>
>
>
Attachment:
signature.asc
Description: OpenPGP digital signature
