Re: [PATCH v3] btrfs: qgroup: Remove qgroup items along with subvolume deletion

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2018年08月08日 10:56, Misono Tomohiro wrote:
> On 2018/08/08 1:23, Lu Fengqi wrote:
>> On Mon, Aug 06, 2018 at 01:53:28PM +0900, Misono Tomohiro wrote:
>>> When qgroup is on, subvolume deletion does not remove qgroup items
>>> of the subvolume (qgroup info, limit, relation) from quota tree and
>>> they need to get removed manually by "btrfs qgroup destroy".
>>>
>>> Since level 0 qgroup cannot be used/inherited by any other subvolume,
>>> let's remove them automatically when subvolume is deleted
>>> (to be precise, when the subvolume root is dropped).
>>>
>>> Reviewed-by: Lu Fengqi <lufq.fnst@xxxxxxxxxxxxxx>
>>> Reviewed-by: Qu Wenruo <wqu@xxxxxxxx>
>>> Signed-off-by: Misono Tomohiro <misono.tomohiro@xxxxxxxxxxxxxx>
>>> ---
>>> v2 -> v3:
>>>  Use root->root_key.objectid instead of root->objectid
>>>  Add Reviewed-by tag
>>>
>>> v1 -> v2:
>>>  Move call of btrfs_remove_qgroup() from btrfs_delete_subvolume()
>>>  to btrfs_snapshot_destroy() so that it will be called after the
>>>  subvolume root is really dropped
>>>
>>> fs/btrfs/extent-tree.c | 16 ++++++++++++----
>>> 1 file changed, 12 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
>>> index 9e7b237b9547..48edf839ed2c 100644
>>> --- a/fs/btrfs/extent-tree.c
>>> +++ b/fs/btrfs/extent-tree.c
>>> @@ -8871,12 +8871,13 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
>>> 	struct btrfs_root_item *root_item = &root->root_item;
>>> 	struct walk_control *wc;
>>> 	struct btrfs_key key;
>>> +	u64 objectid = root->root_key.objectid;
>>> 	int err = 0;
>>> 	int ret;
>>> 	int level;
>>> 	bool root_dropped = false;
>>>
>>> -	btrfs_debug(fs_info, "Drop subvolume %llu", root->objectid);
>>> +	btrfs_debug(fs_info, "Drop subvolume %llu", objectid);
>>>
>>> 	path = btrfs_alloc_path();
>>> 	if (!path) {
>>> @@ -9030,7 +9031,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
>>> 		goto out_end_trans;
>>> 	}
>>>
>>> -	if (root->root_key.objectid != BTRFS_TREE_RELOC_OBJECTID) {
>>> +	if (objectid != BTRFS_TREE_RELOC_OBJECTID) {
>>> 		ret = btrfs_find_root(tree_root, &root->root_key, path,
>>> 				      NULL, NULL);
>>> 		if (ret < 0) {
>>> @@ -9043,8 +9044,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
>>> 			 *
>>> 			 * The most common failure here is just -ENOENT.
>>> 			 */
>>> -			btrfs_del_orphan_item(trans, tree_root,
>>> -					      root->root_key.objectid);
>>> +			btrfs_del_orphan_item(trans, tree_root, objectid);
>>> 		}
>>> 	}
>>>
>>> @@ -9056,6 +9056,14 @@ int btrfs_drop_snapshot(struct btrfs_root *root,
>>> 		btrfs_put_fs_root(root);
>>> 	}
>>> 	root_dropped = true;
>>> +
>>> +	 /* Remove level-0 qgroup items since no other subvolume can use them */
>>> +	ret = btrfs_remove_qgroup(trans, objectid);
>>> +	if (ret && ret != -EINVAL && ret != -ENOENT) {
>>
>> I'm sorry for missing the snapshot case. If it is a snapshot, then when
>> we remove the relevant qgroup, we will not be able to perform
>> quick_update_accounting(), and it will return 1. So we shouldn't abort
>> the transaction when the return value = 1.
>>
>> btrfs_remove_qgroup
>> -> __del_qgroup_relation
>>    -> quick_update_accounting << if qgroup->excl != qgroup->rfer; return 1
>>
> 
> oh, thanks for pointing out, I missed that case...
> I wonder if there is a way to keep qgroup consistency without enforcing rescan. 

In theory, it's possible by just dirtying all extents of related level 0
qgroups.

In practical, not really possible.
As above method will hang current transaction almost forever.

> 
> btw, I noticed that removing qgroup relation when excl != rfer causes
> some problem without this patch:
> 
> (4.18-rc7)
> $ mkfs.btrfs -fq $DEV
> $ mount $DEV /mnt
> 
> $ btrfs quota enable /mnt
> $ btrfs qgroup create 1/0 /mnt
> $ btrfs sub create /mnt/sub
> $ btrfs qgroup assign 0/257 1/0 /mnt
> 
> $ dd if=/dev/urandom of=/mnt/sub/file bs=1k count=1000
> $ btrfs sub snap /mnt/sub /mnt/snap

In fact, at this point we should already mark btrfs quota inconsistent.

When creating qgroup 0/258 we're already doing quick hack for its
number, but since 0/257 is already belong to 1/0, the quick update
itself is already making quota inconsistent.

(Yes, snapshot creation under multi-level qgroups is the biggest
challenge for btrfs quota)

> $ dd if=/dev/urandom of=/mnt/sub/file2 bs=1k count=1000
> 
> $ btrfs fi sync /mnt
> $ btrfs qgroup show -pcre /mnt

At this point, qgroups is already inconsistent.
I'll try to fix it.

Thanks,
Qu

> qgroupid         rfer         excl     max_rfer     max_excl parent  child
> --------         ----         ----     --------     -------- ------  -----
> 0/5          16.00KiB     16.00KiB         none         none ---     ---
> 0/257         1.97MiB   1016.00KiB         none         none 1/0     ---
> 0/258      1016.00KiB     16.00KiB         none         none ---     ---
> 1/0           1.97MiB      1.97MiB         none         none ---     0/257
>                            ^^^^^^^ should be 1016Kib
> 
> $ btrfs qgroup remove 0/257 1/0 /mnt
> WARNING: quotas may be inconsistent, rescan needed
> $ btrfs qgroup show -pcre  /mnt
> WARNING: qgroup data inconsistent, rescan recommended
> qgroupid         rfer         excl     max_rfer     max_excl parent  child
> --------         ----         ----     --------     -------- ------  -----
> 0/5          16.00KiB     16.00KiB         none         none ---     ---
> 0/258      1016.00KiB   1016.00KiB         none         none ---     ---
> 1/0           1.97MiB      1.97MiB         none         none ---     ---
> 
> $ btrfs quota rescan -w /mnt
> $ btrfs qgroup show -pcre /mnt
> qgroupid         rfer         excl     max_rfer     max_excl parent  child
> --------         ----         ----     --------     -------- ------  -----
> 0/5          16.00KiB     16.00KiB         none         none ---     ---
> 0/258      1016.00KiB   1016.00KiB         none         none ---     ---
> 1/0           1.97MiB      1.97MiB         none         none ---     ---
>                ^^^^^^      ^^^^^^^ not changed, although inconsistency flag is cleared
> 
> I will look at the case to better understand qgroup.
> 
> Thanks,
> Misono
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux