On 5/3/18 3:20 AM, Qu Wenruo wrote:
> When doing qgroup rescan using the following script (modified from
> btrfs/017 test case), we can sometimes hit qgroup corruption.
>
> ------
> umount $dev &> /dev/null
> umount $mnt &> /dev/null
>
> mkfs.btrfs -f -n 64k $dev
> mount $dev $mnt
>
> extent_size=8192
>
> xfs_io -f -d -c "pwrite 0 $extent_size" $mnt/foo > /dev/null
> btrfs subvolume snapshot $mnt $mnt/snap
>
> xfs_io -f -c "reflink $mnt/foo" $mnt/foo-reflink > /dev/null
> xfs_io -f -c "reflink $mnt/foo" $mnt/snap/foo-reflink > /dev/null
> xfs_io -f -c "reflink $mnt/foo" $mnt/snap/foo-reflink2 > /dev/unll
> btrfs quota enable $mnt
>
> # -W is the new option to only wait rescan while not starting new one
> btrfs quota rescan -W $mnt
> btrfs qgroup show -prce $mnt
>
> # Need to patch btrfs-progs to report qgroup mismatch as error
> btrfs check $dev || _fail
> ------
>
> For fast machine, we can hit some corruption which missed accounting
> tree blocks:
> ------
> qgroupid rfer excl max_rfer max_excl parent child
> -------- ---- ---- -------- -------- ------ -----
> 0/5 8.00KiB 0.00B none none --- ---
> 0/257 8.00KiB 0.00B none none --- ---
> ------
>
> This is due to the fact that we're always searching commit root for
> btrfs_find_all_roots() at qgroup_rescan_leaf(), but the leaf we get is
> from current transaction, not commit root.
>
> And if our tree blocks get modified in current transaction, we won't
> find any owner in commit root, thus causing the corruption.
>
> Fix it by searching commit root for extent tree for
> qgroup_rescan_leaf().
>
> Reported-by: Nikolay Borisov <nborisov@xxxxxxxx>
> Signed-off-by: Qu Wenruo <wqu@xxxxxxxx>
> ---
>
> Please keep in mind that it is possible to hit another type of race
> which double accounting tree blocks:
> ------
> qgroupid rfer excl max_rfer max_excl parent child
> -------- ---- ---- -------- -------- ------ -----
> 0/5 136.00KiB 128.00KiB none none --- ---
> 0/257 136.00KiB 128.00KiB none none --- ---
> ------
> For this type of corruption, this patch could reduce the possibility,
> but the root cause is race between transaction commit and qgroup rescan,
> which needs to be addressed in another patch.
> ---
> fs/btrfs/qgroup.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> index 4baa4ba2d630..829e8fe5c97e 100644
> --- a/fs/btrfs/qgroup.c
> +++ b/fs/btrfs/qgroup.c
> @@ -2681,6 +2681,11 @@ static void btrfs_qgroup_rescan_worker(struct btrfs_work *work)
> path = btrfs_alloc_path();
> if (!path)
> goto out;
> + /*
> + * Rescan should only search for commit root, and any later difference
> + * should be recorded by qgroup
> + */
> + path->search_commit_root = 1;
>
> err = 0;
> while (!err && !btrfs_fs_closing(fs_info)) {
>
If we're searching the commit root here, do we need the tree mod
sequence number dance in qgroup_rescan_leaf anymore?
-Jeff
--
Jeff Mahoney
SUSE Labs
Attachment:
signature.asc
Description: OpenPGP digital signature
