On Mon, Nov 19, 2018 at 11:52 AM Filipe Manana <fdmanana@xxxxxxxxxx> wrote:
>
> On Mon, Nov 19, 2018 at 11:35 AM Qu Wenruo <quwenruo.btrfs@xxxxxxx> wrote:
> >
> >
> >
> > On 2018/11/19 下午7:13, Filipe Manana wrote:
> > > On Mon, Nov 19, 2018 at 11:09 AM Qu Wenruo <quwenruo.btrfs@xxxxxxx> wrote:
> > >>
> > >>
> > >>
> > >> On 2018/11/19 下午5:48, fdmanana@xxxxxxxxxx wrote:
> > >>> From: Filipe Manana <fdmanana@xxxxxxxx>
> > >>>
> > >>> If the quota enable and snapshot creation ioctls are called concurrently
> > >>> we can get into a deadlock where the task enabling quotas will deadlock
> > >>> on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it
> > >>> twice. The following time diagram shows how this happens.
> > >>>
> > >>> CPU 0 CPU 1
> > >>>
> > >>> btrfs_ioctl()
> > >>> btrfs_ioctl_quota_ctl()
> > >>> btrfs_quota_enable()
> > >>> mutex_lock(fs_info->qgroup_ioctl_lock)
> > >>> btrfs_start_transaction()
> > >>>
> > >>> btrfs_ioctl()
> > >>> btrfs_ioctl_snap_create_v2
> > >>> create_snapshot()
> > >>> --> adds snapshot to the
> > >>> list pending_snapshots
> > >>> of the current
> > >>> transaction
> > >>>
> > >>> btrfs_commit_transaction()
> > >>> create_pending_snapshots()
> > >>> create_pending_snapshot()
> > >>> qgroup_account_snapshot()
> > >>> btrfs_qgroup_inherit()
> > >>> mutex_lock(fs_info->qgroup_ioctl_lock)
> > >>> --> deadlock, mutex already locked
> > >>> by this task at
> > >>> btrfs_quota_enable()
> > >>
> > >> The backtrace looks valid.
> > >>
> > >>>
> > >>> So fix this by adding a flag to the transaction handle that signals if the
> > >>> transaction is being used for enabling quotas (only seen by the task doing
> > >>> it) and do not lock the mutex qgroup_ioctl_lock at btrfs_qgroup_inherit()
> > >>> if the transaction handle corresponds to the one being used to enable the
> > >>> quotas.
> > >>>
> > >>> Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when creating snapshot")
> > >>> Signed-off-by: Filipe Manana <fdmanana@xxxxxxxx>
> > >>> ---
> > >>> fs/btrfs/qgroup.c | 10 ++++++++--
> > >>> fs/btrfs/transaction.h | 1 +
> > >>> 2 files changed, 9 insertions(+), 2 deletions(-)
> > >>>
> > >>> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
> > >>> index d4917c0cddf5..3aec3bfa3d70 100644
> > >>> --- a/fs/btrfs/qgroup.c
> > >>> +++ b/fs/btrfs/qgroup.c
> > >>> @@ -908,6 +908,7 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info)
> > >>> trans = NULL;
> > >>> goto out;
> > >>> }
> > >>> + trans->enabling_quotas = true;
> > >>
> > >> Should we put enabling_quotas bit into btrfs_transaction instead of
> > >> btrfs_trans_handle?
> > >
> > > Why?
> > > Only the task which is enabling quotas needs to know about it.
> >
> > But it's the btrfs_qgroup_inherit() using the trans handler to avoid
> > dead lock.
> >
> > What makes sure btrfs_qgroup_inherit() get the exactly same trans
> > handler allocated here?
>
> If it's the other task (the one creating a snapshot) that starts the
> transaction commit,
> it will have to wait for the task enabling quotas to release the
> transaction - once that task
> also calls commit_transaction(), it will skip doing the commit itself
> and wait for the snapshot
> one to finish the commit, while holding the qgroup mutex (this part I
> missed before).
> So yes we'll have to use a bit in the transaction itself instead.
That (using a flag in the transaction itself) wouldn't be good, it would allow
concurrent and unprotected access to qgroup stuff at
btrfs_qgroup_inherit() by anyone who
calls it (currently only subvolume creation). Fortunately there's a
much simpler solution in v2.
>
> >
> > >
> > >>
> > >> Isn't it possible to have different trans handle pointed to the same
> > >> transaction?
> > >
> > > Yes.
> > >
> > >>
> > >> And I'm not really sure about the naming "enabling_quotas".
> > >> What about "quota_ioctl_mutex_hold"? (Well, this also sounds awful)
> > >
> > > Too long.
> >
> > Anyway, current naming doesn't really show why we could skip mutex
> > locking. Just hope to get some name better.
>
> No name will ever show you that.
> You'll always have to see where and how it's used, unless you want a
> name like "dont_lock_mutex_because_we_locked_it_at_btrfs...".
>
> >
> > Thanks,
> > Qu
> >
> > >
> > >
> > >>
> > >> Thanks,
> > >> Qu
> > >>
> > >>>
> > >>> fs_info->qgroup_ulist = ulist_alloc(GFP_KERNEL);
> > >>> if (!fs_info->qgroup_ulist) {
> > >>> @@ -2250,7 +2251,11 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle *trans, u64 srcid,
> > >>> u32 level_size = 0;
> > >>> u64 nums;
> > >>>
> > >>> - mutex_lock(&fs_info->qgroup_ioctl_lock);
> > >>> + if (trans->enabling_quotas)
> > >>> + lockdep_assert_held(&fs_info->qgroup_ioctl_lock);
> > >>> + else
> > >>> + mutex_lock(&fs_info->qgroup_ioctl_lock);
> > >>> +
> > >>> if (!test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags))
> > >>> goto out;
> > >>>
> > >>> @@ -2413,7 +2418,8 @@ int btrfs_qgroup_inherit(struct btrfs_trans_handle *trans, u64 srcid,
> > >>> unlock:
> > >>> spin_unlock(&fs_info->qgroup_lock);
> > >>> out:
> > >>> - mutex_unlock(&fs_info->qgroup_ioctl_lock);
> > >>> + if (!trans->enabling_quotas)
> > >>> + mutex_unlock(&fs_info->qgroup_ioctl_lock);
> > >>> return ret;
> > >>> }
> > >>>
> > >>> diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h
> > >>> index 703d5116a2fc..a5553a1dee30 100644
> > >>> --- a/fs/btrfs/transaction.h
> > >>> +++ b/fs/btrfs/transaction.h
> > >>> @@ -122,6 +122,7 @@ struct btrfs_trans_handle {
> > >>> bool reloc_reserved;
> > >>> bool sync;
> > >>> bool dirty;
> > >>> + bool enabling_quotas;
> > >>> struct btrfs_root *root;
> > >>> struct btrfs_fs_info *fs_info;
> > >>> struct list_head new_bgs;
> > >>>
> > >>
> >