Re: [PATCH] Btrfs: fix race setting up and completing qgroup rescan workers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 24, 2019 at 3:07 PM Nikolay Borisov <nborisov@xxxxxxxx> wrote:
>
>
>
> On 24.09.19 г. 12:49 ч., fdmanana@xxxxxxxxxx wrote:
> > From: Filipe Manana <fdmanana@xxxxxxxx>
> >
> > There is a race between setting up a qgroup rescan worker and completing
> > a qgroup rescan worker that can lead to callers of the qgroup rescan wait
> > ioctl to either not wait for the rescan worker to complete or to hang
> > forever due to missing wake ups. The following diagram shows a sequence
> > of steps that illustrates the race.
> >
> >         CPU 1                                                         CPU 2                                  CPU 3
> >
> >  btrfs_ioctl_quota_rescan()
> >   btrfs_qgroup_rescan()
> >    qgroup_rescan_init()
> >     mutex_lock(&fs_info->qgroup_rescan_lock)
> >     spin_lock(&fs_info->qgroup_lock)
> >
> >     fs_info->qgroup_flags |=
> >       BTRFS_QGROUP_STATUS_FLAG_RESCAN
> >
> >     init_completion(
> >       &fs_info->qgroup_rescan_completion)
> >
> >     fs_info->qgroup_rescan_running = true
> >
> >     mutex_unlock(&fs_info->qgroup_rescan_lock)
> >     spin_unlock(&fs_info->qgroup_lock)
> >
> >     btrfs_init_work()
> >      --> starts the worker
> >
> >                                                         btrfs_qgroup_rescan_worker()
> >                                                          mutex_lock(&fs_info->qgroup_rescan_lock)
> >
> >                                                          fs_info->qgroup_flags &=
> >                                                            ~BTRFS_QGROUP_STATUS_FLAG_RESCAN
> >
> >                                                          mutex_unlock(&fs_info->qgroup_rescan_lock)
> >
> >                                                          starts transaction, updates qgroup status
> >                                                          item, etc
> >
> >                                                                                                            btrfs_ioctl_quota_rescan()
> >                                                                                                             btrfs_qgroup_rescan()
> >                                                                                                              qgroup_rescan_init()
> >                                                                                                               mutex_lock(&fs_info->qgroup_rescan_lock)
> >                                                                                                               spin_lock(&fs_info->qgroup_lock)
> >
> >                                                                                                               fs_info->qgroup_flags |=
> >                                                                                                                 BTRFS_QGROUP_STATUS_FLAG_RESCAN
> >
> >                                                                                                               init_completion(
> >                                                                                                                 &fs_info->qgroup_rescan_completion)
> >
> >                                                                                                               fs_info->qgroup_rescan_running = true
> >
> >                                                                                                               mutex_unlock(&fs_info->qgroup_rescan_lock)
> >                                                                                                               spin_unlock(&fs_info->qgroup_lock)
> >
> >                                                                                                               btrfs_init_work()
> >                                                                                                                --> starts another worker
> >
> >                                                          mutex_lock(&fs_info->qgroup_rescan_lock)
> >
> >                                                          fs_info->qgroup_rescan_running = false
> >
> >                                                          mutex_unlock(&fs_info->qgroup_rescan_lock)
> >
> >                                                        complete_all(&fs_info->qgroup_rescan_completion)
> >
> > Before the rescan worker started by the task at CPU 3 completes, if another
> > task calls btrfs_ioctl_quota_rescan(), it will get -EINPROGRESS because the
> > flag BTRFS_QGROUP_STATUS_FLAG_RESCAN is set at fs_info->qgroup_flags, which
> > is expected and correct behaviour.
> >
> > However if other task calls btrfs_ioctl_quota_rescan_wait() before the
> > rescan worker started by the task at CPU 3 completes, it will return
> > immediately without waiting for the new rescan worker to complete,
> > because fs_info->qgroup_rescan_running is set to false by CPU 2.
> >
> > This race is making test case btrfs/171 (from fstests) to fail often:
> >
> >   btrfs/171 9s ... - output mismatch (see /home/fdmanana/git/hub/xfstests/results//btrfs/171.out.bad)
> >       --- tests/btrfs/171.out     2018-09-16 21:30:48.505104287 +0100
> >       +++ /home/fdmanana/git/hub/xfstests/results//btrfs/171.out.bad      2019-09-19 02:01:36.938486039 +0100
> >       @@ -1,2 +1,3 @@
> >        QA output created by 171
> >       +ERROR: quota rescan failed: Operation now in progress
> >        Silence is golden
> >       ...
> >       (Run 'diff -u /home/fdmanana/git/hub/xfstests/tests/btrfs/171.out /home/fdmanana/git/hub/xfstests/results//btrfs/171.out.bad'  to see the entire diff)
> >
> > That is because the test calls the btrfs-progs commands "qgroup quota
> > rescan -w", "qgroup assign" and "qgroup remove" in a sequence that makes
> > calls to the rescan start ioctl fail with -EINPROGRESS (note the "btrfs"
> > commands 'qgroup assign' and 'qgroup remove' often call the rescan start
> > ioctl after calling the qgroup assign ioctl, btrfs_ioctl_qgroup_assign()),
> > since previous waits didn't actually wait for a rescan worker to complete.
> >
> > Another problem the race can cause is missing wake ups for waiters, since
> > the call to complete_all() happens outside a critical section and after
> > clearing the flag BTRFS_QGROUP_STATUS_FLAG_RESCAN. In the sequence diagram
> > above, if we have a waiter for the first rescan task (executed by CPU 2),
> > then fs_info->qgroup_rescan_completion.wait is not empty, and if after the
> > rescan worker clears BTRFS_QGROUP_STATUS_FLAG_RESCAN and before it calls
> > complete_all() against fs_info->qgroup_rescan_completion, the task at CPU 3
> > calls init_completion() against fs_info->qgroup_rescan_completion which
> > re-initilizes its wait queue to an empty queue, therefore causing the
> > rescan worker at CPU 2 to call complete_all() against an empty queue, never
> > waking up the task waiting for that rescan worker.
> >
> > Fix this by clearing BTRFS_QGROUP_STATUS_FLAG_RESCAN and setting
> > fs_info->qgroup_rescan_running to false in the same critical section,
> > delimited by the mutex fs_info->qgroup_rescan_lock, as well as doing
>
> Why do we need both the RESCAN flag and qgroup_rescan_running having
> them both and having btrfs_qgroup_wait_for_completion rely on
> qgroup[_rescan_running just makes it easier to screw up. IMO it makes
> sense to remove qgroup_rescan_running in this patch as well and the code
> should only use RESCAN flag.

No, they are both needed. Otherwise the issue fixed by the commit
mentioned in the second "Fixes:" tag is re-introduced.

>




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux