We need to get this series moving because the bug affects a few stable
versions.
On Thu, Dec 12, 2019 at 08:39:43AM +0800, Qu Wenruo wrote:
> On 2019/12/11 下午11:34, David Sterba wrote:
> > On Wed, Dec 11, 2019 at 01:00:01PM +0800, Qu Wenruo wrote:
> >> Due to commit d2311e698578 ("btrfs: relocation: Delay reloc tree
> >> deletion after merge_reloc_roots"), reloc tree lifespan is extended.
> >>
> >> Although we always set root->reloc_root to NULL before we drop the reloc
> >> tree, but that's not multi-core safe since we have no proper memory
> >> barrier to ensure other cores can see the same root->reloc_root.
> >>
> >> The proper root fix should be some proper root refcount, and make
> >> btrfs_drop_snapshot() to wait for all other root owner to release the
> >> root before dropping it.
> >
> > This would block cleaning deleted subvolumes, no? We can skip the dead
> > tree (and add it back to the list) in that can and not wait. The
> > cleaner thread is able to process the list repeatedly.
>
> What I mean is:
> - For consumer (reading root->reloc_root)
> spin_lock(&root->reloc_lock);
> if (!root->reloc_root) {
> spin_unlock(&root->reloc_lock);
> return NULL
> }
> refcount_inc(&root->reloc_root->refcount);
> return(root->reloc_root);
> spin_unlock(&root->reloc_lock);
>
> And of cource, release it after grabbing reloc_root.
>
> - For cleaner
> grab reloc_root just like consumer.
> retry:
> wait_event(refcount_read(&root->reloc_root->ref_count) == 1);
> spin_lock(&root->reloc_lock);
> if (&root->reloc_root->ref_count != 1){
> spin_unlock(); goto retry;
> }
> root->reloc_root = NULL;
> spin_unlock(&root->reloc_lock);
> /* Now we're the only owner, delete the root */
So it's one bit vs refcount and a lock. For the backports I'd go with
the bit, but this needs the barriers as mentioned in my previous reply.
Can you please update the patches?