Re: [PATCH 2/2] Btrfs: fix unprotected deletion from pending_chunks list

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Filipe,

Thank you for your response. I realize it was a long time, ago, but we
are just now in the process of moving to stable kernel 4.14.x.

Regarding the fix, I see now the relevant code in "btrfs_remove_block_group":
    mutex_lock(&fs_info->chunk_mutex);
    if (!list_empty(&em->list)) {
        /* We're in the transaction->pending_chunks list. */
        free_extent_map(em);
    }
...
However, this brings another doubt. Let's say we indeed performed
free_extent_map in the above code. But later we may do:
        /*
         * Our em might be in trans->transaction->pending_chunks which
         * is protected by fs_info->chunk_mutex ([lock|unlock]_chunks),
         * and so is the fs_info->pinned_chunks list.
         *
         * So at this point we must be holding the chunk_mutex to avoid
         * any races with chunk allocation (more specifically at
         * volumes.c:contains_pending_extent()), to ensure it always
         * sees the em, either in the pending_chunks list or in the
         * pinned_chunks list.
         */
        list_move_tail(&em->list, &fs_info->pinned_chunks);

So we have dropped the ref that was held by
"transaction->pending_chunks" list, and now we moved the "em" to the
pinned_chunks without a ref. But the code assumes that "pinned_chunks"
also has a ref on the "em". For example in close_ctree, we do:
    while (!list_empty(&fs_info->pinned_chunks)) {
        struct extent_map *em;

        em = list_first_entry(&fs_info->pinned_chunks,
                      struct extent_map, list);
        list_del_init(&em->list);
        free_extent_map(em);
    }

Can you please comment on that?

Thanks,
Alex.


On Mon, Jan 21, 2019 at 10:06 PM Filipe Manana <fdmanana@xxxxxxxxxx> wrote:
>
> On Mon, Jan 21, 2019 at 7:07 PM Alex Lyakas <alex@xxxxxxxxxx> wrote:
> >
> > Hi Filipe,
> >
> > On Tue, Dec 2, 2014 at 8:08 PM Filipe Manana <fdmanana@xxxxxxxx> wrote:
> > >
> > > On block group remove if the corresponding extent map was on the
> > > transaction->pending_chunks list, we were deleting the extent map
> > > from that list, through remove_extent_mapping(), without any
> > > synchronization with chunk allocation (which iterates that list
> > > and adds new elements to it). Fix this by ensure that this is done
> > > while the chunk mutex is held, since that's the mutex that protects
> > > the list in the chunk allocation code path.
> > >
> > > This applies on top (depends on) of my previous patch titled:
> > > "Btrfs: fix race between fs trimming and block group remove/allocation"
> > >
> > > But the issue in fact was already present before that change, it only
> > > became easier to hit after Josef's 3.18 patch that added automatic
> > > removal of empty block groups.
> > >
> > > Signed-off-by: Filipe Manana <fdmanana@xxxxxxxx>
> > > ---
> > >  fs/btrfs/extent-tree.c | 8 +++++++-
> > >  1 file changed, 7 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> > > index 17d429d..a7b81b4 100644
> > > --- a/fs/btrfs/extent-tree.c
> > > +++ b/fs/btrfs/extent-tree.c
> > > @@ -9524,19 +9524,25 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
> > >                 list_move_tail(&em->list, &root->fs_info->pinned_chunks);
> > >         }
> > >         spin_unlock(&block_group->lock);
> > > -       unlock_chunks(root);
> > >
> > >         if (remove_em) {
> > >                 struct extent_map_tree *em_tree;
> > >
> > >                 em_tree = &root->fs_info->mapping_tree.map_tree;
> > >                 write_lock(&em_tree->lock);
> > > +               /*
> > > +                * The em might be in the pending_chunks list, so make sure the
> > > +                * chunk mutex is locked, since remove_extent_mapping() will
> > > +                * delete us from that list.
> > > +                */
> > >                 remove_extent_mapping(em_tree, em);
> > >                 write_unlock(&em_tree->lock);
> > If the "em" was in pending_chunks, it will be deleted from that list
> > by "remove_extent_mapping". But it looks like in this case we also
> > need to drop the extra ref on "em", which was held by pending_chunks
> > list. I don't see it being done anywhere else. So we should check
> > before the remove_extent_mapping() call whether "em" was in
> > pending_chunks, and, if yes, drop the extra ref?
>
> This was part of a large patch set that fixed multiple issues with
> automatic removal of block groups.
> Dropping the extent map reference was done on another patch of that patch set:
>
> commit 495e64f4fe0363bc79fa0dfb41c271787e01b5c3
> Author: Filipe Manana <fdmanana@xxxxxxxx>
> Date:   Tue Dec 2 18:07:30 2014 +0000
>
>     Btrfs: fix fs mapping extent map leak
>
> Over 4 years ago....
>
> >
> > Thanks,
> > Alex.
> >
> >
> > >                 /* once for the tree */
> > >                 free_extent_map(em);
> > >         }
> > >
> > > +       unlock_chunks(root);
> > > +
> > >         btrfs_put_block_group(block_group);
> > >         btrfs_put_block_group(block_group);
> > >
> > > --
> > > 2.1.3
> > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> > > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux