On Thu, Sep 26, 2019 at 02:35:45PM +0800, Qu Wenruo wrote:
> [BUG]
> There is one BUG_ON() report where a transaction is aborted during
> balance, then kernel BUG_ON() in merge_reloc_roots():
Do you have details from the report, eg. what's the error code?
> void merge_reloc_roots(struct reloc_control *rc)
> {
> ...
> BUG_ON(!RB_EMPTY_ROOT(&rc->reloc_root_tree.rb_root)); <<<
> }
>
> [CAUSE]
> It's still uncertain why we can get to such situation.
> As all __add_reloc_root() calls will also link that reloc root to
> rc->reloc_roots, and in merge_reloc_roots() we cleanup rc->reloc_roots.
>
> So the root cause is still uncertain.
>
> [FIX]
> But we can still hunt down all the BUG_ON() in merge_reloc_roots().
>
> There are 3 BUG_ON()s in it:
> - BUG_ON() for read_fs_root() result
> - BUG_ON() for root->reloc_root != reloc_root case
> - BUG_ON() for the non-empty reloc_root_tree
relocation.c is worst regarding number of BUG_ONs, some of them look
like runtime assertions of the invariants but some of them are
substituting for error handling.
The first one BUG_ON(IS_ERR(root)); is clearly the latter, the other are
assertions
>
> For the first two, just grab the return value and goto out tag for
> cleanup.
>
> For the last one, make it more graceful by:
> - grab the lock before doing read/write
> - warn instead of panic
> - cleanup the nodes in rc->reloc_root_tree
>
> Signed-off-by: Qu Wenruo <wqu@xxxxxxxx>
> ---
> Reason for RFC:
> The root cause to leak nodes in reloc_root_tree is still unknown.
> ---
> fs/btrfs/relocation.c | 39 ++++++++++++++++++++++++++++++++++++---
> 1 file changed, 36 insertions(+), 3 deletions(-)
>
> diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
> index 655f1d5a8c27..d562b5c52a40 100644
> --- a/fs/btrfs/relocation.c
> +++ b/fs/btrfs/relocation.c
> @@ -2484,11 +2484,26 @@ void merge_reloc_roots(struct reloc_control *rc)
> if (btrfs_root_refs(&reloc_root->root_item) > 0) {
> root = read_fs_root(fs_info,
> reloc_root->root_key.offset);
> - BUG_ON(IS_ERR(root));
> - BUG_ON(root->reloc_root != reloc_root);
> + if (IS_ERR(root)) {
This is bug_on -> error handling, ok
> + ret = PTR_ERR(root);
> + btrfs_err(fs_info,
> + "failed to read root %llu: %d",
> + reloc_root->root_key.offset, ret);
> + goto out;
> + }
> + if (root->reloc_root != reloc_root) {
With this one I'm not sure it could happen but replacing the bug on is
still good.
> + btrfs_err(fs_info,
> + "reloc root mismatch for root %llu",
Would be good to print the number of the other root as well.
> + root->root_key.objectid);
> + ret = -EINVAL;
> + goto out;
> + }
>
> ret = merge_reloc_root(rc, root);
> if (ret) {
> + btrfs_err(fs_info,
> + "failed to merge reloc tree for root %llu: %d",
> + root->root_key.objectid, ret);
> if (list_empty(&reloc_root->root_list))
> list_add_tail(&reloc_root->root_list,
> &reloc_roots);
> @@ -2520,7 +2535,25 @@ void merge_reloc_roots(struct reloc_control *rc)
> free_reloc_roots(&reloc_roots);
> }
>
> - BUG_ON(!RB_EMPTY_ROOT(&rc->reloc_root_tree.rb_root));
This one looks more like the invariant, the tree should be really empty
here. While the cleanup is trying to make things work despite there's a
problem, I think the warning should not be debugging only.
> + spin_lock(&rc->reloc_root_tree.lock);
> + /* Cleanup dirty reloc tree nodes */
> + if (!RB_EMPTY_ROOT(&rc->reloc_root_tree.rb_root)) {
> + struct mapping_node *node;
> + struct mapping_node *next;
> +
> + WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG));
...