Re: Kernel BUG on Snapshot Deletion (3.11.0-rc5)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 21, 2013 at 08:44:55AM -0500, Mitch Harder wrote:
> On Thu, Aug 15, 2013 at 12:29 PM, Mitch Harder
> <mitch.harder@xxxxxxxxxxxxxxxx> wrote:
> > I'm running into a curious problem.
> >
> > In the process of making my script portable, I am breaking the ability
> > to replicate the error.
> >
> > I'm trying to isolate the aspect of my local script that is triggering
> > the error.  No firm insights yet.
> >
> >
> > On Tue, Aug 13, 2013 at 11:03 AM, Mitch Harder
> > <mitch.harder@xxxxxxxxxxxxxxxx> wrote:
> >> Let me work on making that script more portable, and hopefully quicker
> >> to reproduce.
> >>
> >> On Tue, Aug 13, 2013 at 9:15 AM, Josef Bacik <jbacik@xxxxxxxxxxxx> wrote:
> >>> On Mon, Aug 12, 2013 at 11:06:27PM -0500, Mitch Harder wrote:
> >>>> I'm hitting a btrfs Kernel BUG running a snapshot stress script with
> >>>> linux-3.11.0-rc5.
> >>>>
> >>>
> >>> I can haz script?  Thanks,
> >>>
> 
> I've had a hard time assembling a portable reproducer for this issue.
> 
> I discovered that my reproducer was highly dependent on a local
> archive of out-of-date git kernel sources.  My efforts to reproduce
> the error with a portable set of scripts with publicly available
> kernel git sources weren't successful.
> 
> It seems like this issue is related to a corner-case workload that is
> difficult to reproduce.
> 
> So I've bisected the error I was seeing with my local script, and
> identified the following commit as triggering my issue:
> 
> commit:    3c64a1aba7cfcb04f79e76f859b3d66660275d59
> Btrfs: cleanup: don't check the same thing twice
> https://git.kernel.org/cgit/linux/kernel/git/mason/linux-btrfs.git/commit/fs/btrfs?h=for-linus&id=3c64a1aba7cfcb04
> 
> I tested a kernel which reverted this change, and also added WARN_ON
> lines to provide a back trace.
> 
> diff --git a/fs/btrfs/export.c b/fs/btrfs/export.c
> index 4b86916..336d628 100644
> --- a/fs/btrfs/export.c
> +++ b/fs/btrfs/export.c
> @@ -82,6 +82,12 @@ static struct dentry *btrfs_get_dentry(struct
> super_block *sb, u64 objectid,
>          goto fail;
>      }
> 
> +    if (btrfs_root_refs(&root->root_item) == 0) {
> +        WARN_ON(1);
> +        err = -ENOENT;
> +        goto fail;
> +    }
> +
>      key.objectid = objectid;
>      btrfs_set_key_type(&key, BTRFS_INODE_ITEM_KEY);
>      key.offset = 0;
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 94413af..4010257 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -310,6 +310,12 @@ static int __btrfs_run_defrag_inode(struct
> btrfs_fs_info *fs_info,
>          goto cleanup;
>      }
> 
> +    if (btrfs_root_refs(&inode_root->root_item) == 0) {
> +        WARN_ON(1);
> +        ret = -ENOENT;
> +        goto cleanup;
> +    }
> +

Funnily enough I just added this check back in a different commit.  Now that I
look at the reasoning tho this cleanup patch was wrong.  We do check if
root_refs is 0 in btrfs_read_fs_root_no_name, but only if the root isn't already
in cache.  If it is in cache we will happily return it with no issue.  So either
we should add the extra check for the in-cache case (probably a good idea), or
go back and add all of these checks back.  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux