On Mon, Jun 01, 2015 at 05:56:43PM +0100, Filipe David Manana wrote:
> On Sat, May 30, 2015 at 9:59 AM, Omar Sandoval <osandov@xxxxxxxxxxx> wrote:
> > Since commit bafc9b754f75 ("vfs: More precise tests in d_invalidate"),
> > mounted subvolumes can be deleted because d_invalidate() won't fail.
> > However, we run into problems when we attempt to delete the default
> > subvolume while it is mounted as the root filesystem:
> >
> > # btrfs subvol list /
> > ID 257 gen 306 top level 5 path rootvol
> > ID 267 gen 334 top level 5 path snap1
> > # btrfs subvol get-default /
> > ID 267 gen 334 top level 5 path snap1
> > # btrfs inspect-internal rootid /
> > 267
> > # mount -o subvol=/ /dev/vda1 /mnt
> > # btrfs subvol del /mnt/snap1
> > Delete subvolume (no-commit): '/mnt/snap1'
> > ERROR: cannot delete '/mnt/snap1' - Operation not permitted
> > # findmnt /
> > findmnt: can't read /proc/mounts: No such file or directory
> > # ls /proc
> > #
> >
> > Markus reported that this same scenario simply led to a kernel oops.
> >
> > This happens because in btrfs_ioctl_snap_destroy(), we call
> > d_invalidate() before we check may_destroy_subvol(), which means that we
> > detach the submounts and drop the dentry before erroring out. Instead,
> > we should only invalidate the dentry once we know that we're going
> > through with the deletion.
> >
> > Cc: <stable@xxxxxxxxxxxxxxx>
> > Fixes: bafc9b754f75 ("vfs: More precise tests in d_invalidate")
> > Reported-by: Markus Schauler <mschauler@xxxxxxxxx>
> > Signed-off-by: Omar Sandoval <osandov@xxxxxxxxxxx>
> > ---
> > The other fix for preventing all mounted subvolumes from being deleted
> > would preclude this, but it sounded like we were leaning towards
> > enforcing that in userspace once subvolume info becomes available in
> > /proc/mounts, so this should be fixed separately.
> >
> > fs/btrfs/ioctl.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> > index 1c22c6518504..8edb8544088b 100644
> > --- a/fs/btrfs/ioctl.c
> > +++ b/fs/btrfs/ioctl.c
> > @@ -2413,14 +2413,14 @@ static noinline int btrfs_ioctl_snap_destroy(struct file *file,
> > goto out_unlock_inode;
> > }
> >
> > - d_invalidate(dentry);
> > -
> > down_write(&root->fs_info->subvol_sem);
> >
> > err = may_destroy_subvol(dest);
> > if (err)
> > goto out_up_write;
> >
> > + d_invalidate(dentry);
> > +
>
> Any reason why not calling d_invalidate() only if the call
> btrfs_unlink_subvol() succeeds? Not seeing a reason why we should
> invalidate before doing the actual deletion successfully (before that
> metadata reservation can fail or failure to start/join a transaction,
> etc).
Good point, it's probably best to put it here:
----
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 1c22c6518504..5a225cd0af65 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2413,8 +2413,6 @@ static noinline int btrfs_ioctl_snap_destroy(struct file *file,
goto out_unlock_inode;
}
- d_invalidate(dentry);
-
down_write(&root->fs_info->subvol_sem);
err = may_destroy_subvol(dest);
@@ -2508,6 +2506,7 @@ out_up_write:
out_unlock_inode:
mutex_unlock(&inode->i_mutex);
if (!err) {
+ d_invalidate(dentry);
shrink_dcache_sb(root->fs_info->sb);
btrfs_invalidate_inodes(dest);
d_delete(dentry);
----
I also can't figure out what that shrink_dcache_sb() is doing there.
d_invalidate() already prunes the dentry cache under the deleted
subvolume, but this clears the dcache for the whole filesystem, which
could incur unnecessary overhead. The call was added by efefb1438be2
("Btrfs: remove negative dentry when deleting subvolumne"), which fixes
a problem in btrfs_dentry_delete(), but the commit message doesn't
explain what shrink_dcache_sb() had to do with it. I'll send in an
updated version with d_invalidate() moved and shrink_dcache_sb() removed
and see if anyone can enlighten me.
> Also, would you consider making an xfstest for this?
No problem.
> thanks
Thanks for the review!
--
Omar
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html