On Tue, Mar 26, 2019 at 12:17 PM Nikolay Borisov <nborisov@xxxxxxxx> wrote:
>
>
>
> On 26.03.19 г. 12:49 ч., fdmanana@xxxxxxxxxx wrote:
> > From: Filipe Manana <fdmanana@xxxxxxxx>
> >
> > Whan a filesystem is mounted with the nologreplay mount option, which
> > requires it to be mounted in RO mode as well, we can not allow discard on
> > free space inside block groups, because log trees refer to extents that
> > are not pinned in a block group's free space cache (pinning the extents is
> > precisely the first phase of replaying a log tree).
> >
> > So do not allow the fitrim ioctl to do anything when the filesystem is
> > mounted with the nologreplay option, because later it can be mounted RW
> > without that option, which causes log replay to happen and result in
> > either a failure to replay the log trees (leading to a mount failure), a
> > crash or some silent corruption.
> >
> > Reported-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > Signed-off-by: Filipe Manana <fdmanana@xxxxxxxx>
>
> Does it make sense to make the check a bit more specific and only return
> EROFS when NOLOGREPLAY and the log tree has non-null generation?
It would make sense checking if there's actually a log tree as well.
Neither the xfs nor ext4 (which is already in Linus' tree) do such
equivalent checks, nor the proposed fstests test case makes sure a
journal/log exists.
Not against it, but this isn't a common use case either.
>
> In any case:
>
> Reviewed-by: Nikolay Borisov <nborisov@xxxxxxxx>
>
> > ---
> > fs/btrfs/ioctl.c | 10 ++++++++++
> > 1 file changed, 10 insertions(+)
> >
> > diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> > index 494f0f10d70e..01808934d21f 100644
> > --- a/fs/btrfs/ioctl.c
> > +++ b/fs/btrfs/ioctl.c
> > @@ -501,6 +501,16 @@ static noinline int btrfs_ioctl_fitrim(struct file *file, void __user *arg)
> > if (!capable(CAP_SYS_ADMIN))
> > return -EPERM;
> >
> > + /*
> > + * If the fs is mounted with nologreplay, which requires it to be
> > + * mounted in RO mode as well, we can not allow discard on free space
> > + * inside block groups, because log trees refer to extents that are not
> > + * pinned in a block group's free space cache (pinning the extents is
> > + * precisely the first phase of replaying a log tree).
> > + */
> > + if (btrfs_test_opt(fs_info, NOLOGREPLAY))
> > + return -EROFS;
> > +
> > rcu_read_lock();
> > list_for_each_entry_rcu(device, &fs_info->fs_devices->devices,
> > dev_list) {
> >