On Tue, Nov 26, 2019 at 11:00:24AM +0800, Qu Wenruo wrote:
>
>
> On 2019/11/25 下午10:40, Josef Bacik wrote:
> > inc_block_group_ro does a calculation to see if we have enough room left
> > over if we mark this block group as read only in order to see if it's ok
> > to mark the block group as read only.
> >
> > The problem is this calculation _only_ works for data, where our used is
> > always less than our total. For metadata we will overcommit, so this
> > will almost always fail for metadata.
> >
> > Fix this by exporting btrfs_can_overcommit, and then see if we have
> > enough space to remove the remaining free space in the block group we
> > are trying to mark read only. If we do then we can mark this block
> > group as read only.
>
> This patch is indeed much better than my naive RFC patches.
>
> Instead of reducing over commit threshold, this just increase the check
> to can_overcommit(), brilliant.
>
> However some small nitpicks below.
> >
> > Signed-off-by: Josef Bacik <josef@xxxxxxxxxxxxxx>
> > ---
> > fs/btrfs/block-group.c | 37 ++++++++++++++++++++++++++-----------
> > fs/btrfs/space-info.c | 19 ++++++++++---------
> > fs/btrfs/space-info.h | 3 +++
> > 3 files changed, 39 insertions(+), 20 deletions(-)
> >
> > diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> > index 3ffbc2e0af21..7b1f6d2b9621 100644
> > --- a/fs/btrfs/block-group.c
> > +++ b/fs/btrfs/block-group.c
> > @@ -1184,7 +1184,6 @@ static int inc_block_group_ro(struct btrfs_block_group *cache, int force)
> Now @force parameter is not used any more.
>
> We can have a cleanup patch for it.
>
> > {
> > struct btrfs_space_info *sinfo = cache->space_info;
> > u64 num_bytes;
> > - u64 sinfo_used;
> > int ret = -ENOSPC;
> >
> > spin_lock(&sinfo->lock);
> > @@ -1200,19 +1199,38 @@ static int inc_block_group_ro(struct btrfs_block_group *cache, int force)
> >
> > num_bytes = cache->length - cache->reserved - cache->pinned -
> > cache->bytes_super - cache->used;
> > - sinfo_used = btrfs_space_info_used(sinfo, true);
> >
> > /*
> > - * sinfo_used + num_bytes should always <= sinfo->total_bytes.
> > - *
> > - * Here we make sure if we mark this bg RO, we still have enough
> > - * free space as buffer.
> > + * Data never overcommits, even in mixed mode, so do just the straight
> > + * check of left over space in how much we have allocated.
> > */
> > - if (sinfo_used + num_bytes + sinfo->total_bytes) {
> > + if (sinfo->flags & BTRFS_BLOCK_GROUP_DATA) {
> > + u64 sinfo_used = btrfs_space_info_used(sinfo, true);
> > +
> > + /*
> > + * sinfo_used + num_bytes should always <= sinfo->total_bytes.
> > + *
> > + * Here we make sure if we mark this bg RO, we still have enough
> > + * free space as buffer.
> > + */
> > + if (sinfo_used + num_bytes + sinfo->total_bytes)
>
> The same code copied from patch 3 I guess?
> The same typo.
>
> > + ret = 0;
> > + } else {
> > + /*
> > + * We overcommit metadata, so we need to do the
> > + * btrfs_can_overcommit check here, and we need to pass in
> > + * BTRFS_RESERVE_NO_FLUSH to give ourselves the most amount of
> > + * leeway to allow us to mark this block group as read only.
> > + */
> > + if (btrfs_can_overcommit(cache->fs_info, sinfo, num_bytes,
> > + BTRFS_RESERVE_NO_FLUSH))
> > + ret = 0;
> > + }
>
> For metadata chunks, allocating a new chunk won't change how we do over
> commit calculation.
>
> I guess we can skip the chunk allocation for metadata chunks in
> btrfs_inc_block_group_ro()?
I _think_ so? But you are working/testing in this area right now, try and see
how it works out? I _think_ we'll still need to pre-allocate for data, and I
feel like relocation is an area where we may not be able to allocate new chunks
at certain times, so pre-allocating the metadata chunk may be useful. System is
another one I would be iffy on.
If we're going to do that I would way rather it be a separate thing and only
after somebody has tested it to death to make sure it's not going to bite us in
the ass. Thanks,
Josef