Re: [PATCH 4/4] btrfs: use btrfs_can_overcommit in inc_block_group_ro

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 26, 2019 at 11:00:24AM +0800, Qu Wenruo wrote:
> 
> 
> On 2019/11/25 下午10:40, Josef Bacik wrote:
> > inc_block_group_ro does a calculation to see if we have enough room left
> > over if we mark this block group as read only in order to see if it's ok
> > to mark the block group as read only.
> > 
> > The problem is this calculation _only_ works for data, where our used is
> > always less than our total.  For metadata we will overcommit, so this
> > will almost always fail for metadata.
> > 
> > Fix this by exporting btrfs_can_overcommit, and then see if we have
> > enough space to remove the remaining free space in the block group we
> > are trying to mark read only.  If we do then we can mark this block
> > group as read only.
> 
> This patch is indeed much better than my naive RFC patches.
> 
> Instead of reducing over commit threshold, this just increase the check
> to can_overcommit(), brilliant.
> 
> However some small nitpicks below.
> > 
> > Signed-off-by: Josef Bacik <josef@xxxxxxxxxxxxxx>
> > ---
> >  fs/btrfs/block-group.c | 37 ++++++++++++++++++++++++++-----------
> >  fs/btrfs/space-info.c  | 19 ++++++++++---------
> >  fs/btrfs/space-info.h  |  3 +++
> >  3 files changed, 39 insertions(+), 20 deletions(-)
> > 
> > diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> > index 3ffbc2e0af21..7b1f6d2b9621 100644
> > --- a/fs/btrfs/block-group.c
> > +++ b/fs/btrfs/block-group.c
> > @@ -1184,7 +1184,6 @@ static int inc_block_group_ro(struct btrfs_block_group *cache, int force)
> Now @force parameter is not used any more.
> 
> We can have a cleanup patch for it.
> 
> >  {
> >  	struct btrfs_space_info *sinfo = cache->space_info;
> >  	u64 num_bytes;
> > -	u64 sinfo_used;
> >  	int ret = -ENOSPC;
> >  
> >  	spin_lock(&sinfo->lock);
> > @@ -1200,19 +1199,38 @@ static int inc_block_group_ro(struct btrfs_block_group *cache, int force)
> >  
> >  	num_bytes = cache->length - cache->reserved - cache->pinned -
> >  		    cache->bytes_super - cache->used;
> > -	sinfo_used = btrfs_space_info_used(sinfo, true);
> >  
> >  	/*
> > -	 * sinfo_used + num_bytes should always <= sinfo->total_bytes.
> > -	 *
> > -	 * Here we make sure if we mark this bg RO, we still have enough
> > -	 * free space as buffer.
> > +	 * Data never overcommits, even in mixed mode, so do just the straight
> > +	 * check of left over space in how much we have allocated.
> >  	 */
> > -	if (sinfo_used + num_bytes + sinfo->total_bytes) {
> > +	if (sinfo->flags & BTRFS_BLOCK_GROUP_DATA) {
> > +		u64 sinfo_used = btrfs_space_info_used(sinfo, true);
> > +
> > +		/*
> > +		 * sinfo_used + num_bytes should always <= sinfo->total_bytes.
> > +		 *
> > +		 * Here we make sure if we mark this bg RO, we still have enough
> > +		 * free space as buffer.
> > +		 */
> > +		if (sinfo_used + num_bytes + sinfo->total_bytes)
> 
> The same code copied from patch 3 I guess?
> The same typo.
> 
> > +			ret = 0;
> > +	} else {
> > +		/*
> > +		 * We overcommit metadata, so we need to do the
> > +		 * btrfs_can_overcommit check here, and we need to pass in
> > +		 * BTRFS_RESERVE_NO_FLUSH to give ourselves the most amount of
> > +		 * leeway to allow us to mark this block group as read only.
> > +		 */
> > +		if (btrfs_can_overcommit(cache->fs_info, sinfo, num_bytes,
> > +					 BTRFS_RESERVE_NO_FLUSH))
> > +			ret = 0;
> > +	}
> 
> For metadata chunks, allocating a new chunk won't change how we do over
> commit calculation.
> 
> I guess we can skip the chunk allocation for metadata chunks in
> btrfs_inc_block_group_ro()?

I _think_ so?  But you are working/testing in this area right now, try and see
how it works out?  I _think_ we'll still need to pre-allocate for data, and I
feel like relocation is an area where we may not be able to allocate new chunks
at certain times, so pre-allocating the metadata chunk may be useful.  System is
another one I would be iffy on.

If we're going to do that I would way rather it be a separate thing and only
after somebody has tested it to death to make sure it's not going to bite us in
the ass.  Thanks,

Josef



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux