Re: [PATCH 0/4] Introduce per-profile available space array to avoid over-confident can_overcommit()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2020/1/2 下午3:47, Qu Wenruo wrote:
> There are several bug reports of ENOSPC error in
> btrfs_run_delalloc_range().
> 
> With some extra info from one reporter, it turns out that
> can_overcommit() is using a wrong way to calculate allocatable metadata
> space.
> 
> The most typical case would look like:
>   devid 1 unallocated:	1G
>   devid 2 unallocated:  10G
>   metadata profile:	RAID1
> 
> In above case, we can at most allocate 1G chunk for metadata, due to
> unbalanced disk free space.
> But current can_overcommit() uses factor based calculation, which never
> consider the disk free space balance.
> 
> 
> To address this problem, here comes the per-profile available space
> array, which gets updated every time a chunk get allocated/removed or a
> device get grown or shrunk.
> 
> This provides a quick way for hotter place like can_overcommit() to grab
> an estimation on how many bytes it can over-commit.
> 
> The per-profile available space calculation tries to keep the behavior
> of chunk allocator, thus it can handle uneven disks pretty well.
> 
> Although per-profile is not clever enough to handle estimation when both
> data and metadata chunks need to be considered, its virtual chunk
> infrastructure is flex enough to handle such case.
> 
> So for statfs(), we also re-use virtual chunk allocator to handle
> available data space, with metadata over-commit space considered.
> This brings an unexpected advantage, now we can handle RAID5/6 pretty OK
> in statfs().
> 
> Changelog:
> v1:
> - Fix a bug where we forgot to update per-profile array after allocating
>   a chunk.
>   To avoid ABBA deadlock, this introduce a small windows at the end
>   __btrfs_alloc_chunk(), it's not elegant but should be good enough
>   before we rework chunk and device list mutex.
My persistence on device_list_mutex doesn't turn out to be good.
It causes dead lock in btrfs/124.

I'll rework this lock part to solve them.

Thanks,
Qu

>   
> - Make statfs() to use virtual chunk allocator to do better estimation
>   Now statfs() can report not only more accurate result, but can also
>   handle RAID5/6 better.
> 
> Qu Wenruo (4):
>   btrfs: Introduce per-profile available space facility
>   btrfs: Update per-profile available space when device size/used space
>     get updated
>   btrfs: space-info: Use per-profile available space in can_overcommit()
>   btrfs: statfs: Use virtual chunk allocation to calculation available
>     data space
> 
>  fs/btrfs/space-info.c |  15 ++-
>  fs/btrfs/super.c      | 190 +++++++++++++----------------------
>  fs/btrfs/volumes.c    | 223 ++++++++++++++++++++++++++++++++++++++----
>  fs/btrfs/volumes.h    |  14 +++
>  4 files changed, 293 insertions(+), 149 deletions(-)
> 




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux