Re: [PATCH v2 1/4] btrfs: Introduce per-profile available space facility

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/2/20 6:27 AM, Qu Wenruo wrote:
[PROBLEM]
There are some locations in btrfs requiring accurate estimation on how
many new bytes can be allocated on unallocated space.

We have two types of estimation:
- Factor based calculation
   Just use all unallocated space, divide by the profile factor
   One obvious user is can_overcommit().

- Chunk allocator like calculation
   This will emulate the chunk allocator behavior, to get a proper
   estimation.
   The only user is btrfs_calc_avail_data_space(), utilized by
   btrfs_statfs().
   The problem is, that function is not generic purposed enough, can't
   handle things like RAID5/6.

Current factor based calculation can't handle the following case:
   devid 1 unallocated:	1T
   devid 2 unallocated:	10T
   metadata type:	RAID1

If using factor, we can use (1T + 10T) / 2 = 5.5T free space for
metadata.
But in fact we can only get 1T free space, as we're limited by the
smallest device for RAID1.

[SOLUTION]
This patch will introduce the skeleton of per-profile available space
calculation, which can more-or-less get to the point of chunk allocator.

The difference between it and chunk allocator is mostly on rounding and
[0, 1M) reserved space handling, which shouldn't cause practical impact.

The newly introduced per-profile available space calculation will
calculate available space for each type, using chunk-allocator like
calculation.

With that facility, for above device layout we get the full available
space array:
   RAID10:	0  (not enough devices)
   RAID1:	1T
   RAID1C3:	0  (not enough devices)
   RAID1C4:	0  (not enough devices)
   DUP:		5.5T
   RAID0:	2T
   SINGLE:	11T
   RAID5:	1T
   RAID6:	0  (not enough devices)

Or for a more complex example:
   devid 1 unallocated:	1T
   devid 2 unallocated:  1T
   devid 3 unallocated:	10T

We will get an array of:
   RAID10:	0  (not enough devices)
   RAID1:	2T
   RAID1C3:	1T
   RAID1C4:	0  (not enough devices)
   DUP:		6T
   RAID0:	3T
   SINGLE:	12T
   RAID5:	2T
   RAID6:	0  (not enough devices)

And for the each profile , we go chunk allocator level calculation:
The code code looks like:

   clear_virtual_used_space_of_all_rw_devices();
   do {
   	/*
   	 * The same as chunk allocator, despite used space,
   	 * we also take virtual used space into consideration.
   	 */
   	sort_device_with_virtual_free_space();

   	/*
   	 * Unlike chunk allocator, we don't need to bother hole/stripe
   	 * size, so we use the smallest device to make sure we can
   	 * allocated as many stripes as regular chunk allocator
   	 */
   	stripe_size = device_with_smallest_free->avail_space;
	stripe_size = min(stripe_size, to_alloc / ndevs);

   	/*
   	 * Allocate a virtual chunk, allocated virtual chunk will
   	 * increase virtual used space, allow next iteration to
   	 * properly emulate chunk allocator behavior.
   	 */
   	ret = alloc_virtual_chunk(stripe_size, &allocated_size);
   	if (ret == 0)
   		avail += allocated_size;
   } while (ret == 0)

As we always select the device with least free space (just like chunk
allocator), for above 1T + 10T device, we will allocate a 1T virtual chunk
in the first iteration, then run out of device in next iteration.

Thus only get 1T free space for RAID1 type, just like what chunk
allocator would do.

This patch is just the skeleton, we only do the per-profile chunk
calculation at mount time.

Later commits will update per-profile available space at other proper
timings.

Suggested-by: Josef Bacik <josef@xxxxxxxxxxxxxx>
Signed-off-by: Qu Wenruo <wqu@xxxxxxxx>

Reviewed-by: Josef Bacik <josef@xxxxxxxxxxxxxx>

Thanks,

Josef



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux