Re: [PATCH] btrfs: Remove the duplicated and sometimes too cautious btrfs_can_relocate()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 18.07.19 г. 8:48 ч., Qu Wenruo wrote:
> [BUG]
> The following script can easily cause unexpected ENOSPC:
>   umount $dev &> /dev/null
>   umount $mnt &> /dev/null
> 
>   mkfs.btrfs -b 1G -m single -d single $dev -f > /dev/null
> 
>   mount $dev $mnt -o enospc_debug
> 
>   for i in $(seq -w 0 511); do
>   	xfs_io -f -c "pwrite 0 1m" $mnt/inline_$i > /dev/null
>   done
>   sync
> 
>   btrfs balance start --full $mnt || return 1
>   sync
> 
>   # This will report -ENOSPC
>   btrfs balance start --full $mnt || return 1
>   umount $mnt
> 
> Also, btrfs/156 can also fail due to ENOSPC.
> 
> [CAUSE]
> The ENOSPC is reported by btrfs_can_relocate().
> 
> In btrfs_can_relocate(), it does the following check:
> - If the block group is empty
>   If empty, definitely we can relocate this block group.
> - If we are not the only block group and we have enough space
>   Then we can relocate this block group.
> 
> Above two checks are completely OK, although I could argue they doesn't
> make much sense, but the following check is vague and even sometimes
> too cautious to cause ENOSPC:
> - If we can allocate a new block group as large as current one.
>   If we failed previous two checks, we must pass this to relocate this
>   block group.

btrfs_can_relocate chunk requires min_free to be allocatable. 
min_free is defined as the used space in the  block group being 
relocated, which I think is correct. Also I find the logic which 
adjusts min_free and dev_min to also be correct. Finally the function 
checks whether the device's freespace is fragmented by trying to find a 
device chunk with the appropriate size. The question is - can we really 
have a device that has enough free space, yet is fragmented such that 
find_free_dev_extent fails which results in failing the allocation? I 
think the answer is no since we allocate in chunk granularity. What am I missing?


OTOH, in btrfs_inc_block_group_ro we only allocate a chunk if: 

a) we are changing raid profiles 
b) if inc_block_group_ro fails for our block group. 

And for b) I'm a bit puzzled as to what the code is supposed to mean. We have: 

num_bytes = cache->key.offset - cache->reserved - cache->pinned -       
                      cache->bytes_super - btrfs_block_group_used(&cache->item);  
          sinfo_used = btrfs_space_info_used(sinfo, true);                        
                                                                                  
          if (sinfo_used + num_bytes + min_allocable_bytes <=                     
              sinfo->total_bytes) {                   
//set ro

}

This means if the free space in the block group + the used space in the 
space info is smaller than the total space in 
the space info - make this block group RO. What's the rationale behind that?






[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux