Re: btrfs balance start -musage=0 / eats drive space

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2020/2/5 上午11:25, Qu Wenruo wrote:
> 
> 
> On 2020/2/5 上午11:18, Matt Corallo wrote:
>> Hmm? My understanding is that that issue was only visible via stat
>> calls, not in actual behavior. In this case, if you have a lot of
>> in-flight writes from write cache balance will fail after allocating
>> blocks (so I guess balance relies on stat()?)
>>
>> Also, this is all on a kernel with your previous patch "btrfs: super:
>> Make btrfs_statfs() work with metadata over-commiting" applied.
> 
> Oh, sorry, misread some thing.
> 
> Then it's going to be fixed by a patchset:
> https://patchwork.kernel.org/project/linux-btrfs/list/?series=229013

Oh, wrong patchset. It's to solve another problem. Not the problem you hit.

This is the correct patch set:
https://patchwork.kernel.org/project/linux-btrfs/list/?series=229979

Sorry for the inconvinience.

Thanks,
Qu
> 
> It's relocation space calculation going too paranoid.
> 
> Thanks,
> Qu
>>
>> Thanks,
>> Matt
>>
>> On 2/5/20 1:03 AM, Qu Wenruo wrote:
>>>
>>>
>>> On 2020/2/5 上午2:17, Matt Corallo wrote:
>>>> This appears to be some kind of race when there are a lot of pending
>>>> metadata writes in flight.
>>>>
>>>> I went and unmounted/remounted again (after taking about 30 minutes of
>>>> 5MB/s writes flushing an rsync with a ton of tiny files) and after the
>>>> remount the issue went away again. So I can only presume it is an issue
>>>> only when there are a million or so tiny files pending write.
>>>
>>> Known bug, the upstream fix is d55966c4279b ("btrfs: do not zero
>>> f_bavail if we have available space"), and is backported to stable kernels.
>>>
>>> I guess downstream kernels will soon get updated to fix it.
>>>
>>> Thanks,
>>> Qu
>>>>
>>>> Matt
>>>>
>>>> On 2/4/20 3:41 AM, Matt Corallo wrote:
>>>>> Things settled a tiny bit after unmount (see last email for the errors
>>>>> that generated) and remount, and a balance -mconvert,soft worked:
>>>>>
>>>>> [268093.588482] BTRFS info (device dm-2): balance: start
>>>>> -mconvert=raid1,soft -sconvert=raid1,soft
>>>>> ...
>>>>> [288405.946776] BTRFS info (device dm-2): balance: ended with status: 0
>>>>>
>>>>> However, the enospc issue still appears and seems tied to a few of the
>>>>> previously-allocated metadata blocks:
>>>>>
>>>>> # btrfs balance start -musage=0 /bigraid
>>>>> ...
>>>>>
>>>>> [289714.420418] BTRFS info (device dm-2): balance: start -musage=0 -susage=0
>>>>> [289714.508411] BTRFS info (device dm-2): 64 enospc errors during balance
>>>>> [289714.508413] BTRFS info (device dm-2): balance: ended with status: -28
>>>>>
>>>>> # cd /sys/fs/btrfs/e2843f83-aadf-418d-b36b-5642f906808f/allocation/ &&
>>>>> grep -Tr .
>>>>> metadata/raid1/used_bytes:	255838797824
>>>>> metadata/raid1/total_bytes:	441307889664
>>>>> metadata/disk_used:	511677595648
>>>>> metadata/bytes_pinned:	0
>>>>> metadata/bytes_used:	255838797824
>>>>> metadata/total_bytes_pinned:	999424
>>>>> metadata/disk_total:	882615779328
>>>>> metadata/total_bytes:	441307889664
>>>>> metadata/bytes_reserved:	4227072
>>>>> metadata/bytes_readonly:	65536
>>>>> metadata/bytes_may_use:	433502945280
>>>>> metadata/flags:	4
>>>>> system/raid1/used_bytes:	1474560
>>>>> system/raid1/total_bytes:	33554432
>>>>> system/disk_used:	2949120
>>>>> system/bytes_pinned:	0
>>>>> system/bytes_used:	1474560
>>>>> system/total_bytes_pinned:	0
>>>>> system/disk_total:	67108864
>>>>> system/total_bytes:	33554432
>>>>> system/bytes_reserved:	0
>>>>> system/bytes_readonly:	0
>>>>> system/bytes_may_use:	0
>>>>> system/flags:	2
>>>>> global_rsv_reserved:	536870912
>>>>> data/disk_used:	13645423230976
>>>>> data/bytes_pinned:	0
>>>>> data/bytes_used:	13645423230976
>>>>> data/single/used_bytes:	13645423230976
>>>>> data/single/total_bytes:	13661217226752
>>>>> data/total_bytes_pinned:	0
>>>>> data/disk_total:	13661217226752
>>>>> data/total_bytes:	13661217226752
>>>>> data/bytes_reserved:	117518336
>>>>> data/bytes_readonly:	196608
>>>>> data/bytes_may_use:	15064711168
>>>>> data/flags:	1
>>>>> global_rsv_size:	536870912
>>>>>
>>>>>
>>>>> Somewhat more frightening, this also happens on the system blocks:
>>>>>
>>>>> [288405.946776] BTRFS info (device dm-2): balance: ended with status: 0
>>>>> [289589.506357] BTRFS info (device dm-2): balance: start -musage=5 -susage=5
>>>>> [289589.905675] BTRFS info (device dm-2): relocating block group
>>>>> 9676759498752 flags system|raid1
>>>>> [289590.807033] BTRFS info (device dm-2): found 89 extents
>>>>> [289591.300212] BTRFS info (device dm-2): 16 enospc errors during balance
>>>>> [289591.300216] BTRFS info (device dm-2): balance: ended with status: -28
>>>>>
>>>>> Matt
>>>>>
>>>>> On 2/3/20 9:40 PM, Chris Murphy wrote:
>>>>>> A developer might find it useful to see this reproduced with mount
>>>>>> option enospc_debug. And soon after enospc the output from:
>>>>>>
>>>>>>  cd /sys/fs/btrfs/UUID/allocation/ && grep -Tr .
>>>>>>
>>>>>> yep, space then dot at the end
>>>>>>
>>>
> 

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux