Re: btrfs dev sta not updating

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 24.06.20 г. 14:39 ч., Zygo Blaxell wrote:
> On Tue, Jun 23, 2020 at 02:13:04PM +0300, Nikolay Borisov wrote:
>>
>>
>> On 23.06.20 г. 12:48 ч., Russell Coker wrote:
>>> On Tuesday, 23 June 2020 6:17:00 PM AEST Nikolay Borisov wrote:
>>>>> In this case I'm getting application IO errors and lost data, so if the
>>>>> error count is designed to not count recovered errors then it's still not
>>>>> doing the right thing.
>>>>
>>>> In this case yes, however this was utterly not clear from your initial
>>>> email. In fact it seems you have omitted quite a lot of information. So
>>>> let's step back and start afresh. So first give information about your
>>>> current btrfs setup by giving the output of:
>>>>
>>>> btrfs fi usage /path/to/btrfs
>>>
>>> # btrfs fi usa .
>>> Overall:
>>>     Device size:                  62.50GiB
>>>     Device allocated:             19.02GiB
>>>     Device unallocated:           43.48GiB
>>>     Device missing:                  0.00B
>>>     Used:                         16.26GiB
>>>     Free (estimated):             44.25GiB      (min: 22.51GiB)
>>>     Data ratio:                       1.00
>>>     Metadata ratio:                   2.00
>>>     Global reserve:               17.06MiB      (used: 0.00B)
>>>
>>> Data,single: Size:17.01GiB, Used:16.23GiB (95.43%)
>>>    /dev/sdc1      17.01GiB
>>>
>>> Metadata,DUP: Size:1.00GiB, Used:17.19MiB (1.68%)
>>>    /dev/sdc1       2.00GiB
>>>
>>> System,DUP: Size:8.00MiB, Used:16.00KiB (0.20%)
>>>    /dev/sdc1      16.00MiB
>>>
>>> Unallocated:
>>>    /dev/sdc1      43.48GiB
>>
>> Do you use compression on this filesystem i.e have you mounted with
>> -ocompression= option ?
>>
>> Based on this data alone it's evident that you don't really have mirrors
>> of the data, in this case having experienced the checksum errors should
>> have indeed resulted in error counters being incremented. I'll look into
>> this.
> 
> In commit 0cc068e6ee59 "btrfs: don't report readahead errors and don't
> update statistics" we stopped counting errors if they occur during
> readahead.  If there's a mirror available, we do still correct errors
> in that case.  Errors in readahead are fairly common, e.g. there are
> usually a few during lvm pvmove operations, so it maybe makes sense
> not to count them; however, if the errors are not counted, they should
> also not be repaired.  Instead, they should be repaired only during
> non-readahead reads (i.e. when the repairs will be counted in dev stats).
> Repairing errors without counting is bad because it hides an important
> indicator of device failure.
> 
> This thread might be a different issue since there aren't any mirrors
> with single data, but if you're look at dev stats correctness anyway...

Turns out this is a genueine bug, namely errors stats are only ever
updated in btrfs_end_bio which  happens well before checksums are
checked. In fact at the time when we are checking checksums
end_bio_extent_readpage->readpage_end_io_hook
(btrfs_readpage_end_io_hook) we don't (currently) have enough context to
increment the errors. I'm currently testing a tentative fix for this.

> 
>> <snip>
> 



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux