On Tue, Jun 23, 2020 at 02:13:04PM +0300, Nikolay Borisov wrote: > > > On 23.06.20 г. 12:48 ч., Russell Coker wrote: > > On Tuesday, 23 June 2020 6:17:00 PM AEST Nikolay Borisov wrote: > >>> In this case I'm getting application IO errors and lost data, so if the > >>> error count is designed to not count recovered errors then it's still not > >>> doing the right thing. > >> > >> In this case yes, however this was utterly not clear from your initial > >> email. In fact it seems you have omitted quite a lot of information. So > >> let's step back and start afresh. So first give information about your > >> current btrfs setup by giving the output of: > >> > >> btrfs fi usage /path/to/btrfs > > > > # btrfs fi usa . > > Overall: > > Device size: 62.50GiB > > Device allocated: 19.02GiB > > Device unallocated: 43.48GiB > > Device missing: 0.00B > > Used: 16.26GiB > > Free (estimated): 44.25GiB (min: 22.51GiB) > > Data ratio: 1.00 > > Metadata ratio: 2.00 > > Global reserve: 17.06MiB (used: 0.00B) > > > > Data,single: Size:17.01GiB, Used:16.23GiB (95.43%) > > /dev/sdc1 17.01GiB > > > > Metadata,DUP: Size:1.00GiB, Used:17.19MiB (1.68%) > > /dev/sdc1 2.00GiB > > > > System,DUP: Size:8.00MiB, Used:16.00KiB (0.20%) > > /dev/sdc1 16.00MiB > > > > Unallocated: > > /dev/sdc1 43.48GiB > > Do you use compression on this filesystem i.e have you mounted with > -ocompression= option ? > > Based on this data alone it's evident that you don't really have mirrors > of the data, in this case having experienced the checksum errors should > have indeed resulted in error counters being incremented. I'll look into > this. In commit 0cc068e6ee59 "btrfs: don't report readahead errors and don't update statistics" we stopped counting errors if they occur during readahead. If there's a mirror available, we do still correct errors in that case. Errors in readahead are fairly common, e.g. there are usually a few during lvm pvmove operations, so it maybe makes sense not to count them; however, if the errors are not counted, they should also not be repaired. Instead, they should be repaired only during non-readahead reads (i.e. when the repairs will be counted in dev stats). Repairing errors without counting is bad because it hides an important indicator of device failure. This thread might be a different issue since there aren't any mirrors with single data, but if you're look at dev stats correctness anyway... > <snip>
