"Corrected" errors persist after scrubbing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Below (and also attached because of formatting) is an example of `btrfs
scrub` incorrectly reporting that errors have been corrected.

In this example, /dev/md127 is the device created by running:
mdadm --build /dev/md0 --level=faulty --raid-devices=1 /dev/loop0

The filesystem is RAID1.

# mdadm --grow /dev/md0 --layout=rp400
layout for /dev/md0 set to 12803
# btrfs scrub start -Bd /mnt/tmp
scrub device /dev/md127 (id 1) done
        scrub started at Fri May  5 19:23:54 2017 and finished after
00:00:01
        total bytes scrubbed: 200.47MiB with 8 errors
        error details: read=8
        corrected errors: 8, uncorrectable errors: 0, unverified errors: 248
scrub device /dev/loop1 (id 2) done
        scrub started at Fri May  5 19:23:54 2017 and finished after
00:00:01
        total bytes scrubbed: 200.47MiB with 0 errors
WARNING: errors detected during scrubbing, corrected
# ### But the errors haven't really been corrected, they're still there:
# mdadm --grow /dev/md0 --layout=clear # Stop producing additional errors
layout for /dev/md0 set to 31
# btrfs scrub start -Bd /mnt/tmp
scrub device /dev/md127 (id 1) done
        scrub started at Fri May  5 19:24:24 2017 and finished after
00:00:00
        total bytes scrubbed: 200.47MiB with 8 errors
        error details: read=8
        corrected errors: 8, uncorrectable errors: 0, unverified errors: 248
scrub device /dev/loop1 (id 2) done
        scrub started at Fri May  5 19:24:24 2017 and finished after
00:00:00
        total bytes scrubbed: 200.47MiB with 0 errors
WARNING: errors detected during scrubbing, corrected
#

Since scrub is checking for read issues, I expect that it would read any
corrections before asserting that they have indeed been corrected.

I understand that HDDs have a pool of non-LBA-addressable sectors set
aside to mask bad physical sectors, but this pool size is fixed by the
manufacturer (who makes money from sales of new drives).

However, I don't believe it is sufficient to blindly trust that the
underlying  HDD still has spare reallocatable sectors or that the
hardware will always correctly write data, given the verification and
fixing intention of scrub.

At a minimum, shouldn't these 8 "corrected errors" be listed as
"uncorrectable errors" to inform the sysadmin that data integrity has
degraded (e.g. in this RAID1 example the data is no longer duplicated)?

Ideally, I would hope that the blocks with uncorrectable errors are
marked as bad and fresh blocks are used to maintain integrity.

-- 
Regards,

Tom Hale
# mdadm --grow /dev/md0 --layout=rp400
layout for /dev/md0 set to 12803
# btrfs scrub start -Bd /mnt/tmp
scrub device /dev/md127 (id 1) done
        scrub started at Fri May  5 19:23:54 2017 and finished after 00:00:01
        total bytes scrubbed: 200.47MiB with 8 errors
        error details: read=8
        corrected errors: 8, uncorrectable errors: 0, unverified errors: 248
scrub device /dev/loop1 (id 2) done
        scrub started at Fri May  5 19:23:54 2017 and finished after 00:00:01
        total bytes scrubbed: 200.47MiB with 0 errors
WARNING: errors detected during scrubbing, corrected
# ### But the errors haven't really been corrected, they're still there:
# mdadm --grow /dev/md0 --layout=clear # Stop producing additional errors
layout for /dev/md0 set to 31
# btrfs scrub start -Bd /mnt/tmp
scrub device /dev/md127 (id 1) done
        scrub started at Fri May  5 19:24:24 2017 and finished after 00:00:00
        total bytes scrubbed: 200.47MiB with 8 errors
        error details: read=8
        corrected errors: 8, uncorrectable errors: 0, unverified errors: 248
scrub device /dev/loop1 (id 2) done
        scrub started at Fri May  5 19:24:24 2017 and finished after 00:00:00
        total bytes scrubbed: 200.47MiB with 0 errors
WARNING: errors detected during scrubbing, corrected
#

[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux