On Wed, Dec 11, 2019 at 04:11:05PM +0300, Cerem Cem ASLAN wrote: > This is the second time after a year that the server's disk throws > "INPUT OUTPUT ERROR" and "btrfs scrub" finds some uncorrectable errors > along with some corrected errors. However, "smartctl -x" displays > "SMART overall-health self-assessment test result: PASSED". > > Should we interpret "btrfs scrub"'s "uncorrectable error count" as > "time to replace the disk" or are those unrelated events? "btrfs scrub" operates on a higher layer, and can detect more errors, some of which may have a cause elsewhere. For example, dodgy memory very often corrupts data this way; you can retry the scrub to see if the corruption happened during write (so the data is lost) or during read (so retrying should work). In that case, you may want to test and/or replace your memory, motherboard, processor, etc. Or, the disk's firmware may fail to detect errors. It's supposed to verify disk's internal checksum but detecting errors is another place where a dodgy manufacturer can shave some costs -- either intentionally, or by neglecting testing. Or, some buggy software (which may even include btrfs itself, albeit unlikely) might scribble on wrong areas of the disk. Or... Anyway, all you know for sure that you have _some_ breakage, which a filesystem without data checksums would fail to detect, allowing silent data corruption. Finding the cause is another story. Meow! -- ⢀⣴⠾⠻⢶⣦⠀ A MAP07 (Dead Simple) raspberry tincture recipe: 0.5l 95% alcohol, ⣾⠁⢠⠒⠀⣿⡁ 1kg raspberries, 0.4kg sugar; put into a big jar for 1 month. ⢿⡄⠘⠷⠚⠋⠀ Filter out and throw away the fruits (can dump them into a cake, ⠈⠳⣄⠀⠀⠀⠀ etc), let the drink age at least 3-6 months.
