On Thu, May 12, 2016 at 11:49 AM, Richard A. Lochner <lochner@xxxxxxxxxx> wrote: > I suspected, and I still suspect that the error occurred upon a > metadata update that corrupted the checksum for the file, probably due > to silent memory corruption. If the checksum was silently corrupted, > it would be simply written to both drives causing this type of error. Metadata is checksummed independently of data. So if the data isn't updated, its checksum doesn't change, only metadata checksum is changed. > > btrfs dmesg(s): > > [16510.334020] BTRFS warning (device sdb1): checksum error at logical > 3037444042752 on dev /dev/sdb1, sector 4988789496, root 259, inode > 1437377, offset 75754369024, length 4096, links 1 (path: Rick/sda4.img) > [16510.334043] BTRFS error (device sdb1): bdev /dev/sdb1 errs: wr 0, rd > 0, flush 0, corrupt 5, gen 0 > [16510.345662] BTRFS error (device sdb1): unable to fixup (regular) > error at logical 3037444042752 on dev /dev/sdb1 > > [17606.978439] BTRFS warning (device sdb1): checksum error at logical > 3037444042752 on dev /dev/sdc1, sector 4988750584, root 259, inode > 1437377, offset 75754369024, length 4096, links 1 (path: Rick/sda4.img) > [17606.978460] BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr 0, rd > 13, flush 0, corrupt 4, gen 0 > [17606.989497] BTRFS error (device sdb1): unable to fixup (regular) > error at logical 3037444042752 on dev /dev/sdc1 This is confusing. Are these the same boot? The later time has a lower corrupt count. Can you just 'dd if=sda4.img of=/dev/null' and report all (new) messages in dmesg? It seems to me there should be pretty much all the same monotonic-time for the problem with both devices. Also what do you get for these for each device: smartctl scterc -l /dev/sdX cat /sys/block/sdX/device/timeout -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
