cc: linux-btrfs ---------- Forwarded message ---------- From: Shilong Wang <wangshilong1991@xxxxxxxxx> Date: 2013/12/1 Subject: Re: 2 errors when scrubbing - but I don't know what they mean To: Sebastian Ochmann <ochmann@xxxxxxxxxxxxxxxxxxxxxx> Hello Sebastian, 2013/11/30 Sebastian Ochmann <ochmann@xxxxxxxxxxxxxxxxxxxxxx>: > Hello, > > thank you for your input. I didn't know that btrfs keeps the error counters > over mounts/reboots, but that's nice. > > I'm still trying to figure out how such a generation error may occur in the > first place. One thing I noticed looking at the btrfs code is that the > generation error counter will only get incremented in the actual scrubbing > code (either in "scrub_checksum_super" or in "scrub_handle_errored_block", > both in scrub.c - please correct me if I'm wrong, I'm not a btrfs dev). Right, Scrub will read superblock with bio rather than using pagecaches. This mean we will reread superblock from disks, if a checksum mismatch happens, This can be the following reasons: 1.some read errors happen while scrubing, while superblocks are actually good 2.during last transaction, when we are trying to write superblocks to disk, some silent corruption happens. 3.some unexpected operation write data to superblocks directly, for example..'dd if=/dev/zero' of=/dev/ seek=65536 count=4k' something like this. Actually, during boot time, superblock should be fine, because will do checksum check when trying to using superblock. if checksum mismatch, we will refuse to mount, After mounting, these superblocks should be cached in memory until you umouting filesystem. So ideal thing is your disk is fine, and during next transaction, superblocks will be rewritten. and during next umounting, you can mounting filesystem successfully! However, if you find such superblocks checksum mismatch very often during scrub, it maybe there are something wrong with disk! > Also, the dmesg errors I saw were not there at boot time, but about 10 > minutes after boot which was about the time when I started the scrub so I'm > pretty sure that it was the scrub that detected the errors. > > The question remains what can cause superblock/gen errors. Sure it could be > "some" read error, but I'd really like to make sure that it's not a > systematic error. I wasn't able to reproduce it yet though. You can reproduce this by doing 'dd if=/dev/zero of=/dev/sd* seek=65536 count=4k' before btrfs scrubing. Thanks, Wang > > Best > Sebastian > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
