Just wiping the slate clean to summarize: 1. We have a consistent ~1 in 3 maybe 1 in 2, reproducible corruption of *data extent* parity during a scrub with raid5. Goffredo and I have both reproduced it. It's a big bug. It might still be useful if someone else can reproduce it too. Goffredo, can you file a bug at bugzilla.kernel.org and reference your bug thread? I don't know if the key developers know about this, it might be worth pinging them on IRC once the bug is filed. Unknown if it affects balance, or raid 6. And if it affects raid 6, is p or q corrupted, or both? Unknown how this manifests on metadata raid5 profile (only tested was data raid5). Presumably if there is metadata corruption that's fixed during a scrub, and its parity is overwritten with corrupt parity, the next time there's a degraded state, the file system would face plant somehow. And we've seen quite a few degraded raid5's (and even 6's) face plant in inexplicable ways and we just kinda go, shit. Which is what the fs is doing when it encounters a pile of csum errors. It treats the csum errors as a signal to disregard the fs rather than maybe only being suspicious of the fs. Could it turn out that these file systems were recoverable, just that Btrfs wasn't tolerating any csum error and wouldn't proceed further? 2. The existing scrub code computes parity on-the-fly, compares it with what's on-disk, and overwrites if there's a mismatch. If there's a mismatch, there's no message anywhere. It's a feature request to get a message on parity mismatches. An additional feature request would be to get a parity_error counter along the lines of the other error counters we have for scrub stats and dev stats. 3. I think it's a more significant change to get parity checksums stored some where. Right now the csum tree holds item type EXTENT_CSUM but parity is not an extent, it's also not data, it's a variant of data. So it seems to me we'd need a new item type PARITY_CSUM to get it into the existing csum tree. And I'm not sure what incompatibility that brings; presumably older kernels could mount such a volume ro safely, but shouldn't write to it, including btrfs check --repair should probably fail. Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
