Hello everyone, I've just done a fairly large push to btrfs-unstable. It can recover from checksum errors on data and metadata as long as there is a second copy (either mirrored on or on the same spindle). A big part of that was getting rid of the bogus checksum failure messages for metadata blocks. With the current code, if you get a checksum failure the block really should be wrong. But, for leaves and nodes bigger than the page size, I need proper fine grained locking before recovery from another mirror can work correctly. So, I've pushed a changeset to btrfs-progs that sets the leaf size and node size to the page size on mkfs for now. If you aren't testing the mirroring code, you don't have to worry about this. v0.14 will probably ship with the smaller leaf size by default. All checksum verification has been moved to async worker threads. For data blocks it was done at interrupt time before, and the new code is dramatically faster on large IO subsystems. My 4 drive array was CPU bound on streaming reads before and only ran at 100MB/s. WIth the new code, I get 200MB/s (disk speed). Still todo: properly mirror the super blocks. If you're testing the mirror code, don't overwrite anything before 20k into the drive...the super blocks don't get mirrored today. Also, btrfs-progs doesn't retry IO on different mirrors yet, only the kernel does this. I'll tackle both problems tomorrow. A simple test: mkfs.btrfs --data=raid1 /dev/sdb /dev/sdc mount /dev/sdb /mnt (copy some data over) umount /mnt dd if=/dev/zero of=sdc seek=1M count=500 mount /dev/sdb /mnt (see if the data is still there) Bonus points for anyone willing to write a program or kernel patch that randomly corrupts blocks from mirrored chunks. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
