On 03/14/2018 08:27 PM, Austin S. Hemmelgarn wrote: > On 2018-03-14 14:39, Goffredo Baroncelli wrote: >> On 03/14/2018 01:02 PM, Austin S. Hemmelgarn wrote: >> [...] >>>> >>>> In btrfs, a checksum mismatch creates an -EIO error during the reading. In a conventional filesystem (or a btrfs filesystem w/o datasum) there is no checksum, so this problem doesn't exist. >>>> >>>> I am curious how ZFS solves this problem. >>> It doesn't support disabling COW or the O_DIRECT flag, so it just never has the problem in the first place. >> >> I would like to perform some tests: however I think that you are right. if you make a "double buffering" approach (copy the data in the page cache, compute the checksum, then write the data to disk), the mismatch should not happen. Of course this is incompatible with O_DIRECT; but disabling O_DIRECT is a prerequisite to the "double buffering"; alone it couldn't be sufficient; what about mmap ? Are we sure that this does a double buffering ? > There's a whole lot of applications that would be showing some pretty serious issues if checksumming didn't work correctly with mmap(), so I think it does work correctly given that we don't have hordes of angry users and sysadmins beating down the doors. I tried to do in parallel updating a page and writing in different thread; I was unable to reproduce a checksum mismatch; so it seems that mmap are safe from this point of view; >> >> I would prefer that btrfs doesn't allow O_DIRECT with the COW files. I prefer this to the checksum mismatch bug. > This is only reasonable if you are writing to the files. Checksums appear to be checked on O_DIRECT reads, and outside of databases and VM's, read-only access accounts for a significant percentage of O_DIRECT usage, partly because it is needed for AIO support (nginx for example can serve files using AIO and O_DIRECT and gets a pretty serious performance boost on heavily loaded systems by doing so). > So O_DIRECT should be unsupported/ignored only for the writing ? It could be a good compromise... BR G.Baroncelli -- gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it> Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
