On 08/14/2017 09:08 AM, Qu Wenruo wrote: > >> >> Supposing to log for each transaction BTRFS which "data NOCOW blocks" will be updated and their checksum, in case a transaction is interrupted you know which blocks have to be checked and are able to verify if the checksum matches and correct the mismatch. Logging also the checksum could help to identify if: >> - the data is old >> - the data is updated >> - the updated data is correct >> >> The same approach could be used also to solving also the issue related to the infamous RAID5/6 hole: logging which block are updated, in case of transaction aborted you can check the parity which have to be rebuild. > Indeed Liu is using journal to solve RAID5/6 write hole. > > But to address the lack-of-journal nature of btrfs, he introduced a journal device to handle it, since btrfs metadata is either written or trashed, we can't rely existing btrfs metadata to handle journal. The Liu's solution is a lot heavier. With the Liu's solution, you need to write both the data and parity 2 times. I am only suggest to track the block to update. And it would be only need for the stripes involved by a RMW cycle. This is a lot less data to write (8 byte vs 4Kbyte) > > PS: This reminds me why ZFS is still using journal (called ZFS intent log) but not mandatory metadata CoW of btrfs. Form a theoretical point of view, if you have a "PURE" COW file-system, you don't need a journal. Unfortunately a RAID5/6 stripe update is a RMW cycle, so you need a journal to keep it in sync. The same is true for the NOCOW file (and their checksums) > > Thanks, > Qu -- gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it> Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
