On 12/11/2019 16.13, Hubert Tonneau wrote:
Hi, In order to close the RAID5 write hole, I prepose the add a mount option that would change RAID5 (and RAID6) behaviour : . When overwriting a RAID5 stripe, first convert it to RAID1 (convert it to RAID1C3 if it was RAID6)
You can't overwrite and convert a existing stripe for two kind of reason: 1) you still have to protect the stripe overwriting from the write hole 2) depending by the layout, a raid1 stripe consumes more space than a raid5 stripe with equal "capacity" So you have to write (temporarily) the data on another place. This is something not different from what Qu proposed few years ago: https://www.mail-archive.com/linux-btrfs@xxxxxxxxxxxxxxx/msg66472.html [Btrfs: Add journal for raid5/6 writes] where he added a device for logging the writes. Unfortunately, this means doubling the writes; that for a COW filesystem (which already suffers this kind of issue) would be big performance penality.... Instead I would like to investigate the idea of COW-ing the stripe: instead of updating the stripe on place, why not write the new stripe in another place and then update the data extent to point to the new data ? Of course would work only for the data and not for the metadata. Pros: the data is written only once Cons: the pressure of the metadata would increase; the fragmentation would increase
. Have a background process that converts RAID1 stripes to RAID5 (RAID1C3 to RAID6) Expected advantages are : . the low level features set basically remains the same . the filesystem format remains the same . old kernels and btrs-progs would not be disturbed The end result would be a mixed filesystem where active parts are RAID1 and archives one are RAID5. Regards, Hubert Tonneau
-- gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it> Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5
