On 3/28/20 8:35 PM, Zygo Blaxell wrote: > On Fri, Mar 27, 2020 at 11:29:52AM +0100, Holger Hoffstätte wrote: >> On 3/26/20 11:21 PM, Hans van Kranenburg wrote: >>> 2) Metadata "cluster allocator" write behavior: >>> >>> *empty_cluster = SZ_64K # nossd >>> *empty_cluster = SZ_2M # ssd >>> >>> This happens in extent-tree.c. >> >> 2M used to be a common erase block size on SSDs. Or maybe it's just >> a nice round number.. ¯\(ツ)/¯ > > As a side-effect, 2M write clusters close the write hole on raid5/6 if you > have an array that is a power of 2 data disks wide. This capability is > wasted when it's only available through the 'ssd' mount option. Search for SSD_SPREAD in free-space-cache.c. There's this cont1_bytes which is a fallback, so you'll have to run full SSD_SPREAD mode for this to happen IINM. https://www.spinics.net/lists/linux-btrfs/msg70624.html for a huge braindump While running Linux 4.9 back then, I had to actually use 'ssd_spread' metadata (not for data, possible thanks to that 'bug') to prevent metadata writes from running around in circles while writing the extent tree. With 4.19, I can juse use 'ssd' and TBH I have no idea what change in between got rid of that insane amount of write overhead. So, I never continued with researching behavior of different options (empty_cluster, cont1_bytes combinations). > The behavior could be quite useful if it was properly integrated with > the raid5/6 stuff: set *empty_cluster = block group data width, make > sure it's aligned to raid5/6 stripe boundaries, and use it for both data > and metadata. > > It works by effectively making partially-filled clusters read-only. > If we can guarantee that clusters are aligned to raid5/6 data/parity block > boundaries, then btrfs can't allocate new data in partially filled raid5/6 > stripes, so it won't break the parity relation and won't have write hole. > >> cheers, >> Holger >> >> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=08635bae0b4ceb08fe4c156a11c83baec397d36d >> >> [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ba8a9d07954397f0645cf62bcc1ef536e8e7ba24 >> K
