On 10/05/2018 06:40 AM, Duncan wrote: > Wilson, Ellis posted on Thu, 04 Oct 2018 21:33:29 +0000 as excerpted: > >> Hi all, >> >> I'm attempting to understand a roughly 30% degradation in BTRFS RAID0 >> for large read I/Os across six disks compared with ext4 atop mdadm >> RAID0. >> >> Specifically, I achieve performance parity with BTRFS in terms of >> single-threaded write and read, and multi-threaded write, but poor >> performance for multi-threaded read. The relative discrepancy appears >> to grow as one adds disks. > > [...] > >> Before I dive into the BTRFS source or try tracing in a different way, I >> wanted to see if this was a well-known artifact of BTRFS RAID0 and, even >> better, if there's any tunables available for RAID0 in BTRFS I could >> play with. The man page for mkfs.btrfs and btrfstune in the tuning >> regard seemed...sparse. > > This is indeed well known for btrfs at this point, as it hasn't been > multi-read-thread optimized yet. I'm personally more familiar with the > raid1 case, where which one of the two copies gets the read is simply > even/odd-PID-based, but AFAIK raid0 isn't particularly optimized either. > > The recommended workaround is (as you might expect) btrfs on top of > mdraid. In fact, while it doesn't apply to your case, btrfs raid1 on top > of mdraid0s is often recommended as an alternative to btrfs raid10, as > that gives you the best of both worlds -- the data and metadata integrity > protection of btrfs checksums and fallback (with writeback of the correct > version) to the other copy if the first copy read fails checksum > verification, with the much better optimized mdraid0 performance. So it > stands to reason that the same recommendation would apply to raid0 -- > just do single-mode btrfs on mdraid0, for better performance than the as > yet unoptimized btrfs raid0. Thank you very much Duncan. I failed to mention that I'd tried this before as well, but was hoping to avoid it as it felt like a kludge and it didn't give me the big jump I expected so I forgot about it. I retested and btrfs on mdraid in a six-wide RAID0 does improve performance slightly -- I see typically 990MB/s, and up to around 1.1GB/s in the best case. Same options to fio as my original email. Still a ways away from ext4 (which admittedly may be cheating a bit since it seems to detect the md0 underneath of it and adjust its stride length accordingly, though I may be over-representing it's intelligence about this). The I/O sizes improve greatly to parity with ext4 atop mdraid, but the queue depth is still fairly low -- even with many processes it rarely exceeds 5 or 6. This is true if I run fio with or without the aio ioengine. Is there any tuning in BTRFS that limits the number of outstanding reads at a time to a small single-digit number, or something else that could be behind small queue depths? I can't otherwise imagine what the difference would be on the read path between ext4 vs btrfs when both are on mdraid. Thanks again for your insights, ellis
