On Fri, Nov 02, 2018 at 05:06:07PM +0800, Ethan Lien wrote: > Snapshot is expected to be fast. But if there are writers steadily > create dirty pages in our subvolume, the snapshot may take a very long > time to complete. To fix the problem, we use tagged writepage for > snapshot flusher as we do in generic write_cache_pages(): we quickly > tag all dirty pages with a TOWRITE tag, then do the hard work of > writepage only on those pages with TOWRITE tag, so we ommit pages dirtied > after the snapshot command. > > We do a simple snapshot speed test on a Intel D-1531 box: > > fio --ioengine=libaio --iodepth=32 --bs=4k --rw=write --size=64G > --direct=0 --thread=1 --numjobs=1 --time_based --runtime=120 > --filename=/mnt/sub/testfile --name=job1 --group_reporting & sleep 5; > time btrfs sub snap -r /mnt/sub /mnt/snap; killall fio > > original: 1m58sec > patched: 6.54sec > > This is the best case for this patch since for a sequential write case, > we omit nearly all pages dirtied after the snapshot command. > > For a multi writers, random write test: > > fio --ioengine=libaio --iodepth=32 --bs=4k --rw=randwrite --size=64G > --direct=0 --thread=1 --numjobs=4 --time_based --runtime=120 > --filename=/mnt/sub/testfile --name=job1 --group_reporting & sleep 5; > time btrfs sub snap -r /mnt/sub /mnt/snap; killall fio > > original: 15.83sec > patched: 10.35sec > > The improvement is less compared with the sequential write case, since > we omit only half of the pages dirtied after snapshot command. > > Signed-off-by: Ethan Lien <ethanlien@xxxxxxxxxxxx> Added to misc-next, with an updated comment and a paragraph to changelog about the semantics based on the discussion under v1. Thanks.
