On 6/23/16 5:24 PM, Holger Hoffstätte wrote: > On 06/23/16 21:26, jeffm@xxxxxxxx wrote: >> From: Jeff Mahoney <jeffm@xxxxxxxx> Hi Holger - >> While running xfstests generic/291, which creates a single file populated Whoops, this should've been generic/297. >> with reflinks to the same extent, I found that fsck had been running for >> hours. perf top lead me to find_data_backref as the culprit, and a litte >> more digging made it clear: For every extent record we add, we iterate >> the entire list first. My test case had ~2M records. That math doesn't >> go well. >> >> This patchset converts the extent_backref list to an rbtree. The test >> that used to run for more than 8 hours without completing now takes >> less than 20 seconds. > > Very interesting. Time for science! > > unpatched: > > holger>time btrfs check /dev/sdc1 > Checking filesystem on /dev/sdc1 > .. > btrfs check /dev/sdc1 17.03s user 3.79s system 25% cpu 1:22.82 total > > patched: > > holger>time btrfs check /dev/sdc1 > Checking filesystem on /dev/sdc1 > .. > btrfs check /dev/sdc1 17.03s user 3.74s system 24% cpu 1:23.24 total > > This is a 1TB disk with ~850GB data in 4 subvolumes, ~2 snapshots each. > I guess it only starts to matter (relative to the necessary I/O cost per > extent) when the level of sharing is higher, i.e. many more snapshots? Thanks for testing. This is exactly the result I wanted to see -- that the impact on the regular case is minimal. Once the number of reflinks to a single extent rises significantly, the improvement is clear. My test case was an 8 GB file with every block referencing the first block of the file. You're welcome to try that test case as well, but I will warn you *not* to run filefrag on it or you'll either have to wait a really long time for it to complete or reboot your system. > OTOH it doesn't explode, so that's good. :) Hey, I'll take that feedback. :) -Jeff -- Jeff Mahoney SUSE Labs
Attachment:
signature.asc
Description: OpenPGP digital signature
