Re: Interesting btrfs csum and tree-checker performance penalty analyse

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 3.04.19 г. 11:54 ч., Qu Wenruo wrote:
> Hi,
> 
> Recently Intel LKP performance test is reporting regression of btrfs
> performance.
> 
> It points to tree-checker code, and since I'm poking around the
> bcc/ebpf, I spend some time to do an interesting look into the
> performance penalty about both btrfs csum and tree-checker.
> 
> The code base is David's misc-next, which contains both write-time tree
> checker and enhanced code to handle fuzzed image.
> 
> The tool can be find in my gist:
> https://gist.github.com/adam900710/b5542f2e52ed4687986cf41f64b85253

So you are essentially trying to figure out the average run time of 3
functions, this could have been made simpler by using the funclatency
bcc tool from iovisor repo:

https://github.com/iovisor/bcc/blob/master/tools/funclatency.py


Actually running this tool will show you a latency histogram making it
easier to spot any latency outliers. An average value doesn't mean
anything without having more context i.e stddev.


> 
> To use the tool, one needs bcc-python binding and kernel config for
> eBPF, but at least Arch default kernel has all needed config, so any one
> can try it on Arch.
> 
> The work load is:
>  mkfs.btrfs -n 4K $DEV
>  mount $DEV $MNT
>  fsstress -n 10000 -w -d $MNT
>  umount $MNT
> 
>  ## start my script ##
>  mount $DEV $MNT
>  ls -R $MNT > /dev/null # To read all fs tree blocks
>  fsstress -n 1000 -w -d $MNT # Trigger enough write
>  umount $MNT
>  ## stop my script ##
> 
> 
> The result is very interesting:
> Basic result is:
> CSUM_TREE_BLOCK: nr=2311 total=10000612 avg=4327
> TREE_CHECKER_READ: nr=461 total=41911553 avg=90914
> TREE_CHECKER_WRITE: nr=1575 total=5783330 avg=3671

Definitely something worth looking at.

> 
> So if just looking at the average number of csum calculate, it only
> brings 3~5μs. And surprisingly, write time tree checker even slower than
> checksum!
> 
> Also surprisingly, read time tree checker takes near 100μs. nearly 20
> times slower than csum/write time tree checker.
> 
> So we have a new direction to enhance tree-checker performance.
> BTW, bcc/ebpf is really awesome!
> 
> Thanks,
> Qu
> 



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux