On 27.05.20 г. 22:48 ч., Filipe Manana wrote: > On Wed, May 27, 2020 at 12:42 PM Nikolay Borisov <nborisov@xxxxxxxx> wrote: >> >> The flag simply replicates whether btrfs_inode::delallocs_inodes list >> is empty or not. Just defer this check to the list management functions >> (btrfs_add_delalloc_inodes/__btrfs_del_delalloc_inode) which are >> always called under btrfs_root::delalloc_lock. > > The flag is there to avoid taking the root's delalloc_lock spinlock > everytime a range is marked for delalloc for any inode of the > subvolume. > Have you measured performance with very high concurrency of buffered > writes against files in the same subvolume? > > Thanks. I performed the following test on a 16-core VM (physical cores are 12 on the host): fio --direct=0 --ioengine=sync --thread --directory=/media/scratch/ --invalidate=1 --group_reporting=1 \ --fallocate=posix --name=RandomWrites-async-64512-4k-4 --new_group --rw=randwrite --size=50m --numjobs=200 \ --bs=4k --fsync_on_close=0 --fallocate=none --end_fsync=0 --filename_format=FioWorkloads.\$jobnum And here's what /proc/lock_stat report: With BTRFS_INODE_IN_DELALLOC_LIST: class name con-bounces contentions waittime-min waittime-max waittime-total waittime-avg acq-bounces acquisitions holdtime-min holdtime-max holdtime-total holdtime-avg &root->delalloc_lock: 245 245 0.08 4.14 88.10 0.36 62168 122055 0.05 60.41 32721.41 0.27 Fio output: WRITE: bw=43.9MiB/s (45.0MB/s), 43.9MiB/s-43.9MiB/s (45.0MB/s-45.0MB/s), io=9.77GiB (10.5GB), run=228044-228044msec Without BTRFS_INODE_IN_DELALLOC_LIST: class name con-bounces contentions waittime-min waittime-max waittime-total waittime-avg acq-bounces acquisitions holdtime-min holdtime-max holdtime-total holdtime-avg &root->delalloc_lock: 8824 8838 0.05 210.92 3451.03 0.39 2542011 2685019 0.03 301.63 451369.98 0.17 WRITE: bw=33.8MiB/s (35.5MB/s), 33.8MiB/s-33.8MiB/s (35.5MB/s-35.5MB/s), io=9.77GiB (10.5GB), run=295770-295770msec So yeah, it does have noticeable effect, and massively reduces lock contentions on the delalloc_lock but it increases the critical section, due to the added avg times. But the improvement in performance in terms of throughput and reduced acquires/contentions is indisputable. So yeah, this patch should be dropped. Thanks for spotting this.
