On Wed, May 6, 2020 at 3:04 PM Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote:
>
> On Tue, May 5, 2020 at 11:48 PM Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote:
> >
> > btrfs-progs-5.6-1.fc32.x86_64
> > kernel-5.6.8-300.fc32.x86_64
> >
> > I've created a new file system, filled it with ~600G, set the seed
> > flag, and I'm creating a sprout by adding a 2nd device and removing
> > the 1st. This used to go at device speeds. Today it's running at
> > ~35M/s where I expect 95-120M/s. I don't know that it's related to
> > send/receive performance.
>
> Same brand new file system, and a second new file system, doing
> send/receive with a single snapshot (no other snapshots or subvolumes
> related to it) and I'm seeing this same abysmal performance. ~5MB/s
> send. The instant I do 'perf top -g -U' on the send process, the send
> performance increases 10x to over 50MB/s. As soon as I quit perf top,
> abysmal performance. Both of these drives scrub at ~100M/s. I'm not
> sure what's going on or how to get more information but it's a
> regression.
>
> I'll attach a few perf top text files to the next email.
[chris@fnuc ~]$ sudo perf top -g -U -p 263438 -d 10
The pid is for the 'btrfs send' process. Both btrfs send and receive
processes are using nearly 100% CPU which itself just seems
excessive.
--
Chris Murphy
Samples
Children Self Shared Object Symbol
+ 96.31% 0.02% [kernel] [k] entry_SYSCALL_64_after_hwframe
+ 96.19% 0.00% [kernel] [k] do_syscall_64
+ 72.48% 0.00% libc-2.31.so [.] splice
+ 71.99% 0.01% [kernel] [k] __x64_sys_splice
+ 71.93% 6.54% [kernel] [k] do_splice
+ 34.57% 33.41% [kernel] [k] mutex_unlock
+ 27.52% 0.00% libc-2.31.so [.] __GI___ioctl
+ 24.20% 0.00% [kernel] [k] __x64_sys_ioctl
+ 24.20% 0.00% [kernel] [k] ksys_ioctl
+ 24.20% 0.00% [kernel] [k] rpc_populate.constprop.0
+ 24.20% 0.00% [kernel] [k] _btrfs_ioctl_send
+ 24.20% 0.03% [kernel] [k] btrfs_ioctl_send
+ 23.96% 0.03% [kernel] [k] process_extent
+ 21.72% 3.73% [kernel] [k] send_extent_data
+ 19.06% 12.52% [kernel] [k] mutex_lock
+ 17.62% 3.83% [kernel] [k] pipe_double_lock
+ 8.65% 0.06% [kernel] [k] send_cmd
+ 8.52% 0.02% [kernel] [k] kernel_write
+ 8.49% 0.05% [kernel] [k] vfs_write
+ 8.12% 0.07% [kernel] [k] new_sync_write
+ 8.05% 0.28% [kernel] [k] pipe_write
+ 6.50% 3.82% [kernel] [k] _cond_resched
+ 5.70% 0.27% [kernel] [k] __mutex_lock.constprop.0
+ 5.39% 0.02% [kernel] [k] __do_page_cache_readahead
+ 5.34% 5.24% [kernel] [k] mutex_spin_on_owner
+ 5.15% 5.00% [kernel] [k] memcpy_erms
+ 4.13% 0.00% [kernel] [k] read_pages.isra.0
+ 4.00% 0.05% [kernel] [k] extent_readpages
+ 3.40% 0.11% [kernel] [k] copy_page_from_iter
+ 3.32% 0.12% [kernel] [k] _copy_from_iter
+ 2.97% 0.24% [kernel] [k] __alloc_pages_nodemask
+ 2.79% 2.13% [kernel] [k] rcu_all_qs
+ 2.50% 1.92% [kernel] [k] pipe_unlock
+ 2.31% 1.07% [kernel] [k] get_page_from_freelist
+ 1.95% 0.02% [kernel] [k] tlv_put
+ 1.90% 0.23% [kernel] [k] __do_readpage
+ 1.78% 0.10% [kernel] [k] __schedule
+ 1.77% 0.00% [kernel] [k] iterate_extent_inodes
+ 1.74% 0.03% [kernel] [k] schedule
+ 1.48% 0.01% [kernel] [k] apic_timer_interrupt
Press '?' for help on key bindings
Samples
Children Self Shared Objec Symbol
- 1.64% tlv_put
1.62% memcpy_erms
- 0.75% iterate_extent_inodes
+ 96.66% 0.01% [kernel] [k] do_syscall_64
+ 74.70% 0.01% [kernel] [k] __x64_sys_splice
- 74.64% 7.46% [kernel] [k] do_splice
- 52.97% do_splice
32.61% mutex_unlock
- 14.18% pipe_double_lock
- 9.97% mutex_lock
- 3.59% _cond_resched
1.30% rcu_all_qs
- 3.62% __mutex_lock.constprop.0
3.45% mutex_spin_on_owner
- 9.01% mutex_lock
- 3.12% _cond_resched
1.15% rcu_all_qs
- 1.59% __mutex_lock.constprop.0
1.56% mutex_spin_on_owner
1.51% pipe_unlock
- 7.46% splice
entry_SYSCALL_64_after_hwframe
do_syscall_64
__x64_sys_splice
- do_splice
- 33.71% 32.42% [kernel] [k] mutex_unlock
- 32.39% splice
entry_SYSCALL_64_after_hwframe
do_syscall_64
__x64_sys_splice
- do_splice
31.67% mutex_unlock
0.95% mutex_unlock
+ 25.11% 0.00% libc-2.31.so [.] splice
+ 21.95% 0.02% [kernel] [k] btrfs_ioctl_send
+ 21.74% 0.03% [kernel] [k] process_extent
+ 21.28% 13.95% [kernel] [k] mutex_lock
+ 19.60% 3.29% [kernel] [k] send_extent_data
+ 19.29% 4.38% [kernel] [k] pipe_double_lock
+ 9.27% 0.00% libc-2.31.so [.] __GI___ioctl
Press '?' for help on key bindings
Samples
Children Self Shared Objec Symbol
- 96.85% 0.02% [kernel] [k] entry_SYSCALL_64_after_hwframe
- 30.54% entry_SYSCALL_64_after_hwframe
- 30.53% do_syscall_64
- 29.04% __x64_sys_splice
- 60.72% do_splice
32.47% mutex_unlock
- 14.37% pipe_double_lock
- 10.21% mutex_lock
- 3.72% _cond_resched
1.33% rcu_all_qs
- 3.39% __mutex_lock.constprop.0
3.20% mutex_spin_on_owner
- 9.35% mutex_lock
- 3.25% _cond_resched
1.16% rcu_all_qs
2.14% pipe_unlock
- 1.58% __mutex_lock.constprop.0
1.56% mutex_spin_on_owner
- 7.32% __x64_sys_ioctl
ksys_ioctl
rpc_populate.constprop.0
_btrfs_ioctl_send
- btrfs_ioctl_send
- 9.39% process_extent
- 8.81% send_extent_data
- 2.69% send_cmd
- 2.64% kernel_write
- 2.71% vfs_write
- 2.68% new_sync_write
- 3.14% pipe_write
- 1.18% copy_page_from_iter
- 2.88% _copy_from_iter
2.72% memcpy_erms
- 0.93% __alloc_pages_nodemask
- 0.98% get_page_from_freelist
- 1.75% __do_page_cache_readahead
- 1.33% read_pages.isra.0
- 1.49% extent_readpages
- 0.90% __do_readpage
- 0.66% __alloc_pages_nodemask
Press '?' for help on key bindings