Dave Hansen wrote on 2016/03/18 09:33 -0700:
On 03/17/2016 06:02 PM, Qu Wenruo wrote:
Dave Hansen wrote on 2016/03/17 09:36 -0700:
On 03/16/2016 06:36 PM, Qu Wenruo wrote:
Dave Hansen wrote on 2016/03/16 13:53 -0700:
I have a medium-sized multi-device btrfs filesystem (4 disks, 16TB
total) running under 4.5.0-rc5. I recently added a disk and needed to
rebalance. I started a rebalance operation three days ago. It was on
the order of 20% done after those three days. :)
...
Data, RAID1: total=4.53TiB, used=4.53TiB
System, RAID1: total=32.00MiB, used=720.00KiB
Metadata, RAID1: total=17.00GiB, used=15.77GiB
GlobalReserve, single: total=512.00MiB, used=0.00B
Considering the size and the amount of metadata, even doing a quota
rescan will be quite slowing.
Would you please try to do a quota rescan and see the CPU/IO usage?
I did a quota rescan. It uses about 80% of one CPU core, but also has
some I/O wait time and pulls 1-20MB/s of data off the disk (the balance
with quotas on was completely CPU-bound, and had very low I/O rates).
It would seem that the "quota rescan" *does* have the same issue as the
balance with quotas on, but to a much smaller extent than what I saw
with the "balance" operation.
This is quite expected. Most CPU would be consumed by find_all_roots(),
as your subvolume layout is the just the worst case for find_all_roots().
That's to say, remove unneeded snapshots and keep them under a limited
amount would be the best case.
Btrfs snapshot is superfast to create, but the design implies a lot of
overhead for some minor operation, especially for backref lookup.
And unfortunately, quota relies heavily on it.
The only difference that I doesn't expect is IO. As I expected rescan
would be much the same with balance, with little IO.
Did you run rescan/balance after dropping all the caches?
Although for balance, I would add some patch to make them by-pass quota
accounting as them seems to be OK.
But for rescan case, AFAIK that's the case.
BTW although it's quite risky, I hope you can run some old kernels and
see the performance difference of rescan and balance.
For old kernels, I mean 4.1 which is the latest kernel that doesn't use
the new quota framework.
I'm very interesting to see if the old but wrong code would have a
better or worse performance on balance or rescan.
Thank you very much for all this quota related reports.
Qu
I have a full profile recorded from the "quota rescan", but the most
relevant parts are pasted below. Basically btrfs_search_slot() and
radix tree lookups are eating all the CPU time, but they're still doing
enough I/O to see _some_ idle time on the processor.
74.55% 3.10% kworker/u8:0 [btrfs] [k] find_parent_nodes
|
---find_parent_nodes
|
|--99.95%-- __btrfs_find_all_roots
| btrfs_find_all_roots
| btrfs_qgroup_rescan_worker
| normal_work_helper
| btrfs_qgroup_rescan_helper
| process_one_work
| worker_thread
| kthread
| ret_from_fork
--0.05%-- [...]
32.14% 4.16% kworker/u8:0 [btrfs] [k] btrfs_search_slot
|
---btrfs_search_slot
|
|--87.90%-- find_parent_nodes
| __btrfs_find_all_roots
| btrfs_find_all_roots
| btrfs_qgroup_rescan_worker
| normal_work_helper
| btrfs_qgroup_rescan_helper
| process_one_work
| worker_thread
| kthread
| ret_from_fork
|
|--11.70%-- btrfs_search_old_slot
| __resolve_indirect_refs
| find_parent_nodes
| __btrfs_find_all_roots
| btrfs_find_all_roots
| btrfs_qgroup_rescan_worker
| normal_work_helper
| btrfs_qgroup_rescan_helper
| process_one_work
| worker_thread
| kthread
| ret_from_fork
--0.39%-- [...]
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html