Re: Very slow balance / btrfs-transaction

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





At 02/07/2017 12:09 AM, Goldwyn Rodrigues wrote:

Hi Qu,

On 02/05/2017 07:45 PM, Qu Wenruo wrote:


At 02/04/2017 09:47 AM, Jorg Bornschein wrote:
February 4, 2017 1:07 AM, "Goldwyn Rodrigues" <rgoldwyn@xxxxxxx> wrote:

<snipped>



Quata support was indeed active -- and it warned me that the qroup
data was inconsistent.

Disabling quotas had an immediate impact on balance throughput -- it's
*much* faster now!
From a quick glance at iostat I would guess it's at least a factor 100
faster.


Should quota support generally be disabled during balances? Or did I
somehow push my fs into a weired state where it triggered a slow-path?



Thanks!

   j

Would you please provide the kernel version?

v4.9 introduced a bad fix for qgroup balance, which doesn't completely
fix qgroup bytes leaking, but also hugely slow down the balance process:

commit 62b99540a1d91e46422f0e04de50fc723812c421
Author: Qu Wenruo <quwenruo@xxxxxxxxxxxxxx>
Date:   Mon Aug 15 10:36:51 2016 +0800

    btrfs: relocation: Fix leaking qgroups numbers on data extents

Sorry for that.

And in v4.10, a better method is applied to fix the byte leaking
problem, and should be a little faster than previous one.

commit 824d8dff8846533c9f1f9b1eabb0c03959e989ca
Author: Qu Wenruo <quwenruo@xxxxxxxxxxxxxx>
Date:   Tue Oct 18 09:31:29 2016 +0800

    btrfs: qgroup: Fix qgroup data leaking by using subtree tracing


However, using balance with qgroup is still slower than balance without
qgroup, the root fix needs us to rework current backref iteration.


This patch has made the btrfs balance performance worse. The balance
task has become more CPU intensive compared to earlier and takes longer
to complete, besides hogging resources. While correctness is important,
we need to figure out how this can be made more efficient.

The cause is already known.

It's find_parent_node() which takes most of the time to find all referencer of an extent.

And it's also the cause for FIEMAP softlockup (fixed in recent release by early quit).

The biggest problem is, current find_parent_node() uses list to iterate, which is quite slow especially it's done in a loop.
In real world find_parent_node() is about O(n^3).
We can either improve find_parent_node() by using rb_tree, or introduce some cache for find_parent_node().


IIRC SUSE guys(maybe Jeff?) are working on it with the first method, but I didn't hear anything about it recently.

Thanks,
Qu


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux