On 02/03/2014 01:28 PM, Johannes Hirte wrote:
On Thu, 23 Jan 2014 13:07:52 -0500
Josef Bacik <jbacik@xxxxxx> wrote:
On one of our gluster clusters we noticed some pretty big lag
spikes. This turned out to be because our transaction commit was
taking like 3 minutes to complete. This is because we have like 30
gigs of metadata, so our global reserve would end up being the max
which is like 512 mb. So our throttling code would allow a
ridiculous amount of delayed refs to build up and then they'd all get
run at transaction commit time, and for a cold mounted file system
that could take up to 3 minutes to run. So fix the throttling to be
based on both the size of the global reserve and how long it takes us
to run delayed refs. This patch tracks the time it takes to run
delayed refs and then only allows 1 seconds worth of outstanding
delayed refs at a time. This way it will auto-tune itself from cold
cache up to when everything is in memory and it no longer has to go
to disk. This makes our transaction commits take much less time to
run. Thanks,
Signed-off-by: Josef Bacik <jbacik@xxxxxx>
This one breaks my system. Shortly after boot the btrfs-freespace
thread goes up to 100% CPU usage and the system is nearly unresponsive.
I've seen it first with the full pull request for 3.14-rc1 and was able
to track it down to this patch.
Could you turn on the softlockup timer and see if you can get a
backtrace of where it is stuck? In the meantime I will go through and
see if I can pinpoint where it may be happening. Thanks,
Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html