On 05/22/2012 07:33 PM, Jeff King wrote:
On Tue, May 22, 2012 at 03:28:32PM +0200, Michael Haggerty wrote:When I try it with both 'next' and v1.7.10, I see that the latter is much faster. I did my tests with a warm cache, but the interesting number is the CPU time, which is quite different.I cannot reproduce anything as big as the performance regression that you see. I did find a regression 9.5 s -> 10.1 s caused by 5fa044184 find_containing_dir(): use strbuf in implementation of this function It is fixed by the patch that I just sent to the mailing list [1], which sizes the strbuf in that function to strlen(refname) instead of PATH_MAX. Since your experiments suggest that the performance regression is related to the size of the repository contents, it could be that the same test produces more memory pressure on your system and therefore a larger effect. Please try the patch and tell me if it fixes the problem for you.That patch drops about a second off of the slow case, but the main problem still remains. Just to be sure we are both doing the exact same thing, here is a complete reproduction recipe: GIT=/path/to/your/git/checkout RAILS=/path/to/unpacked/rails.git cd $GIT&& git checkout 432ad41e60cedb87ceec446ab034d46a53f5f9d8^&& make&& cd $RAILS&& time $GIT/bin-wrappers/git fetch . refs/*:refs/*&& cd $GIT&& git checkout 432ad41e60cedb87ceec446ab034d46a53f5f9d8&& make&& cd $RAILS&& time $GIT/bin-wrappers/git fetch . refs/*:refs/* produces: [before] real 0m9.128s user 0m9.369s sys 0m0.976s [after] real 0m15.926s user 0m16.181s sys 0m0.984s I don't think memory pressure is involved. The git process maxes out at ~300M, and this machine has 7.5G available. I wonder why we are getting different results. Could it be compilation options? As I mentioned, I compile with -O0, but I got similar results with -O3.
Thanks for the idiot-proof reproduction steps. I don't know what I was doing differently last time, but now I can reproduce your slowdown.
The thing that triggers the problem is a reference directory ("refs/remotes/8514/pull/") with 3810 sub-namespaces ("refs/remotes/8514/pull/1091", "refs/remotes/8514/pull/1092", etc), each subnamespace containing only one or two references. It wouldn't be a problem having so many *references* in a namespace, because references are just added to the end of the directory unsorted, and typically only sorted once after all of them have been added. But every time that a sub-namespace is accessed, the namespace has to be searched to see if that sub-namespace already exists. The searching requires the namespace to be sorted. So, for example, when adding the following sequence of references:
1. refs/remotes/8514/pull/1091/head 2. refs/remotes/8514/pull/1091/merge 3. refs/remotes/8514/pull/1092/head 4. refs/remotes/8514/pull/1092/merge 5. refs/remotes/8514/pull/1093/head 6. refs/remotes/8514/pull/1093/mergeAt step 1, sub-namespace "refs/remotes/8514/pull/1091/" is added to "refs/remotes/8514/pull/". This makes the code think that "refs/remotes/8514/pull/" is unsorted.
Step 2 is not problematic; the new references is added to "refs/remotes/8514/pull/1091/", but adding a reference doesn't require the ref_dir to be sorted.
At step 3, namespace "refs/remotes/8514/pull/" is first checked to see if sub-namespace "refs/remotes/8514/pull/1092/" already exists. This search requires namespace "refs/remotes/8514/pull/" to be sorted because step 1 caused it to be considered unsorted.
Again at step 5, namespace "refs/remotes/8514/pull/" needs to be sorted, and so on every time a subnamespace is added.
I will submit a patch shortly. Michael -- Michael Haggerty mhagger@xxxxxxxxxxxx http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
[Newbies FAQ] [Linux Kernel Development] [Free Online Dating] [Gcc Help] [IETF Annouce] [DCCP] [Netdev] [Networking] [Security] [V4L] [Bugtraq] [Free Online Dating] [Photo] [Yosemite] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Linux SCSI] [Fedora Users] [Linux Resources]