Re: Borrowing objects from nearby repositories

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mar 24, 2014, at 5:21 PM, Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> wrote:
> On Wed, Mar 12, 2014 at 4:37 AM, Andrew Keller <andrew@xxxxxxxxxxxxxx> wrote:
>> Hi all,
>> 
>> I am considering developing a new feature, and I'd like to poll the group for opinions.
>> 
>> Background: A couple years ago, I wrote a set of scripts that speed up cloning of frequently used repositories.  The scripts utilize a bare Git repository located at a known location, and automate providing a --reference parameter to `git clone` and `git submodule update`.  Recently, some coworkers of mine expressed an interest in using the scripts, so I published the current version of my scripts, called `git repocache`, described at the bottom of <https://github.com/andrewkeller/ak-git-tools>.
>> 
>> Slowly, it has occurred to me that this feature, or something similar to it, may be worth adding to Git, so I've been thinking about the best approach.  Here's my best idea so far:
>> 
>> 1)  Introduce '--borrow' to `git-fetch`.  This would behave similarly to '--reference', except that it operates on a temporary basis, and does not assume that the reference repository will exist after the operation completes, so any used objects are copied into the local objects database.  In theory, this mechanism would be distinct from '--reference', so if both are used, some objects would be copied, and some objects would be accessible via a reference repository referenced by the alternates file.
> 
> Isn't this the same as git clone --reference <path> --no-hardlinks <url> ?

'--reference` adds an entry to 'info/alternates' inside the objects folder.  When an object is looked up, any objects folder listed in 'objects/info/alternates' is considered to be an extension of the local objects folder.  So, when, for example, fetch runs, when it goes to decide whether or not it already has a blob locally, it may decide "yes", and not download the blob at all, because it already exists in one of the reference repositories.  If I clone one of my 80 GB repositories over SSH using a reference repository, the resulting clone is only about 175 KB, because it's assuming the reference repository will exist going forward, so it doesn't actually own any objects itself at all.

The '--no-hardlinks' option is only applicable when hard linking is available in the first place - i.e., when cloning from one local folder to another on the same filesystem (assuming the filesystem supports hard links).

Thanks,
 - Andrew

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]