Note:
This page is primarily intended for developers of Mercurial.
Note:
This page is no longer relevant but is kept for historical purposes.
This page does not meet our wiki style guidelines. Please help improve this page by cleaning up its formatting. |
Problem: on large repos, it would be useful to be able to do a partial clone and/or checkout
Proposal: add a new file like .hgignore to the .hg/ directory that specifies which files should be ignored for clone/checkout.
Issue105 in the BTS covers partial checkouts and suggests using -X/-I on clone or checkout.
To implement:
- pick a sensible name for this file (BryanOSullivan suggests ignoremissing)
- teach localrepo to use this filtering where appropriate (checkout, pull)
- teach changegroup to choke if someone attempts to pull a changeset from us with a file we don't have
Not to be confused with TrimmingHistory, which trims in the history tree, not the directory tree.
Note that sub-repositories provide the ability to segment a repository into modules to achieve most of the desired functionality of this feature. However, this does require some foresight to set up the modules, i.e. sub-repositories do not provide ad-hoc sub-directory cloning as is desired by this feature. Those who think they are interested in partial clones should be sure to read up on sub-repositories first to see if that will fulfill your needs.
Comments from Matt Mackall:
> > How hard would this be to implement?
On a scale of 1 to 10, I'd rate it about a 7.
The simplest approach is probably "allow cloning of subdirectories". Here's how that would work:
- add a file called "subdir" to .hg/store containing the subdirectory we're cloning
- do a clone as usual, but don't write out revlogs that aren't in subdir
- adjust the functions dealing with paths (localrepo.wjoin, etc.) appropriately
- teach various things (checkout, status, etc.) to skip files outside the subdirectory
- teach merge not allow merging when files outside the subdirectory conflict
Merging is the tricky part, and I expect there's a gotcha or two buried in there.
RyanF: I would also like to see a filtering option or partial clone, but with support for the include and exclude (-I, -X). We are writing an open source Mavenesque (but much simpler) tool that manages very large products and dependencies involving multiple repositories. It requires a Dependencies.lookup in the root of any repos module but it does not seem possible for us to checkout a single file with hg.
It would be fine for our needs if an interim solution checked out the subset of files in a read-only mode.
trisk: ConvertExtension appears to already provide the ability to perform partial clones by generating a new full repository, as described under "Converting from Mercurial".
- You can use this, today, without waiting for additional subdir support to arrive
- For the same reason, it probably doesn't handle merging
jasonrohrer: Unfortunately, ConvertExtension doesn't really create a clone, since it doesn't remember where the repository came from. So after the convert, you cannot pull or push back to the source of the clone. Great for creating new subdir repositories that will operate in isolation from the source, but not for creating subdirs that actually work with the other, full repositories. Thus, I'd say:
- You CANNOT use this, today, without waiting for additional subdir support to arrive
(Though I might be missing something!)
sairon: We're coming from svn ( as would seem a lot of others ) where there's for example mod_dav_svn for controlling repository access levels. Consider the following setup:
- Artists: Creates artwork in different forms for projects, source data stored in data/art .
- Sound designers: Ditto but for sound / music, stored in data/sound .
- Coders: src/ .
- etc etc
Under bin/ we have folders for all projects where the build process creates the latest build.
Under this scheme you want to control access both on a role basis as well as a project basis. When a new actor enters the fray he gets an account and is added to relevant group, when he's assigned to a new project we just assign him access rights after which he does a root update that updates his directory structure. This has a number of advantages which seems to be very hard to achieve with mercurial:
- Updates / commits, everything is faster because we don't need to consider the entire repository. Faster in every regard, bandwidth usage, computationally, disc access etc.
- We know for a fact that person X cannot fuck up stuff in directory branch Y, because he simply does not have access.
- If someones computer is compromised, the damage is limited.
Using pattern matching with groups, as in mod_dav_svn, makes setting up such a system easy. As access needs to be controlled in more dimensions it's to just add a new pattern. As this extension is already trying to fix the root problem in mercurial for this type of setup it would seem the only thing that would be needed in addition is for it to work with the ACL extension.