Corporate Mercurial
This page details how to use Mercurial in a corporate environment. This implies the following constraints which may not be present in home or FOSS projects:
- A strongly enforced official central server.
- An emphasis on security. This ends up having lots of trickle-down effects on your use of Mercurial.
- A workflow that is slower to evolve. Some workflow models that perhaps are better suited to a DVCS like Mercurial can't just be implemented overnight.
Additionally, you must understand that Mercurial is a fantastic version control system. But that's all it does. When you think of setting up a version control system in a big corporate environment you need user management with users assigned to groups; access control on an individual- and group-basis that also takes into account operations (e.g. a) perhaps only admins can tag. b) read-only vs read-write users); perhaps your access control rules are hierarchical or perhaps they are first-match. You need a server to serve using ssh or http. You need any number of other little and big things. And while some of these may be provided by the Mercurial community in a greater or lesser way, they're not part of Mercurial, and the earlier you recognize that the better.
Note: Some of the missing features might be provided by the associated Kallithea project. Missing features for support of your workflow can be implemented on top of it.
This page mostly does not provide answers, but provides you with the right questions to ask to get you started to solving your particular issues in the right way.
Contents
1. Security
1.1. Authorizing Users
See AuthorizingUsers. This page also contains information on autorizing committers vs. pushers and on pushlogging.
1.2. Implications of security on code sharing.
Sharing code among designers while maintaining security is hard work. And one of the key benefits of a DVCS like Mercurial is the ability to share code. So in order to realize these benefits, you really need to do your homework in this area.
- To share directly from desktops, you have some different options.
hg serve has no security so it's out.
- configuring every designer's desktop to have a secure Apache installation is probably not feasible
SSH among designers is possible, but you need to have some sort of existing centralized login mechanism e.g. NIS, LDAP with full login extensions, etc. Google "nss_ldap", "linux pam ldap", 'nss "compatibility mode" OR "compat mode" linux' (or use public key authentication -- ThomasArendsenHein 2012-10-04 08:40:33. Exchanging keys will be an n2 operation, so this only works in small offices.)
- To share via a centralized server you need
Scripts on the central server to allow users to make, delete, update cloned workspaces for private use. You may code these yourself, or use a 3rd party front end like Kallithea (see PublishingRepositories for a more complete list) to provide this. Kallithea does not appear to have a facility to logically separate "official" central repositories and user clones ("forks").
2. Workflow
Here are some questions you'll want to think about.
Mercurial allows you to commit often and incrementally. But having designers make a dozen little commits which eventually have comments like "fixed another bug with limits" and then having them push all that to the central repo might not make for a readable history. So do you just allow this? Do you encourage the use of the CollapseExtension? Do you encourage the use of MqExtension?
By default, when you pull in changes from the central repo, you must perform a merge with any commits you've already made. And you end up with two branches in parallel and a merge commit. This can lead to some very complicated branch history. For the non-Mercurial-expert, even telling what commits were present in a particular build might be non-obvious. So do you encourage the use of the RebaseExtension
To enforce standard commit checks, you'll want to use the ProjrcExtension. (See below.)
- If you clone a workspace locally, it no longer “points at” the default repository. It points at the other local clone. A designer may end up pushing somewhere unexpected! You may need to provide wrappers for creating and managing local clones.
2.1. Named Branches
Named branches are for ever. Barring heroic efforts, they cannot be purged from the system. So even if a developer uses named branches in their private repo for their own development, if they ever want to push to the central server, either you've got to let them create that branch on the central server, or you've got to give a means for them ot use the ConvertExtension to move their changes out of a branch.
2.2. Major Release Branches
- Do you use branching by named branches, or branching by cloning?
2.3. Minor Development Branches
Do you allow named branches (given the above issues)? Bookmarks are a good idea but have their own problems that you must fully understand. Perhaps producing clones for minor development branches is the way to go.
2.4. Extensions
You’ll want to at least look at the following extensions. I’m not going to list ones like "color" that are not specific to a centralized application.
Projrc – Allows you to define both important hooks and workspace-specific metadata. You probably have some local policies involving commit checks. Whether it’s that BugIds must be in the comments, or that you must use tabs rather than spaces, you’ll need these enforced on the designer PC so they don’t get a surprise when pushing to the repo. How do you set those up? The Projrc Extension.
ACL – If only to get inspiration. See also the page on Authentication
Largefiles – If you have a number of large files with many revisions. Especially if they’re binary and not particularly compressible, this can save space in designers’ repositories. Also note that while remote backups and/or mirroring still fully work with Largefiles, that the behavor is different and you should understand it.
- Extensions that address various parts of the "Mercurial history can get ugly and complicated, and it's immutable to boot" issue. This relates to large FOSS projects as much as large corporations. The issue is that if you have lots of developers (e.g. in the hundreds) you may need to enforce some strict rules about what gets pushed.
Collapse – Lets designers commit often and develop/test incrementally but then collapse their commits into one changeset to push to the central repo. Some people consider this dangerous.
MQ - Because Mercurial's history is immutable, store your un-pushed changes in a separate Mercurial repository and only commit them when you're ready. (Some people say that you're 'pushing deltas into a queue' which is somehow different than using a second Mercurial repository. This is just semantics.)
Rebase - Provides various options to reapply your un-pushed changesets to the tip so that you don't have merges in your history.
3. The devil is in the details
Some things that you expect to be hard are going to be trivially easy with Mercurial. Almost anything that involves examining history can be accomplished with hg log and revsets. And then the most unexpected things are going to be hard.
3.1. So get a Python Expert
Mercurial native hooks are all written in Python. Anything you want done fast (e.g. grepping through all committed files to verify they meet corporate standards) will need to be done in python. You don't want to be fumbling learning both Mercurial and Python. So get a python expert on board.
3.2. Example devil
For example, tagging nightly builds. Suppose you have a continuous integration system that builds nightly and tags that build with an internal revision number for tracking purposes. Tagging, that's easy, right? Well, a tag is just a commit, and so it's got to be merged with the latest before committing etc. That kind of thing is easy for a person, but needs some pretty specific instruction for an automated CI system. In order to guarantee error-free operation, we ended up with this monstrosity.
function vcs_tag () { #The tag operation is done in a scratchpad repo so that the main repo's changeset is never altered. tag_name=$1 id=$(hg id -i | sed -e 's/+$//g') _hg_scratchpad_start echo "tagging changeset $id with $tag_name" hg tag -r $id $tag_name if [ $? -ne 0 ] ; then echo "ERROR: Tag Failed. rc=$?" _hg_scratchpad_end return 1 fi for i in `seq $HG_RETRY_MAX` do hg push $HGROOT && break hg pull --rebase -u $HGROOT done if [ $i -eq $HG_RETRY_MAX ] ; then echo "ERROR: Tag Failed." 1>&2 hg revert --all _hg_scratchpad_end return 1 fi _hg_scratchpad_end return 0 } function _hg_scratchpad_start () { wsroot=`hg root` scratchpad=$wsroot/../hg_scratchpad default_path=`hg paths default` if [ ! -d $scratchpad ] ; then hg clone $wsroot $scratchpad echo -e "[paths]\ndefault = $default_path\n" > $scratchpad/.hg/hgrc fi pushd $scratchpad hg pull -u $HGROOT hg revert --all } function _hg_scratchpad_end () { popd }
Less monstrous alternative: Use bookmarks, they do not need to be merged. Except that a bookmark is creating a new head and then designers may accidentally update to it. Differently monstrous alternative, but without needing $HG_RETRY_MAX: Let the CI system operate on its own named branch, merge the new changes with hg pull && hg merge --tool internal:local default && hg revert -q --no-backup -a -r default && hg ci -m 'merged default' && hg tag -r default $tag_name && hg push (you could even manually put the tag in the same commit as the merge if you want). If you're already using named branches and have different CI systems running in parallel each monitoring a named branch, you will need a named tag branch per named development branch. -- ThomasArendsenHein 2012-10-05 07:45:36