Differences between revisions 5 and 70 (spanning 65 versions)
Revision 5 as of 2005-09-09 00:37:30
Size: 5291
Editor: mpm
Comment:
Revision 70 as of 2010-04-14 23:13:28
Size: 26594
Editor: PaulBoddie
Comment: Minor wording edit.
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
(this page ought to be merged with ServerInstall)

== Publishing Mercurial repositories ==

The easiest way to share changes with other people using Mercurial is to publish them on the Web. Mercurial lets people pull changes using HTTP.

There are four (!) ways to publish a repository over HTTP, of which the two below are most worth considering:

 * Use the `hgweb.cgi` CGI script. This is a simply way to quickly publish a single repository.
 * Use the `hgwebdir.cgi` CGI script. This lets you publish multiple repositories easily, but initial setup can be a bit of work.
= Publishing Mercurial Repositories =
<<TableOfContents(2)>>

== Choosing A Publishing Method ==
There are a variety of different ways to publish your Mercurial repositories. Some are more powerful than others but may require more effort to set up and administer. See below for some general recommendations.
|| ||'''Solution''' ||'''Mechanism''' ||'''Push?''' ||'''Browsable''' ||'''Advantages''' ||'''Disadvantages''' ||
||<#cccccc style="text-align: center;" |6>'''Public''' ||[[hgserve|hg serve]] ||HTTP ||off by default ||yes ||built-in ||push has no authentication, so can only be used on trusted internal networks ||
||[[StaticHTTP|static HTTP]] ||<style="text-align: center;" |5>HTTP/HTTPS ||no ||no ||does not require hg or CGI support on the server ||very slow ||
||[[#single|hgweb]] {*} ||off by default ||yes ||can use existing web server (CGI, WSGI, mod_python), including authentication ||web server config can be hard to debug ||
||[[#multiple|hgwebdir]] {*} ||off by default ||yes ||can use existing web server (CGI, WSGI, [[http://www.aventinesolutions.nl/mediawiki/index.php/Quick_Tip:_Getting_Started_with_Mercurial#hgwebdir.py|mod_python]]), including authentication, '''supports multiple repositories''' ||slightly more work to setup than hgweb ||
||[[HgServeNginx|hg serve behind a proxy (Nginx)]] ||yes ||yes ||multiple repos, permits authentication, no CGI ||requires Nginx, slower than CGI ||
||[[MercurialHosting|third-party hosting]] ||yes ||yes ||minimal setup ||not locally administered, may have fees ||
||<#999999 style="text-align: center;" |3>'''Private/internal''' ||ssh ||SSH ||yes ||no ||no additional setup ||requires Unix server, per-user accounts and repositories ||
||[[SharedSSH|mercurial-server]] ||SSH (but with shared accounts) ||yes ||no ||easy key management, fine-grained permissions ||requires Unix server, not built in ||
||shared disk ||NFS/Samba etc. ||yes ||no ||can use existing setup ||generally restricted to intranets ||




(!) See also this [[HgWebDirStepByStep|comprehensive guide]] to acquiring Mercurial and configuring hgwebdir.

(!) If using a recent Apache, use [[modwsgi|mod_wsgi]] instead of mod_python or CGI as it has better performance.

(!) See also [[#SeeAlso|various other guides]] about configuring Mercurial's Web interface using various technologies.

{*} Recommended solutions are described below.

=== Quick Recommendations ===
The easiest way to share changes with other people using Mercurial is to publish them on the Web. The following two methods are the most recommended for publishing repositories over HTTP:

 *
 Use the `hgweb.cgi` script. This is a simple way to quickly [[#single|publish a single repository]].

 *
 Use the `hgwebdir.cgi` script. This lets you [[#multiple|publish multiple repositories]] easily, but initial setup can be a bit of work.
Line 14: Line 40:
 * Use the {{{hg serve}}} command. This is single threaded, and not recommended except for temporary situations where you need to publish a repository for a few minutes, for example to pull changes from a laptop.
 * Make the plain repository available. This uses a much slower, less reliable protocol, called `old-http`. We won't cover it here.

== Publishing a single repository ==

The `hgweb.cgi` CGI script is simple to use. You will find it in the root of your Mercurial tree. Copy it to a directory that your web server will handle CGI scripts in, rename it to `index.cgi` if you want, then edit its contents, and you're done.
 *
 Use the `hg serve` command. This is Mercurial's [[hgserve|built-in Web server]]. It is not really recommended except for temporary situations where you need to publish a repository for a few minutes, for example to pull changes from a laptop.

 *
 Make the plain repository available. This uses a much slower 'serverless' protocol called `static-http`. We won't cover it here, see [[StaticHTTP]] instead.


For private or restricted-access repositories, aside from the solutions explicitly marked as "private/internal" in the table above, authentication measures (certificates, logins) can be applied to many of the "public" solutions in order to restrict access.

----
<<Anchor(introduction)>>

== Introduction and Prerequisites ==
In this document we assume that repositories reside in the `/home/user/hg` directory. For example, a repository called `myproject` would reside in `/home/user/hg/myproject`.

To implement the mechanisms described in this document, you will need the following:

 * Some control over the behaviour of the Web server you use.
 * (Optional) control over the DNS domain you use.

With control over DNS, such as that provided with various Web hosting service control panels, you should be able to set up a subdomain; this makes the URL of your repositories a little tidier, so that `http://hg.example.com/myproject` can be used instead of `http://www.example.com/hgwebdir.cgi/myproject`, for example.

Such an approach, known as virtual hosting, is entirely optional. To implement it for the fictional `example.com`, a CNAME record for `hg.example.com` would be defined for the same address as that already used by the Web server.

<<Anchor(single)>>

== Publishing a Single Repository ==
To publish a single repository, perform the following steps:

 1.
 Find the `hgweb.cgi` script in the root of your Mercurial source tree.

 1. Copy it to a directory where your Web server can access it.
 1.
 Rename it to `index.cgi` if you want, then edit its contents.

 1. Make sure the Web server is configured and can execute the script.

If Mercurial is not installed system-wide, uncomment and edit the Python path in `hgweb.cgi` (or `index.cgi`) as indicated:

{{{#!python numbers=disable
# adjust python path if not a system-wide install:
import sys
sys.path.insert(0, "/home/user/lib")
}}}
You will need to edit the call to `hgweb.hgweb` as indicated in the following example:

{{{#!python numbers=disable
h = hgweb.hgweb("/home/user/hg/myproject", "My Project")
}}}
Line 23: Line 92:
== Publishing multiple repositories ==

The `hgwebdir.cgi` CGI script takes some work to initially set up, but once it's working, it lets you publish new repositories easily and cheaply. Its advantage is that to publish a repository, you simply place a clone in a particular directory, then add one line to a config file to tell the CGI script that it is allowed to publish that repository.

I will describe the setup that BryanOSullivan (me) and ThomasAH use. To mimic our configurations, you will need the following:

 * Control over the web server you use.
 * (Optional) control over the DNS domain you use.

=== Is virtual hosting necessary? ===

Using virtual hosting is entirely optional, and not worth it in the majority of cases. It simply makes URLs a little tidier.

For example, I can serve repositories at http://hg.serpentine.com/mercurial/hg instead of http://www.serpentine.com/hg/hgwebdir.cgi/mercurial/hg.

To do this, I simply have {{{hg.serpentine.com}}} CNAMEd to my web server.
<<Anchor(multiple)>>

== Publishing Multiple Repositories ==
The `hgwebdir.cgi` script takes some work to initially set up, but once it's working, it lets you publish new repositories easily and cheaply. Its advantage is that to publish a repository, you simply place a clone in a particular directory, then add one line to a config file to tell the CGI script that it is allowed to publish that repository.
Line 41: Line 98:

Choose a directory you want to publish from. On my systems, I use {{{/home/bos/hg/share}}}, which you will see below.

Copy `hgwebdir.cgi` in there. Read and edit it, then create a file called `hgweb.config`.

=== Setting up the hgweb.config file ===

Here are the contents of my `hgweb.config` file:
To publish a single repository, perform the following steps:

 1.
 Find the `hgwebdir.cgi` script in the root of your Mercurial source tree.

 1.
 Copy it to a directory where your Web server can access it. This will be illustrated below using `/home/user/webdir` as this directory.

 1. Edit the contents of the file.
 1. Make sure the Web server is configured and can execute the script.

If Mercurial is not installed system-wide, uncomment and edit the Python path in `hgwebdir.cgi` as indicated:

{{{#!python numbers=disable
# adjust python path if not a system-wide install:
import sys
sys.path.insert(0, "/home/user/lib")
}}}
=== Setting up the hgweb.config File ===
In the `/home/user/webdir` directory, create a file called `hgweb.config`. Here are the contents of an `hgweb.config` file:
Line 52: Line 121:
mercurial/bos = mercurial/bos
mercurial/hg = mercurial/hg
mercurial/crew = mercurial/crew
mercurial/tah = mercurial/tah
}}}

The duplication above looks a little silly, but it's telling the CGI script which repositories it is allowed to publish.

=== Configuring Apache ===

Here's the Apache config I use. Description below.
myproject = /home/user/hg/myproject
otherproject = /home/user/hg/otherproject
}}}

The `paths` setting in this file tells the CGI script which repositories it is allowed to publish and how their published locations map to their actual locations in the filesystem.

 * The keys (on the left) are ''URL'' paths which can incorporate `/` characters - these paths appear as part of the URL used to access a repository
 * The values (on the right) are ''filesystem'' paths which can be relative to the CGI directory - note that it is typically preferable to keep the repositories ''outside'' the CGI directory

{i} If you are publishing a fixed set of repositories, using the `paths` setting should be sufficient, and you should not need to consider the `collections` setting described below.

==== Working with Collections ====

Where many repositories are being served, it can be preferable to refer to each directory holding such collections of repositories instead of listing each and every repository as is done above. The above `hgweb.config` file could be rewritten as follows:

{{{
[collections]
/home/user/hg = /home/user/hg
}}}

The `collections` setting in this file tells the CGI script where to look for repositories.

 * The keys (on the left) and the values (on the right) are both ''filesystem'' paths
 * The keys should be prefixes of the values and are "subtracted" from the values in order to generate the ''URL'' paths to each repository

Consider two repository collections given in the `hgweb.config` file:

{{{
[collections]
/home/user/private = /home/user/private
/home/user/official = /home/user/official
}}}

Where both `/home/user/private` and `/home/user/official` contain repositories using the same names (`myproject` and `otherproject`), a combined list will then be shown by the Web interface containing two entries for each of `someproject` and `otherproject`: one from the `private` collection and one from the `official` collection. This may be confusing to the end-user, so we may modify the configuration file as follows:

{{{
[collections]
/home/user = /home/user
}}}

This will now produce entries for `private/someproject`, `private/otherproject`, `official/someproject` and `official/otherproject`. Unfortunately, it will also find other repositories outside the `private` and `official` directories. It is therefore recommended that repositories are located in suitably organised directory hierarchies if exported in this way.

== Configuring Apache ==
There are many ways of configuring Apache to run CGI scripts, and a few of the possibilities are provided below. Where the main configuration files are mentioned, you should use the appropriate conventions for your system in defining such files in the `conf.d` and/or `sites-available` directories.

In each example, `hgwebdir.cgi` is mentioned, but the same principles apply to directories hosting the `hgweb.cgi` script.

{i} To ensure that a script is executable by the Web server, the following command is typically used:

{{{
chmod u+x hgwebdir.cgi
}}}

{i} The preferred mechanism for persuading Apache to use updated configuration information can vary from platform to platform and from distribution to distribution. Please consult your distribution's documentation, if appropriate, or the more general Apache documentation (for example, the [[http://httpd.apache.org/docs/2.2/programs/apachectl.html|apachectl documentation]]) for details.

<<Anchor(script)>>
=== Publishing the CGI Script Directly ===
''This example requires access to the main configuration files.''

The easiest way to serve the `hgwebdir.cgi` script is to use a `ScriptAlias` directive:

{{{
ScriptAlias /hg "/home/user/webdir/hgwebdir.cgi"
}}}
This actually exports the repository browser at the URL path `/hg` (for example, `http://www.example.com/hg`) and doesn't expose the name of the script at all.

{i} See the [[http://httpd.apache.org/docs/2.2/mod/mod_alias.html#scriptalias|Apache httpd documentation]] for `ScriptAlias`.

<<Anchor(directory)>>

=== Using a Simple CGI Directory ===
''This example requires access to the main configuration files.''

If the directory containing the script is supposed to hold this and other CGI programs, such a CGI directory can be configured as follows:

{{{
ScriptAlias /hg "/home/user/webdir"
}}}
This should permit URLs like `http://www.example.com/hg/hgwebdir.cgi` to show the repository browser, although to hide the `hgwebdir.cgi` script name in URLs, more work is required (and is mentioned below).

{i} See the [[http://httpd.apache.org/docs/2.2/mod/mod_alias.html#scriptalias|Apache httpd documentation]] for `ScriptAlias`, especially for information about the pitfalls of putting CGI directories inside existing Web-accessible directories.

<<Anchor(htaccess)>>

=== Using an .htaccess File ===
''This example can be used with pre-configured CGI directories.''

If you may not change the main Apache configuration files, you may still be able to use `.htaccess` file to make URLs nicer, as suggested in the previous section. Here is an `.htaccess` file which sits in the published `webdir` directory on the Web server and redirects `http://www.example.com/hg/*` URLs to the `hgwebdir.cgi` script inside that folder. As a result it would no longer be not necessary to mention the CGI script name in URLs: one could use `http://www.example.com/hg/myproject` instead of `http://www.example.com/hg/hgwebdir.cgi/myproject`.

{{{
# Taken from http://www.pmwiki.org/wiki/Cookbook/CleanUrls#samedir
# Used at http://ggap.sf.net/hg/
Options +ExecCGI
RewriteEngine On
#write base depending on where the base url lives
RewriteBase /hg
RewriteRule ^$ hgwebdir.cgi [L]
# Send requests for files that exist to those files.
RewriteCond %{REQUEST_FILENAME} !-f
# Send requests for directories that exist to those directories.
RewriteCond %{REQUEST_FILENAME} !-d
# Send requests to hgwebdir.cgi, appending the rest of url.
RewriteRule (.*) hgwebdir.cgi/$1 [QSA,L]
}}}
A corresponding change in `hgweb.config` can be made to make sure that the nicer urls are used in the HTML produced by the CGI scripts:

{{{
[web]
baseurl = /hg
}}}
Where the CGI scripts are made to appear at the server root (for example, `http://hg.example.com/`), leave the `baseurl` setting blank:

{{{
[web]
baseurl =
}}}
Generally, the value specified should not end with a `/` character.

=== Adding Authentication ===
''The following configurations requires access to the main configuration files. They can be combined with the [[#script|script]] or [[#directory|directory]] declarations to impose authentication and access restrictions on repositories.''

''To use these configurations with the [[#htaccess|pre-configured CGI directories]], the `Location` directive start and end tags can be omitted, leaving the bare authentication-related directives.''

==== Restrict to Known Users ====
This configuration restricts access to a known set of users as defined in the `/home/user/hg/hgusers` password file:

{{{
<Location /hg>
    AuthType Basic
    AuthName "Mercurial repositories"
    AuthUserFile /home/user/hg/hgusers
    Require valid-user
</Location>
}}}
<!> Since the `AuthType` directive is set to `Basic`, passwords are communicated as plain text, and it is therefore recommended that this only be used with a server configured for HTTPS. See the [[http://httpd.apache.org/docs/2.2/ssl/|Apache SSL documentation]] for more information.

{i} To set up the password file, use the `htpasswd` tool as described in the relevant [[http://httpd.apache.org/docs/2.2/programs/htpasswd.html|Apache documentation]].

This alternative configuration employs digest authentication and thus offers an alternative to basic authentication and HTTPS:

{{{
<Location /hg>
    AuthType Digest
    AuthName "Mercurial repositories"
    AuthDigestProvider file
    AuthUserFile /home/user/hg/hgusers
    Require valid-user
</Location>
}}}
{i} To set up the password file, use the `htdigest` tool as described in the relevant [[http://httpd.apache.org/docs/2.2/programs/htdigest.html|Apache documentation]].

{i} See the [[http://httpd.apache.org/docs/2.2/mod/mod_auth_digest.html|Apache mod_auth_digest documentation]] for more information on digest authentication and its limitations.

<<Anchor(pushing)>>

==== Restrict Pushing to Known Users ====
To exercise finer control and to provide global read-only access to the repositories, but require authentication for pushing, a `LimitExcept` directive can be added. Here are the previous examples with such a directive in use. First with basic authentication:

{{{
<Location /hg>
    AuthType Basic
    AuthName "Mercurial repositories"
    AuthUserFile /home/user/hg/hgusers
    <LimitExcept GET>
        Require valid-user
    </LimitExcept>
</Location>
}}}
And with digest authentication:

{{{
<Location /hg>
    AuthType Digest
    AuthName "Mercurial repositories"
    AuthDigestProvider file
    AuthUserFile /home/user/hg/hgusers
    <LimitExcept GET>
        Require valid-user
    </LimitExcept>
</Location>
}}}
{i} See the [[http://httpd.apache.org/docs/2.2/mod/core.html#limitexcept|Apache documentation]] for `LimitExcept` for more information.

Now consult the instructions on [[#push|allowing the push operation]] in Mercurial to complete this configuration task.

==== Using Groups ====
Apache also provides support for user groups through the `AuthGroupFile` directive. Here it is in context:

{{{
    AuthUserFile /home/user/hg/hgusers
    AuthGroupFile /home/user/hg/hggroups
}}}
Instead of a `Require` directive involving users, the following directive can be used in its place. Here it is in context:

{{{
    <LimitExcept GET>
        Require group hobbits
    </LimitExcept>
}}}
Here, the `hobbits` group is defined in the nominated file as described in the relevant [[http://httpd.apache.org/docs/2.2/mod/mod_authz_groupfile.html#authgroupfile|Apache documentation]] for `AuthGroupFile`, connecting users who will authenticate themselves with groups such as `hobbits`.

=== Using Virtual Hosts ===
''This example requires access to the main configuration files.''

Here is an example Apache configuration for publishing repositories at `http://hg.example.com/`:

{{{
<VirtualHost *>
  ServerName hg.example.com

  ServerAdmin webmaster@example.com
  CustomLog logs/access_log.example combined
  ErrorLog logs/error_log.example

  ScriptAlias / "/home/user/webdir/hgwebdir.cgi"
</VirtualHost>
}}}
The `ScriptAlias` directive is taken from the [[#script|script]] example; all other directives support the virtual host `hg.example.com`.

{i} Note that the `CustomLog` and `ErrorLog` directives may need to be changed to refer to files in standard locations such as `/var/log/apache2` or `/var/log/httpd`, depending on how Apache is configured.

Here is a more complicated example using rewrite rules and explicit `Directory` directives. A description follows the example.
Line 66: Line 346:
  ServerName hg.serpentine.com

  ServerAdmin webmaster@serpentine.com
  CustomLog logs/access_log.serpentine combined
  ErrorLog logs/error_log.serpentine
  ServerName hg.example.com

  ServerAdmin webmaster@example.com
  CustomLog logs/access_log.example combined
  ErrorLog logs/error_log.example
Line 73: Line 353:
  RewriteRule (.*) /home/bos/hg/share/hgwebdir.cgi$1

  <Directory "/home/bos/hg/share/">
  RewriteRule (.*) /home/user/webdir/hgwebdir.cgi/$1

  # Or we can use mod_alias for starting CGI script and making URLs "nice":
  # ScriptAliasMatch ^(.*) /home/user/webdir/hgwebdir.cgi/$1

  <Directory "/home/user/webdir/">
Line 84: Line 367:

The {{{ServerName}}} directive matches the hostname I configured earlier.

The next section is just administrative cruft.

The rewrite-related directives tell Apache to turn URIs like {{{/mercurial/hg}}} into {{{/home/bos/hg/share/hgwebdir.cgi/mercurial/hg}}}. This causes Apache to fire up the CGI script, giving it the remainder of the URI as an argument.

Finally, the {{{Directory}}} section lets Apache know that we have a CGI script to look at.

=== Putting useful information in the index page ===
The directives in the above have the following purposes:

 *
 The `ServerName` directive matches the hostname configured for the domain.

 *
 The next section (`ServerAdmin` and so on) is just administrative cruft.

 *
 The rewrite-related directives (`RewriteEngine` and `RewriteRule`) tell Apache to turn URIs ending in `/myproject` into `/home/user/webdir/hgwebdir.cgi/myproject`. This causes Apache to fire up the CGI script, giving it the remainder of the URI as an argument.

 *
 Finally, the `Directory` section lets Apache know that we have a CGI script to look at.


{i} See the [[http://httpd.apache.org/docs/2.2/vhosts/|Apache virtual hosts documentation]] for more information.

<<Anchor(push)>>

== Allowing Push ==
Make sure that your repository is writeable by the user running the Apache server (such as `www-data`), and that the repository's `.hg/hgrc` file (or the `/home/user/.hgrc` file) contains the allowed users:

{{{
[web]
allow_push = frodo, sam
}}}
This would allow pushing for `frodo` and `sam`. You can allow pushing for everyone with the following:

{{{
[web]
allow_push = *
}}}
By default, pushing is only allowed via HTTPS. To permit HTTP pushing you have to add this to your repository's `.hg/hgrc` file (or to the `/home/user/.hgrc` file):

{{{
[web]
push_ssl = false
}}}
Now consult the instructions on configuring Apache to [[#pushing|restrict pushing]] in order to set up the authentication/authorisation infrastructure.

==== Defining User Credentials ====
To define credentials for the allowed users, use the `htpasswd` tool. For example:

{{{
htpasswd -c /home/user/hg/hgusers frodo
}}}
You will need to enter the desired password for the username `frodo`. Later, you can add more usernames without the `-c` option:

{{{
htpasswd /home/user/hg/hgusers sam
}}}
{i} See the relevant [[http://httpd.apache.org/docs/2.2/programs/htpasswd.html|Apache documentation]] for `htpasswd`.

== Troubleshooting ==
Mercurial is executed on the server by Apache and therefore runs as the Apache user and group. If experiencing flaky behavior, it may be because the CGI script is failing because it does not have enough rights. In that case, you should check the log files, but you can also make some common-sense permissions checks.

There are two ways that problems primarily manifest themselves on the server: either you won't see any repositories at all (indicating missing read or execute permissions) or you won't be able to push to the server (indicating missing write permissions), which gives you the error message:

{{{
abort: ‘http://foo/bar’ does not appear to be an hg repository!
}}}
Another common error often related to permissions:

{{{
abort: HTTP Error 500: Internal Server Error
}}}
The best way to solve permissions problems is to grant the required permissions to the Apache group (the `www-data` group on Debian). You should have some familiarity with assigning permissions under Linux/Unix before attempting the following.

Suppose your main user is `john`, your web server runs as `www-data` and your repositories are in `/home/john/repositories`. Then, execute the following commands to change the group for all files in your repositories on the server and make the files writable to the server process as well as make the home directory readable.

{{{
chown -R john:www-data /home/john/repositories
chmod -R g+rw /home/john/repositories
chmod g+x /home/john/repositories
}}}
For each repository, you will have to make both the repository folder and the `.hg` folder executable as well:

{{{
chmod g+x /home/john/repositories/rep1
chmod g+x /home/john/repositories/rep1/.hg
}}}
If each of your repositories are subdirectories from some main folder which only contains repositories (such as `/var/www/html/hg/repos`, with underlying repositories `/var/www/html/hg/repos/repo1`, `/var/www/html/hg/repos/repo2`, and so on), you may find it easier to remember to script the setting of these permissions. Write the following at the prompt to create a new executable shell script:

{{{
cat <<EOM >permission.sh
  #!/bin/sh
chown -R john:www-data repos
chmod -R g+rw repos
chmod g+x repos
chmod g+x repos/*
chmod g+x repos/*/.hg
EOM
sudo chown root:root permission.sh
sudo chmod u+x permission.sh
}}}
These assume your username is `john`, the apache server-user's group is `www-data`, and the folder containing your repositories is called `repos`. Now you can update permissions for your entire repository by navigating to this containing directory and issuing a single command:

{{{
sudo ./permission.sh
}}}
Before you start crawling through logs to find out why your Mercurial server isn't letting you pull, push, or authenticate, run this command and see if it solves your issue.

It is important to note that the entire repository tree must be accessible by the web server. For example, the tree `/home/john/source/repos/hg/repo1` requires `john`, `source`, `repos`, and `hg` to be executable by the webserver.

== Putting Useful Information in the Index Page ==
Line 103: Line 480:

=== What can go wrong? ===

If you are trying to publich multiple repositories, and you haven't configured Apache to force all accesses to go through the `hgwebdir.cgi` script, you will not be able to access any of the repositories you have published unless you set up a `hgweb.cgi` script in each published repository. Clearly, this defeats the whole point of using `hgwebdir.cgi` in thie first place, as you're not saving any effort.

Whatever mechanism you are trying to use, the important thing is to ensure that all accesses go through `hgwebdir.cgi`, so that Apache can pass the rest of the path to it using the `REQUEST_INFO` environment variable.
== Allowing Archive Downloads ==
Make sure that your repository's `.hg/hgrc` file (or the `/home/user/.hgrc` file) contains the `allow_archive` setting:

{{{
[web]
allow_archive = gz, zip, bz2
}}}
This example illustrates how gzip, zip and bzip2 archive formats can be supported. As a result, links should appear in the Web interface corresponding to these archive types.

== What Can Go Wrong? ==
If the version of `hgwebdir.cgi` is newer than the version of Mercurial you have installed, you may experience strange results. This could happen if you use a binary installer for Mercurial, and manually fetch `hgwebcir.cgi` from a source repository. Newer versions of Mercurial support older versions of the cgi scripts, so you usually do not have to upgrade all your cgi installations, though it might be useful.

If you are trying to publish multiple repositories, and you haven't configured Apache to force all accesses to go through the `hgwebdir.cgi` script, you will not be able to access any of the repositories you have published unless you set up a `hgweb.cgi` script in each published repository. Clearly, this defeats the whole point of using `hgwebdir.cgi` in the first place, as you're not saving any effort.

Whatever mechanism you are trying to use, the important thing is to ensure that all accesses go through `hgwebdir.cgi`, so that Apache can pass the rest of the path to it using the `PATH_INFO` environment variable.

== Theming ==
The hgweb interface is completely themable. See the [[Theming]] page for additional instructions on customizing the look of your site.

<<Anchor(SeeAlso)>>
== See Also ==
 * [[modwsgi]] to do the same using WSGI and Apache
 * [[http://www.aventinesolutions.nl/mediawiki/index.php/Quick_Tip:_Getting_Started_with_Mercurial|Quick Tip: Getting Started with Mercurial]] describes Mercurial installation and repository serving using mod_python
 * [[http://vampirebasic.blogspot.com/2009/06/running-mercurial-on-windows.html|Vampire Basic Blog]] has a very good walkthrough on how to set up hgwebdir on IIS
 * [[http://www.jeremyskinner.co.uk/mercurial-on-iis7/|Setting up a Mercurial server under IIS7 on Windows Server 2008 R2]] covers IIS and includes pictures to illustrate the process

----
CategoryWeb CategoryHowTo CategoryTipsAndTricks

Publishing Mercurial Repositories

1. Choosing A Publishing Method

There are a variety of different ways to publish your Mercurial repositories. Some are more powerful than others but may require more effort to set up and administer. See below for some general recommendations.

Solution

Mechanism

Push?

Browsable

Advantages

Disadvantages

Public

hg serve

HTTP

off by default

yes

built-in

push has no authentication, so can only be used on trusted internal networks

static HTTP

HTTP/HTTPS

no

no

does not require hg or CGI support on the server

very slow

hgweb {*}

off by default

yes

can use existing web server (CGI, WSGI, mod_python), including authentication

web server config can be hard to debug

hgwebdir {*}

off by default

yes

can use existing web server (CGI, WSGI, mod_python), including authentication, supports multiple repositories

slightly more work to setup than hgweb

hg serve behind a proxy (Nginx)

yes

yes

multiple repos, permits authentication, no CGI

requires Nginx, slower than CGI

third-party hosting

yes

yes

minimal setup

not locally administered, may have fees

Private/internal

ssh

SSH

yes

no

no additional setup

requires Unix server, per-user accounts and repositories

mercurial-server

SSH (but with shared accounts)

yes

no

easy key management, fine-grained permissions

requires Unix server, not built in

shared disk

NFS/Samba etc.

yes

no

can use existing setup

generally restricted to intranets

(!) See also this comprehensive guide to acquiring Mercurial and configuring hgwebdir.

(!) If using a recent Apache, use mod_wsgi instead of mod_python or CGI as it has better performance.

(!) See also various other guides about configuring Mercurial's Web interface using various technologies.

{*} Recommended solutions are described below.

1.1. Quick Recommendations

The easiest way to share changes with other people using Mercurial is to publish them on the Web. The following two methods are the most recommended for publishing repositories over HTTP:

Less desirable are the following:

  • Use the hg serve command. This is Mercurial's built-in Web server. It is not really recommended except for temporary situations where you need to publish a repository for a few minutes, for example to pull changes from a laptop.

  • Make the plain repository available. This uses a much slower 'serverless' protocol called static-http. We won't cover it here, see StaticHTTP instead.

For private or restricted-access repositories, aside from the solutions explicitly marked as "private/internal" in the table above, authentication measures (certificates, logins) can be applied to many of the "public" solutions in order to restrict access.


2. Introduction and Prerequisites

In this document we assume that repositories reside in the /home/user/hg directory. For example, a repository called myproject would reside in /home/user/hg/myproject.

To implement the mechanisms described in this document, you will need the following:

  • Some control over the behaviour of the Web server you use.
  • (Optional) control over the DNS domain you use.

With control over DNS, such as that provided with various Web hosting service control panels, you should be able to set up a subdomain; this makes the URL of your repositories a little tidier, so that http://hg.example.com/myproject can be used instead of http://www.example.com/hgwebdir.cgi/myproject, for example.

Such an approach, known as virtual hosting, is entirely optional. To implement it for the fictional example.com, a CNAME record for hg.example.com would be defined for the same address as that already used by the Web server.

3. Publishing a Single Repository

To publish a single repository, perform the following steps:

  1. Find the hgweb.cgi script in the root of your Mercurial source tree.

  2. Copy it to a directory where your Web server can access it.
  3. Rename it to index.cgi if you want, then edit its contents.

  4. Make sure the Web server is configured and can execute the script.

If Mercurial is not installed system-wide, uncomment and edit the Python path in hgweb.cgi (or index.cgi) as indicated:

# adjust python path if not a system-wide install:
import sys
sys.path.insert(0, "/home/user/lib")

You will need to edit the call to hgweb.hgweb as indicated in the following example:

h = hgweb.hgweb("/home/user/hg/myproject", "My Project")

While you could use this mechanism to publish multiple repositories, it requires a little work to configure each copy of the script to have slightly different paths.

4. Publishing Multiple Repositories

The hgwebdir.cgi script takes some work to initially set up, but once it's working, it lets you publish new repositories easily and cheaply. Its advantage is that to publish a repository, you simply place a clone in a particular directory, then add one line to a config file to tell the CGI script that it is allowed to publish that repository.

4.1. Setting up the CGI script

To publish a single repository, perform the following steps:

  1. Find the hgwebdir.cgi script in the root of your Mercurial source tree.

  2. Copy it to a directory where your Web server can access it. This will be illustrated below using /home/user/webdir as this directory.

  3. Edit the contents of the file.
  4. Make sure the Web server is configured and can execute the script.

If Mercurial is not installed system-wide, uncomment and edit the Python path in hgwebdir.cgi as indicated:

# adjust python path if not a system-wide install:
import sys
sys.path.insert(0, "/home/user/lib")

4.2. Setting up the hgweb.config File

In the /home/user/webdir directory, create a file called hgweb.config. Here are the contents of an hgweb.config file:

[paths]
myproject = /home/user/hg/myproject
otherproject = /home/user/hg/otherproject

The paths setting in this file tells the CGI script which repositories it is allowed to publish and how their published locations map to their actual locations in the filesystem.

  • The keys (on the left) are URL paths which can incorporate / characters - these paths appear as part of the URL used to access a repository

  • The values (on the right) are filesystem paths which can be relative to the CGI directory - note that it is typically preferable to keep the repositories outside the CGI directory

{i} If you are publishing a fixed set of repositories, using the paths setting should be sufficient, and you should not need to consider the collections setting described below.

4.2.1. Working with Collections

Where many repositories are being served, it can be preferable to refer to each directory holding such collections of repositories instead of listing each and every repository as is done above. The above hgweb.config file could be rewritten as follows:

[collections]
/home/user/hg = /home/user/hg

The collections setting in this file tells the CGI script where to look for repositories.

  • The keys (on the left) and the values (on the right) are both filesystem paths

  • The keys should be prefixes of the values and are "subtracted" from the values in order to generate the URL paths to each repository

Consider two repository collections given in the hgweb.config file:

[collections]
/home/user/private = /home/user/private
/home/user/official = /home/user/official

Where both /home/user/private and /home/user/official contain repositories using the same names (myproject and otherproject), a combined list will then be shown by the Web interface containing two entries for each of someproject and otherproject: one from the private collection and one from the official collection. This may be confusing to the end-user, so we may modify the configuration file as follows:

[collections]
/home/user = /home/user

This will now produce entries for private/someproject, private/otherproject, official/someproject and official/otherproject. Unfortunately, it will also find other repositories outside the private and official directories. It is therefore recommended that repositories are located in suitably organised directory hierarchies if exported in this way.

5. Configuring Apache

There are many ways of configuring Apache to run CGI scripts, and a few of the possibilities are provided below. Where the main configuration files are mentioned, you should use the appropriate conventions for your system in defining such files in the conf.d and/or sites-available directories.

In each example, hgwebdir.cgi is mentioned, but the same principles apply to directories hosting the hgweb.cgi script.

{i} To ensure that a script is executable by the Web server, the following command is typically used:

chmod u+x hgwebdir.cgi

{i} The preferred mechanism for persuading Apache to use updated configuration information can vary from platform to platform and from distribution to distribution. Please consult your distribution's documentation, if appropriate, or the more general Apache documentation (for example, the apachectl documentation) for details.

5.1. Publishing the CGI Script Directly

This example requires access to the main configuration files.

The easiest way to serve the hgwebdir.cgi script is to use a ScriptAlias directive:

ScriptAlias /hg "/home/user/webdir/hgwebdir.cgi"

This actually exports the repository browser at the URL path /hg (for example, http://www.example.com/hg) and doesn't expose the name of the script at all.

{i} See the Apache httpd documentation for ScriptAlias.

5.2. Using a Simple CGI Directory

This example requires access to the main configuration files.

If the directory containing the script is supposed to hold this and other CGI programs, such a CGI directory can be configured as follows:

ScriptAlias /hg "/home/user/webdir"

This should permit URLs like http://www.example.com/hg/hgwebdir.cgi to show the repository browser, although to hide the hgwebdir.cgi script name in URLs, more work is required (and is mentioned below).

{i} See the Apache httpd documentation for ScriptAlias, especially for information about the pitfalls of putting CGI directories inside existing Web-accessible directories.

5.3. Using an .htaccess File

This example can be used with pre-configured CGI directories.

If you may not change the main Apache configuration files, you may still be able to use .htaccess file to make URLs nicer, as suggested in the previous section. Here is an .htaccess file which sits in the published webdir directory on the Web server and redirects http://www.example.com/hg/* URLs to the hgwebdir.cgi script inside that folder. As a result it would no longer be not necessary to mention the CGI script name in URLs: one could use http://www.example.com/hg/myproject instead of http://www.example.com/hg/hgwebdir.cgi/myproject.

# Taken from http://www.pmwiki.org/wiki/Cookbook/CleanUrls#samedir
# Used at http://ggap.sf.net/hg/
Options +ExecCGI
RewriteEngine On
#write base depending on where the base url lives
RewriteBase /hg
RewriteRule ^$ hgwebdir.cgi  [L]
# Send requests for files that exist to those files.
RewriteCond %{REQUEST_FILENAME} !-f
# Send requests for directories that exist to those directories.
RewriteCond %{REQUEST_FILENAME} !-d
# Send requests to hgwebdir.cgi, appending the rest of url.
RewriteRule (.*) hgwebdir.cgi/$1  [QSA,L]

A corresponding change in hgweb.config can be made to make sure that the nicer urls are used in the HTML produced by the CGI scripts:

[web]
baseurl = /hg

Where the CGI scripts are made to appear at the server root (for example, http://hg.example.com/), leave the baseurl setting blank:

[web]
baseurl =

Generally, the value specified should not end with a / character.

5.4. Adding Authentication

The following configurations requires access to the main configuration files. They can be combined with the script or directory declarations to impose authentication and access restrictions on repositories.

To use these configurations with the pre-configured CGI directories, the Location directive start and end tags can be omitted, leaving the bare authentication-related directives.

5.4.1. Restrict to Known Users

This configuration restricts access to a known set of users as defined in the /home/user/hg/hgusers password file:

<Location /hg>
    AuthType Basic
    AuthName "Mercurial repositories"
    AuthUserFile /home/user/hg/hgusers
    Require valid-user
</Location>

<!> Since the AuthType directive is set to Basic, passwords are communicated as plain text, and it is therefore recommended that this only be used with a server configured for HTTPS. See the Apache SSL documentation for more information.

{i} To set up the password file, use the htpasswd tool as described in the relevant Apache documentation.

This alternative configuration employs digest authentication and thus offers an alternative to basic authentication and HTTPS:

<Location /hg>
    AuthType Digest
    AuthName "Mercurial repositories"
    AuthDigestProvider file
    AuthUserFile /home/user/hg/hgusers
    Require valid-user
</Location>

{i} To set up the password file, use the htdigest tool as described in the relevant Apache documentation.

{i} See the Apache mod_auth_digest documentation for more information on digest authentication and its limitations.

5.4.2. Restrict Pushing to Known Users

To exercise finer control and to provide global read-only access to the repositories, but require authentication for pushing, a LimitExcept directive can be added. Here are the previous examples with such a directive in use. First with basic authentication:

<Location /hg>
    AuthType Basic
    AuthName "Mercurial repositories"
    AuthUserFile /home/user/hg/hgusers
    <LimitExcept GET>
        Require valid-user
    </LimitExcept>
</Location>

And with digest authentication:

<Location /hg>
    AuthType Digest
    AuthName "Mercurial repositories"
    AuthDigestProvider file
    AuthUserFile /home/user/hg/hgusers
    <LimitExcept GET>
        Require valid-user
    </LimitExcept>
</Location>

{i} See the Apache documentation for LimitExcept for more information.

Now consult the instructions on allowing the push operation in Mercurial to complete this configuration task.

5.4.3. Using Groups

Apache also provides support for user groups through the AuthGroupFile directive. Here it is in context:

    AuthUserFile /home/user/hg/hgusers
    AuthGroupFile /home/user/hg/hggroups

Instead of a Require directive involving users, the following directive can be used in its place. Here it is in context:

    <LimitExcept GET>
        Require group hobbits
    </LimitExcept>

Here, the hobbits group is defined in the nominated file as described in the relevant Apache documentation for AuthGroupFile, connecting users who will authenticate themselves with groups such as hobbits.

5.5. Using Virtual Hosts

This example requires access to the main configuration files.

Here is an example Apache configuration for publishing repositories at http://hg.example.com/:

<VirtualHost *>
  ServerName hg.example.com

  ServerAdmin webmaster@example.com
  CustomLog logs/access_log.example combined
  ErrorLog logs/error_log.example

  ScriptAlias / "/home/user/webdir/hgwebdir.cgi"
</VirtualHost>

The ScriptAlias directive is taken from the script example; all other directives support the virtual host hg.example.com.

{i} Note that the CustomLog and ErrorLog directives may need to be changed to refer to files in standard locations such as /var/log/apache2 or /var/log/httpd, depending on how Apache is configured.

Here is a more complicated example using rewrite rules and explicit Directory directives. A description follows the example.

<VirtualHost *:80>
  ServerName hg.example.com

  ServerAdmin webmaster@example.com
  CustomLog logs/access_log.example combined
  ErrorLog logs/error_log.example

  RewriteEngine on
  RewriteRule (.*) /home/user/webdir/hgwebdir.cgi/$1

  # Or we can use mod_alias for starting CGI script and making URLs "nice":
  # ScriptAliasMatch ^(.*) /home/user/webdir/hgwebdir.cgi/$1

  <Directory "/home/user/webdir/">
    Order allow,deny
    Allow from all
    AllowOverride All
    Options ExecCGI
    AddHandler cgi-script .cgi
  </Directory>
</VirtualHost>

The directives in the above have the following purposes:

  • The ServerName directive matches the hostname configured for the domain.

  • The next section (ServerAdmin and so on) is just administrative cruft.

  • The rewrite-related directives (RewriteEngine and RewriteRule) tell Apache to turn URIs ending in /myproject into /home/user/webdir/hgwebdir.cgi/myproject. This causes Apache to fire up the CGI script, giving it the remainder of the URI as an argument.

  • Finally, the Directory section lets Apache know that we have a CGI script to look at.

{i} See the Apache virtual hosts documentation for more information.

6. Allowing Push

Make sure that your repository is writeable by the user running the Apache server (such as www-data), and that the repository's .hg/hgrc file (or the /home/user/.hgrc file) contains the allowed users:

[web]
allow_push = frodo, sam

This would allow pushing for frodo and sam. You can allow pushing for everyone with the following:

[web]
allow_push = *

By default, pushing is only allowed via HTTPS. To permit HTTP pushing you have to add this to your repository's .hg/hgrc file (or to the /home/user/.hgrc file):

[web]
push_ssl = false

Now consult the instructions on configuring Apache to restrict pushing in order to set up the authentication/authorisation infrastructure.

6.0.1. Defining User Credentials

To define credentials for the allowed users, use the htpasswd tool. For example:

htpasswd -c /home/user/hg/hgusers frodo

You will need to enter the desired password for the username frodo. Later, you can add more usernames without the -c option:

htpasswd /home/user/hg/hgusers sam

{i} See the relevant Apache documentation for htpasswd.

7. Troubleshooting

Mercurial is executed on the server by Apache and therefore runs as the Apache user and group. If experiencing flaky behavior, it may be because the CGI script is failing because it does not have enough rights. In that case, you should check the log files, but you can also make some common-sense permissions checks.

There are two ways that problems primarily manifest themselves on the server: either you won't see any repositories at all (indicating missing read or execute permissions) or you won't be able to push to the server (indicating missing write permissions), which gives you the error message:

abort: ‘http://foo/bar’ does not appear to be an hg repository!

Another common error often related to permissions:

abort: HTTP Error 500: Internal Server Error

The best way to solve permissions problems is to grant the required permissions to the Apache group (the www-data group on Debian). You should have some familiarity with assigning permissions under Linux/Unix before attempting the following.

Suppose your main user is john, your web server runs as www-data and your repositories are in /home/john/repositories. Then, execute the following commands to change the group for all files in your repositories on the server and make the files writable to the server process as well as make the home directory readable.

chown -R john:www-data /home/john/repositories
chmod -R g+rw /home/john/repositories
chmod g+x /home/john/repositories

For each repository, you will have to make both the repository folder and the .hg folder executable as well:

chmod g+x /home/john/repositories/rep1
chmod g+x /home/john/repositories/rep1/.hg

If each of your repositories are subdirectories from some main folder which only contains repositories (such as /var/www/html/hg/repos, with underlying repositories /var/www/html/hg/repos/repo1, /var/www/html/hg/repos/repo2, and so on), you may find it easier to remember to script the setting of these permissions. Write the following at the prompt to create a new executable shell script:

cat <<EOM >permission.sh
  #!/bin/sh
chown -R john:www-data repos
chmod -R g+rw repos
chmod g+x repos
chmod g+x repos/*
chmod g+x repos/*/.hg
EOM
sudo chown root:root permission.sh
sudo chmod u+x permission.sh

These assume your username is john, the apache server-user's group is www-data, and the folder containing your repositories is called repos. Now you can update permissions for your entire repository by navigating to this containing directory and issuing a single command:

sudo ./permission.sh

Before you start crawling through logs to find out why your Mercurial server isn't letting you pull, push, or authenticate, run this command and see if it solves your issue.

It is important to note that the entire repository tree must be accessible by the web server. For example, the tree /home/john/source/repos/hg/repo1 requires john, source, repos, and hg to be executable by the webserver.

8. Putting Useful Information in the Index Page

If you get everything working properly, pointing a browser at the CGI script directly should give a list of the repositories you've published. This will be a table containing four columns. Let's say you have published a repository named lord/rings. To fill out the first three columns of the index entry for that repository, you will need to edit its .hg/hgrc file, and add a new section:

[web]
contact = Bilbo Baggins
description = My precious!
name = lord/rings

9. Allowing Archive Downloads

Make sure that your repository's .hg/hgrc file (or the /home/user/.hgrc file) contains the allow_archive setting:

[web]
allow_archive = gz, zip, bz2

This example illustrates how gzip, zip and bzip2 archive formats can be supported. As a result, links should appear in the Web interface corresponding to these archive types.

10. What Can Go Wrong?

If the version of hgwebdir.cgi is newer than the version of Mercurial you have installed, you may experience strange results. This could happen if you use a binary installer for Mercurial, and manually fetch hgwebcir.cgi from a source repository. Newer versions of Mercurial support older versions of the cgi scripts, so you usually do not have to upgrade all your cgi installations, though it might be useful.

If you are trying to publish multiple repositories, and you haven't configured Apache to force all accesses to go through the hgwebdir.cgi script, you will not be able to access any of the repositories you have published unless you set up a hgweb.cgi script in each published repository. Clearly, this defeats the whole point of using hgwebdir.cgi in the first place, as you're not saving any effort.

Whatever mechanism you are trying to use, the important thing is to ensure that all accesses go through hgwebdir.cgi, so that Apache can pass the rest of the path to it using the PATH_INFO environment variable.

11. Theming

The hgweb interface is completely themable. See the Theming page for additional instructions on customizing the look of your site.

12. See Also


CategoryWeb CategoryHowTo CategoryTipsAndTricks

PublishingRepositories (last edited 2020-12-06 23:19:24 by PaulBoddie)