Differences between revisions 1 and 25 (spanning 24 versions)
Revision 1 as of 2008-10-22 13:51:50
Size: 3164
Editor: abuehl
Comment: new page
Revision 25 as of 2014-02-19 22:09:21
Size: 8141
Editor: mpm
Comment: spam
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
'''fncache''' is a new repository layout (or format) for Mercurial that solves the following issues: #pragma section-numbers 2
Line 3: Line 3:
 * http://www.selenic.com/mercurial/bts/issue839 - Hg local store creates paths too long for Windows
 * http://www.selenic.com/mercurial/bts/issue793 - Can't clone repos that use Windows reserved names in paths
= 'fncache' Repository Format =
Line 6: Line 5:
Current version of the patch: http://marc.info/?l=mercurial-devel&m=122445566407446&w=2 '''fncache''' is a [[Repository|repository]] layout (or format) for Mercurial that
reorganized the revlog data file ''names'' (and directory names) inside the store to
work around some nasty file name limitations of Windows (long path names, names
reserved by Windows).
Line 8: Line 10:
Status: [http://marc.info/?l=mercurial-devel&m=122452614713724&w=2 queued] by mpm <<TableOfContents>>
Line 10: Line 12:
With this patch, all new repositories on all platforms will be fncache
repositories. You don't have to do anything (besides using a version
of Mercurial containing this patch, which is not yet the case, as
the patch is not yet applied to any official repo).
== Details and usage ==

The fncache layout was first released with Mercurial 1.1 and bugfixed in
Mercurial 1.1.1 (see [[WhatsNew]]).

The "format" is not really new, since the contents of the files in the
store did not change. We just changed ''where'' we store the bits - not ''how''
we store them.

Since this was a backwards-incompatible change in the way the files in
the store are named, we introduced an new format specifier ('fncache')
in the [[RequiresFile|requires file]], thus telling old versions of Mercurial
that it should keep its fingers off from new 'fncache' repositories (since we know
those old versions of Mercurial won't know how to find the files in
the store).

With this change, all newly created repositories on all platforms will
be fncache repositories. You don't have to do anything (besides using a
version of Mercurial containing this change).

''The new layout does not affect the wire (or bundle) protocol(s)
in any way.'' So you can push/pull/clone over the wire to/from
any repo being in any layout using any Mercurial version on both
ends.
Line 18: Line 40:
For example, if you have a current non-fncache repo and you do
a local 'clone --pull' you will end up with an fncache repo.
If you do a plain local clone (without '--pull') of an existing
For example, if you have a (pre 1.1) non-fncache repo and you do
a local `clone --pull` you will end up with an fncache repo.
If you do a plain local clone (without `--pull`) of a
Line 24: Line 46:
In short, use clone --pull to convert repos (in case you In short, use `clone --pull` to convert repos (in case you
Line 30: Line 52:
with an old version of Mercurial it will abort with: with a version of Mercurial prior to 1.1 it will abort with:
Line 36: Line 58:
Which tells you that the repo at hand requires knowledge which tells you that the repo at hand requires knowledge
Line 40: Line 62:
repo becomes corrupted, you can do a clone --pull to re-
build it. The fncache file contains a list of all revlog files
repo becomes corrupted, you can do a `clone --pull` to rebuild
it. The fncache file contains a list of all revlog files
Line 44: Line 66:
Existing non-fncache repositories will remain as they are,
as Mercurial will still be able to write and read non-fncache
repositories with this patch.
Existing non-fncache repositories, that is, repositories created with
Mercurial 1.0 (or older),
will remain as they are,
as Mercurial will still be able to read and write non-fncache
repositories.
Line 48: Line 71:
In current Mercurial there is already a hgrc option
'[format] usesstore' [1], which enables the current 'store' format,
which is the default in current Mercurial.
The fncache repo format can be disabled with
Line 52: Line 73:
The store format encodes filenames with uppercase chars
"X" as "_x". If you disable that, you will have to make
sure that the repo is only used on a platform that does not
fold case (that is, don't use or copy it to/on Windows).
The fncache repo layout is a descendant of the store
format, so if you disable the store format you implicitly
disable the fncache layout.
{{{
[format]
usefncache = False
}}}
Line 60: Line 78:
With the patch as it is, there is currently no option to
disable the fncache layout for new repos (as a hackaround,
you can manually remove the 'fncache' entry in the requires
file after hg init). You can only disable the 'store' format,
which implicitly disables fncache too. But there is no
separate option to only disable 'fncache' and keep 'store'.
in the hgrc (see http://www.selenic.com/mercurial/hgrc.5.html#format) or with
{{{--config format.usefncache=0}}} on the command line. For example, the
command
Line 67: Line 82:
The new layout does not affect the wire (or bundle) protocol(s)
in any way. So you can push/pull/clone over the wire to/from
any repo being in any layout using any Mercurial version on both
ends.
{{{
hg --config format.usefncache=0 clone --pull A B
}}}
Line 72: Line 86:
[1] http://www.selenic.com/mercurial/hgrc.5.html#format converts the local fncache repo A to non-fncache repo B.

== New entry 'fncache' in the requires file ==

Mercurial writes a file named {{{'requires'}}} in the {{{.hg}}} directory when creating a new repository (see [[RequiresFile]]). For an fncache repository, the requires file contains:

{{{
revlogv1
store
fncache
}}}

In a pre-fncache repository, the entry {{{'fncache'}}} in the requires file is missing.

== Encoding of Windows reserved names ==

Path elements consisting of Windows reserved names are now
encoded using {{{~xx}}} where {{{xx}}} is the two digit ASCII hex code
of the third character of that reserved name. For example "{{{aux}}}"
is encoded as "{{{au~78}}}".

Windows reserved names are: {{{'con', 'prn', 'aux', 'nul', 'com1'..'com9'}}} and {{{'lpt1'..'lpt9'}}}.

For example the path
{{{
data/aux.bla/bla.aux/prn/PRN/lpt/com3/nul/coma/foo.NUL/normal.c.i
}}}
is encoded as
{{{
data/au~78.bla/bla.aux/pr~6e/_p_r_n/lpt/co~6d3/nu~6c/coma/foo._n_u_l/normal.c.i
}}}

Note that {{{'aux.bla'}}} needs to be encoded, but not {{{'bla.aux'}}}.

== Hashing of long paths ==

Paths inside the store that would be longer than 120 chars are now
hash encoded.

For the encoding used see the function {{{mercurial.store.hybridencode}}}.

Some encoding examples for paths that are hashed (A1&rarr;B1, A2&rarr;B2, ...):

{{{
(A1) data/AUX/SECOND/X.PRN/FOURTH/FI:FTH/SIXTH/SEVENTH/EIGHTH/NINETH/TENTH/ELEVENTH/LOREMIPSUM.TXT.i
(B1) dh/au~78/second/x.prn/fourth/fi~3afth/sixth/seventh/eighth/nineth/tenth/loremia20419e358ddff1bf8751e38288aff1d7c32ec05.i

(A2) data/enterprise/openesbaddons/contrib-imola/corba-bc/netbeansplugin/wsdlExtension/src/main/java/META-INF/services/org.netbeans.modules.xml.wsdl.bindingsupport.spi.ExtensibilityElementTemplateProvider.i
(B2) dh/enterpri/openesba/contrib-/corba-bc/netbeans/wsdlexte/src/main/java/org.net7018f27961fdf338a598a40c4683429e7ffb9743.i

(A3) data/AUX.THE-QUICK-BROWN-FOX-JU:MPS-OVER-THE-LAZY-DOG-THE-QUICK-BROWN-FOX-JUMPS-OVER-THE-LAZY-DOG.TXT.i
(B3) dh/au~78.the-quick-brown-fox-ju~3amps-over-the-lazy-dog-the-quick-brown-fox-jud4dcadd033000ab2b26eb66bae1906bcb15d4a70.i
}}}

All paths that are hashed are stored in the directory {{{'dh'}}} inside {{{'.hg/store'}}}. Non-hashed paths
are stored inside {{{'.hg/store/data'}}}.

The hashing used is the sha1 digest (40 characters) of the direncoded path below {{{'.hg/store'}}}, as pre-encoded by {{{mercurial.filelog.encodedir}}}.

For the hashencoded path, the first eight characters of the first n directory levels are taken (converted to lowercase), where n
is adapted slightly to use more levels if space allows (see {{{store.hybridencode}}}). If space allows, the filename before the
hash value is filled up with to lowercase converted chars from the filename of the input path.

As you can see, the path encoding done may fold multiple files originating from different input path directories
into the same encoded path directory. The sha1 digest part of the filename ensures that the filenames
are distinct and no name clashes occur.

== The fncache file ==

For the fncache repository format Mercurial maintains a new file {{{'fncache'}}} (thus the name of the format)
inside {{{'.hg/store'}}}. The fncache file contains the paths of all filelog files in the store as encoded
by {{{mercurial.filelog.encodedir}}}. The paths are separated by {{{'\n'}}} (LF).

The fncache file is used to enumerate all filelog files in the store, for example when doing a {{{clone --uncompressed}}}. The fncache file may contain duplicates or inexistent entries (this can happen when using
the {{{strip}}} or {{{rollback}}} commmands).

During a {{{clone --uncompressed}}} or a {{{hg verify}}}
the fncache file is read and rewritten '''if''' duplicates or entries with missing filelog files are detected,
so even operations that don't modify the history of the repository may lead to modifying the
fncache file (this was a deliberate design decision as discussed with mpm).

The fncache file is not read by a {{{hg clone --pull}}}, so that command may be used to resurrect a damaged
fncache file, since {{{hg clone --pull}}} rewrites the fncache file from the information found in all manifest
revisions. That's also the reason why it is basically cached information.

The {{{verify}}} command checks the fncache file and removes non-existent or duplicate entries. If a filelog file
referenced in a manifest revision is not found in the fncache file, {{{hg verify}}} reports an error.

== Bug tracker reference ==

|| ''Issue'' || ''Title'' || ''Fixed in release'' ||
|| [[http://www.selenic.com/mercurial/bts/issue839|839]] || Hg local store creates paths too long for Windows || 1.1 ||
|| [[http://www.selenic.com/mercurial/bts/issue793|793]] || Can't clone repos that use Windows reserved names in paths || 1.1 ||
|| [[http://www.selenic.com/mercurial/bts/issue1417|1417]] || 'maximum recursion depth exceeded' exception when cloning to fncache layout on windows || 1.1.1 ||

'fncache' Repository Format

fncache is a repository layout (or format) for Mercurial that reorganized the revlog data file names (and directory names) inside the store to work around some nasty file name limitations of Windows (long path names, names reserved by Windows).

1. Details and usage

The fncache layout was first released with Mercurial 1.1 and bugfixed in Mercurial 1.1.1 (see WhatsNew).

The "format" is not really new, since the contents of the files in the store did not change. We just changed where we store the bits - not how we store them.

Since this was a backwards-incompatible change in the way the files in the store are named, we introduced an new format specifier ('fncache') in the requires file, thus telling old versions of Mercurial that it should keep its fingers off from new 'fncache' repositories (since we know those old versions of Mercurial won't know how to find the files in the store).

With this change, all newly created repositories on all platforms will be fncache repositories. You don't have to do anything (besides using a version of Mercurial containing this change).

The new layout does not affect the wire (or bundle) protocol(s) in any way. So you can push/pull/clone over the wire to/from any repo being in any layout using any Mercurial version on both ends.

New repositories are for example created by non-hardlink cloning of existing repos or if you clone over the wire (http, ssh).

For example, if you have a (pre 1.1) non-fncache repo and you do a local clone --pull you will end up with an fncache repo. If you do a plain local clone (without --pull) of a non-fncache repo, you will get a non-fncache repo with hardlinks to the existing repo.

In short, use clone --pull to convert repos (in case you want to convert repos to the fncache repo format, which will almost never be needed).

Of course old versions of Mercurial will not be able to read fncache repos. If you try to access an fncache repo with a version of Mercurial prior to 1.1 it will abort with:

abort: requirement 'fncache' not supported!

which tells you that the repo at hand requires knowledge of the fncache repo format in Mercurial.

(BTW, if, for whatever reason, the fnache file in the repo becomes corrupted, you can do a clone --pull to rebuild it. The fncache file contains a list of all revlog files in the repo).

Existing non-fncache repositories, that is, repositories created with Mercurial 1.0 (or older), will remain as they are, as Mercurial will still be able to read and write non-fncache repositories.

The fncache repo format can be disabled with

[format]
usefncache = False

in the hgrc (see http://www.selenic.com/mercurial/hgrc.5.html#format) or with --config format.usefncache=0 on the command line. For example, the command

hg --config format.usefncache=0 clone --pull A B

converts the local fncache repo A to non-fncache repo B.

2. New entry 'fncache' in the requires file

Mercurial writes a file named 'requires' in the .hg directory when creating a new repository (see RequiresFile). For an fncache repository, the requires file contains:

revlogv1
store
fncache

In a pre-fncache repository, the entry 'fncache' in the requires file is missing.

3. Encoding of Windows reserved names

Path elements consisting of Windows reserved names are now encoded using ~xx where xx is the two digit ASCII hex code of the third character of that reserved name. For example "aux" is encoded as "au~78".

Windows reserved names are: 'con', 'prn', 'aux', 'nul', 'com1'..'com9' and 'lpt1'..'lpt9'.

For example the path

data/aux.bla/bla.aux/prn/PRN/lpt/com3/nul/coma/foo.NUL/normal.c.i

is encoded as

data/au~78.bla/bla.aux/pr~6e/_p_r_n/lpt/co~6d3/nu~6c/coma/foo._n_u_l/normal.c.i

Note that 'aux.bla' needs to be encoded, but not 'bla.aux'.

4. Hashing of long paths

Paths inside the store that would be longer than 120 chars are now hash encoded.

For the encoding used see the function mercurial.store.hybridencode.

Some encoding examples for paths that are hashed (A1→B1, A2→B2, ...):

(A1) data/AUX/SECOND/X.PRN/FOURTH/FI:FTH/SIXTH/SEVENTH/EIGHTH/NINETH/TENTH/ELEVENTH/LOREMIPSUM.TXT.i
(B1) dh/au~78/second/x.prn/fourth/fi~3afth/sixth/seventh/eighth/nineth/tenth/loremia20419e358ddff1bf8751e38288aff1d7c32ec05.i

(A2) data/enterprise/openesbaddons/contrib-imola/corba-bc/netbeansplugin/wsdlExtension/src/main/java/META-INF/services/org.netbeans.modules.xml.wsdl.bindingsupport.spi.ExtensibilityElementTemplateProvider.i
(B2) dh/enterpri/openesba/contrib-/corba-bc/netbeans/wsdlexte/src/main/java/org.net7018f27961fdf338a598a40c4683429e7ffb9743.i

(A3) data/AUX.THE-QUICK-BROWN-FOX-JU:MPS-OVER-THE-LAZY-DOG-THE-QUICK-BROWN-FOX-JUMPS-OVER-THE-LAZY-DOG.TXT.i
(B3) dh/au~78.the-quick-brown-fox-ju~3amps-over-the-lazy-dog-the-quick-brown-fox-jud4dcadd033000ab2b26eb66bae1906bcb15d4a70.i

All paths that are hashed are stored in the directory 'dh' inside '.hg/store'. Non-hashed paths are stored inside '.hg/store/data'.

The hashing used is the sha1 digest (40 characters) of the direncoded path below '.hg/store', as pre-encoded by mercurial.filelog.encodedir.

For the hashencoded path, the first eight characters of the first n directory levels are taken (converted to lowercase), where n is adapted slightly to use more levels if space allows (see store.hybridencode). If space allows, the filename before the hash value is filled up with to lowercase converted chars from the filename of the input path.

As you can see, the path encoding done may fold multiple files originating from different input path directories into the same encoded path directory. The sha1 digest part of the filename ensures that the filenames are distinct and no name clashes occur.

5. The fncache file

For the fncache repository format Mercurial maintains a new file 'fncache' (thus the name of the format) inside '.hg/store'. The fncache file contains the paths of all filelog files in the store as encoded by mercurial.filelog.encodedir. The paths are separated by '\n' (LF).

The fncache file is used to enumerate all filelog files in the store, for example when doing a clone --uncompressed. The fncache file may contain duplicates or inexistent entries (this can happen when using the strip or rollback commmands).

During a clone --uncompressed or a hg verify the fncache file is read and rewritten if duplicates or entries with missing filelog files are detected, so even operations that don't modify the history of the repository may lead to modifying the fncache file (this was a deliberate design decision as discussed with mpm).

The fncache file is not read by a hg clone --pull, so that command may be used to resurrect a damaged fncache file, since hg clone --pull rewrites the fncache file from the information found in all manifest revisions. That's also the reason why it is basically cached information.

The verify command checks the fncache file and removes non-existent or duplicate entries. If a filelog file referenced in a manifest revision is not found in the fncache file, hg verify reports an error.

6. Bug tracker reference

Issue

Title

Fixed in release

839

Hg local store creates paths too long for Windows

1.1

793

Can't clone repos that use Windows reserved names in paths

1.1

1417

'maximum recursion depth exceeded' exception when cloning to fncache layout on windows

1.1.1

fncacheRepoFormat (last edited 2014-02-19 22:09:21 by mpm)