Note:
This page is primarily intended for developers of Mercurial.
Path separator handling
How Mercurial handles path separators
Contents
1. Overview
This page is intended to collect information about Mercurial's current path separator behavior and help formulate coherent rules for when paths should be converted.
2. Basics
There are three basic path formats:
- portable (with a forward slash)
- local (slash on Unix, backslash on Windows)
OS-specific observations:
- Despite usually using backslash, Windows internally will always accept slash instead of backslash
- Backslashes are valid non-separator characters on Unix, which we should be able to store (though the repo may not be usable on Windows)
- On Windows, ui.slash will cause (most? very few!) commands to output slashes
Basic rules:
- paths stored in changelog and manifest should always be in portable format
- paths read from user (command line, config files) may be in either portable or local format
- URLs should always use portable format
APIs:
- convert from native to portable with util.pconvert()
- convert from portable to native with util.localpath()
- util.normpath() converts to portable
- repo.pathto() and dirstate.pathto() converts to local (honors ui.slash)
- util.pathto() converts to local (ignores ui.slash)
3. Audit of current usage
As of Mercurial 2.4, we have the following behavior:
command |
output |
comments |
hg add/copy/forget/rename/remove/revert |
local |
|
hg diff/export |
portable |
|
hg locate |
portable |
unusual for working directory command |
hg manifest |
portable |
|
hg log |
portable |
|
hg resolve |
portable |
unusual for working directory command |
hg status |
local |
even if not checking working directory |
hg clone/push/pull/in/out |
local |
shows normalized source |
hg rebase/strip |
local |
reports path of backup bundle |
Inferred rules:
- commands for inspecting history should use portable format
- commands for working directory should use local format
- commands that default to the working directory (ie status) should use local format
- diffs should use portable format... for portability
4. Tests
As of 2.4, the Mercurial test suite uses the '(glob)' filter to hide Windows-specific path changes. This has several downsides:
- Unix developers usually fail to mark these paths when writing tests
- doesn't detect accidental regressions
- tedious to maintain
Some possible improvements:
- a new filter '(path)' that normalizes local paths but complains about non-local paths
- more complete check-code rules to warn Unix devs about non-portable constructs
5. Questions
Considering how inconsistent Mercurial uses portable vs local slashes: Do Windows users actually rely on Mercurial emitting backslash? What are the usecases?
How much time do Mercurial spend normalizing paths and converting slashes over and over?