Size: 4717
Comment: revise current status
|
Size: 5087
Comment: add explanation about "Control start point of matching arbitrarily"
|
Deletions are marked like this. | Additions are marked like this. |
Line 79: | Line 79: |
Each of existing pattern types will be internally treated as an alias of types above. | * new "glob"/"re" families match recursively, fully according to the specified pattern * each of existing pattern types will be internally treated as an alias of types above * recursion of "glob"/"relglob" aliases is treated specially, for backward compatibility With these newly introduced pattern types, both "start point" and "recursion" of matching can be fully controlled arbitrarily via existing command I/F (as PATTERN, or via -I/-X). |
Note:
This page is primarily intended for developers of Mercurial.
Better Matcher API and File Patterns Plan
Status: Project
Main proponents: YourNameHere
This is a speculative project and does not represent any firm decisions on future behavior.
Add a short summary of the idea here.
1. Goal
- Short term: add non-recursive globs ?
- Long term: extensible matcher API ?
2. Detailed description
2.1. Sprint Notes
Non-recursive globs (Rodrigo, spectral, Durham, : Issue is that * is sometimes recursive matcher API is a mess Should we re-write match.py or just add fileglob? Suggestion: add fileglob via a new, cleaner API, then migrate others over time Possible FB use case: pick parts of a tree to include and exclude (would add ordering dependency instead of excludes always trumping includes?) matcher API should be extensible matcher composition: anyof, allof, negate, per-file-type, etc. Inconsistencies in pattern behavior between hgignore, --include/--exclude, etc. FB: conversion between matchers and watchman expressions Proposal: wiki page, first group to have a use case proposes the initial API
2.2. Current Status
pattern type |
root-ed |
cwd-ed |
any-of-path |
wildcard |
--- |
glob |
relglob |
regexp |
re |
--- |
relre |
raw string |
path |
relpath |
--- |
If rule is read in from file (e.g. .hgignore):
- "glob" is treated as "relglob"
- "re" is treated as "relre"
This is mentioned in "hg help patterns" and "hg help hgignore", but syntax name "relglob" and "relre" themselves aren't explained.
Matching is examined:
non-recursively for glob/relglob as PATTERN (e.g. argument in command line), but
- recursively for glob/relglob as INCLUDES/EXCLUDES, or other pattern types
For example, file "foo/bar/baz" is:
not matched at: hg files glob:foo/bar
not matched at: hg files -I 'set:"glob:foo/bar"'
but matched at: hg files -I glob:foo/bar
The latter seems to cause the issue mentioned by Rodrigo in "match: adding non-recursive directory matching".
2.3. Proposal by foozy
2.3.1. Control start point of matching arbitrarily
How about introducing new systematic names like below to re-organize current complicated mapping between names and matching ?
pattern type |
root-ed |
cwd-ed |
any-of-path |
wildcard |
rootglob |
cwdglob |
anyglob |
regexp |
rootre |
cwdre |
anyre |
raw string |
rootpath |
cwdpath |
anypath |
- new "glob"/"re" families match recursively, fully according to the specified pattern
- each of existing pattern types will be internally treated as an alias of types above
- recursion of "glob"/"relglob" aliases is treated specially, for backward compatibility
With these newly introduced pattern types, both "start point" and "recursion" of matching can be fully controlled arbitrarily via existing command I/F (as PATTERN, or via -I/-X).
2.3.2. Control recursion of matching arbitrarily
With current Mercurial (at least, 4.0 or earlier), recursion of each pattern types can be controlled by:
type |
for recursive matching |
for non-recursive matching |
glob |
using "**" |
using "*" |
re |
omitting "$" |
appending "$" |
path |
always |
--- |
User can't control recursion of matching with "path" type pattern arbitrarily (it matches against both directory and file).
Therefore, how about introducing two more additional pattern types "file" and "dir" ?
type |
for recursive |
for non-recursive |
file |
--- |
always |
dir |
always(*) |
--- |
(*) "dir" matches against only directory.
After adding these types, there are 5 (base types) x 3 (start points) = 15 types
base type |
root-ed |
cwd-ed |
any-of-path |
wildcard |
rootglob |
cwdglob |
anyglob |
regexp |
rootre |
cwdre |
anyre |
raw path |
rootpath |
cwdpath |
anypath |
raw file name |
rootfile |
cwdfile |
anyfile |
raw dir name |
rootdir |
cwddir |
anydir |
2.4. Proposal by Rodrigo
Add rootglob: to get over the issue of -I/-X patterns.
https://patchwork.mercurial-scm.org/patch/17311/
3. Roadmap