Differences between revisions 1 and 5 (spanning 4 versions)
Revision 1 as of 2016-11-05 03:45:40
Size: 3203
Comment: copy paste of sprint notes, foozy's email, etc.
Revision 5 as of 2016-11-18 18:27:56
Size: 5087
Comment: add explanation about "Control start point of matching arbitrarily"
Deletions are marked like this. Additions are marked like this.
Line 53: Line 53:
"end of name" matching is required: Matching is examined:
Line 55: Line 55:
 * for glob/relglob as PATTERN (e.g. argument in command line), but
 * not for glob/relglob as INCLUDES/EXCLUDES, or other pattern syntaxes
 * '''non'''-recursively for glob/relglob as PATTERN (e.g. argument in command line), but
 * recursively for glob/relglob as INCLUDES/EXCLUDES, or other pattern types
Line 60: Line 60:
 * not matched at "hg files glob:foo/bar"
 * but matched at "hg file -I glob:foo/bar"
 * not matched at: `hg files glob:foo/bar`
 * not matched at: `hg files -I 'set:
"glob:foo/bar"'`
 * but matched at: `hg files -I glob:foo/bar`
Line 63: Line 64:
This isn't mentioned in any help document :-<, and the latter seems
to cause the issue mentioned in this patch series.
The latter seems to cause the issue mentioned by Rodrigo in "[[https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-October/089003.html|match: adding non-recursive directory matching]]".
Line 68: Line 68:
How about introducing new systematic names like below to re-organize
current complicated mapping between names and matching ? (and enable
"end of name" matching by "-eon" suffix or so)
==== Control start point of matching arbitrarily ====

How about introducing new systematic names like below to
re-organize current complicated mapping between names and
matching ?
Line 77: Line 79:
Of course, we should take care of backward compatibility of .hgignore
or so (e.g. config knob to warn/abort for new syntax name in .hgignore).
  * new "glob"/"re" families match recursively, fully according to the specified pattern
  * each of existing pattern types will be internally treated as an alias of types above
    * recursion of "glob"/"relglob" aliases is treated specially, for backward compatibility

With these newly introduced pattern types, both "start point" and "recursion" of matching can be fully controlled
arbitrarily via existing command I/F (as PATTERN, or via -I/-X).

==== Control recursion of matching arbitrarily ====

With current Mercurial (at least, 4.0 or earlier), recursion of
each pattern types can be controlled by:

||'''type''' ||'''for recursive matching''' ||'''for non-recursive matching''' ||
||glob ||using "**" || using "*" ||
||re ||omitting "$" || appending "$" ||
||path ||always || --- ||

User can't control recursion of matching with "path" type pattern
arbitrarily (it matches against both directory and file).

Therefore, how about introducing two more additional pattern
types "file" and "dir" ?

||'''type''' ||'''for recursive''' ||'''for non-recursive''' ||
||file ||--- ||always ||
||dir ||always(*) ||--- ||

(*) "dir" matches against only directory.

After adding these types, there are 5 (base types) x 3 (start points) = 15 types

||'''base type''' ||'''root-ed''' ||'''cwd-ed''' ||'''any-of-path''' ||
||wildcard ||rootglob ||cwdglob ||anyglob ||
||regexp ||rootre ||cwdre ||anyre ||
||raw path ||rootpath ||cwdpath ||anypath ||
||raw file name ||rootfile ||cwdfile ||anyfile ||
||raw dir name ||rootdir ||cwddir ||anydir ||

Note:

This page is primarily intended for developers of Mercurial.

Better Matcher API and File Patterns Plan

Status: Project

Main proponents: YourNameHere

/!\ This is a speculative project and does not represent any firm decisions on future behavior.

{X} Add a short summary of the idea here.

1. Goal

  • Short term: add non-recursive globs ?
  • Long term: extensible matcher API ?

2. Detailed description

2.1. Sprint Notes

Non-recursive globs (Rodrigo, spectral, Durham, :
    Issue is that * is sometimes recursive
    matcher API is a mess
    Should we re-write match.py or just add fileglob?
    Suggestion: add fileglob via a new, cleaner API, then migrate others over time
    Possible FB use case: pick parts of a tree to include and exclude (would add ordering dependency instead of excludes always trumping includes?)
    matcher API should be extensible
    matcher composition: anyof, allof, negate, per-file-type, etc.
    Inconsistencies in pattern behavior between hgignore, --include/--exclude, etc.
    FB: conversion between matchers and watchman expressions
    Proposal: wiki page, first group to have a use case proposes the initial API

2.2. Current Status

pattern type

root-ed

cwd-ed

any-of-path

wildcard

---

glob

relglob

regexp

re

---

relre

raw string

path

relpath

---

If rule is read in from file (e.g. .hgignore):

  • "glob" is treated as "relglob"
  • "re" is treated as "relre"

This is mentioned in "hg help patterns" and "hg help hgignore", but syntax name "relglob" and "relre" themselves aren't explained.

Matching is examined:

  • non-recursively for glob/relglob as PATTERN (e.g. argument in command line), but

  • recursively for glob/relglob as INCLUDES/EXCLUDES, or other pattern types

For example, file "foo/bar/baz" is:

  • not matched at: hg files glob:foo/bar

  • not matched at: hg files -I 'set:"glob:foo/bar"'

  • but matched at: hg files -I glob:foo/bar

The latter seems to cause the issue mentioned by Rodrigo in "match: adding non-recursive directory matching".

2.3. Proposal by foozy

2.3.1. Control start point of matching arbitrarily

How about introducing new systematic names like below to re-organize current complicated mapping between names and matching ?

pattern type

root-ed

cwd-ed

any-of-path

wildcard

rootglob

cwdglob

anyglob

regexp

rootre

cwdre

anyre

raw string

rootpath

cwdpath

anypath

  • new "glob"/"re" families match recursively, fully according to the specified pattern
  • each of existing pattern types will be internally treated as an alias of types above
    • recursion of "glob"/"relglob" aliases is treated specially, for backward compatibility

With these newly introduced pattern types, both "start point" and "recursion" of matching can be fully controlled arbitrarily via existing command I/F (as PATTERN, or via -I/-X).

2.3.2. Control recursion of matching arbitrarily

With current Mercurial (at least, 4.0 or earlier), recursion of each pattern types can be controlled by:

type

for recursive matching

for non-recursive matching

glob

using "**"

using "*"

re

omitting "$"

appending "$"

path

always

---

User can't control recursion of matching with "path" type pattern arbitrarily (it matches against both directory and file).

Therefore, how about introducing two more additional pattern types "file" and "dir" ?

type

for recursive

for non-recursive

file

---

always

dir

always(*)

---

(*) "dir" matches against only directory.

After adding these types, there are 5 (base types) x 3 (start points) = 15 types

base type

root-ed

cwd-ed

any-of-path

wildcard

rootglob

cwdglob

anyglob

regexp

rootre

cwdre

anyre

raw path

rootpath

cwdpath

anypath

raw file name

rootfile

cwdfile

anyfile

raw dir name

rootdir

cwddir

anydir

2.4. Proposal by Rodrigo

Add rootglob: to get over the issue of -I/-X patterns.

https://patchwork.mercurial-scm.org/patch/17311/

3. Roadmap

{X}

4. See Also


CategoryDeveloper CategoryNewFeatures

FileNamePatternsPlan (last edited 2016-12-05 13:08:40 by YuyaNishihara)