Differences between revisions 4 and 10 (spanning 6 versions)
Revision 4 as of 2016-05-10 23:32:03
Size: 8797
Comment: revise "outline of issue" and "steps to make cache validation exact"
Revision 10 as of 2017-06-09 02:07:09
Size: 18286
Comment: update current status, and add additional topics
Deletions are marked like this. Additions are marked like this.
Line 7: Line 7:
'''Status: need discussion''' '''Status: In progress'''
Line 16: Line 16:
== Status of tasks ==

Almost all of main tasks to make cache validation exact were finished, and changes were released as a part of 3.9 (mentioned in "[[Release3.9|overview of new features]]" page).

There were/are some issues found after 3.9 release.

 * ambiguity around truncation (./) by "[[#make_revlog-style_files_aware_of_ambiguity_around_truncation|make revlog-style files aware of ambiguity around truncation]]"
 * abortion for EPERM at rollback (./) by "[[#ignore_EPERM_at_advancing_mtime_of_non-privileged_files|ignore EPERM at advancing mtime of non-privileged files]]"
 * rewinding mtime at rollback, in progress as "[[#avoid_rewinding_mtime_at_rollback|avoid rewinding mtime at rollback]]"

Optional tasks are:

  * make scmutil.filecache use "new cachestat" {X}
  * make hg.cachedlocalrepo use "new cachestat" {X}
  * make util.atomictempfile ready for context manager (./) (by [[https://selenic.com/repo/hg/rev/6d96658a22b0|Martijn Pieters]], and released as a part of 3.9)
Line 18: Line 34:
This comes from issue4368 https://bz.mercurial-scm.org/show_bug.cgi?id=4368#c10 , and was once RFC-ed in http://thread.gmane.org/gmane.comp.version-control.mercurial.devel/85394 This comes from issue4368 https://bz.mercurial-scm.org/show_bug.cgi?id=4368#c10 , and was once RFC-ed in https://www.mercurial-scm.org/pipermail/mercurial-devel/2015-November/075434.html
Line 35: Line 51:
Base idea of the solution for this issue was suggested by Matt in
[[http://thread.gmane.org/gmane.comp.version-control.mercurial.devel/85394/focus=85437]]

But simple "advance mtime only if size, ctime and mtime are same"
might cause "fluttering for mtime" like below:
Root cause of this issue is that timestamp in second doesn't have
enough resolution to detect multiple changes "at same second".

Base idea of the solution for this issue is "advance mtime, if changed
at same second", which was suggested by
[[https://www.mercurial-scm.org/pipermail/mercurial-devel/2015-November/075477.html|Matt]]

"S[N]" below means stat of a file at N-th change:

  * S[n-1].ctime < S[n].ctime: can detect change of a file
  * S[n-1].ctime == S[n].ctime
    * S[n-1].ctime < S[n].mtime: means natural advancing (*1)
    * S[n-1].ctime == S[n].mtime: is ambiguous (*2)
    * S[n-1].ctime > S[n].mtime: never occurs naturally (*3)
  * S[n-1].ctime > S[n].ctime: never occurs naturally

Case (*2) above means that a file was changed twice or more at same
second (= S[n-1].ctime), and comparison of timestamp is ambiguous.

But advancing mtime only in case (*2) doesn't work as expected,
because naturally advanced S[n].mtime in case (*1) might be equal to
manually advanced S[n-1 or less].mtime.

Therefore, all "S[n-1].ctime == S[n].ctime" cases should be treated as
ambiguous regardless of mtime, to avoid overlooking by confliction
between such mtime.

Advancing mtime in such case ensures "S[n-1].mtime != S[n].mtime"
always at change of a file.

Even though meaning of "ctime" itself is different on each platforms
("change" time on POSIX, but "creation" time on Windows), this
solution should work as expected, because it is fact that file stat is
ambiguous if ctime isn't different between before and after change of
a file.

{{{#!wiki caution
Ambiguity of file stat causes issues only between multiple threads (or
processes) running parallely in the same repository, because (1) files
are never changed by another thread and (2) in-memory filecache-ed
properties should be valid in "single thread" case.

If ambiguity of file stat seems to cause an issue while continuous
procedures in the same thread, we should investigate refreshing file
stat, writing in-memory changes out, invalidating filecahce-ed
properties, and so on, at first.

For example, see [[https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-September/087872.html|the issue around manifest caching]].
}}}

BTW, "rewind mtime 1 sec, if timestamp is ambiguous" also resolves
this issue.

If we rewind S[n].mtime 1 sec in (*2) case, we don't worry about
confliction between naturally advanced S[n].mtime in case (*1) and
manually advanced S[n-1 or less].mtime, because manually rewound mtime
uses (*3) space: while a file is changed at same ctime, mtime should
be naturally equal to or grater than ctime (= (*2) or (*1)).

But:

  * it is a little complicated to explain,
  * it might confuse traditional tools by "mtime less than ctime", and
  * performance advantage of rewinding mtime is only that os.lstat() is avoided if "S[n-1].ctime < S[n].mtime" (if "S[n-1].ctime == S[n].ctime" occurs often, "S[n-1].ctime < S[n].mtime" is just a little part of such case)

Therefore, advancing mtime seems better than rewinding mtime, for this
issue.

== Ambiguity check with other than ctime ==

If other than ctime of a file is used to examine ambiguity, it might
cause overlooking changes.

For example, if "ambiguous" means "ctime and mtime are same":
Line 49: Line 134:
In addition to it, "fluttering for size" like below might occur. Then, this "fluttering" for mtime between MTIME and MTIME+1 causes
overlooking changes.

For another example, if "ambiguous" means "size and ctime are same":
Line 53: Line 141:
  3. modify FILE again with (SIZE1, CTIME, MTIME)
  4. (2) - (3) might be repeated many times while same CTIME

Root cause of this issue is that timestamp in second doesn't have
enough resolution to detect multiple changes "at same second".

Therefore, we should advance mtime of changed file, if changes occur
"at same second". If timestamps below are same, changes should occur
"at same second".

  * ctime of original file
  * ctime of changed file
  * mtime of changed file
  3. mtime is kept, because of change of size
  4. modify FILE again with (SIZE1, CTIME, MTIME)
  5. mtime is kept, because of change of size
  6. (2) - (5) might be repeated many times while same CTIME

In this situation, mtime is never advanced, and this causes
overlooking changes.

Therefore, other than ctime of a file can't be used to examine
ambiguity, even though it might cause advancing mtime often.
Line 139: Line 224:
   (self.stat.st_ctime == oldstat.stat.st_ctime and
    self.stat.st_ctime == self.stat.st_mtime
)
   (self.stat.st_ctime == oldstat.stat.st_ctime)
Line 227: Line 311:
=== make revlog-style files aware of ambiguity around truncation ===

Normally, revlog-style files are "file stat ambiguity" free,
because revlog is append only (= size should be changed always).
But strip and rollback cause irregular truncation of such files.

If steps below occurs at "the same time in sec", all of mtime, ctime
and size are same between (1) and (3).

  1. append data to revlog-style file (and close transaction)
  2. discard appended data by truncation (strip or rollback)
  3. append same size but different data to revlog-style file again

Then, cache validation doesn't work after (3) as expected.

Therefore, we should:

  * make filecache-ed revlog-style files aware of ambiguity
    * writing out at finalization of changelog
    * writing out of manifest
  * make truncation below aware of ambiguity
    * truncation at stripping
    * truncation at rollbacking

=== ignore EPERM at advancing mtime of non-privileged files ===

"Group write permission" allows users to share Mercurial repositories.

But advancing mtime described in this page causes exception at
rollback and strip, if previous transaction was executed by
another user, because POSIX specifies that advancing mtime by
utime() is allowed only for owner UID process.

 * rollback restores files from "undo.*", which were backed up at previous transaction
 * rollback and strip truncate files, which were created at previous transaction

(see [[https://bz.mercurial-scm.org/show_bug.cgi?id=5418|issue5418]] for detail)

Therefore, we should omit advancing mtime, if EPERM is detected.

=== avoid rewinding mtime at rollback ===

Previous "ignore EPERM at advancing" rewinds mtime at rollback,
and make cache validation not exact. For example:

||step ||actual time ||mtime of FILE ||mtime of undo.FILE ||action ||
||1 ||N ||N ||- ||(initial state)||
||2 || ||N+1 ||N ||commit by UID:A (*A)||
||3 || || || ||(*1)||
||4 || ||N ||- ||rollback by UID:B (*B)||
||5 || ||N+1 ||- ||updated by anyone||
||6 || || || ||(*2)||

UID:B process can't advance mtime of dirstate at (*B), because
undo.dirstate is owned by UID:A at (*A).

In this time table, a process caching FILE at (*1) above
misunderstands that FILE isn't changed between (*1) and (*2), because
of same mtime.

This rewinding makes cache stat ambiguous, and causes serious cache
validation issue at race condition. Especially, dirstate can be
updated outside transaction, and amount of rewinding mtime might be
more than 1, if transaction at N mtime is rollbacked at N+X mtime.

Ambiguity of file stat also requires
[[https://bz.mercurial-scm.org/show_bug.cgi?id=5584|issue5584]]
to be fixed with other than file stat.

==== for non append-only files ====

(this mainly focuses on "dirstate")

To avoid rewinding mtime at rollback:

 1. rename from undo.FILE to FILE
 1. advance mtime of FILE
 1. if it fails for EPERM:
  a. copy from (already renamed) FILE (back) to undo.FILE
  a. rename from undo.FILE to FILE, again
  a. now, FILE is own current process UID
  a. advance mtime of FILE

Extra overhead at "copy from FILE to undo.FILE" above seems reasonable
cost for exact cache validation, because:

 * only limited files are copied (dirstate, bookmarks, and phaseroots)
 * "hg rollback" itself has been deprecated since Mercurial 2.7
 * "sharing a repository clone via group permission" is reasonable usecase, but not ordinary for many users

==== for append-only files ====

(this focuses on files changed only inside transaction)

Applying similar "copy on truncation" approach on append-only files
implies copying all revlog format files at rollback. Is it acceptable
cost for exact cache validation ?

Another approach is:

 1. introduce ".hg/txngen" file to record timestamp of the last transaction
 2. update (= re-create) ".hg/txngen" at each transactions, and
 3. advance mtime of ".hg/txngen" (re-creation can avoid EPERM)
 4. make @filecache properties corresponded to append-only files check ".hg/txngen", too

This "generation ID" file approach was already
[[https://www.mercurial-scm.org/pipermail/mercurial-devel/2015-November/075438.html|rejected at the initial RFC]], though.

Which approach should we choose ?
Line 237: Line 431:
BTW, validity of ctime/mtime fields depends on underlying
filesystem. For example, on Windows, os.lstat() of Python fills
st_ctime/st_mtime information by ftCreationTime/ftLastWriteTime of
[[https://msdn.microsoft.com/en-us//library/windows/desktop/aa363788(v=vs.85).aspx|BY_HANDLE_FILE_INFORMATION structure]]
as a result of {{{GetFileInformationByHandle()}}} API, and these
fields are zero, if underlying filesystem doesn't support

Ambiguity detection described in this page depends on validity of
ctime/mtime fields of file stat. Therefore, cachability at runtime
should be examined by "{{{stat.st_ctime != 0 and stat.st_mtime != 0}}}".

This examination might cause false negative, if "filesystem time" is
accidentally 0 at that time. But such situation should be very rare
(or intentional for specific purpose), and should disappear after 1
second or so on ordinary environment.

Note:

This page is primarily intended for developers of Mercurial.

Exact Cache Validation Plan

Status: In progress

Main proponents: KatsunoriFujiwara

This page mainly focuses on the way to validate cache exactly even if stat information of file isn't changed as expected at changing file itself.

1. Status of tasks

Almost all of main tasks to make cache validation exact were finished, and changes were released as a part of 3.9 (mentioned in "overview of new features" page).

There were/are some issues found after 3.9 release.

Optional tasks are:

  • make scmutil.filecache use "new cachestat" {X}

  • make hg.cachedlocalrepo use "new cachestat" {X}

  • make util.atomictempfile ready for context manager (./) (by Martijn Pieters, and released as a part of 3.9)

2. Outline of issue

This comes from issue4368 https://bz.mercurial-scm.org/show_bug.cgi?id=4368#c10 , and was once RFC-ed in https://www.mercurial-scm.org/pipermail/mercurial-devel/2015-November/075434.html

On POSIX, comparing i-node number for file cache validation causes unexpected overlooking changes of file, because i-node number is reused rapidly in many cases. For example, it is assumed that steps below occur in same CTIME/MTIME:

  1. create FILE with (SIZE, INO1)
  2. modify FILE with (SIZE, INO2)
  3. modify FILE with (SIZE, reused INO1)

Then, file base property cached with stat at (1) overlooks changes at (3).

There are some files, of which size might be kept even at changing (for example, dirstate, bookmarks and so on). This causes inconsistent result.

Root cause of this issue is that timestamp in second doesn't have enough resolution to detect multiple changes "at same second".

Base idea of the solution for this issue is "advance mtime, if changed at same second", which was suggested by Matt

"S[N]" below means stat of a file at N-th change:

  • S[n-1].ctime < S[n].ctime: can detect change of a file

  • S[n-1].ctime == S[n].ctime
    • S[n-1].ctime < S[n].mtime: means natural advancing (*1)

    • S[n-1].ctime == S[n].mtime: is ambiguous (*2)
    • S[n-1].ctime > S[n].mtime: never occurs naturally (*3)

  • S[n-1].ctime > S[n].ctime: never occurs naturally

Case (*2) above means that a file was changed twice or more at same second (= S[n-1].ctime), and comparison of timestamp is ambiguous.

But advancing mtime only in case (*2) doesn't work as expected, because naturally advanced S[n].mtime in case (*1) might be equal to manually advanced S[n-1 or less].mtime.

Therefore, all "S[n-1].ctime == S[n].ctime" cases should be treated as ambiguous regardless of mtime, to avoid overlooking by confliction between such mtime.

Advancing mtime in such case ensures "S[n-1].mtime != S[n].mtime" always at change of a file.

Even though meaning of "ctime" itself is different on each platforms ("change" time on POSIX, but "creation" time on Windows), this solution should work as expected, because it is fact that file stat is ambiguous if ctime isn't different between before and after change of a file.

Ambiguity of file stat causes issues only between multiple threads (or processes) running parallely in the same repository, because (1) files are never changed by another thread and (2) in-memory filecache-ed properties should be valid in "single thread" case.

If ambiguity of file stat seems to cause an issue while continuous procedures in the same thread, we should investigate refreshing file stat, writing in-memory changes out, invalidating filecahce-ed properties, and so on, at first.

For example, see the issue around manifest caching.

BTW, "rewind mtime 1 sec, if timestamp is ambiguous" also resolves this issue.

If we rewind S[n].mtime 1 sec in (*2) case, we don't worry about confliction between naturally advanced S[n].mtime in case (*1) and manually advanced S[n-1 or less].mtime, because manually rewound mtime uses (*3) space: while a file is changed at same ctime, mtime should be naturally equal to or grater than ctime (= (*2) or (*1)).

But:

  • it is a little complicated to explain,
  • it might confuse traditional tools by "mtime less than ctime", and
  • performance advantage of rewinding mtime is only that os.lstat() is avoided if "S[n-1].ctime < S[n].mtime" (if "S[n-1].ctime == S[n].ctime" occurs often, "S[n-1].ctime < S[n].mtime" is just a little part of such case)

Therefore, advancing mtime seems better than rewinding mtime, for this issue.

3. Ambiguity check with other than ctime

If other than ctime of a file is used to examine ambiguity, it might cause overlooking changes.

For example, if "ambiguous" means "ctime and mtime are same":

  1. create FILE with (SIZE, CTIME, MTIME)
  2. modify FILE with (SIZE, CTIME, MTIME)
  3. change mtime of FILE into MTIME+1, and
  4. stat of FILE is now (SIZE, CTIME, MTIME+1)
  5. modify FILE with (SIZE, CTIME, MTIME) again, but
  6. (SIZE, CTIME, MTIME) stat of FILE is kept, because it is different from stat at (4)
  7. (2) - (6) might be repeated many times while same CTIME

Then, this "fluttering" for mtime between MTIME and MTIME+1 causes overlooking changes.

For another example, if "ambiguous" means "size and ctime are same":

  1. crete FILE with (SIZE1, CTIME, MTIME)
  2. modify FILE with (SIZE2, CTIME, MTIME)
  3. mtime is kept, because of change of size
  4. modify FILE again with (SIZE1, CTIME, MTIME)
  5. mtime is kept, because of change of size
  6. (2) - (5) might be repeated many times while same CTIME

In this situation, mtime is never advanced, and this causes overlooking changes.

Therefore, other than ctime of a file can't be used to examine ambiguity, even though it might cause advancing mtime often.

4. Caching for web UI

hg.cachedlocalrepo uses st_mtime and st_size of stat of files below to validate repo object on ALL platforms.

  • bookmarks
  • changelog
  • phaseroots

There is no trigger to invoke repo.invalidate() explicitly at just referring, for hgweb. Therefore, changes might be shadowed by ambiguity of files above (other than changelog), if repo object is reused.

5. Performance problem

If we simply make changing file "aware of ambiguity" always for atomictemp=True, it might cause performance problem.

For example, revlog code opens file in write mode with atomictemp=True. This means that all files derived from revlog is "aware of ambiguity".

But such files aren't ambiguous, because of "append only" revlog policy. In addition to it, a repository might write many filelog files at once (= require cost to get stat of them), but it doesn't cache them (= check of ambiguity isn't needed).

In conclusion, we should make util.atomictempfile aware of ambiguity, only if such awareness is required.

6. Cachability of files on Windows

On Windows, file isn't cachable since 2.0 (or 2aa3e07b2f07).

But on the other hand, hg.cachedlocalrepo uses st_mtime and st_size of stat to validate repo object on ALL platforms, as described above.

According to 2aa3e07b2f07:

"the path is uncacheable" above seems to mean "there is no reliable file ID on Windows".

Therefore, "using st_mtime, st_size and so on for cache validation" itself seems not problematic also on Windows.

7. Steps to make cache validation exact

7.1. introduce "new cachestat" class

Introduce portable "new cachestat" class.

It mainly provides methods below.

__eq__(self, oldstat) returns:

   (self.stat.st_size == oldstat.stat.st_size and
    self.stat.st_ctime == oldstat.stat.st_ctime and
    self.stat.st_mtime == oldstat.stat.st_mtime)

This is used to examine whether file is changed or not.

On the other hand, isambig(self, oldstat) returns:

   (self.stat.st_ctime == oldstat.stat.st_ctime)

This examines whether changes occur "at same second". If so, stat of changed file has "ambiguity" against one of original file, and we should advance mtime of changed file.

7.2. make util.atomictempfile aware of ambiguity

Make close() of util.atomictempfile examine whether stat of changed file is ambiguous or not in steps below.

  1. invoke original close() of file object for temporary file

  2. get "new cachestat" of original file ("oldstat")
  3. get "new cachestat" of changed file ("newstat")
  4. change mtime of changed file into "mtime of oldstat + 1", if newstat.isambig(oldstat)

For performance reason described above, close() of util.atomictempfile examines ambiguity, only if "checkambig" optional argument is True at contruction time.

7.3. change vfs.__call__() for ambiguity

Add "checkambig" optional argument to vfs.__call__(), and it will be passed to util.atomictempfile.

7.4. change file generation of transaction for ambiguity

Add "checkambig" optional argument to transaction.addfilegenerator(), and it will be used at opening file to write data out.

This is needed, because below files are written out also via file generation of transaction.

  • .hg/bookmarks
  • .hg/dirstate
  • .hg/phaseroots

Fortunately or unfortunately, transaction.addfilegenerator() is currently used only by classes related to files above.

For simplicity, should we make transaction generate files with vfs.open(FILENAME, "w", checkambig=True) always ?

7.5. make classes, of which instance is cached from file, aware of ambiguity

Classes below, which is cached via scmutil.filecache and might be ambiguous at change, should be aware of ambiguity.

  • bookmarks.bmstore (also for .hg/bookmarks.current)
  • dirstate.dirstate (also for .hg/branch)
  • phases.phasecache

We should make them use:

  • vfs.__call__() with checkambig=True

  • transaction.addfilegenerator() with checkambig=True

7.6. make restoring from backup file aware of ambiguity

Renaming from backup of cached file overwrites original file in cases below, and this renaming might cause ambiguity, too.

  • restoring from dirstate backup file at failure in scopes below
    • transaction scope
    • dirstateguard scope
  • restoring from "undo." files at transaction rollback
    • bookmarks
    • dirstate
    • phaseroots

Therefore, we should:

  • add (optional) ambiguity check logic to vfs.rename()

  • make restoring from backup file aware of ambiguity
    • restoring dirstate at failure (transaction or dirstateguard scope)
    • restoring files at localrepository._rollback()

In addition to it, transaction restores contents of files, which are registered via addfilegenerator(), from backup (journal.backup.* at failure, undo.backup.* at rollback) by util.copyfile().

Therefore, we should also:

  • add (optional) ambiguity check logic to util.copyfile()

  • make transaction._playback() aware of ambiguity

7.7. make revlog-style files aware of ambiguity around truncation

Normally, revlog-style files are "file stat ambiguity" free, because revlog is append only (= size should be changed always). But strip and rollback cause irregular truncation of such files.

If steps below occurs at "the same time in sec", all of mtime, ctime and size are same between (1) and (3).

  1. append data to revlog-style file (and close transaction)
  2. discard appended data by truncation (strip or rollback)
  3. append same size but different data to revlog-style file again

Then, cache validation doesn't work after (3) as expected.

Therefore, we should:

  • make filecache-ed revlog-style files aware of ambiguity
    • writing out at finalization of changelog
    • writing out of manifest
  • make truncation below aware of ambiguity
    • truncation at stripping
    • truncation at rollbacking

7.8. ignore EPERM at advancing mtime of non-privileged files

"Group write permission" allows users to share Mercurial repositories.

But advancing mtime described in this page causes exception at rollback and strip, if previous transaction was executed by another user, because POSIX specifies that advancing mtime by utime() is allowed only for owner UID process.

  • rollback restores files from "undo.*", which were backed up at previous transaction
  • rollback and strip truncate files, which were created at previous transaction

(see issue5418 for detail)

Therefore, we should omit advancing mtime, if EPERM is detected.

7.9. avoid rewinding mtime at rollback

Previous "ignore EPERM at advancing" rewinds mtime at rollback, and make cache validation not exact. For example:

step

actual time

mtime of FILE

mtime of undo.FILE

action

1

N

N

-

(initial state)

2

N+1

N

commit by UID:A (*A)

3

(*1)

4

N

-

rollback by UID:B (*B)

5

N+1

-

updated by anyone

6

(*2)

UID:B process can't advance mtime of dirstate at (*B), because undo.dirstate is owned by UID:A at (*A).

In this time table, a process caching FILE at (*1) above misunderstands that FILE isn't changed between (*1) and (*2), because of same mtime.

This rewinding makes cache stat ambiguous, and causes serious cache validation issue at race condition. Especially, dirstate can be updated outside transaction, and amount of rewinding mtime might be more than 1, if transaction at N mtime is rollbacked at N+X mtime.

Ambiguity of file stat also requires issue5584 to be fixed with other than file stat.

7.9.1. for non append-only files

(this mainly focuses on "dirstate")

To avoid rewinding mtime at rollback:

  1. rename from undo.FILE to FILE
  2. advance mtime of FILE
  3. if it fails for EPERM:
    1. copy from (already renamed) FILE (back) to undo.FILE
    2. rename from undo.FILE to FILE, again
    3. now, FILE is own current process UID
    4. advance mtime of FILE

Extra overhead at "copy from FILE to undo.FILE" above seems reasonable cost for exact cache validation, because:

  • only limited files are copied (dirstate, bookmarks, and phaseroots)
  • "hg rollback" itself has been deprecated since Mercurial 2.7
  • "sharing a repository clone via group permission" is reasonable usecase, but not ordinary for many users

7.9.2. for append-only files

(this focuses on files changed only inside transaction)

Applying similar "copy on truncation" approach on append-only files implies copying all revlog format files at rollback. Is it acceptable cost for exact cache validation ?

Another approach is:

  1. introduce ".hg/txngen" file to record timestamp of the last transaction
  2. update (= re-create) ".hg/txngen" at each transactions, and
  3. advance mtime of ".hg/txngen" (re-creation can avoid EPERM)
  4. make @filecache properties corresponded to append-only files check ".hg/txngen", too

This "generation ID" file approach was already rejected at the initial RFC, though.

Which approach should we choose ?

8. Optional tasks

8.1. make scmutil.filecache use "new cachestat"

To invalidate changed cache certainly, make scmutil.filecache use "new cachestat".

This also improves performance on Windows, because of reusing cached information :-)

BTW, validity of ctime/mtime fields depends on underlying filesystem. For example, on Windows, os.lstat() of Python fills st_ctime/st_mtime information by ftCreationTime/ftLastWriteTime of BY_HANDLE_FILE_INFORMATION structure as a result of GetFileInformationByHandle() API, and these fields are zero, if underlying filesystem doesn't support

Ambiguity detection described in this page depends on validity of ctime/mtime fields of file stat. Therefore, cachability at runtime should be examined by "stat.st_ctime != 0 and stat.st_mtime != 0".

This examination might cause false negative, if "filesystem time" is accidentally 0 at that time. But such situation should be very rare (or intentional for specific purpose), and should disappear after 1 second or so on ordinary environment.

8.2. make hg.cachedlocalrepo use "new cachestat"

BTW, after "make cache validation exact", files are cached on all platforms, and cache validity is examined exactly (I hope so :-)).

Then, can repo.invalidate() replace recreation of out-of-date repo object ?

8.3. make util.atomictempfile ready for context manager

Now, file opened with atomictemp=True can't be used for with Python statement.


CategoryDeveloper CategoryNewFeatures

ExactCacheValidationPlan (last edited 2017-06-09 02:07:09 by KatsunoriFujiwara)