Differences between revisions 1 and 9 (spanning 8 versions)
Revision 1 as of 2013-02-09 10:55:53
Size: 2685
Comment: moving older ChangeEvolution content here
Revision 9 as of 2014-08-14 01:00:19
Size: 3618
Comment: add proposed format.
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
Line 6: Line 5:
Changesets evolution allows for safe rewriting of Mercurial history. This has a close relationship with Phases For a user perspective have a look at the ChangesetEvolution page.
Line 8: Line 7:
    Presentation of the concept: http://hg-lab.logilab.org/doc/mutable-history/html/ == Obsstore Format ==
Line 10: Line 9:
    Related experimental extension (usable): http://bitbucket.org/marmoute/mutable-history/overview
Line 12: Line 10:
== Core principle == Markers are stored in an append-only file stored in
'.hg/store/obsstore'.
Line 14: Line 13:
 * Store an explicit obsolescence marker between new and old version of rewritten changeset.
 * This marker is *not* part of the changeset (should not alter the hash).
 * People are able to collaborate on evolving changeset
=== V1 (current) Format ===
Line 18: Line 15:
== Additional ideas == (see in line document for latest data)
Line 20: Line 17:
 * Store final delta in a real and autonomous changeset.
 * The Obsolescence markers are exchangeable without rewritten changeset.
 * Easily allow other extension to manipulate such relation (and to hook on such operation)
==== quick summary ====
Line 24: Line 19:
== Handled situations ==  * <number-of-successors(=N)><metadata-lenght(=M)><bits-field><precursor>(<successor>*N)<metadata>
Line 26: Line 21:
 * Rewriting content of a changeset
 * Deleting/killing a changeset
 * Splitting a single changeset into multiple ones
 * Collapsing/folding multiple changeset into a single one
 * Changing changeset order
 * Adding (e.g., pulling) a changeset evolution that conflicts with another one
 * Adding (or adding in general) new changesets on one which already evolved (or evolving a changeset that have descendants)
 * B, I, B, 20s, (20s*N), s*M
Line 34: Line 23:
== Changeset Obsolescence ==
Line 36: Line 24:
Obsolescence markers make it possible to mark changesets that have been deleted or superseded in a new version of the changeset. ==== longer explanation ====
Line 38: Line 26:
Unlike the previous way of handling such changes, by stripping the old changesets from the repository, obsolescence markers can be propagated between repositories. This allows for a safe and simple way of exchanging mutable history and altering it after the fact. Changeset phases are respected, such that only draft and secret changesets can be altered (see hg phases for details). The file starts with a version header:
Line 40: Line 28:
Obsolescence is tracked using "obsolescence markers", a piece of metadata that tracks which changesets have been made obsolete, potential successors for a given changeset, the moment the changeset was marked as obsolete, and the user who performed the rewriting operation. The markers are stored separately from standard changeset data can be exchanged without any of the precursor changesets, preventing unnecessary exchange of obsolescence data.  * 1 unsigned byte: version number, starting at zero.
Line 42: Line 30:
The complete set of obsolescence markers describes a history of changeset modifications that is orthogonal to the repository history of file modifications. This changeset history allows for detection and automatic resolution of edge cases arising from multiple users rewriting the same part of history concurrently.
The header is followed by the markers. Each marker is made of:

 * 1 unsigned byte: number of new changesets "N", can be zero.

 * 1 unsigned 32-bits integer: metadata size "M" in bytes.

 * 1 byte: a bit field. It is reserved for flags used in common
   obsolete marker operations, to avoid repeated decoding of metadata
   entries.

 * 20 bytes: obsoleted changeset identifier.

 * N*20 bytes: new changesets identifiers.

 * M bytes: metadata as a sequence of nul-terminated strings. Each
   string contains a key and a value, separated by a colon ':', without
   additional encoding. Keys cannot contain '\0' or ':' and values
   cannot contain '\0'.

=== V2 (current) Format ===


==== motivation ====

There is two extra information we would like to see in a second version of the format:

  * date: There is currently *always* a date in the meta data. So storing it explicitly would be more space efficient. It would also open the way to quickly access the date for sorting purpose (no use case yet but not crazy to think about it)

  * parents: When a changesets is pruned (obsoleted, no successors) we needs to records its parents. This is necessary to link the markers chain to the push/pull operation it is relevant to.

  * We may want to extend the bit field to 2 bytes. We currently use 1 and can see use case for 3-5 others (tracking the type of changes introduce by the rewriting (desc, patches, content, etc) so we are running short

==== possible change ====

'''Date''':

 * The date will be a 64bits integer (for seconds since epoch) followed by a 16 bits integer (time zone)

 * It will make sense to put the date in front of the markers. that would give markers sorting some semantic.

'''Parents''':

We have multiple option for storing parents:

 1. Having an explicite field similar to successors (one byte to know how many parents, then parents)

 2. Having an explicite field but store the number of parent in the bit fields (since we never have more than 2 parents)

 3. Using the successors field. Having negative number of successors mean it is a prune.

Option (3) is the most space saving but prevent use to store parent information for more changesets if needed in the future (We do not have a final exchange plan yet).

Option (1) and (2) takes 2 to 8 bits more than (3) but are more flexible.

'''bit field'''

If we extend the bit field to 2 Bytes, it makes sense to use option (2) for storing parent.

==== Proposed Format ====
 
 * <date><timezone><number-of-successors(=N)><metadata-lenght(=M)><bits-field+nb-parents(=P)><precursor>(<successor>*N)(<parents>*P)<metadata>

 * L, h, B, I, H, 20s, (20s*N), (20s*P), s*M

The P number would be hidden in the bit field. We need to store 4 possible values here: 0 parents, 1 parent, 2 parents, ø parents information stored. Possible assignement is 00, 01, 10, 11. this let both 0 parent and ø parent info to be 0 module 3.

Implementation Details about Changesets Evolution

/!\ This page is intended for developer

For a user perspective have a look at the ChangesetEvolution page.

1. Obsstore Format

Markers are stored in an append-only file stored in '.hg/store/obsstore'.

1.1. V1 (current) Format

(see in line document for latest data)

1.1.1. quick summary

  • <number-of-successors(=N)><metadata-lenght(=M)><bits-field><precursor>(<successor>*N)<metadata>

  • B, I, B, 20s, (20s*N), s*M

1.1.2. longer explanation

The file starts with a version header:

  • 1 unsigned byte: version number, starting at zero.

The header is followed by the markers. Each marker is made of:

  • 1 unsigned byte: number of new changesets "N", can be zero.
  • 1 unsigned 32-bits integer: metadata size "M" in bytes.
  • 1 byte: a bit field. It is reserved for flags used in common
    • obsolete marker operations, to avoid repeated decoding of metadata entries.
  • 20 bytes: obsoleted changeset identifier.
  • N*20 bytes: new changesets identifiers.
  • M bytes: metadata as a sequence of nul-terminated strings. Each
    • string contains a key and a value, separated by a colon ':', without additional encoding. Keys cannot contain '\0' or ':' and values cannot contain '\0'.

1.2. V2 (current) Format

1.2.1. motivation

There is two extra information we would like to see in a second version of the format:

  • date: There is currently *always* a date in the meta data. So storing it explicitly would be more space efficient. It would also open the way to quickly access the date for sorting purpose (no use case yet but not crazy to think about it)
  • parents: When a changesets is pruned (obsoleted, no successors) we needs to records its parents. This is necessary to link the markers chain to the push/pull operation it is relevant to.
  • We may want to extend the bit field to 2 bytes. We currently use 1 and can see use case for 3-5 others (tracking the type of changes introduce by the rewriting (desc, patches, content, etc) so we are running short

1.2.2. possible change

Date:

  • The date will be a 64bits integer (for seconds since epoch) followed by a 16 bits integer (time zone)
  • It will make sense to put the date in front of the markers. that would give markers sorting some semantic.

Parents:

We have multiple option for storing parents:

  1. Having an explicite field similar to successors (one byte to know how many parents, then parents)
  2. Having an explicite field but store the number of parent in the bit fields (since we never have more than 2 parents)
  3. Using the successors field. Having negative number of successors mean it is a prune.

Option (3) is the most space saving but prevent use to store parent information for more changesets if needed in the future (We do not have a final exchange plan yet).

Option (1) and (2) takes 2 to 8 bits more than (3) but are more flexible.

bit field

If we extend the bit field to 2 Bytes, it makes sense to use option (2) for storing parent.

1.2.3. Proposed Format

  • <date><timezone><number-of-successors(=N)><metadata-lenght(=M)><bits-field+nb-parents(=P)><precursor>(<successor>*N)(<parents>*P)<metadata>

  • L, h, B, I, H, 20s, (20s*N), (20s*P), s*M

The P number would be hidden in the bit field. We need to store 4 possible values here: 0 parents, 1 parent, 2 parents, ø parents information stored. Possible assignement is 00, 01, 10, 11. this let both 0 parent and ø parent info to be 0 module 3.

ChangesetEvolutionDevel (last edited 2020-05-29 08:03:48 by aayjaychan)