Differences between revisions 7 and 8
Revision 7 as of 2014-08-14 00:19:50
Size: 1838
Comment: add a quick summary of the V1 format
Revision 8 as of 2014-08-14 00:48:03
Size: 3146
Comment: add some idea about possible markers changes.
Deletions are marked like this. Additions are marked like this.
Line 17: Line 17:
quick summary: ==== quick summary ====
Line 23: Line 23:

==== longer explanation ====
Line 50: Line 52:

==== motivation ====
Line 55: Line 60:

  * We may want to extend the bit field to 2 bytes. We currently use 1 and can see use case for 3-5 others (tracking the type of changes introduce by the rewriting (desc, patches, content, etc) so we are running short

==== possible change ====

'''Date''':

 * The date will be a 64bits integer (for seconds since epoch) followed by a 16 bits integer (time zone)

 * I will make sense to put the date in front of the markers. that would give markers sorting some semantic.

'''Parents''':

We have multiple option for storing parents:

 1. Having an explicite field similar to successors (one byte to know how many parents, then parents)

 2. Having an explicite field but store the number of parent in the bit fields (since we never have more than 2 parents)

 3. Using the successors field. Having negative number of successors mean it is a prune.

Option (3) is the most space saving but prevent use to store parent information for more changesets if needed in the future (We do not have a final exchange plan yet).

Option (1) and (2) takes 2 to 8 bits more than (3) but are more flexible.

'''bit field'''

If we extend the bit field to 2 Bytes, it makes sense to use option (2) for storing parent.


 

Implementation Details about Changesets Evolution

/!\ This page is intended for developer

For a user perspective have a look at the ChangesetEvolution page.

Obsstore Format

Markers are stored in an append-only file stored in '.hg/store/obsstore'.

V1 (current) Format

(see in line document for latest data)

quick summary

  • <number-of-successors(=N)><metadata-lenght(=M)><bits-field><precursor>(<successor>*N)<metadata>

  • B, I, B, 20s, (20s*N), s*M

longer explanation

The file starts with a version header:

  • 1 unsigned byte: version number, starting at zero.

The header is followed by the markers. Each marker is made of:

  • 1 unsigned byte: number of new changesets "N", can be zero.
  • 1 unsigned 32-bits integer: metadata size "M" in bytes.
  • 1 byte: a bit field. It is reserved for flags used in common
    • obsolete marker operations, to avoid repeated decoding of metadata entries.
  • 20 bytes: obsoleted changeset identifier.
  • N*20 bytes: new changesets identifiers.
  • M bytes: metadata as a sequence of nul-terminated strings. Each
    • string contains a key and a value, separated by a colon ':', without additional encoding. Keys cannot contain '\0' or ':' and values cannot contain '\0'.

V2 (current) Format

motivation

There is two extra information we would like to see in a second version of the format:

  • date: There is currently *always* a date in the meta data. So storing it explicitly would be more space efficient. It would also open the way to quickly access the date for sorting purpose (no use case yet but not crazy to think about it)
  • parents: When a changesets is pruned (obsoleted, no successors) we needs to records its parents. This is necessary to link the markers chain to the push/pull operation it is relevant to.
  • We may want to extend the bit field to 2 bytes. We currently use 1 and can see use case for 3-5 others (tracking the type of changes introduce by the rewriting (desc, patches, content, etc) so we are running short

possible change

Date:

  • The date will be a 64bits integer (for seconds since epoch) followed by a 16 bits integer (time zone)
  • I will make sense to put the date in front of the markers. that would give markers sorting some semantic.

Parents:

We have multiple option for storing parents:

  1. Having an explicite field similar to successors (one byte to know how many parents, then parents)
  2. Having an explicite field but store the number of parent in the bit fields (since we never have more than 2 parents)
  3. Using the successors field. Having negative number of successors mean it is a prune.

Option (3) is the most space saving but prevent use to store parent information for more changesets if needed in the future (We do not have a final exchange plan yet).

Option (1) and (2) takes 2 to 8 bits more than (3) but are more flexible.

bit field

If we extend the bit field to 2 Bytes, it makes sense to use option (2) for storing parent.

ChangesetEvolutionDevel (last edited 2020-05-29 08:03:48 by aayjaychan)