Differences between revisions 12 and 34 (spanning 22 versions)
Revision 12 as of 2014-03-07 23:59:45
Size: 1444
Editor: DavidSoria
Comment:
Revision 34 as of 2018-02-10 00:05:58
Size: 2056
Editor: AviKelman
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
<<Include(A:dev)>> {{{#!wiki caution
Line 3: Line 3:
This page describes the current plan to get a more modern and complete bundle format. (for old content of this page check [[BundleFormatHG19]]) This information was derived by reverse engineering. Some details may be incomplete. Hopefully someone with intimate familiarity with the code can improve it.}}}
Line 5: Line 5:
<<TableOfContents>> The v2 bundle file format is in practice quite similar to v1 (see BundleFormat), in that it comprises a file header followed by a changegroup, but it differs in a few significant ways.
Line 7: Line 7:
(current content is copy pasted from 2.9 sprint note) == Practical differences from v1 bundles ==
 * The file has a more verbose multi-stage ASCII header containing key:value pairs. (more below)
 * Zstandard compression (new default) also supported.
 * Uses version 2 deltagroup headers instead of version 1. (see the spec at [[Topic:internals.changegroups|help internals.changegroups]])
 * Everything after the header is shredded into N-byte chunks after it is assembled (N is a parameter defined in the source code).
Line 9: Line 13:
=== New bundle format === == Reading the header ==
Line 11: Line 15:
* lightweight
* new manifest
* general delta
* bookmarks
* phase boundaries
* obsolete markers
* >sha1 support
* pushkey
* extensible for new features (required and optional)
* progress information
* resumable?
* transaction commit markers?
    
It's possible to envision a format that sends a change, its manifest, and filenodes in each chunk rather than sending all changesets, then all manifests, etc.
capabilities
=== stage 1 ===
|| 'HG20' || Compression Chunk || rest of file ||
Line 27: Line 18:
=== New header === Compression Chunk will be either null or contain the ASCII 'Compression=XX' where XX is a code indicating which decompression to use on the rest of the file.
Line 29: Line 20:
{{{#!C
type Header struct {
    length uint32
    lNode byte
    node [lNode]byte
=== stage 2 ===
|||| rest of file from stage 1 ||
|| Parameters Chunk || shredded changegroup (and possibly other sections?) ||
Line 35: Line 24:
    // if empty (lP1 ==0) then default to previous node in the stream
    lP1 byte
    p1 [lP1]byte
Parameters Chunk contains (among possibly other things?) the fact that the file contains a changegroup ('\x0bCHANGEGROUP'), a null chunk, and then a complex nested sequence of two parameter categories. The nested sequence contains, first, indicators for how many key:value pairs are in the first category, followed by how many pairs are in the second category, followed by the length of an ASCII key, followed by the length of its ASCII value (repeated for all keys and values).
Line 39: Line 26:
    // if empty, nullrev
    lP2 byte
    p2 [lP2]byte

    // if empty, self (for changelogs)
    lLinknode byte
    linknode [lLinknode]byte

    // if empty, p1
    lDeltaParent byte
    deltaParent [lDeltaParent]byte
}
}}}
We'll modify the existing changegroup type so it can pretend to be a new changegroup that just has a variety of empty fields. Progress information fields might be optional.



----
CategoryNewFeatures
Example Parameters Chunk:
|| chunk length |||| description of contents || #section1 parameters || #section2 parameters || len(key1),len(value1) || len(key2),len(value2) || key1 || value1 || key2 || value2||
|| 4 bytes || \x0bCHANGEGROUP || 4 bytes null || \x01 || \x01 || \x07\x02 || \t\x01 || version || 02 || nbchanges || 7 ||

This information was derived by reverse engineering. Some details may be incomplete. Hopefully someone with intimate familiarity with the code can improve it.

The v2 bundle file format is in practice quite similar to v1 (see BundleFormat), in that it comprises a file header followed by a changegroup, but it differs in a few significant ways.

Practical differences from v1 bundles

  • The file has a more verbose multi-stage ASCII header containing key:value pairs. (more below)
  • Zstandard compression (new default) also supported.
  • Uses version 2 deltagroup headers instead of version 1. (see the spec at help internals.changegroups)

  • Everything after the header is shredded into N-byte chunks after it is assembled (N is a parameter defined in the source code).

Reading the header

stage 1

'HG20'

Compression Chunk

rest of file

Compression Chunk will be either null or contain the ASCII 'Compression=XX' where XX is a code indicating which decompression to use on the rest of the file.

stage 2

rest of file from stage 1

Parameters Chunk

shredded changegroup (and possibly other sections?)

Parameters Chunk contains (among possibly other things?) the fact that the file contains a changegroup ('\x0bCHANGEGROUP'), a null chunk, and then a complex nested sequence of two parameter categories. The nested sequence contains, first, indicators for how many key:value pairs are in the first category, followed by how many pairs are in the second category, followed by the length of an ASCII key, followed by the length of its ASCII value (repeated for all keys and values).

Example Parameters Chunk:

chunk length

description of contents

#section1 parameters

#section2 parameters

len(key1),len(value1)

len(key2),len(value2)

key1

value1

key2

value2

4 bytes

\x0bCHANGEGROUP

4 bytes null

\x01

\x01

\x07\x02

\t\x01

version

02

nbchanges

7

BundleFormat2 (last edited 2018-02-10 00:05:58 by AviKelman)