Differences between revisions 35 and 37 (spanning 2 versions)
Revision 35 as of 2014-03-04 21:56:17
Size: 10086
Comment: update D
Revision 37 as of 2014-03-08 00:08:07
Size: 12531
Comment:
Deletions are marked like this. Additions are marked like this.
Line 9: Line 9:
== Current strategy idea ==

=== Current idea ===


1. You want all markers '''relevant''' changeset common between source and
== Idea being current experimented (withing the evolve extension ==

=== Exchanged markers changes ===


1. You have all markers '''relevant''' changeset common between source and
Line 22: Line 22:
?. What shall we do on partial split push…


=== recent idea ===

One of the big issue it the necessity to know actual changeset graph to perform the '''"prune marker of direct children on this changeset"'''.

A radical idea is to store parent information of the precursors in all marker. But his mean a significant inflation of the size of marker. Matt Mackall does not like the idea (because of the extra size).

Pierre-Yves David suspect we could make all necessary computation by just storing parent information in prune marker only:

According to above definition we only care marker from a changeset if:

1) the successors is in the pushed set (actualy pushed or already common)
2) the successors is in a precursors of a marker we care about
3) this is prune marker and we care about the parent of the prune

This me we never need parent information elsewhere than one prune marker.
By this definition we includes splits markers where only part of the successors are pushed. This is as crazy case that we are unsure about.


=== format content changes ===

The above definition requires knownledge of the actual changeset graph to compute the '''"prune marker of direct children on this changeset"''' set.

As changeset referenced by markers may be unknown locally, we need to uncode this graph information in the markers themself. (so storing the parents node for the precursors at the same time than the precursors)

This would significantly inflate the size of markers.

So instead of storing it for all markers, we store only store parent information for prune markers (precursors with no successors). This is sufficient as we never look at parents information elsewhere than in the pruned changeset case.

=== Markers discovery changes ===

We are buiding an "obsolescence has tree"

Every node are associate a hash (called obshash from now). This obshash is a sha1 hash for "obshash of the parent + all relevant obsmarkers"

We can do standard discovery (the same than changeet one) on that.

==== Known limitation ====

Some of those limitation can be overcome quite easily, some other will be harder to fix.

===== does not detect common subset =====

If the destination has a super-set of source markers, this will be detected as "different markers on both side" and the source will send everything to the destination again. No new markers will be added, the obshash will still be different and this will happend again and again will all future exchange.

This happen when:

- You push to a server with more information than you
- You pull for a server with less information than you.

The pull case will be very common. For example mercurial contributor are likely to have markers that applies to changeset in selenic.com/hg/ but that never and will never make it to selenic.com.

Moreover when you have this extra information mismatch for one node, you'll get the same one for all descendant. Disabling the discovery benefit for a whole subtree.


===== Fragile to prune of children of old changeset =====

Chain that directly lead to a changeset (as successors) should be fairly stable. It's possible for someone to come up with old marker at the precursors end of the chain, but it should be fairly rare.

However, the obshash also containts data about the pruned children. And it will be much more common to see people adding markers that prune a children of an old changeset. eg: if someone leave a project for 6 months, it will probably prune multiple draft changeset when he comes back.

This mean invalidation of the obshash of an old changeset and all its descendant leading to the resend of all markers applying to 6 months all changesets)

===== Sending Whole chain all the time =====

This discovery said if the whole chain is known or not. which mean that each new changeset created, adding a new markers to the chain will resend everything in the chain. For changeset that got rewritten a lot will be an issue. For changeset that will never get public (-not- recommended, but you know… users…) that'll be worse.

Note that in practice the median chain length is fairly low (2 for mercurial-devel repos)

Obsolescence Markers exchange

List of case and expected behavior when exchanging obsolesence marker

/!\ This page is intended for developer

1. Idea being current experimented (withing the evolve extension

1.1. Exchanged markers changes

1. You have all markers relevant changeset common between source and destination to be exchanged 2. Marker relevant to a changeset are:

  • marker that use this changeset as successors
  • prune marker of direct children on this changeset.
  • recursive application of the two rules on precursors store in those marker

By this definition we includes splits markers where only part of the successors are pushed. This is as crazy case that we are unsure about.

1.2. format content changes

The above definition requires knownledge of the actual changeset graph to compute the "prune marker of direct children on this changeset" set.

As changeset referenced by markers may be unknown locally, we need to uncode this graph information in the markers themself. (so storing the parents node for the precursors at the same time than the precursors)

This would significantly inflate the size of markers.

So instead of storing it for all markers, we store only store parent information for prune markers (precursors with no successors). This is sufficient as we never look at parents information elsewhere than in the pruned changeset case.

1.3. Markers discovery changes

We are buiding an "obsolescence has tree"

Every node are associate a hash (called obshash from now). This obshash is a sha1 hash for "obshash of the parent + all relevant obsmarkers"

We can do standard discovery (the same than changeet one) on that.

1.3.1. Known limitation

Some of those limitation can be overcome quite easily, some other will be harder to fix.

1.3.1.1. does not detect common subset

If the destination has a super-set of source markers, this will be detected as "different markers on both side" and the source will send everything to the destination again. No new markers will be added, the obshash will still be different and this will happend again and again will all future exchange.

This happen when:

- You push to a server with more information than you - You pull for a server with less information than you.

The pull case will be very common. For example mercurial contributor are likely to have markers that applies to changeset in selenic.com/hg/ but that never and will never make it to selenic.com.

Moreover when you have this extra information mismatch for one node, you'll get the same one for all descendant. Disabling the discovery benefit for a whole subtree.

1.3.1.2. Fragile to prune of children of old changeset

Chain that directly lead to a changeset (as successors) should be fairly stable. It's possible for someone to come up with old marker at the precursors end of the chain, but it should be fairly rare.

However, the obshash also containts data about the pruned children. And it will be much more common to see people adding markers that prune a children of an old changeset. eg: if someone leave a project for 6 months, it will probably prune multiple draft changeset when he comes back.

This mean invalidation of the obshash of an old changeset and all its descendant leading to the resend of all markers applying to 6 months all changesets)

1.3.1.3. Sending Whole chain all the time

This discovery said if the whole chain is known or not. which mean that each new changeset created, adding a new markers to the chain will resend everything in the chain. For changeset that got rewritten a lot will be an issue. For changeset that will never get public (-not- recommended, but you know… users…) that'll be worse.

Note that in practice the median chain length is fairly low (2 for mercurial-devel repos)

2. Graph Outline

    ○ ← a changeset,
    ◔ ← changeset being pushed
    ● ← changeset that exist remotly before the push.
    ◕ ← changeset that exist remotly but is not selected by the push
    ⊗ ← pruned changeset
    ø ← obsolete changeset with a precursors
    ◌ ← changeset that does not exist locally but are present in marker history
    ✕ ← changeset that does not exist locally but are pruned in marker history
    ⇠ ← obsolescence marker from that point (if not poiting to anything this mean we do not care about what is point to)

3. A. Simple Case

3.1. A.1 pushing a single heads

3.1.1. A.1.1 pushing a single head

    ⇠◔ A
     |
     ●  O

Marker exist from:

  • A

Command run:

  • hg push -r A
  • hg push

Expected exchange:

  • chain from A

3.1.2. A.1.2 pushing a multiple changeset into a single head

     ◔ B
     |
    ⇠◔ A
     |
     ● O

Marker exist from:

  • A

Command run:

  • hg push -r B
  • hg push

Expected exchange:

  • chain from A

3.2. A.2 Two heads

    ⇠○ B
  ⇠◔ | A
   |/
   ● O

Marker exist from:

  • A
  • B

Command run:

  • hg push -r A

Expected exchange:

  • chain from A

Expected Exclude:

  • chain from B

3.3. A.3 new branch created

  B' ○⇢ø B
     | |
     \Aø⇠◔ A'
      \|/
       ● O

Marker exist from:

  • Aø⇠○ A'

  • Bø⇠○ B'

Command run:

  • hg push -r A

Expected exchange:

  • chain from A

Expected Exclude:

  • chain from B

If A and B are remontly known, we should expect:

  • hg push will complain about the new head

  • hg push should complain about unstable history creation

3.4. A.4 Push in the middle of the obsolescence chain

(Where we show that we should not push the marker without the successors)

  B ◔
    |
  A⇠ø⇠○ A'
    |/
    ● O

Marker exist from:

  • Aø⇠○ A'

  • chain from A

Command run:

  • hg push -r B

Expected exchange:

  • Chain from A

Expected Exclude:

  • Aø⇠○ A'

3.5. A.5 partial reordering

  B ø⇠⇠
    | ⇡
  A ø⇠⇠⇠○ A'
    | ⇡/
    | ○ B'
    |/
    ● O

Marker exist from:

  • Aø⇠○ A'

  • Bø⇠○ B'

Command run:

  • hg push -r B

Expected exchange:

  • Bø⇠○ B'

Expected Exclude:

  • Aø⇠○ A'

3.6. A.6 between existing changeset

  A ◕⇠● B
    |/
    ● O

Marker exist from:

  • A◕⇠● B

Command run:

  • hg push -r B
  • hg push

Expected exchange:

  • A◕⇠● B

3.7. A.7 Non targeted common changeset

   ⇠◕ A
    |
    ● O

Marker exist from:

  • Chain from A

Command run:

  • hg push -r O

Expected exchange:

  • ø

Expected exclude:

  • Chain from A

4. B. Deletion Case

Most B case can be read with

4.1. B.1 Pruned changeset atop the pushed set

    ⊗ B
    |
    ◔ A
    |
    ● O

Marker exist from:

  • B (prune)

Command run:

  • hg push -r A
  • hg push

Expected exchange:

  • prune marker for B

4.2. B.2 Pruned changeset on head. nothing pushed

    ⊗ A
    |
    ● O

Marker exist from:

  • A (prune)

Command run:

  • hg push -r O
  • hg push

Expected exchange:

  • prune marker for A

4.3. B.3 Pruned changeset on non-pushed part of the history

  ⊗ C
  |
  ○ B
  | ◔ A
  |/
  ● O

Marker exist from:

  • C (prune)

Command run:

  • hg push -r A
  • hg push

Expected exchange:

  • ø

Expected Exclude:

  • chain from B

4.4. B.4 Pruned changeset on common part of history

  ⊗ C
  | ● B
  | |
  | ● A
  |/
  ● O

Marker exist from:

  • C (prune)

Command run:

  • hg push -r B
  • hg push

Expected exchange:

  • prune for C

4.5. B.5 Push of a children of changeset which successors is pruned

This case Mirror A.4, with pruned changeset successors.

  B ◔
    |
  A⇠ø⇠⊗ A'
    |/
    ● O

Marker exist from:

  • Aø⇠○ A'

  • chain from A
  • A'

Command run:

  • hg push -r B

Expected exchange:

  • Aø⇠○ A'

  • chain from A
  • A'

Extra Note:

  • I'm not totally happy about this case and I believe some more complicated graph can result in behavior wuite confusing for the user (if some tool create prune maker in a the middle of a valid chain)

4.6. B.6 Pruned changeset with ancestors not in pushed set

  B ø⇠⊗ B'
    | |
  A ○ |
    |/
    ● O

Marker exist from:

  • Bø⇠⊗ B'

  • B' prune

Command run:

  • hg push -r O

Expected exchange:

  • Bø⇠⊗ B'

  • B' prune

4.7. B.7 Prune on non targeted common changeset

    ⊗ B
    |
    ◕ A
    |
    ● O

Marker exist from:

  • B (prune)

Command run:

  • hg push -r O

Expected exchange:

  • ø

5. C. Advance Case

5.1. C.1 Multiple pruned changeset atop each other

  ⊗ B
  |
  ⊗ A
  |
  ● O

Marker exist from:

  • A (prune)
  • B (prune)

Command run:

  • hg push -r O
  • hg push

Expected exchange:

  • A (prune)
  • B (prune)

5.2. C.2 Pruned changeset on precursors

  B ⊗
    |
  A ø⇠◔ A'
    |/
    ● O

Marker exist from:

  • A' succeed to A
  • B (prune)

Command run:

  • hg push -r A'
  • hg push

Expected exchange:

  • A ø⇠o A'

  • B (prune)

5.3. C.3 Pruned changeset on precursors of another pruned one

  B ⊗
    |
  A ø⇠⊗ A'
    |/
    ● O

Marker exist from:

  • A' succeed to A
  • A' (prune
  • B (prune)

Command run:

  • hg push -r A'
  • hg push

Expected exchange:

  • A ø⇠⊗ A'

  • A (prune)
  • B (prune)

5.4. C.4 multiple successors, one is pruned

Another case were prune are confusing? (A is killed without its successors being pushed)

(could split of divergence, if split see the Z section)

       A
   B ○⇢ø⇠⊗ C
      \|/
       ● O

Marker exist from:

  • A ø⇠○ B

  • A ø⇠○ C

  • C (prune)

Command run:

  • hg push -r O

Expected exchange:

  • A ø⇠○ C

  • C (prune)

Expected exclude:

  • A ø⇠○ B

6. D. Partial Information Case

From then we have changeset missing from the repo but still referenced in obsolescence marker. This has an impact on the knowledge we have from the graph topology.

About any of the above Case could be used too, just drop local knownledge of some/all obsolete changeset.

6.1. D.1 Pruned changeset based on a missing precursors of something we push

  B ⊗
    |
  A ◌⇠◔ A'
     /
    ● O

Marker exist from:

  • A' succeed to A
  • B (prune)

Command run:

  • hg push -r A'
  • hg push

Expected exchange:

  • A ø⇠o A'

  • B (prune)

6.2. D.2 missing prune target (prune in "pushed set")

  A ø⇠✕ A'
    |/
    ● O

Marker exist from:

  • A' succeed to A
  • A' (prune)

Command run:

  • hg push

Expected exchange:

  • A ø⇠o A'

  • A' (prune)

6.3. D.3 missing prune target (prune Not in "pushed set")

(this is one of the case were is will be hard to be non-confusing)

  A ø⇠✕ A'
    | |
    | ○ B
    |/
    ● O

Marker exist from:

  • A' succeed to A
  • A' (prune)

Command run:

  • hg push -r O

(shall we account for a secret B?

Expected exchange:

  • nothing?

6.4. D.4 Unknown changeset in between known one

Mostly a clarification case

    ø⇠◌⇠○
    | |/
    | ◔
    |/
    ● O

Should be treated as A.3 case:

    ø⇠○
    | |
    | ◔
    |/
    ● O

6.5. D.5 Unknown changeset in between known one

7. Z. Crazy case

When I'm note very sure about what we should do

7.1. Z.1 partial push of split

   D'○⇢ø D
     | | A
   B ○⇢ø⇠◔ C
      \|/
       ● O

Marker exist from:

  • A ø⇠⚭ (B,C) (split)

  • D ø⇠○ D'

Command run:

  • hg push -r C

Expected exchange:

  • We should probably send the whole marker anyway. But what about things related to B children
  • A ø⇠⚭ (B,C) (split)

Expected exclude:

  • D ø⇠○ D'

CEDObsmarkersExchange (last edited 2018-03-04 20:25:36 by BorisFeld)