Size: 10005
Comment: fix A.2
|
Size: 23438
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
## page was renamed from ChangesetEvolutionDevelObsmarkersExchange ## page was renamed from ObsolescenceMarkersExchange #pragma section-numbers 2 |
|
Line 9: | Line 12: |
== Current strategy idea == === Current idea === 1. You want all markers '''relevant''' changeset common between source and |
== Idea being current experimented (withing the evolve extension) == === Exchanged markers changes === This discuss '''behavior''' part of the exchange. What do we exchange when. Experimenting with this for two years, we are happy with it. 1. You have all markers '''relevant''' changeset common between source and |
Line 16: | Line 20: |
Line 17: | Line 22: |
Line 22: | Line 26: |
?. What shall we do on partial split push… === recent idea === One of the big issue it the necessity to know actual changeset graph to perform the '''"prune marker of direct children on this changeset"'''. A radical idea is to store parent information of the precursors in all marker. But his mean a significant inflation of the size of marker. Matt Mackall does not like the idea (because of the extra size). Pierre-Yves David suspect we could make all necessary computation by just storing parent information in prune marker only: According to above definition we only care marker from a changeset if: 1) the successors is in the pushed set (actualy pushed or already common) 2) the successors is in a precursors of a marker we care about 3) this is prune marker and we care about the parent of the prune This me we never need parent information elsewhere than one prune marker. |
By this definition we includes splits markers where only part of the successors are pushed. This is as crazy case that we are unsure about. === format content changes === This discuss '''storage format''' for support the ''behavior'' in an efficient way. This change have been implemented in core and give satisfaction so far. The above definition requires knowledge of the actual changeset graph to compute the '''"prune marker of direct children on this changeset"''' set. As changeset referenced by markers may be unknown locally, we need to uncode this graph information in the markers themself. (so storing the parents node for the precursors at the same time than the precursors) This would significantly inflate the size of markers. So instead of storing it for all markers, we store only store parent information for prune markers (precursors with no successors). This is sufficient as we never look at parents information elsewhere than in the pruned changeset case. === Markers discovery changes === This discuss method to get a '''efficient and fast exchange''' of the markers, restricting to necessary data. The method currently experimented is '''know to be too weak for our need'''. ==== Current Experiment: Changeset Tree approach (need improvement) ==== We are building and tracking a set of hashes from obsolescence data. This set is based on the changesets graph. Every node are associate a hash (called obshash from now). This obshash of C is a sha1 hash for "obshash of C's parents + all obsmarkers relevant to C" Because we use hashes of the parents, we can do standard discovery (the same than changeset one) on the result. However, this use of the parents hashs introduce severe drawback, some easy to overcome, others harder to fix. ====== Lack of Subsets Detection ====== <<Anchor(nosubset)>> If the destination has a super-set of source markers, this will be detected as "different markers on both side" and the source will send everything to the destination again. No new markers will be added, the obshash will still be different and this will happen again and again for all future exchanges. Common case for this to happen "perpetually" are: - You push to a server with more information than you - You pull for a server with less information than you. The pull case will be very common. For example mercurial contributor are likely to have markers that applies to changeset in selenic.com/hg/ but that never and will never make it to selenic.com. Moreover when you have this extra information mismatch for one node, you'll get the same one for all descendant. Disabling the discovery benefit for a whole subtree. See the [[#subhashes|sub hashes tracking idea]] for lead on how to solve this. ====== Fragile to prune of children of old changeset (and push of forgotten ancient data) ====== <<Anchor(fragiletoparent)>> A chain that directly leads to a changeset (as successors) should be fairly stable. It's possible for someone to come up with old marker at the precursors end of the chain, but it should be fairly rare. However, the obshash also containts data about the pruned children. And it will be much more common to see people adding markers that prune a children of an old changeset. eg: if someone leave a project for 6 months, it will probably prune multiple draft changesets when he comes back. This mean invalidation of the obshash of an old changeset and all its descendant leading to the resend of all markers applying to 6 months all changesets) See the [[#slicing|graph slicing idea]] for lead on how to solve this. ====== Sending Whole chain all the time ====== <<Anchor(wholechain)>> This discovery said if the whole chain is known or not. which mean that each new changeset created, adding a new markers to the chain will resend everything in the chain. For changeset that got rewritten a lot will be an issue. For changeset that will never get public (-not- recommended, but you know… users…) that'll be worse. Note that in practice the median chain length is fairly low (2 for mercurial-devel repo) ==== Idea for Next Step Experiment ==== ===== Graph slicing ===== <<Anchor(slicing)>> The previous experiment basically reuse the same logic as the changeset discover. Defining a subset of revision null::REV using REV as an identifier and comparing the obsmarker hash for this subset. However, as the model is not append only, this is a bit too fragile. So the idea would be to have more flexibility in the definition of the subset. For example START::REV and do a wide → narrow search of the space. This mean (1) dropping parent information from the hash (2) requesting hash using the `START`, `END` pair instead of just `END`. This would solve the [[ObsolescenceMarkersExchange#fragiletoparent|issue with pruned changeset and newly pushed data on ancient changeset]] '''Simplified explanation''' Let's explain the idea on a a linear history first for simplicity. We have a linear history with 128 changesets. We ask the hash of all obsmarkers relevant for changeset `0::127`, then `0::63` and `64::127`, then `0::31`, `32::63`, `64::95`, `96::127`, etc. At any time if the hash for a slice matches, we can stop digging deeper. otherwise we keep subdividing. Of course, request won't be made one at a time and multiple requests for hashes of superset and subset would we batched in a single round trip. For the purpose of caching and cross-repo convergence, the scheme for subdivision should be stable. If we have 5 changesets, we probe the following slices `0::1`, `2::3`, `0::3`, `0::4`. When we grow `to 0::6` we want to be able to reuse slices: `0::1`, `2::3`, `0::3`. We introduce a new one `4::5` (That new slice is "final" and will stay in use forever). We also add `0::5` that will stay in use as long as nothing is exchange above `5`. Having a stable scheme is important for 2 reasons: 1) '''Local Growth''': By reusing as many as the same slice (independently of the probed set) we allow efficient caching of the hash of all relevant markers. The is expected to be efficient as the set of relevant markers of each slice will be "final" in most case (but can never be guaranteed to be) so once computed, chance for invalidation are low as the repository grow. 2) '''Cross Repositories Stability''': repository discovering markers from each others will have various sizes and content (including case where content will not converge), having the same scheme for slicing maximize cache reuse regardless of what other repos one is talking to. '''Extra complexity in Mercurial''' Mercurial have branching and merging, so we cannot simply use a 2^N slicing based on index as shown above. We have to build a a slicing algorithm able to explore branches in an efficient and stable way. Keep in mind that branching might be introduced in the repository later or even know by only one side of the discovery, so "branch point" cannot be treated as useful boundary. ===== Obsmarker Subset caching ===== <<Anchor(subhashes)>> The second major issue we face with the current experiment is the [[ObsolescenceMarkersExchange#nosubset|lack of detection of a common subset]]. If I pull from a repository having markers `A+B` but I've `A+B+C` locally, the hash will be different and the discovery will not detect that nothing needs to be pulled. The idea here would be to exchange not just the hash of the relevant markers for a slices, but to also keep around and exchange a small amount of hashes for frequent subset of the relevant markers. For example we have repo-1 with `A+B` (hashed as `h1`), repo-2 with `B+C` (hashed as `h2`) and repo-3 with `A+B+C` (hashed as `h3`). When repo-3 pull from repo-1, It would receive `h1` and recognize it as a subset of `h3`. Same would go when repo-2 push to repo-3. repo-3 would send the data that `h3` is its exact hash, but also that `h1` and `h2` are common subset. repo-2 would recognize `h2` as its local content and skip the push of these markers. '''Keeping track of these subset''' means making the obsmarkers application smarter. Initially, something has to notice than despite the fact that `h1 != h3`, all markers sent for `h1` were know locally. The subset relation could also be fed when adding new markers. If a cached hash is invalidated when new markers are added, the old value used as a known subset right away. In addition we could compute a set of relevant marker without taking "prune" into account improve another problematic aspect of the current approach. To avoid unbounded growth, we need some kind of least frequently used cache structure. Possibly with adaptive cache size to handle various project geography. ==== Proposal ==== See the previous section for constraint and idea behind this proposal ===== Stable Ordering ===== The order of revisions '::HEADS' is computed as such: * depth first traversal of the tree starting from nullrev, * In case of branching, we explore the branch point with the lowest node value first, * We do not iterate over merge until we reach them through both parents. Properties: * order is stable across repositories (rev number are not used anywhere). (strict requirement to be used in discovery) * The order of revision in `::MERGE` is a strict super set `::P` where P is one of the two merges parents. This adds stability when the repository growth, and helps caching. Constraints: * We need an accurate children maps to traverse the tree from bottom to top * We need an accurate children maps to detect branch point ===== Slicing ===== A "range" is defined as a (`HEAD`, `START`) pair. `HEAD` is a single revision, `START` is an The (`HEAD`, `START`) range contains all revisions in `stablesort('::HEAD')[START:]`. tandard slicing to a range. The ideal standard slicing of range takes a linear range of `2**N` elements into 2 ranges of `2**N-1` elements. In reality, ranges have arbitrary number of element, and the tree is not linear. So here is the pseudo algorithm: /!\ Unsurprisingly, this pseudo code is full of various error :-/ {{{ def depth(rev): """The "depth" of a revision in th graph""" return len(ancestors(rev)) # includes rev def slice(input_range, slice_idx): """a slice a range at a given index return a list of ranges (two elements or more) """ # The input range have 1 head by definition, # so that top part is also guarantee to have one head top = range(head=input_range.head, start=input_range.start + slice_idx) bottom_revs = input_range[:slice_point] # However The remainder might have multiple heads bottom_heads = heads(bottom_revs) # keep order from input # if we have only one head, things are simple: if 1 == len(bottom_heads): bottom = range(head=bottom_revs[-1], start=start) return [bottom, top] # otherwise, we need to build a subrange for all heads else: # not: because of the topological ordering, all heads but one are garanteed to be some merge parent subslices = [] for branch_head in heads: branch_revs = [r for r in bottom_revs if r in ancestors(branch_head)] branch_start = depth(branch_revs[0]) -1 subslices.append(range(head=branch_head, start=branch_start) return subslices + [top] def standard_subslice(input_range): """slice a range into its standard subrange return a list of range.""" assert 1 < len(input_range) # check if the input range starts at a standard point depth(input_range.head) step = 2 ** int(log2(len(input_range) -1)) standard_start = 0 start_ok = None while standard_start < start: standard_start += step step /= 2 if start == standard_start: # This range is "standard", let us slice it into subslice # # We skim the top part and hopefully leave 2 ** N elements in the bottom part target_size = 2 ** int(log2(len(input_range) -1)) return slice(input_range, target_size) else: # this range starts at a non-standard point, let us extract the standard part of it return slice(input_range, standard_start - start) }}} Property: * a range of `N` revs is can be sliced in `O(nb_merge)` ranges. * recursive slicing of a range containing `N` revs produces `O(Nlog(N)*nb_merge)` ranges * focus on reusing standard sub-range in multiple super-range ===== Actual discovery ===== We do discovery using "range-obshash" (associated to ranges), if "range-obshash" is the same, we know the content if similar for that range. If the content is different, we query the standard-subranges to pin point the difference. * "range-obshash" of a 1 changeset range, is the hash of the marker relevant to it. * "range-obshash" is the hash of its standard subrange "range-obshash" ===== Caching ===== Here is the list of information we could cache for this discovery: * depth of a changesets (changeset → deps): immutable for each changeset, trivial to compute but for merges, * Standard slicing of "range" (range → standard-subrange): immutable for each range, trivial to compute as long as merge is not involved. * obshash of range (range → obshash): Long lived for each range, we need to detect when marker relevant to a changeset changes and invalidated range including it. (note: worth case invalidation seems `O(N log(N))` for N descendant, But after such invalidation, not all range will be re-used) * For each merge: caching highest common ancestors and parents order allow todo a stable topological traversal from head to bottom. |
Line 49: | Line 270: |
ø ← obsolete changeset with a precursors | ø ← obsolete changeset with successors |
Line 52: | Line 273: |
⇠ ← obsolescence marker from that point (if not poiting to anything this mean we do not care about what is point to) | ⇠ ← obsolescence marker from that point (if not pointing to anything this mean we do not care about what is point to) |
Line 158: | Line 379: |
Extra node: | If A and B are remontly known, we should expect: |
Line 266: | Line 488: |
* `A◕⇠● B` | * ø Expected exclude: * Chain from A |
Line 269: | Line 495: |
Most B case can be read with |
|
Line 442: | Line 670: |
* hg push Expected exchange: * B (prune) |
Expected exchange: * ø |
Line 570: | Line 797: |
=== D.1 Pruned changeset based on a missing precursors of something we miss === | === D.1 Pruned changeset based on a missing precursors of something we push === |
Line 576: | Line 803: |
|/ | / |
Line 617: | Line 844: |
=== D.2 missing prune target (prune Not in "pushed set") === | === D.3 missing prune target (prune Not in "pushed set") === |
Line 706: | Line 933: |
---- CategoryDeveloper CategoryEvolution |
Obsolescence Markers exchange
List of case and expected behavior when exchanging obsolesence marker
This page is intended for developer
Contents
- Idea being current experimented (withing the evolve extension)
- Graph Outline
- A. Simple Case
-
B. Deletion Case
- B.1 Pruned changeset atop the pushed set
- B.2 Pruned changeset on head. nothing pushed
- B.3 Pruned changeset on non-pushed part of the history
- B.4 Pruned changeset on common part of history
- B.5 Push of a children of changeset which successors is pruned
- B.6 Pruned changeset with ancestors not in pushed set
- B.7 Prune on non targeted common changeset
- C. Advance Case
- D. Partial Information Case
- Z. Crazy case
1. Idea being current experimented (withing the evolve extension)
1.1. Exchanged markers changes
This discuss behavior part of the exchange. What do we exchange when. Experimenting with this for two years, we are happy with it.
1. You have all markers relevant changeset common between source and destination to be exchanged
2. Marker relevant to a changeset are:
- marker that use this changeset as successors
- prune marker of direct children on this changeset.
- recursive application of the two rules on precursors store in those marker
By this definition we includes splits markers where only part of the successors are pushed. This is as crazy case that we are unsure about.
1.2. format content changes
This discuss storage format for support the behavior in an efficient way. This change have been implemented in core and give satisfaction so far.
The above definition requires knowledge of the actual changeset graph to compute the "prune marker of direct children on this changeset" set.
As changeset referenced by markers may be unknown locally, we need to uncode this graph information in the markers themself. (so storing the parents node for the precursors at the same time than the precursors)
This would significantly inflate the size of markers.
So instead of storing it for all markers, we store only store parent information for prune markers (precursors with no successors). This is sufficient as we never look at parents information elsewhere than in the pruned changeset case.
1.3. Markers discovery changes
This discuss method to get a efficient and fast exchange of the markers, restricting to necessary data. The method currently experimented is know to be too weak for our need.
1.3.1. Current Experiment: Changeset Tree approach (need improvement)
We are building and tracking a set of hashes from obsolescence data. This set is based on the changesets graph.
Every node are associate a hash (called obshash from now). This obshash of C is a sha1 hash for "obshash of C's parents + all obsmarkers relevant to C"
Because we use hashes of the parents, we can do standard discovery (the same than changeset one) on the result.
However, this use of the parents hashs introduce severe drawback, some easy to overcome, others harder to fix.
1.3.1.1. Lack of Subsets Detection
If the destination has a super-set of source markers, this will be detected as "different markers on both side" and the source will send everything to the destination again. No new markers will be added, the obshash will still be different and this will happen again and again for all future exchanges.
Common case for this to happen "perpetually" are:
- You push to a server with more information than you - You pull for a server with less information than you.
The pull case will be very common. For example mercurial contributor are likely to have markers that applies to changeset in selenic.com/hg/ but that never and will never make it to selenic.com.
Moreover when you have this extra information mismatch for one node, you'll get the same one for all descendant. Disabling the discovery benefit for a whole subtree.
See the sub hashes tracking idea for lead on how to solve this.
1.3.1.2. Fragile to prune of children of old changeset (and push of forgotten ancient data)
A chain that directly leads to a changeset (as successors) should be fairly stable. It's possible for someone to come up with old marker at the precursors end of the chain, but it should be fairly rare.
However, the obshash also containts data about the pruned children. And it will be much more common to see people adding markers that prune a children of an old changeset. eg: if someone leave a project for 6 months, it will probably prune multiple draft changesets when he comes back.
This mean invalidation of the obshash of an old changeset and all its descendant leading to the resend of all markers applying to 6 months all changesets)
See the graph slicing idea for lead on how to solve this.
1.3.1.3. Sending Whole chain all the time
This discovery said if the whole chain is known or not. which mean that each new changeset created, adding a new markers to the chain will resend everything in the chain. For changeset that got rewritten a lot will be an issue. For changeset that will never get public (-not- recommended, but you know… users…) that'll be worse.
Note that in practice the median chain length is fairly low (2 for mercurial-devel repo)
1.3.2. Idea for Next Step Experiment
1.3.2.1. Graph slicing
The previous experiment basically reuse the same logic as the changeset discover. Defining a subset of revision null::REV using REV as an identifier and comparing the obsmarker hash for this subset. However, as the model is not append only, this is a bit too fragile. So the idea would be to have more flexibility in the definition of the subset. For example START::REV and do a wide → narrow search of the space. This mean (1) dropping parent information from the hash (2) requesting hash using the START, END pair instead of just END.
This would solve the issue with pruned changeset and newly pushed data on ancient changeset
Simplified explanation
Let's explain the idea on a a linear history first for simplicity.
We have a linear history with 128 changesets. We ask the hash of all obsmarkers relevant for changeset 0::127, then 0::63 and 64::127, then 0::31, 32::63, 64::95, 96::127, etc. At any time if the hash for a slice matches, we can stop digging deeper. otherwise we keep subdividing. Of course, request won't be made one at a time and multiple requests for hashes of superset and subset would we batched in a single round trip.
For the purpose of caching and cross-repo convergence, the scheme for subdivision should be stable. If we have 5 changesets, we probe the following slices 0::1, 2::3, 0::3, 0::4. When we grow to 0::6 we want to be able to reuse slices: 0::1, 2::3, 0::3. We introduce a new one 4::5 (That new slice is "final" and will stay in use forever). We also add 0::5 that will stay in use as long as nothing is exchange above 5.
Having a stable scheme is important for 2 reasons:
1) Local Growth: By reusing as many as the same slice (independently of the probed set) we allow efficient caching of the hash of all relevant markers. The is expected to be efficient as the set of relevant markers of each slice will be "final" in most case (but can never be guaranteed to be) so once computed, chance for invalidation are low as the repository grow.
2) Cross Repositories Stability: repository discovering markers from each others will have various sizes and content (including case where content will not converge), having the same scheme for slicing maximize cache reuse regardless of what other repos one is talking to.
Extra complexity in Mercurial
Mercurial have branching and merging, so we cannot simply use a 2^N slicing based on index as shown above. We have to build a a slicing algorithm able to explore branches in an efficient and stable way. Keep in mind that branching might be introduced in the repository later or even know by only one side of the discovery, so "branch point" cannot be treated as useful boundary.
1.3.2.2. Obsmarker Subset caching
The second major issue we face with the current experiment is the lack of detection of a common subset. If I pull from a repository having markers A+B but I've A+B+C locally, the hash will be different and the discovery will not detect that nothing needs to be pulled.
The idea here would be to exchange not just the hash of the relevant markers for a slices, but to also keep around and exchange a small amount of hashes for frequent subset of the relevant markers.
For example we have repo-1 with A+B (hashed as h1), repo-2 with B+C (hashed as h2) and repo-3 with A+B+C (hashed as h3).
When repo-3 pull from repo-1, It would receive h1 and recognize it as a subset of h3. Same would go when repo-2 push to repo-3. repo-3 would send the data that h3 is its exact hash, but also that h1 and h2 are common subset. repo-2 would recognize h2 as its local content and skip the push of these markers.
Keeping track of these subset means making the obsmarkers application smarter. Initially, something has to notice than despite the fact that h1 != h3, all markers sent for h1 were know locally.
The subset relation could also be fed when adding new markers. If a cached hash is invalidated when new markers are added, the old value used as a known subset right away. In addition we could compute a set of relevant marker without taking "prune" into account improve another problematic aspect of the current approach.
To avoid unbounded growth, we need some kind of least frequently used cache structure. Possibly with adaptive cache size to handle various project geography.
1.3.3. Proposal
See the previous section for constraint and idea behind this proposal
1.3.3.1. Stable Ordering
The order of revisions '::HEADS' is computed as such:
- depth first traversal of the tree starting from nullrev,
- In case of branching, we explore the branch point with the lowest node value first,
- We do not iterate over merge until we reach them through both parents.
Properties:
- order is stable across repositories (rev number are not used anywhere). (strict requirement to be used in discovery)
The order of revision in ::MERGE is a strict super set ::P where P is one of the two merges parents. This adds stability when the repository growth, and helps caching.
Constraints:
- We need an accurate children maps to traverse the tree from bottom to top
- We need an accurate children maps to detect branch point
1.3.3.2. Slicing
A "range" is defined as a (HEAD, START) pair. HEAD is a single revision, START is an
The (HEAD, START) range contains all revisions in stablesort('::HEAD')[START:].
tandard slicing to a range. The ideal standard slicing of range takes a linear range of 2**N elements into 2 ranges of 2**N-1 elements. In reality, ranges have arbitrary number of element, and the tree is not linear. So here is the pseudo algorithm:
Unsurprisingly, this pseudo code is full of various error :-/
def depth(rev): """The "depth" of a revision in th graph""" return len(ancestors(rev)) # includes rev def slice(input_range, slice_idx): """a slice a range at a given index return a list of ranges (two elements or more) """ # The input range have 1 head by definition, # so that top part is also guarantee to have one head top = range(head=input_range.head, start=input_range.start + slice_idx) bottom_revs = input_range[:slice_point] # However The remainder might have multiple heads bottom_heads = heads(bottom_revs) # keep order from input # if we have only one head, things are simple: if 1 == len(bottom_heads): bottom = range(head=bottom_revs[-1], start=start) return [bottom, top] # otherwise, we need to build a subrange for all heads else: # not: because of the topological ordering, all heads but one are garanteed to be some merge parent subslices = [] for branch_head in heads: branch_revs = [r for r in bottom_revs if r in ancestors(branch_head)] branch_start = depth(branch_revs[0]) -1 subslices.append(range(head=branch_head, start=branch_start) return subslices + [top] def standard_subslice(input_range): """slice a range into its standard subrange return a list of range.""" assert 1 < len(input_range) # check if the input range starts at a standard point depth(input_range.head) step = 2 ** int(log2(len(input_range) -1)) standard_start = 0 start_ok = None while standard_start < start: standard_start += step step /= 2 if start == standard_start: # This range is "standard", let us slice it into subslice # # We skim the top part and hopefully leave 2 ** N elements in the bottom part target_size = 2 ** int(log2(len(input_range) -1)) return slice(input_range, target_size) else: # this range starts at a non-standard point, let us extract the standard part of it return slice(input_range, standard_start - start)
Property:
a range of N revs is can be sliced in O(nb_merge) ranges.
recursive slicing of a range containing N revs produces O(Nlog(N)*nb_merge) ranges
- focus on reusing standard sub-range in multiple super-range
1.3.3.3. Actual discovery
We do discovery using "range-obshash" (associated to ranges), if "range-obshash" is the same, we know the content if similar for that range. If the content is different, we query the standard-subranges to pin point the difference.
- "range-obshash" of a 1 changeset range, is the hash of the marker relevant to it.
- "range-obshash" is the hash of its standard subrange "range-obshash"
1.3.3.4. Caching
Here is the list of information we could cache for this discovery:
- depth of a changesets (changeset → deps): immutable for each changeset, trivial to compute but for merges,
- Standard slicing of "range" (range → standard-subrange): immutable for each range, trivial to compute as long as merge is not involved.
obshash of range (range → obshash): Long lived for each range, we need to detect when marker relevant to a changeset changes and invalidated range including it. (note: worth case invalidation seems O(N log(N)) for N descendant, But after such invalidation, not all range will be re-used)
- For each merge: caching highest common ancestors and parents order allow todo a stable topological traversal from head to bottom.
2. Graph Outline
○ ← a changeset, ◔ ← changeset being pushed ● ← changeset that exist remotly before the push. ◕ ← changeset that exist remotly but is not selected by the push ⊗ ← pruned changeset ø ← obsolete changeset with successors ◌ ← changeset that does not exist locally but are present in marker history ✕ ← changeset that does not exist locally but are pruned in marker history ⇠ ← obsolescence marker from that point (if not pointing to anything this mean we do not care about what is point to)
3. A. Simple Case
3.1. A.1 pushing a single heads
3.1.1. A.1.1 pushing a single head
⇠◔ A | ● O
Marker exist from:
- A
Command run:
- hg push -r A
- hg push
Expected exchange:
- chain from A
3.1.2. A.1.2 pushing a multiple changeset into a single head
◔ B | ⇠◔ A | ● O
Marker exist from:
- A
Command run:
- hg push -r B
- hg push
Expected exchange:
- chain from A
3.2. A.2 Two heads
⇠○ B ⇠◔ | A |/ ● O
Marker exist from:
- A
- B
Command run:
- hg push -r A
Expected exchange:
- chain from A
Expected Exclude:
- chain from B
3.3. A.3 new branch created
B' ○⇢ø B | | \Aø⇠◔ A' \|/ ● O
Marker exist from:
Aø⇠○ A'
Bø⇠○ B'
Command run:
- hg push -r A
Expected exchange:
- chain from A
Expected Exclude:
- chain from B
If A and B are remontly known, we should expect:
hg push will complain about the new head
hg push should complain about unstable history creation
3.4. A.4 Push in the middle of the obsolescence chain
(Where we show that we should not push the marker without the successors)
B ◔ | A⇠ø⇠○ A' |/ ● O
Marker exist from:
Aø⇠○ A'
- chain from A
Command run:
- hg push -r B
Expected exchange:
- Chain from A
Expected Exclude:
Aø⇠○ A'
3.5. A.5 partial reordering
B ø⇠⇠ | ⇡ A ø⇠⇠⇠○ A' | ⇡/ | ○ B' |/ ● O
Marker exist from:
Aø⇠○ A'
Bø⇠○ B'
Command run:
- hg push -r B
Expected exchange:
Bø⇠○ B'
Expected Exclude:
Aø⇠○ A'
3.6. A.6 between existing changeset
A ◕⇠● B |/ ● O
Marker exist from:
A◕⇠● B
Command run:
- hg push -r B
- hg push
Expected exchange:
A◕⇠● B
3.7. A.7 Non targeted common changeset
⇠◕ A | ● O
Marker exist from:
- Chain from A
Command run:
- hg push -r O
Expected exchange:
- ø
Expected exclude:
- Chain from A
4. B. Deletion Case
Most B case can be read with
4.1. B.1 Pruned changeset atop the pushed set
⊗ B | ◔ A | ● O
Marker exist from:
- B (prune)
Command run:
- hg push -r A
- hg push
Expected exchange:
- prune marker for B
4.2. B.2 Pruned changeset on head. nothing pushed
⊗ A | ● O
Marker exist from:
- A (prune)
Command run:
- hg push -r O
- hg push
Expected exchange:
- prune marker for A
4.3. B.3 Pruned changeset on non-pushed part of the history
⊗ C | ○ B | ◔ A |/ ● O
Marker exist from:
- C (prune)
Command run:
- hg push -r A
- hg push
Expected exchange:
- ø
Expected Exclude:
- chain from B
4.4. B.4 Pruned changeset on common part of history
⊗ C | ● B | | | ● A |/ ● O
Marker exist from:
- C (prune)
Command run:
- hg push -r B
- hg push
Expected exchange:
- prune for C
4.5. B.5 Push of a children of changeset which successors is pruned
This case Mirror A.4, with pruned changeset successors.
B ◔ | A⇠ø⇠⊗ A' |/ ● O
Marker exist from:
Aø⇠○ A'
- chain from A
A'
Command run:
- hg push -r B
Expected exchange:
Aø⇠○ A'
- chain from A
A'
Extra Note:
- I'm not totally happy about this case and I believe some more complicated graph can result in behavior wuite confusing for the user (if some tool create prune maker in a the middle of a valid chain)
4.6. B.6 Pruned changeset with ancestors not in pushed set
B ø⇠⊗ B' | | A ○ | |/ ● O
Marker exist from:
Bø⇠⊗ B'
- B' prune
Command run:
- hg push -r O
Expected exchange:
Bø⇠⊗ B'
- B' prune
4.7. B.7 Prune on non targeted common changeset
⊗ B | ◕ A | ● O
Marker exist from:
- B (prune)
Command run:
- hg push -r O
Expected exchange:
- ø
5. C. Advance Case
5.1. C.1 Multiple pruned changeset atop each other
⊗ B | ⊗ A | ● O
Marker exist from:
- A (prune)
- B (prune)
Command run:
- hg push -r O
- hg push
Expected exchange:
- A (prune)
- B (prune)
5.2. C.2 Pruned changeset on precursors
B ⊗ | A ø⇠◔ A' |/ ● O
Marker exist from:
- A' succeed to A
- B (prune)
Command run:
- hg push -r A'
- hg push
Expected exchange:
A ø⇠o A'
- B (prune)
5.3. C.3 Pruned changeset on precursors of another pruned one
B ⊗ | A ø⇠⊗ A' |/ ● O
Marker exist from:
- A' succeed to A
- A' (prune
- B (prune)
Command run:
- hg push -r A'
- hg push
Expected exchange:
A ø⇠⊗ A'
- A (prune)
- B (prune)
5.4. C.4 multiple successors, one is pruned
Another case were prune are confusing? (A is killed without its successors being pushed)
(could split of divergence, if split see the Z section)
A B ○⇢ø⇠⊗ C \|/ ● O
Marker exist from:
A ø⇠○ B
A ø⇠○ C
- C (prune)
Command run:
- hg push -r O
Expected exchange:
A ø⇠○ C
- C (prune)
Expected exclude:
A ø⇠○ B
6. D. Partial Information Case
From then we have changeset missing from the repo but still referenced in obsolescence marker. This has an impact on the knowledge we have from the graph topology.
About any of the above Case could be used too, just drop local knownledge of some/all obsolete changeset.
6.1. D.1 Pruned changeset based on a missing precursors of something we push
B ⊗ | A ◌⇠◔ A' / ● O
Marker exist from:
- A' succeed to A
- B (prune)
Command run:
- hg push -r A'
- hg push
Expected exchange:
A ø⇠o A'
- B (prune)
6.2. D.2 missing prune target (prune in "pushed set")
A ø⇠✕ A' |/ ● O
Marker exist from:
- A' succeed to A
- A' (prune)
Command run:
- hg push
Expected exchange:
A ø⇠o A'
- A' (prune)
6.3. D.3 missing prune target (prune Not in "pushed set")
(this is one of the case were is will be hard to be non-confusing)
A ø⇠✕ A' | | | ○ B |/ ● O
Marker exist from:
- A' succeed to A
- A' (prune)
Command run:
- hg push -r O
(shall we account for a secret B?
Expected exchange:
- nothing?
6.4. D.4 Unknown changeset in between known one
Mostly a clarification case
ø⇠◌⇠○ | |/ | ◔ |/ ● O
Should be treated as A.3 case:
ø⇠○ | | | ◔ |/ ● O
6.5. D.5 Unknown changeset in between known one
7. Z. Crazy case
When I'm note very sure about what we should do
7.1. Z.1 partial push of split
D'○⇢ø D | | A B ○⇢ø⇠◔ C \|/ ● O
Marker exist from:
A ø⇠⚭ (B,C) (split)
D ø⇠○ D'
Command run:
- hg push -r C
Expected exchange:
- We should probably send the whole marker anyway. But what about things related to B children
A ø⇠⚭ (B,C) (split)
Expected exclude:
D ø⇠○ D'