Note:
This page is primarily intended for developers of Mercurial.
Topics
Contents
- Problem Statement
- Current Target
- Pro and Cons
- Other questions
- Open ideas
- Current Implementation
- See also
1. Problem Statement
The Mercurial community has been struggling for years to define a nice way to handle 'topic' branches (sometimes also called 'feature' branches), especially when it comes to sharing them with other people (mainly for code review or other collaboration.)
Bookmarks are a clone of git's refs, which seems to work more poorly in Mercurial than they do in Git, in part because the synchronization parts of bookmarks aren't really done. Adding the remaining bits of git's refs to Mercurial has been controversial, and may represent enough of a behavior change that it's infeasible.
Named branches are visible forever in the revision history, which makes them unsuitable for feature branch work as the feature branch names rapidly pollute the output of things like hg branches.
2. Current Target
This describe the target semantic and behavior for topics. Of course some adjustement can be built.
2.1. General semantics
TL;DR; topic are an extra "light-branch" data relevant to draft changesets.
Topic is a name explicitly attached to changesets,
This Topic data is primarily meant to categorize draft changeset and fade out when things become public,
Changeset have both a topic and a 'branch'. The topic allow to gather related in progress work, while the branch data refer to the long terms line of development.
Behaviors focus on ensuring any name have a single head.
- Behaviors related to named branches behave mostly as if the draft-with topic are not on the branch (yet).
- Behaviors within a topic are similar (with minor sensible difference) to named branches one within the topic.
2.2. sub branches, namespacing and representation
(updated on 2021-10-08, as the namespacing subject is moving forward)
- given their overall behavior, topics would be seen as sub branches and displayed as such
- to help managing topic between multiple users, we could add optional namespaces to topics.
- a standard representation scheme encompassing branches, topics and topic namespaces would be useful for various display and addressing needs.
The question is then "how to separate the various elements? The : character has been tempting, but is usage in revsets make it unsuitable for a final UI.
The current plan is to use // (two consecutive forward slashes). Examples:
branch//topic -> no namespace
namespace/topic (cannot be used directly because it is ambiguous with existing namedbranches scheme)
//namespace/topic -> topic with namespace on the default branch
branch// -> no topic
//namespace/ -> a topic namespace
/topic -> (a branch called "/topic")
//topic -> topic on the default branch
branch//namespace/topic -> case where all three elements are explicitely given
foo/bar//user26/feature -> branch is foo/bar, namespace is user26, topic is feature
2.2.1. Use cases for topic namespaces
- on the client side, it could be used to sandbox operations and in particular avoid conflicting with other people's work.
A warning could be issued when an operation obsoletes a changeset having a different namespace than the currently active one, or (depending on configuration) the operation could be refused. Example: Alice works on the topic alice/foo, and Bob on bob/bar, which is stacked on alice/foo. When Alice rebases alice/foo, it should not rewrite bob/bar because that will create a content-divergence with the amendments that Bob has made in the meantime. This is a frequent source of complications with the current state of things.
- forges could map topic namespaces to their user or team concepts, and hence ultimately to their permission model. For example, people could be allowed to push (only) changesets bearing their personal namespace without extra permissions. This could solve the problem of drive-by contribution (a frequently asked feature).
- It would be easy to expand on that by adding namespace-filtered views to give the illusion of a full fledged personal fork on top of that. All major forges in the Git realm use some kind of sharing instead of fully independent clones anyway. The difference between a shared commit pool with user-specific references and a single repository with a filter is nothing but an implementation detail from a user perspective.
- namespace could be used to narrow user view and exchange to the part that are relevant to him. Namespace that a user does not "subscribe" to could not be pulled by default and/or not displayed by default. This would help keeping a reasonable amount of data in case of large shared repository.
2.2.2. Ambiguities related to forward slashes
A study on the actual usage of / as branch names in real life repositories has been conducted, using the Software Heritage public archive. Here are the results:
- single forward slashes are frequent in branch names (often used for user namespacing)
there are only a couple of // occurrences in branch ames, most of them looking to be bogus (attempt to use hg branch with an URL, possibly due to confusion with bzr branch)
- single forward slashes have been forbidden in topics for years, preventing any ambiguity on that side. Out of caution, we shouldn't allow forward slashes in topic namespaces, at least not in the first iterations. As they wouldn't create ambiguity in the full notation, we could relax that later, if the need arises for them (an application having a clear use-case).
We should now
forbid // in new branch names in Mercurial core
forbid // in new branch names in the topic extension so that it applies to earlier Mercurial versions as well
2.2.3. Additional notational shortcuts for the command-line
This is not meant for revsets.
a . (period) could mean "the current", hence .//foo for topic foo (no namespace) on the current branch, or bar//. for the current topic on the bar branch
a * (star) could mean "any", hence foo//* for all topics on branch foo or *//foo for topic bar in all branches.
Caveats:
what about */*
* is currently a valid branch name (. is not a valid topic nor branch name). We should probably forbid it as a branch name.
2.2.4. Implementation of namespaces and commands
Topic namespaces would be stored in the changeset in a dedicated extra field (topic_namespace perhaps?)
There will be a default topic namespace called default, so that it can be used explicitely. Total lack of topic namespace will have to be normalized to default (changesets done prior to the introduction of namespaces).
A default namespace config item would also be introduced. Should it itself default to default or to the user part of email address? The configexpress extension can be used by servers to suggest an adequate value, such as user name on the platform.
- The logic of selection and application to a new changeset would be the same as with branches and topics: marking the working dir, then application in subsequent changeset.
The hg topic command would allow to set also the namespace, using the namespace/topic notation. Should it also display the namespace in all cases ?
the hg topic command would gain a --namespace flag to show/set the namespace.
(Alternatively, maybe dedicated command hg topic-namespace would display or set the namespace)
What should be the behavior of hg topic foo (no namespace)? Follow-up on the currently set one (if any) or get back to user-configured default?
the hg branch command should be teached to recognised hg branch foo//bar/baz call. (and maybe gain flags to list topics)
2.3. General effect on named branch
Changeset with topic are only aspiring to be part of the named branch, but not fully in that branch yet. When you have branch foo name foo is resolved to the heads of foo with no topic.
There is a couple of examples:
Multiple heads are not longer ambiguous,
The name foo resolve to C,
The name bar resolve to Y,
Partial data does not have radical change on the definition,
The name foo resolve to B,
The name bar resolve to Y,
Usual traversal rules apply:
The name foo resolve to B,
The name bar resolve to X,
2.4. Behavior for update
(This implies change in hg update behavior (but are not super relevant))
2.4.1. Case 1: active branch
Currently active branch: foo
- Currently active topic: ø
Running hg update bring you on the head of foo branch (untopiced-head).
2.4.2. Case 2: active topic
Currently active branch foo
Currently active topic: bar
Running hg update bring you on the head of bar topic.
2.4.3. Case 3: active topic, lagging behind
W was on topic bar, but it is now public so bar does not apply to W. However, the working copy is still having an active topic.
Currently active branch: foo
Currently active topic: bar
Running hg update bring you on the head of bar topic.
Classical case for getting in that situation is to hg up bar that bring you on W with bar active, then pull that bring you B, C, X, Y and turn W public.
The intent here is to work on topic bar and hg update should make you up-to-date in the bar context.
Alternatively we could requires more data from the user.
2.4.4. Case 4: active topic, topic is closed
W was on topic bar, but it is now public so bar does not apply to W anymore.
Currently active branch: foo
Currently active topic: bar
Running hg update bring you on the head of foo branch.
The bar is de-activated
Classical case for getting in that situation is to be working on topic bar then bar got 'accepted' and get public. As you former bar is not the new head of foo, people merge/rebase and publish more changesets on top of that.
In that case inferring that you topic is "done" and being up to date mean "bring me to the latest of my branch" seems to make sense.
A possible issue, hg update; hg pull can provide different from hg pull; hg update, in the case where hg update is bringing changeset on the bar topic again.
Alternatively we could not update, pointing at a command to disable the topic.
2.5. Behavior for push to publishing (default)
Mercurial will keep enforcing a single head for each name:
2.5.1. Case A: pushing multiple topological branch
push is rejected as creating a new head on branch foo (Y)
Pushing the following to a publishing server would make X and Y public, fading their bar topic making them plain member of branch foo, creating a new head.
2.5.2. Case B: pushing a merged topic
User can then merge/rebase
push is accepted, all X, Y, Z changeset become public and new single head of branch foo in Z.
2.5.3. Case C: linear push but remote unsynched heads
Local:
Remote:
Push fail as Y creating a new head on branch foo.
User have to pull and merge as usual.
2.5.4. Case D: linear push while in sync
Local:
Push succeed and fade topic bar out.
2.5.5. Case E: new named branch
Push fails, creating new named branch bli
2.6. Behavior for push to non-publishing repo
2.6.1. New topic
Local:
- Push succeed
branch foo head is A
topic bar head is Y
2.6.2. New topic on a topological branch (1)
- Push succeed
branch foo head is C
topic bar head is Y
2.6.3. New topic on a topological branch (2)
- Push succeed
branch foo head is C
topic boo head is J
topic bar head is Y
2.6.4. New topic on a topological branch (3)
Local:
Remote:
- Push succeed
branch foo head is C
topic boo head is J
topic bar head is Y
2.6.5. New head on topic
push fail, new head on topic bar
2.6.6. Conflicting unrelated topic
Local:
Remote:
push fail, new head on topic bar
2.6.7. New head on branch
Local:
Remote:
push fail, creating new head B on branch foo
fail, creating new head B on branch foo
2.6.8. New topic on new branch
- Push succeed
branch foo head is A
branch bli head is ø (does exist yet),
topic bar head is Y
2.7. Behavior for merge
(We won't discuss hg rebase, assuming it will behave the same in 3.7)
2.7.1. Two heads on topic
hg merge pick Z as destination
2.7.2. Topic has three heads
- Merge abort, ambiguous destination.
2.7.3. Topic has one head and include branch head
Local:
- Nothing to merge
2.7.4. Topic has one head and is behind compared to branch head
- hg merge pick the branch head as destination (as the topic is alread linear)
2.7.5. Topic has one head, branch has multiple heads
- hg merge abort, ambiguous destination
2.8. Stacked diffs workflow
2.9. User Transition
3. Pro and Cons
4. Other questions
5. Open ideas
This is a list of idea that emerged while brainstorming. This served as base for the current things.
- Topic could be a property attached to each changeset (grouping them by similar topic)
- Topic could fade away when changesets become public (either archived or plain dropped)
A benefit of archiving them is that users can query for topics, eg you could say hg log -r topic(issue123) which would help
- Tracking could be achieved through the naming scheme. eg:
- 'default//feature-foo' would be a topic 'feature-foo' tracking the 'default' branch.
- 'stable//issue4700' would be a topic 'issue4700' tracking branch stable.
- '@//feature-bar' would be a topic 'feature-bar' tracking bookmark '@' ?
- 'stable//issue4689//issue4700' would be a topic 'issue4700' tracking the topic 'stable//issue4689'. When topic 'issue4686' face away (because published), the tracking fallback to 'stable'.
Topics could be non contiguous (mpm idea) feature-foo -> fix-bar -> feature-foo. Allowing a streamlined work that is automatically split apart after that.
- Topics could be hierarchical 'issue4700.test' 'issue4700.preparation', activation//reference could be done at any level 'issue4700' or 'issue4700' (this could help handle branching/different approach)
pushing a new head on a new topic to a non-publishing server would be allowed.
- that is, it'd be legal to have one head per topic on a non-publishing server.
- A changeset could maybe have multiple topic.
- Augie doesn't feel great about this option just because of UI complexity.
- Users can name patches (in a sense) without mq
- One of the major complaints about evolve from veteran mq users is that their patches no longer have explicit names. Topics provide a potential way to name patches again.
6. Current Implementation
Assign topics to non-public changesets. A topic is like a named branch, in that it is a label stored in a changeset's extra, but that topics just disappear when the change moves to public phase (the data still exists, it's just not shown.)
Code is available at https://www.mercurial-scm.org/repo/topic-experiment.
6.0.1. Non-Goals
- Topics are not suitable for long term branches. We have named branches for that (and possibly also bookmarks, depending on workflow.)
- Topics are not suitable for tracking a moving point in public history. This seems to be a perfect fit for bookmarks.
6.1. Open Questions
- Right now we use changeset extra for storing the topic. That might lead to bonus divergence problems. They might be easily fixed, but should we avoid that?
- Should changesets be allowed multiple topics?
- How permissive should we be on topic names?
7. See also
bambams' (on freenode) proposal that excludes phases: https://bpaste.net/show/107d9bb1be4c