PatchHandlingUnificationRFC

Patch handling unification IRC logs and ideas

This is a wiki page to try to summarize ongoing IRC chatlogs to find a way to unify patch handling extensions and tools (mainly mq, attic, shelve and record).

TableOfContents

Attempt at capturing the situation today

Today, we have a number of overlapping extensions for managing patches on top of Mercurial repositories: mq, shelve, attic, and pbranch. It would be desirable to prune this list, optimally down to just one. However, we likely don't want to lose any of the main functionality offered by the extensions today:

mq

Goals:

UI close to quilt (http://savannah.nongnu.org/projects/quilt/)
Maintain multiple, usually related patches.
Typically for later inclusion upstream.
Or for different local configurations (guards).

Strong points:

Patches, while applied, are normal revisions and can be emailed etc. as such.
Can go back and forth between patches.
Can reorder patches, as long as they don't interdepend.
Can fold multiple consecutive patches into one.
Has guards to conveniently enable/disable patches.
Can easily promote patches to regular changesets.
Can convert regular changesets into malleable patches.
Can be used directly in your normal work repo.
Directly supported on bitbucket.org.
Patches are stored as files, so they can be viewed and edited as such.
hg strip to strip unwanted history (dangerous, too).
Is an established extension shipped with Mercurial.
Is documented in the Mercurial book.

So-so points:

Can collaborate on patches by versioning patch files.

Weak points:

Collaboration is not straightforward (merging patches - yuck).
Base revision is not clearly defined.
Rebase is complicated (or used to be?).
Updating patches is destructive, so it's easy to shoot yourself in your foot. Especially when distributing pending changes to their proper place.
Updating a patch is a special operation. So needs special tool support (qrecord, qct).
Actual intended patch dependencies are not explicit. Always assumes incremental patches.

pbranch

Goals:

Maintain multiple, usually related patches.
Typically for later inclusion upstream.
Be better than mq for
- collaboration and long-term patch maintenance,
- traceability of changes to patches, and
- distribution of changes to patches.

Strong points:

Can go back and forth between patches.
Can fold multiple consecutive patches into one.
Maintains an explicit DAG of patch dependencies.
Each patch has a clearly recorded base it officially applies to. So you always know which patches are already rebased, and which are not.
Rebasing on upstream and on changes in base patches are normal Mercurial merges.
Can collaborate on patches using normal Mercurial commands.
Is non-destructive.
Updating a patch is a normal commit. Tools like record, crecord, TortoiseHg etc. can be used.
Can email patches directly.
Has extensive documentation (Mercurial book style).

So-so points:

Can export patches as mq patch series.
Needs sound understanding of multiple branches/heads and merge flow.

Weak points:

Leads to intractable history in the pbranch clone (all these automated merges).
"Applied" patches are not regular changesets (patches are slightly filtered diffs between branches).
Patches will likely need to be linearized before submission to avoid potential fuzz when committers try to apply them. Currently, such linearization is permanent.
Reordering patches can be non-trivial (backout or rebase needed sometimes). Better to start out with proper dependencies in the first place.
Must be used in a throw-away clone of the main repo, unless you intend to eventually strip the patch branches.
Not included with Mercurial. Tests and documentation currently not according to Mercurial's standard.

attic

Goals:

Maintain multiple unrelated, unfinished strands of work.
Allow passing unfinished work on to colleagues.
Easily and quickly store unfinished work away so you can come back to it later.

Strong points:

Easy to use.
Can untangle multiple unrelated strands (hg shelve --interactive).
Can be used directly in your normal work repo.
When working on a patch, it is right in the working copy (no refreshing changesets and such).
Works well alongside mq/pbranch.
Is non-destructive (doesn't touch history at all).
Finishing a patch is a normal commit.

So-so points:

Can move patches back and forth to mq.
Can share .hg/attic dir as a repo on its own, thus sharing patches

Weak points:

Not included with Mercurial.
Can get rejects when picking up a patch when base repo has evolved.
Only deals with one patch at a time.

shelve

Goals:

Put aside a single piece of unfinished work.

Strong points:

Simple to use.
Some tool integration.
Can be used directly in your normal work repo.

Weak points:

Hunk selection always interactive.
Not included with Mercurial.
It's a subset of attic

List of use-cases

To aid in making a good decision about the way that the functionality of these tools can be merged, several use-cases are listed here (please contribute new ones if you have one in mind that is not yet listed):

Barack is working on a single-branch repository, and realizes that he wants to try to experiment with coding feature A, without sharing any of this work yet. He signals his intent to hg, and proceeds to work on feature A, making several commits. After some time, he realizes that he won't get far until feature B is first implemented. He issues a command that starts a new branch parallel to the 'feature A' branch, and makes a few commits. When feature B is finished, Barack rebases the 'A' branch onto the tip of the 'B' branch, and continues his work on feature A. When he's happy with feature A, he cleans up the history of the B+A branch, and either makes it public, or merges it into a public branch.
Jan has a repository on his machine with various non-public feature branches in it, and wants to locally (i.e. on his machine) clone the repository so that the clone contains all of the non-public branches. He also wants to make another clone which does not contain any non-public branches.
Sarah has been working on several branches, both public and non-public, and decides to strip off a portion of one of the non-public branches. She continues working on this truncated branch, and then realizes that she stripped away some work which she really needs. She runs an hg command which launches a history browser, allowing her to see all the work she's done for the last 30 days, on both public and non-public (permanent and non-permanent) branches. She chooses a changeset, and a diff is computed against the working directory. She is given the option to apply any of the hunks/lines in the diff to her working directory.
Eduardo has a repository he uses on his work computer with various public and non-public branches. He makes this repository available via hgweb, and restricts the access so that only he and his friends Renaldo and Maria have push/pull access to the repository. Renaldo is only given access to pull/push public branches, whereas Eduardo and Maria are (via an hgrc setting) given the option to add a flag when pushing/pulling so that they can push/pull both public and private branches. When Eduardo or Maria eliminate/restructure a private branch, and push with the "both public and private" flag, the private branch on the remote repository is also eliminated/restructured.

Tentative proposal

Hidden branches/heads

Add a special kind of commits heads/branches that don't get transfered during pulling/pushing/cloning. Let's call it 'hidden' branches.

They'd get transfered just when requested explicitely or when converted to normal heads/branches.

A possible way to mark those heads/branches is using an additional extra field, similar to how closed branches are done. Removing that extra field would "unhide" the head and the branch it tips.

In some way, this kind of branches would be like a reversal of git topic branches, or a way to do an hg clone/pull/push -r exclusion.

Applying hidden branches to unify mq/pbranches/attic/shelve

This new kind of branches allow for a safe playground zone without resorting to full separate clones. As they can be used without propagating its changesets to other clones they're suited for history altering changes like strip, rebase, qfold or just in-progress work that's not ready to be published.

With that, the mercurial queues, pbranches}} or {{attic/shelve stores could live in the main repo instead of outside repos and this would also ease rebasing or full 3-way merging. mq, pbranch or attic would be just different ways to interface with evolving changesets.

An outline of how mq, pbranches, attic or shelve could be implemented is given:

attic

attic's shelve could move the working dir changes to a commit in a 'hidden' branch and unshelve would just merge back those changes to the working dir without (optionally) commiting them. shelves could be still named and versioned (bookmarks and/or making it a named branch would do the trick). unshelve --delete would be just a strip of the hidden branch that keeps the patch

pbranches

pbranches could be also used without the need to clone the main repo to keep tentative patches, as the pbranches could be of the hidden kind.

mercurial queues

mq could use a hidden branch to store its patches info that gets rooted at the point (changeset) where the queue is needed. When time passes and new changes are added it could be possible to rebase the mq branch to a more recent changeset and use the full rebase machinery.

Alternative approach: Overlay repositories

Instead of using hidden branches, another possibility would be using overlay repositories, with similar behaviour.

IRC logs

Edited #mercurial IRC Log 28-01-2009

*       mpm would prefer if there were one non-buggy, easy-to-use union of mq/pbranch/record/shelve/attic.
<pachi> after_fallout: I have the new status and default patch working too
<after_fallout> mpm: I don't think that will quite happen; mq and attic are entirely different directions and not very similar when you actually use them; they are actually decent compliments of each other
<after_fallout> shelve/record/attic maybe could be one
<mpm>   pbranch appears to be a superset of mq, possibly of attic as well.

<mpm>   If there's some reason why we need to have more than one way to handle "storing work as patches", then I'm open to it. But not thrilled about it.

<tonfa> mpm: I guess we at least need to keep a "quilt-like" mode

<mpm>   Yes, but there's no reason that we can generalize it to a DAG or whatever pbranch is doing. Or teach it about chunk granularity like record.
<tonfa> yeah it seems like attic and pbranch could be combined

<mpm>   I've not really looked closely at any of them aside from mq, that's just my view from a distance.

<tonfa> attic is one-level pbranch with nice interactive mode

<after_fallout> I think attic is quite a bit simpler than pbranch too
<after_fallout> it isn't meant for long term patch management
<mpm>   I appreciate that, however..
<muggs> big difference: pbranch work can be pulled/pushed around
<mpm>   It kind of sucks to have one tool for beginners and a second incompatible tool for experts.
<mpm>   (And to fix bugs in them both)
<after_fallout> I agree, I don't think we will have that though
<mpm>   No? What happens when a user that starts with attic outgrows it?

<tonfa> it sure would be nice to have just one extension that does everything, with one or two commands for beginners
<tonfa> and the others for advanced usage
<after_fallout> if the reason for having a patch around changes from "I am working on it but it is gonna sit around here for a bit while I do something else" to "I have to manage it for an extended period of time" then it is there for a different reason
<pachi> if attic could become pbranch when working in "versioned" patches mode...
<after_fallout> yeah something like that
<mgeisler>      tonfa: especially now that the three extensions have borrows a lot of code from each other
<after_fallout> a considerable amount of the shared work can be joined together
<after_fallout> attic uses the record code straight out of record
<mgeisler>      after_fallout: yeah, I'know its not that bad -- it just suggest that things should be merged like mpm says

<after_fallout> or that there is some refactoring to do
<after_fallout> for example see record.py line 68
<after_fallout> class header(object):
<after_fallout>     """patch header
<after_fallout>     XXX shoudn't we move this to mercurial/patch.py ?
<after_fallout>     """
<muggs> in order to support collaboration like pbranch does, the extension would need some way to extend the wire protocol
<tonfa> ideally we should have a hg import --interactive
<tonfa> when that happens it means the record stuff is sufficiently refactored :)

Edited IRC log, conversation between parren and pachi

<pachi> after_fallout is working on attic. I just did some minor contributions, but like it very much
<parren> I can imagine. It sounds attractive.
<parren> See, I think where attic/shelve shine is for quick juggling of several strands of work.
<pachi> though it looks like having something as easy to work with as attic, but which gets into pbranch when it gets versioned looks like an ideal solution
<pachi> yes
<parren> So you want something like a pbranch import from attic?
<pachi> but, from reading the pbranch docs I thought that the only "problem" with it is that it's 'in-history'
<parren> And that is exactly why pbranch is so much more heavyweight than both mq and attic and shelve.
<parren> Why are we discussing this in private?
<pachi> one idea (probably stupid, but that's why I'm asking more competent people like you) is if it would be feasible to tag branches
that are used in pbranch like the closed branches
<pachi> using an extra tag that avoids to push/pull them
<pachi> unless tagged to work otherwise
<parren> Well, pulling/pushing these is one of the reasons I created
pbranch. For when you want to collaborate on patches.
<pachi> that would get us some sort of "local-only" branches that don't propagate
<pachi> and one could strip them easily
<pachi> so, if one wants to share them, one just untags the branch and they work as normal
<pachi> but they could be kept as a local only thing meanwhile
<parren> Hmm.
<pachi> so shelve/unshelve could work like a pbranch without worry of history polution as it could be possible to reset that branch history (folding)
<pachi> conceptually it's like having a full clone that you can publish when you want
<parren> Are the new closed branches already exempt from push/pull? I doubt it somehow.
<pachi> no
<pachi> I had a look to what pull/push does, and it uses findincoming to find the tips that lack in local that are in remote, so excluding tagged heads could work
<pachi> (or even walking a branch and having a cache)
<parren> Yep, I can imagine.
<pachi> does it make sense?
<parren> Could work. Feels somewhat like git branches.
<pachi> but they're like the inverse concept, you don't add remotes but tag "local-only" branches, as, by default, hg shares all of the branches
<parren> Yes, of course.
<pachi> I think after_fallout is willing to work on something better, and even on record/mq refactoring, but your pbranches look brilliant too
<pachi> I know you're really busy, but do you think a common plan could be layed out?
<parren> What this does not yet address is mq's guards. pbranch cannot do them. And mq is better at juggling patches freely. Swapping the order of two patches with pbranch is, for example, rather hard with pbranch (which is why you should be starting them as parallel patches in the first place).
<pachi> but it looks more like an UI problem, doesn't it (the patchgraph thing IIRC)?
<pachi> or is it a problem of how to get the desired order manipulating the DAG?
<parren> No. If you have A->B->C, then C contains everything in B. Now try swapping to A->C->B. You have to first rebase C on A, then B on C. And preferably in a non-destructive way. 
<pachi> I see
<parren> And guards are even harder. Every change in guard setup would effectively cause rebases recorded in history. Not likely what people want.
<pachi> well, maybe mq still has its uses, but for special cases like those. At least a versioned version of attic/shelve would be good, even without stacked patches
<pachi> maybe extracting those bits makes mq cleaner and stacks are implemented as an addition to the stacked patches
<pachi> something like pbranch, but with an additional series file
<parren> ?
<parren> pbranch has the .hg/pgraph file, which is like .hg/patches/series.
<pachi> so the problem is how the new patches are created, using overlays instead of disjoint patches?
<parren> I like the idea of making attic more akin to starting a series of parallel pbranch patches.
<parren> You've lost me there. What do you mean by "overlay"?
<pachi> yes, that's the idea (mpm suggested that pbranch is like a superset of mq)
<parren> Like, yes. But as I said, mq still has its strong points.
<pachi> as I understand it, the problem with pbranch is that you start a patch an you later don't really want to create a new ppatch on top of it if they need to be reordered later ala mq
<pachi> s/an/and/
<parren> Right. You should start it independently. Which is a kind of bad premise because you're prone to forget.
<parren> It's like doing fixes properly at the root cause so you don't have to backport, only forward-merge. Everybody knows this, but it just
does not always happen this way.
<pachi> but if mq manipulations had a separate pbranch that gets done and undone transparently but take each of the pieces of the stack from
isolated pbranches, then it could perhaps work
<pachi> s/done and undone/applied and unapplied/
<parren> Only as long as all the patches are truly independent. Usually, some are, some are not.
<pachi> the problem now (as I see it) is that creating new branches that get destroyed are a bad idea because they get shared and then are there
forever
<pachi> but if they were not independent wouldn't we gave the same problems than with mq?
<parren> Yes. When you work with pbranch today, you have to work in a separate clone which you plan to throw away eventually (when the patches
have been merged upstream).
<pachi> *have
<parren> Yes, with mq you then get failed applies.
<pachi> if that clone were an 'in-repo but only local' branch, it wouldn't be needed
<parren> Correct. So I kind of like this idea.
<pachi> the point is... if we had throw away branches that don't get published... would it allow to do things differently?
<parren> Yes. Maybe we could even say that while you're only using this by yourself, we don't have to use hg branches at all and use bookmarks
instead. And flag some bookmarks as non-push/pullable.
<pachi> I had the intutition that it would be so, as hg can do the same as git, but we need clones for strategically hiding branches and that
has a cost when trying to work from a given repo, as some operations are problematic without resorting to an external clone
<pachi> yes
<pachi> all existing syntactic sugar could be applied to it
<parren> When you want to share, you migrate them to proper branches. This could be a pbranch command, committing all pbranch bookmarks to pbranch branches.
<pachi> yes, that's it
<parren> Using bookmarks you also won't run into trouble with colliding branch names (say remote has a new branch that collides with one of your
private attic branches). But I think we should define a branch namespace for pbranch et al anyway.
<parren> Both bookmarks and pbranch still need the wire protocol change so they can transmit their metadata (bookmark definitions and .hg/pgraph).
<pachi> indeed, it would have it's own space (really, like what mq has) to do the bookkeeping
<parren> Well, maybe not. In this case it probably suffices if local fs->fs clones copied the stuff.
<pachi> but while that doesn't happen we'd only miss how patches relate to each other, wouldn't we?
<pachi> I mean... we'd have all the patches (if they were shared) but not how they get merged in a series
<parren> pbranch actually recreates a good idea of the relations from the recorded dependencies. See http://arrenbrecht.ch/mercurial/pbranch/collab.htm#pagetoc__1_2
<pachi> that would be still very useful, but more similar to a versioned attic
<pachi> I see
<parren> And now that we have branch closing, it could recreate an even better idea.
<pachi> maybe tonfa knows if that 'local only' or 'locked' branches is feasible. IIRC he worked on findincoming...
<parren> I've also dug into it for the shallow clones. I think it can work. After all, it really amounts to the same as `hg push -r x -r y -r
z` where xyz are the heads of all non-blocked branches, no?
<pachi> yes, and I even thing (from a very superficial understanding) that it could be just removing the 'local' heads
<parren> So for push it sounds like even an extension could do it, but wrapping `hg push`.
<parren> For pull it's likely harder, since remote has to do the filtering.
<pachi> but pulling from it should work the same
<pachi> I can imagine that that's where the fun begins (how to get the subgraph)
<parren> subgraph meaning? pgraph?
<pachi> no, how to 'obliterate' the non shared parts
<pachi> the non-shared branches could be useful for many other things I suppose
<pachi> besides pbranch
<parren> Sure.
<parren> But I'd really focus on non-shared heads, not branches. More general, and works with bookmarks.
<pachi> fine
<pachi> I was thinking in the same semanthics as branch "closing"
<parren> So: attic becomes a bunch of independent pbranch patches, but based on bookmarks instead of named branches. And we block them from getting out of the repo. And we make pbranch work with both bookmarks and named branches, with an option to promote bookmarks to branches for sharing. Is that the plan?
<pachi> but 'hiding' here instead
<pachi> parren that's it  :) 
<parren> Maybe this could be tied into bookmarks directly. Flag a bookmark as hidden.
<pachi> well, those would be details that depend on the wire protocol allowing the transfering of some bookkeeping
<pachi> even if a named branch had to be used to have a name for patches while that's not available would do
<pachi> IMHO
<parren> Actually, attic would become just a bunch of hidden heads, with no pbranch meta-info needed. But you can seamlessly upgrade from a patch in the attic to making it a full-blown pbranch patch.
<pachi> handling that as metadata or inmutable information is equivalent in some way if we can throw away branches
<pachi> having the easiest UI to start from something like attic and go to a full pbranch is key, IMHO
<parren> Things is, we're talking about hiding these things. So if the things don't get propagated, then we don't need to propagate meta-info
about them either, do we?

<pachi> no
<pachi> only if they get shared, that's why how storing the patch names isn't important until they get shared, and probably the bookmark wire extension is needed

<pachi> but the problem of sharing metadata is not a problem of this particular idea
<parren> Well, as soon as we share, we don't them hide anymore, do we? Which is why I say we teach pbranch to work with both bookmarks (non-shared) and named branches (shared).
<parren> So once we share, we don't hide, so there is no hiding metadata to share anymore
<pachi> I'd use only bookmarks, as mpm probably wants to solve the bookmarks sharing problem anyway
<pachi> but, that's not important ATM, IMHO
<pachi> at the very least we'd have what pbranch has now  ;) 
<parren> Right. And since pbranch already does named branches and needs to be taught to use bookmarks, we lose nothing. And once bookmarks can
be shared we can maybe drop the named branch support in pbranch. So this might be the time to rename pbranch.  :) 

<parren> How about sending a summary as an RFC to the devel list? With this transcript and excerpts from the one tonfa sent appended?
<pachi> but otherwise I can try. I can edit text better than coding  ;) 
<parren> I'll see if I can find some time tonight. Got to go now. Kids and dinner time. I enjoyed this discussion. Thanks.
<pachi> ok, thanks a lot

Edited IRC log, conversation between after_fallout and pachi

<pachi> hi
<pachi> I've been talking to parren, the pbranch extension author
<pachi> about an idea to unify pbranch/attic/shelve, as mpm suggested

<after_fallout> I had the same thoughts as he does prettymuch with pbranch

<after_fallout> looking at pbranch we seem to have a couple of mostly minor issues
<after_fallout> 1. you have to know in advance when you want parallel branches
<after_fallout> 2. it does things to the history of your repository that get propagated when you push and pull
<after_fallout> mq doesn't really have those problems
<pachi> 1 and 2 can be solved with the 'local only' branches IMHO

<after_fallout> I think you can avoid them in the first place
<pachi> well, they can be avoided if some branches can be hidden from sharing
<pachi> that's at the end what mq does
<pachi> and attic, even if unversioned
<pachi> it just hides information from propagating
<after_fallout> true
<pachi> so you can alter contents without polluting history
<pachi> it just adds an outer repo to do its job, but it could be done in a branch (a specialized branch) to avoid the problems of managing it (qclone, qpush -R and so on)
<pachi> same with versioned attic
<after_fallout> ahh, bookmarks instead of named branches
<pachi> :)

<after_fallout> that would solve some of the problems

<after_fallout> right, in these secondary repos, you don't actually care about the history, only that you have the right version

<after_fallout> you need the history to be able to direct a merge, but you don't need it
<pachi> as they won't pollute anything, and could even be stripped of folded or anything without disturbing other repos
<after_fallout> to see what happened at any given time
<pachi> yes
<pachi> pbranch has a sort of pseries in the form of pgraph
<after_fallout> yeah sorta
<pachi> as I see it it's like an out of band DAG

<pachi> so the relationships between changesets for ordering are done with that instead of using hg itself, which is used to get the real thing, but not to store their relations initially
<after_fallout> more like a set of dags I think
<pachi> a set of dags and a way to relate them using another metadag that does the stacking
<pachi> well, at least that's what I understand...
<after_fallout> the way I understood it, you could start multiple pbranches on different nodes on the repository and never have them relating to each other
<pachi> yes, I think it's like that, though they can relate too, but it's not required
<after_fallout> a set of connected pbranches is a DAG, but the pgraph contains all the sets of pbranches
<after_fallout> which may or may not be all connected
<pachi> that's the problem parren saw to replicate mq behaviour, but it can be done using a new 'local' branch for stacking
<pachi> essentially, that's what mq does with the main repo it versions
<after_fallout> I guess
<pachi> use another repo to do the compositing, but we could do that in the same repo using another local branch (that can be stripped and reapplied at will) and still keep patches standalone