Use Cases

Prepending History

A company converts a repository to Mercurial but doesn't preserve history in the original repo. e.g. they copy code from a CVS repository to a Mercurial repository as revision 0. Later, they want to convert the existing history to Mercurial and expose it as a single history while preserving revision nodes (so a flag day isn't required and people don't have to mass convert repos).

Fixing Bad History / Form of commit censorship

Someone did something wrong. They merged against the wrong parent. They committed something they shouldn't have. People would like a mechanism to amend published history to remove bad commits from existence. They could do this by pointing the first good commit after badness to the last good commit before badness. e.g.

A -> B -> C -> D -> E

Say C is bad. We could rewrite the p1 of D to point to B, omitting C from history.

A -> B -> D -> E

Proposal

The proposal is to create a mechanism to allow "fake parent" data in changelog entries. A changeset will be rewritten to refer to a different parent.

This will invalidate the hash of the rewritten changeset. However, all descendent changesets will still verify because their manifests and parents will be valid.

There will need to be a mechanism to describe which changesets have been rewritten - which changesets whose hash cannot be trusted. This is doable via several means. However, there are security implications. A MitM could potentially rewrite changesets during transport to indicate that hashes cannot be trusted. They could also rewrite content at the same time, subtly introducing backdoors or vulnerabilities without existing hash verification catching it.

Requirement for Generic Parent Rewriting

We'd prefer to only be able to rewrite rev 0 to prepend history. If we did this, we could rewrite parents of rev 0 and compute hashes assuming parents are nullid, preserving hash verification. Unfortunately, not all repos could be prepending this way. For example, mozilla-central's rev 0 introduces a .hgignore file and rev 1 is a copy of CVS. Rev 0 doesn't have the full manifest. So rewriting rev 0 would result in a massive diff between the last changeset from the prepended history, the initial rev 0, and rev 1. This would interfere with log, blame, etc.


CategoryNewFeatures CategoryDeveloper