Differences between revisions 1 and 3 (spanning 2 versions)
Revision 1 as of 2016-03-19 22:31:21
Size: 1755
Editor: GregorySzorc
Comment: initial. still needs lots of details
Revision 3 as of 2016-03-19 22:44:50
Size: 2385
Editor: GregorySzorc
Comment: not on why generic parent rewriting needed
Deletions are marked like this. Additions are marked like this.
Line 21: Line 21:
== Requirement for Generic Parent Rewriting ==
We'd prefer to only be able to rewrite rev 0 to prepend history. If we did this, we could rewrite parents of rev 0 and compute hashes assuming parents are nullid, preserving hash verification. Unfortunately, not all repos could be prepending this way. For example, mozilla-central's rev 0 introduces a .hgignore file and rev 1 is a copy of CVS. Rev 0 doesn't have the full manifest. So rewriting rev 0 would result in a massive diff between the last changeset from the prepended history, the initial rev 0, and rev 1. This would interfere with log, blame, etc.
Line 22: Line 25:
CategoryNewFeatures CategoryNewFeatures CategoryDeveloper

Use Cases

Prepending History

A company converts a repository to Mercurial but doesn't preserve history in the original repo. e.g. they copy code from a CVS repository to a Mercurial repository as revision 0. Later, they want to convert the existing history to Mercurial and expose it as a single history while preserving revision nodes (so a flag day isn't required and people don't have to mass convert repos).

Fixing Bad History / Form of commit censorship

Someone did something wrong. They merged against the wrong parent. They committed something they shouldn't have. People would like a mechanism to amend published history to remove bad commits from existence. They could do this by pointing the first good commit after badness to the last good commit before badness. e.g.

A -> B -> C -> D -> E

Say C is bad. We could rewrite the p1 of D to point to B, omitting C from history.

A -> B -> D -> E

Proposal

The proposal is to create a mechanism to allow "fake parent" data in changelog entries. A changeset will be rewritten to refer to a different parent.

This will invalidate the hash of the rewritten changeset. However, all descendent changesets will still verify because their manifests and parents will be valid.

There will need to be a mechanism to describe which changesets have been rewritten - which changesets whose hash cannot be trusted. This is doable via several means. However, there are security implications. A MitM could potentially rewrite changesets during transport to indicate that hashes cannot be trusted. They could also rewrite content at the same time, subtly introducing backdoors or vulnerabilities without existing hash verification catching it.

Requirement for Generic Parent Rewriting

We'd prefer to only be able to rewrite rev 0 to prepend history. If we did this, we could rewrite parents of rev 0 and compute hashes assuming parents are nullid, preserving hash verification. Unfortunately, not all repos could be prepending this way. For example, mozilla-central's rev 0 introduces a .hgignore file and rev 1 is a copy of CVS. Rev 0 doesn't have the full manifest. So rewriting rev 0 would result in a massive diff between the last changeset from the prepended history, the initial rev 0, and rev 1. This would interfere with log, blame, etc.


CategoryNewFeatures CategoryDeveloper

FakeParentPlan (last edited 2016-03-20 21:27:55 by GregorySzorc)