Differences between revisions 7 and 8
Revision 7 as of 2010-08-12 17:24:28
Size: 2993
Editor: RenatoCunha
Comment: Corrected the lists
Revision 8 as of 2010-08-12 19:50:05
Size: 4367
Editor: RenatoCunha
Comment: Updated the current status and some bits of the design
Deletions are marked like this. Additions are marked like this.
Line 16: Line 16:
"Port" is between quotes because this is not a complete portm or a rewrite: we "Port" is between quotes because this is not a complete port or a rewrite: we
Line 40: Line 40:
== Current implementation == === "Design" of the port ===
Line 42: Line 42:
Following the suggestion given by mpm in a
[[http://selenic.com/pipermail/mercurial-devel/2010-June/022363.html|message to the development list]],
the approach used in this port consisted in:
Line 43: Line 46:
 a. teach 2to3 to change all strings in the source into bytestrings
 a. fix up the annoying b"A"[0] = 65 behavior
 a. make the minimum amount of other source changes to get it working under 3.x

The decision pointed out in a) is ok in mercurial's code because "There are
basically no Unicode objects "in the wild" in Mercurial. Their usage is more or
less restricted to a couple transcoding function in encoding.py where they
can't hurt anybody." <<FootNote(From http://selenic.com/pipermail/mercurial-devel/2010-June/022255.html)>>

== Status (Milestones) ==

 1. Port of the core C modules to py3k (./)
 2. Port of inotify's C modules to py3k (./)
 3. Removal of most the warnings issued by python2.6 run with the -3 switch (./)
 4. Implementation of a setup.py-like script that calls 2to3 with our custom fixers (./)
 5. Implementation of a fixer that translates strings into bytestrings (./)
 6. Implementation of a fixer to handle formatting with bytes (b'%s' % 'foo') (./)
 7. Implementation of a fixer to module name changes not shown by 2.6 (implemented, but not applied)
Line 73: Line 94:

== Where to go from now? ==

bytesformatter improved someday

== Notes ==

Status of the "port" of Mercurial to Py3k

This document describes the current status of mercurial's Py3k port. The work here described was developed as part of the Google Summer of Code 2010 program.

1. Summary

Last milestone: "hg manifest" runs successfully.

Current development: Documentation & Improvement of the fixers to generalize the manual edits

2. Objective and constraints

This project's objective is quite clear: to "port" mercurial to py3k. "Port" is between quotes because this is not a complete port or a rewrite: we want to make mercurial run in py3k while maintaining compatibility with python 2.x. There is an additional constraint, though: mercurial supports python from 2.4, which means the features introduced in 2.6 to ease the porting process can't really be used in the port. Also, refactoring the code to work in both python 2 and 3 proved to be too much work because:

  1. It would be troublesome to make a multipython code;
  2. It would be a maintenance hell.

Thus, we came to the conclusion that extending 2to3, the python refactoring tool, was the way to go. So, to summarize the port's objective and constraints:

  • We want to make hg run on py3k;
  • 2to3 is being used for that;
  • We must maintain support for python 2.4 and above;

An important aspect of the approach taken is that we stick to a "from the inside out approach". This means we started working on a port of the core C modules, then to the extension C modules (inotify only, currently), then removing most warnings issued by python 2.6 in "3 mode" (a mode that that issues warnings for deprecated modules and other incompatible changes) to, then, work on the fixing of the code.

2.1. "Design" of the port

Following the suggestion given by mpm in a message to the development list, the approach used in this port consisted in:

  1. teach 2to3 to change all strings in the source into bytestrings
  2. fix up the annoying b"A"[0] = 65 behavior
  3. make the minimum amount of other source changes to get it working under 3.x

The decision pointed out in a) is ok in mercurial's code because "There are basically no Unicode objects "in the wild" in Mercurial. Their usage is more or less restricted to a couple transcoding function in encoding.py where they can't hurt anybody." 1

3. Status (Milestones)

  1. Port of the core C modules to py3k (./)

  2. Port of inotify's C modules to py3k (./)

  3. Removal of most the warnings issued by python2.6 run with the -3 switch (./)

  4. Implementation of a setup.py-like script that calls 2to3 with our custom fixers (./)

  5. Implementation of a fixer that translates strings into bytestrings (./)

  6. Implementation of a fixer to handle formatting with bytes (b'%s' % 'foo') (./)

  7. Implementation of a fixer to module name changes not shown by 2.6 (implemented, but not applied)

3.1. Source code

Most of the code developed in this project has been already imported into mercurial's official repository. Which means that pulling from it will give you updated code that is known to mostly work. Additionally, you can clone Renato Cunha's patch queue, if you want to test more experimental code and code that hasn't been imported to mercurial yet.

3.2. How to run it

Highly experimental

This page describes a highly experimental feature that hasn't been completed yet. It is most useful for enthusiasts that want to know the status of the port and/or who are willing to help on it.

From mercurial's source root, you can run:

python3 contrib/setup3k.py build_ext -i build_py -c -d . build_mo

this is equivalent to running "make local" in hg's source root, with the difference that the python3 interpreter will be used and that the python source code will be preprocessed by 2to3 before exiting. This command takes approximately three minutes to run on a five year-old Athlon64 3000+.

4. Where to go from now?

bytesformatter improved someday

5. Notes

Py3kPort (last edited 2012-10-25 20:48:22 by mpm)