Differences between revisions 29 and 38 (spanning 9 versions)
Revision 29 as of 2018-04-27 03:45:33
Size: 4450
Editor: AugieFackler
Comment:
Revision 38 as of 2019-04-21 15:35:22
Size: 5268
Editor: GregorySzorc
Comment: add section on porting extensions
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
#pragma section-numbers 2
Line 5: Line 7:
This is a status page for keeping track of what needs to be done to make progress on Mercurial on Python 3. Our current aim is to support Python 3.5. This is a status page for keeping track of what needs to be done to make progress on Mercurial on Python 3. Our current aim is to support Python 3.5+.
Line 7: Line 9:
== What Works ==
`hg version`, `hg debuginstall`, `hg init`, `hg files`, `hg manifest`, `hg log`, `hg diff`, `hg export`, `hg status`, `hg summary`, `hg config`, `hg identify`, `hg update`, `hg commit`, `hg branches`, `hg bookmarks` works on Python 3 without using any out of core extensions. These won't work for you if you have out of core extensions enabled. There are certain things which don't work yet with these commands like revset, templates.
<<TableOfContents>>
Line 10: Line 11:
We have many tests passing on Python 3. You can have a look at them in [[https://www.mercurial-scm.org/repo/hg-committed/file/tip/contrib/python3-whitelist|python3-whitelist]] == Status ==
Line 12: Line 13:
== Contributing == We have been testing by installing default mercurial on our system using Python 3. Most of the things work correctly. Things which don't work can be found at BetaBugs section below.
Line 14: Line 15:
We will be happy to review patches and speed up the work related to Python 3. Before you start there are few things related to current porting and how things work currently. Most of our efforts are to make sure have Python 2 compatibility intact while making Python 3 run. We are planning to mark hg 5.0 which is scheduled for May 1 as Python 3 beta release.
Line 16: Line 17:
 * We have a source transformer which does following things on Python 3.
    1. It adds `b''` in front of string starting with `'` or `"` and not having any `u''`, `r''` or `b''` in front.
    2. Adds this line `from mercurial.pycompat import delattr, getattr, hasattr, setattr, xrange, open` on every python file.
    3. Converts every occurrence of `iteritems` to `items` on Python 3.
    4. Converts argument of *attr and encode, decode to unicodes by adding `u''`.
    5. The transformer currently works on `mercurial/, hgext/ and hgext3rd/`.
    6. The transformer code lies [[https://www.mercurial-scm.org/repo/hg/file/295625f1296b/mercurial/__init__.py#l124|here]] and you can also use transformer on your .py files by adding them in the transformer.
If you are an extension author and want to port the extension, [[https://www.mercurial-scm.org/repo/hg-committed/file/tip/mercurial/pycompat.py||pycompat.py]] contains most of our compatibility hacks. If you need help or guidance, you can message on IRC or devel mailing list. We will be happy to help you.
Line 24: Line 19:
 * Due to everything is unicodes by default in Python 3, and we need to rely on bytes, we have [[https://www.mercurial-scm.org/repo/hg/file/tip/mercurial/pycompat.py|pycompat.py]] which contains hacks related to various functions of `os` module on Python.
Line 26: Line 20:
 * We also have `encoding.environ` which helps us using a bytes version of `os.environ` on both Python 2 and 3. == Things need to be investigated ==
Line 28: Line 22:
 * We are also adding `r''` at some places to make it a raw string.  * Windows encoding changes
 https://docs.python.org/3/whatsnew/3.6.html#pep-529-change-windows-filesystem-encoding-to-utf-8
 * Lazy importer performance overhead. Our custom importer on Python 2 always returns a stub module during ``import``. Python 3's does I/O to verify the module exists then returns a lazy module that is loaded when first accessed. In addition to behavior differences, the I/O may contribute sufficient performance overhead to constitute a problem.
 * A mechanism for extensions to advertise that they are Python 3 compatible. Nearly every extension will break in Python 3. We may want a mechanism that requires extensions to self-declare that they are Python 3 compatible - possibly via special syntax in their source code or the presence of a well-named variable. It might have to be at the source level because Python 3 would need to evaluate code in order to obtain the value of a module-level variable.
Line 30: Line 27:
 * There are currently two tests which are based on Python 3 compatibility and few checks in our linter [[https://www.mercurial-scm.org/repo/hg/file/tip/tests/test-check-code.t|test-check-code.t]] to make sure new patches include things from `pycompat.py`.
Line 32: Line 28:
    1. [[https://www.mercurial-scm.org/repo/hg/file/tip/tests/test-check-py3-compat.t|test-check-py3-compat.t]] : This test was used initially to fix a lot of things, not very much helpful now.
    2. [[https://www.mercurial-scm.org/repo/hg/file/tip/tests/test-py3-commands.t|test-py3-commands.t]]: This test lists commands which actually works on Python 3. If you want an updated list anytime, the test is the best place to look for.
== Beta bugs ==
Line 35: Line 30:
 * Encoding issues are generally uncovered by our tests (as everything was byte string on Python 2.) Following are things which don't work right now:
Line 37: Line 32:
== How to start ==   * ~1% of tests fail
  * phabricator extension
  * out of core extensions
  * [[https://docs.python.org/3/whatsnew/3.6.html#pep-529-change-windows-filesystem-encoding-to-utf-8|Windows filesystem encoding]]
Line 39: Line 37:
 * You can always set up a virtual environment and run Mercurial in it, but we have a easier way around.
    1. Clone the [[https://www.mercurial-scm.org/repo/hg|main repo]] or [[https://www.mercurial-scm.org/repo/hg-committed/|committed-repo]] and say it's in folder name `hgrepo`.
    2. Have Python 3.5 installed and say you have it in variable `python3.5`.
    3. Run `hgrepo$ python3.5 hg version`. That must work, if not check that you should not have out of tree extensions enabled.
    4. Now you can run any hg command this way and test if it's working or not.
If you find anything apart from this not working, definitely go ahead and edit this page and we will fix it.
Line 45: Line 39:
Pure-python tests are sometimes easier to port, but often need to be ported to use unittest first instead of our legacy testing system. The first step in migrating such tests to Python 3 involves [[https://www.mercurial-scm.org/repo/hg-committed/rev/11d128a14ec0|porting to unittest]], followed by any necessary followups to fix issues on Python 3. A list of tests that probably still need this work done can be obtained by running `comm -23 <(hg files 'set:tests/test*py - grep(unittest)' | sed 's$tests/$$') contrib/python3-whitelist`. == Porting Extensions to Python 3 ==
Line 47: Line 41:
The practice we follow now is run commands which are not yet fixed and try to fix the exceptions raised. So our current approach is exception based. Nearly every extension will need to be ported to be compatible with Python 3. This is because of fundamental differences between Python 2 and Python 3.

The source code for Mercurial extensions will need to be Python 3 native and will need to be compatible with Mercurial's APIs. In many cases, existing source code will compile on Python 3 but will fail at run-time. Sources of run-time errors include:

 * Use of `str` instead of `bytes`. Mercurial uses `bytes` (`b''` strings) in almost all of its APIs and data structures. This is in contrast to much Python code, which uses `str` and `''` strings. It is common for extensions to `b''` prefix most strings in order to remain compatible with Mercurial.
 * Use of `iteritems()`, `iterkeys()`, etc. These methods from core data structures do not exist in Python 3.
 * Import of renamed modules. Python 3 refactored the locations of various modules in the Python standard library. Extensions may need to take this into account.

Do an Internet search for ''Python 3 porting'' to find well-written and comprehensive guides on generically porting code to Python 3.

Extension authors may find the ``mercurial.pycompat`` module useful. This modules contains abstractions and utilities for bridging the differences between Python 2 and 3. It is conceptually similar to the `six` Python module.

As of at least the Mercurial 5.0 release, Mercurial uses a custom module importer on Python 3 which rewrites source code dynamically as part of importing modules. This module importer is only active for the `mercurial`, `hgext`, and `hgext3rd` packages. '''Extension loading does not use this custom importer.''' This means that Mercurial's own source code and extensions are not yet native Python 3 source code. So if you look at Mercurial's source code for ideas on how to do something in an extension, behavior in the extension may differ from Mercurial itself due to the presence of this custom module importer. For reference, in the 5.0 release, the custom module importer performs the following actions:

 * Automatically adds `b''` prefixes to strings, making all `''` literals `b''` and effectively changing `str` to `bytes` everywhere. i.e. behavior mostly matches Python 2.
 * Modules automatically have `from mercurial.pycompat import delattr, getattr, hasattr, setattr, open, unicode` added.
 * `getattr()`, `setattr()`, `hasattr()`, `safehasattr()`, `encode()`, and `decode()` functions and methods have string literals in arguments rewritten to the appropriate type because Python requires a `str` value instead of `bytes`. (This effectively selectively undoes the global `''` to `b''` source transformation.)
 * `iteritems()` and `itervalues()` are automatically rewritten to `items()` and `values()`, respectively.

The source rewriting module importer is intended to be a stop-gap to make porting Mercurial to Python 3 simpler and will be removed in a future release. This is why extensions do not use it.

Note:

This page is primarily intended for developers of Mercurial.

Python 3

This is a status page for keeping track of what needs to be done to make progress on Mercurial on Python 3. Our current aim is to support Python 3.5+.

1. Status

We have been testing by installing default mercurial on our system using Python 3. Most of the things work correctly. Things which don't work can be found at BetaBugs section below.

We are planning to mark hg 5.0 which is scheduled for May 1 as Python 3 beta release.

If you are an extension author and want to port the extension, https://www.mercurial-scm.org/repo/hg-committed/file/tip/mercurial/pycompat.py contains most of our compatibility hacks. If you need help or guidance, you can message on IRC or devel mailing list. We will be happy to help you.

2. Things need to be investigated

  • Windows encoding changes

    https://docs.python.org/3/whatsnew/3.6.html#pep-529-change-windows-filesystem-encoding-to-utf-8

  • Lazy importer performance overhead. Our custom importer on Python 2 always returns a stub module during import. Python 3's does I/O to verify the module exists then returns a lazy module that is loaded when first accessed. In addition to behavior differences, the I/O may contribute sufficient performance overhead to constitute a problem.

  • A mechanism for extensions to advertise that they are Python 3 compatible. Nearly every extension will break in Python 3. We may want a mechanism that requires extensions to self-declare that they are Python 3 compatible - possibly via special syntax in their source code or the presence of a well-named variable. It might have to be at the source level because Python 3 would need to evaluate code in order to obtain the value of a module-level variable.

3. Beta bugs

Following are things which don't work right now:

If you find anything apart from this not working, definitely go ahead and edit this page and we will fix it.

4. Porting Extensions to Python 3

Nearly every extension will need to be ported to be compatible with Python 3. This is because of fundamental differences between Python 2 and Python 3.

The source code for Mercurial extensions will need to be Python 3 native and will need to be compatible with Mercurial's APIs. In many cases, existing source code will compile on Python 3 but will fail at run-time. Sources of run-time errors include:

  • Use of str instead of bytes. Mercurial uses bytes (b'' strings) in almost all of its APIs and data structures. This is in contrast to much Python code, which uses str and '' strings. It is common for extensions to b'' prefix most strings in order to remain compatible with Mercurial.

  • Use of iteritems(), iterkeys(), etc. These methods from core data structures do not exist in Python 3.

  • Import of renamed modules. Python 3 refactored the locations of various modules in the Python standard library. Extensions may need to take this into account.

Do an Internet search for Python 3 porting to find well-written and comprehensive guides on generically porting code to Python 3.

Extension authors may find the mercurial.pycompat module useful. This modules contains abstractions and utilities for bridging the differences between Python 2 and 3. It is conceptually similar to the six Python module.

As of at least the Mercurial 5.0 release, Mercurial uses a custom module importer on Python 3 which rewrites source code dynamically as part of importing modules. This module importer is only active for the mercurial, hgext, and hgext3rd packages. Extension loading does not use this custom importer. This means that Mercurial's own source code and extensions are not yet native Python 3 source code. So if you look at Mercurial's source code for ideas on how to do something in an extension, behavior in the extension may differ from Mercurial itself due to the presence of this custom module importer. For reference, in the 5.0 release, the custom module importer performs the following actions:

  • Automatically adds b'' prefixes to strings, making all '' literals b'' and effectively changing str to bytes everywhere. i.e. behavior mostly matches Python 2.

  • Modules automatically have from mercurial.pycompat import delattr, getattr, hasattr, setattr, open, unicode added.

  • getattr(), setattr(), hasattr(), safehasattr(), encode(), and decode() functions and methods have string literals in arguments rewritten to the appropriate type because Python requires a str value instead of bytes. (This effectively selectively undoes the global '' to b'' source transformation.)

  • iteritems() and itervalues() are automatically rewritten to items() and values(), respectively.

The source rewriting module importer is intended to be a stop-gap to make porting Mercurial to Python 3 simpler and will be removed in a future release. This is why extensions do not use it.


CategoryAudit

Python3 (last edited 2023-02-19 16:08:38 by AntonShestakov)