Note:
This page is primarily intended for developers of Mercurial.
Pypy Plan
Status: In Progress
Main proponents: Pierre-YvesDavid, MaciejFijalkowski, MartijnPieter, BryanO'Sullivan
Pypy is a python interpreter with a Tracing JIT compiler, using it could significantly boost our performance in some situation.
1. Description
The Python language is powerful but slow. Using a Jit compiler can give us a lot of that speed back without having to translate large part or Mercurial into C. However there is multiple challenge ahead:
- Startup and JIT Warm-up time are an issue for the mercurial command line typically short running type,
- Pypy is unable to use our CPython extension code, missing multiple of the optimisation we implemented.
All kind of stuff can go here, solution description / alternative solution etc
2. Current approach
We plan to mitigate startup and warmup time using ChgPortingPlan,
- We'll also investigate way to share/reuse JIT tracing information from one run to another, but it is a long shot,
- We are investigating using CFFI to call our carefully optimised C code, The CPython implementation should stay around in parallel,
- In some case improving the pure python code should be enough.
3. Failing Tests
- Failed test-devel-warnings.t: output changed
- Failed test-strict.t: timed out
- Failed test-setdiscovery.t: output changed
- Failed test-clone-uncompressed.t: output changed
- Failed test-doctest.py: output changed
- Failed test-revset.t: output changed
4. CFFI experiment
Using the CPython API in C makes it hard for pypy to do nice optimisation. Calling our C code through https://cffi.readthedocs.org/en/latest/ would fix that. However, cffi just allow function call, not building full compatible Python object directly from C. As a result we probably need to keep a CPython API and a CFFI version of our code in parallel. CPython will still use it's carefully build object using the CPython API and PyPy will use pure Python object calling C function through CFFI as it should be able to optimise the pure Python object overhead away.
All our C code must be audited and sorted in the following categories
4.1. Python Version is Just Fine
The pure version is already as fast as the C version.
- …
Link to example: 'to be added'
4.2. Python Version need rework
The pure version should be as fast as the C version, but is currently not.
reachableroot
- …
Link to example: 'to be added'
4.3. Simple Function that need cffi version
mpatch.patches
Link to example: 'to be added'
4.4. Complex Class that need cffi version
lazymanifest ?
index ?
- …
Link to example: 'to be added'
4.5. C Code calling Python code back
This will be fun.
- …
Link to example: 'to be added'
5. Benchmark
We plan to have a look at the performance of the following command.
hg log
hg log -p
hg bundle -t gzip -a - the main component of server side clone
hg unbundle - the main component of client side clone
hg blame
updating - hg up null followed by hg up tip
- performance of http server (hg serve)
- hg commit
- hg amend
- hg rebase
- hg status
6. Roadmap
have tests pass,
add support for a cffi module policy,
demonstrate cffi usage for each "category":
Simple function,
Simple Class,
Code with roundtrip between Python and C,
Updated Python code (no CFFI).
have a basic set of benchmark
have some plan to re-use//share JIT tracing data
have a C version for all critical code
- fill me.
implement that plan