Differences between revisions 3 and 4
Revision 3 as of 2016-03-23 14:45:38
Size: 4492
Comment:
Revision 4 as of 2016-04-04 15:54:32
Size: 5035
Comment:
Deletions are marked like this. Additions are marked like this.
Line 9: Line 9:
/!\ This is a speculative project and does not represent any firm decisions on future behavior. Provide a continuous integration infrastructure to measuring and preventing performance regressions on mercurial
Line 11: Line 11:
Provide a continuous integration infrastructure to measuring and preventing performance regressions on mercurial Discussion on devel list: http://marc.info/?t=145863695000002
Line 22: Line 22:
Line 38: Line 37:
== Expected Results ==
Line 40: Line 38:
Expected results are still to be discussed. For now, we aim at having a simple tool 
to track performance regressions on the 2 branches of the main mercurial repository 
(stable and default).
== Tool selection ==
Line 44: Line 40:
However, there are some open questions for mid-term objectives:

  * What revisions of the Mercurial source code should we run the performance 
  regression tool on? (public cs on the main branch only? Which branches? ...)
  * How do we manage the non-linear structure of a Mercurial history? 
  * What kind of aggregations / comparisons de we want to be able to do? Should these 
  be available through a "query language" or can they be hardwritten in the 
  performance regression tool?


== Existing tools ==

Airspeed velocity
[[After evaluating several tools|https://hg.logilab.org/review/hgperf/raw-file/tip/docs/tools.html]] we
choose to use Airspeed velocity that already handle most of our needs.
Line 59: Line 44:
  * used by the http://www.astropy.org/ projects   * used by the http://www.astropy.org/ projects (numpy)
Line 62: Line 47:
  * Python, Javascript (http://www.flotcharts.org/)
Line 81: Line 64:
However, ASV will require some work to fit the needs:
Line 83: Line 65:
  * The main drawback with ASV is the fact it's designed with commit date as X axis. 
  We must adapt the code of asv to properly handle this "non-linearity" related to 
  dates (see https://github.com/spacetelescope/asv/issues/390)
  * Fix mercurial branch handling (see https://github.com/spacetelescope/asv/pull/394)
  * Tags are displayed in the graphs as a secondary x axis labels, and are related to commit 
  date of the tag; these should be displayed as annotations of the dots instead.
  * Implement a notification system
A demo build with a patched ASV can be seen here: https://hg.logilab.org/review/hgperf/raw-file/454c2bd71fa4/index.html#regressions?sort=3&dir=desc
Line 91: Line 67:
Demo build with a patched ASV to workaround dates, branch and tags issues
Line 93: Line 68:
  * https://hg.logilab.org/review/hgperf/raw-file/5aee29f2aee0/index.html
  * Regression on perftags benchmark from contrib/perf.py: https://hg.logilab.org/review/hgperf/raw-file/5aee29f2aee0/index.html#benchmarks.track_perf_perftags?branch=default&time=28264-28265
== Q & A ==
Line 96: Line 70:
Other tools where evaluated a complete report is available in https://hg.logilab.org/review/hgperf/raw-file/tip/docs/tools.html   * '''Q''': What revisions of the Mercurial source code should we run the
  performance regression tool on? (public cs on the main branch only? Which
  branches? ...).
  
  '''A''': Let's focus on public changesets for now and on the
  two branch (default and stable)

  * '''Q''': How do we manage the non-linear structure of a Mercurial history ?
  
  '''A''': The Mercurial repository is mostly linear as long as only one branch
  is concerned, however we don't (and have no reason to) enforce it. For now
  the plan is to follow the first parent of the merge changesets to enforce the
  linearity of each branches.

== Plan ==

  * Fix mercurial branch handling in ASV: https://github.com/spacetelescope/asv/pull/394

  * Use revision instead of commit date as X axis in ASV (in progress)

  * Provide some ASV benchmark code (starting with revsets) and publish the
  result in a dedicated public repository

  * Provide ansible configuration to deploy the tool in the existing buildbot
  infrastructure and expose the results in a public website when new public
  changesets are pushed on the main branches.

  * Parametrize benchmarks against multiple references repositories (hg, mozilla-central, ...)

  * Parametrize revset benchmarks with variants (first, last, min, max, ...)

  * Implement a notification system in ASV

  * Add more revset benchmarks

  * Add other benchmarks from `contrib/perf.py`

  * Add unit test execution time as benchmark( /!\ We must handle when the test itself has changed /!\ )

  * Write an annotation system in unit tests and get metrics execution time of annotated portions

  * Write a system of scenario based benchmark. They should be written as
  mercurial tests (with annotation) and might be kept in a dedicated repository

  * Track both improvement and regression ? A change, especially on revset, can have positive or negative
  impact on multiple benchmarks, having a global view of this information could be a good feature

  * Discuss about the maintenance of the benchmarks suites and infrastructure.

Note:

This page is primarily intended for developers of Mercurial.

Performance tracking infrastructure

Status: Project

Main proponents: Pierre-YvesDavid PhilippePepiot

Provide a continuous integration infrastructure to measuring and preventing performance regressions on mercurial

Discussion on devel list: http://marc.info/?t=145863695000002

1. Goal

Mercurial code change fast and we must detect and prevent performances  regressions as soon as possible.

  • Automatic execution of performance tests on a given Mercurial revision
  • Store the performance results in a database
  • Expose the performance results in a web application (with graphs, reports, dashboards etc.)
  • Provide some regression detection alarms with email notifications

2. Metrics

We already have code that produce performance metrics:

  • Commands from the perf extension in contrib/perf.py
  • Revset performance tests contrib/revsetbenchmarks.py
  • Unit test execution time

Another idea is to produce metrics from annotated portions of unit test execution time.

These metrics will be used (after some refactoring for some of the tools that  produce them) as performance metrics, but we may need some more specifically written for the purpose of performance regression detection.

3. Tool selection

https://hg.logilab.org/review/hgperf/raw-file/tip/docs/tools.html we choose to use Airspeed velocity that already handle most of our needs.

This tool aims at benchmarking Python packages over their lifetime.  It is mainly a command line tool, asv, that run a series of benchmarks (described  in JSON configuration file), and produces a static HTML/JS report.

When running a benchmark suite, ASV take care of clone/pulling the source repository  in a virtual env and running the configured tasks in this virtual env.

Results of each benchmark execution are stored in a "database" (consisting in JSON files). This database is used to produce evolution plots of the time required to run a test (or any metrics; out of the box, asv has support for 4 types of benchmark:  timing, memory, peak memory and tracking), and to run the regression detection algorithms.

One key feature of this tool is that it's very easy for every developer to use it on  its own development environment. For example, it provides an asv compare command allowing to compare  the results of any 2 revisions.

A demo build with a patched ASV can be seen here: https://hg.logilab.org/review/hgperf/raw-file/454c2bd71fa4/index.html#regressions?sort=3&dir=desc

4. Q & A

  • Q: What revisions of the Mercurial source code should we run the performance regression tool on? (public cs on the main branch only? Which branches? ...).

    A: Let's focus on public changesets for now and on the two branch (default and stable)

  • Q: How do we manage the non-linear structure of a Mercurial history ?

    A: The Mercurial repository is mostly linear as long as only one branch is concerned, however we don't (and have no reason to) enforce it. For now the plan is to follow the first parent of the merge changesets to enforce the linearity of each branches.

5. Plan

  • Fix mercurial branch handling in ASV: https://github.com/spacetelescope/asv/pull/394

  • Use revision instead of commit date as X axis in ASV (in progress)
  • Provide some ASV benchmark code (starting with revsets) and publish the result in a dedicated public repository
  • Provide ansible configuration to deploy the tool in the existing buildbot infrastructure and expose the results in a public website when new public changesets are pushed on the main branches.
  • Parametrize benchmarks against multiple references repositories (hg, mozilla-central, ...)
  • Parametrize revset benchmarks with variants (first, last, min, max, ...)
  • Implement a notification system in ASV
  • Add more revset benchmarks
  • Add other benchmarks from contrib/perf.py

  • Add unit test execution time as benchmark( /!\ We must handle when the test itself has changed /!\ )

  • Write an annotation system in unit tests and get metrics execution time of annotated portions
  • Write a system of scenario based benchmark. They should be written as mercurial tests (with annotation) and might be kept in a dedicated repository
  • Track both improvement and regression ? A change, especially on revset, can have positive or negative impact on multiple benchmarks, having a global view of this information could be a good feature
  • Discuss about the maintenance of the benchmarks suites and infrastructure.


CategoryDeveloper CategoryNewFeatures

PerformanceTrackingSuitePlan (last edited 2020-02-13 22:37:48 by Pierre-YvesDavid)