Command Server Project Proposal
Name: Idan Kamara
Contact: idankk86@gmail.com, idank on #mercurial
Background: I'm a Computer Science and Mathematics undergraduate in the Open University of Israel.
My first meeting with Mercurial was in my previous work place. At the time I started working there, the development teams were using a (god forsaken) source control called StarTeam. I quickly grew tired of it and started looking for alternatives to take its place. Among the top DVCSs at the time, my absolute favorite was Mercurial due to its user-friendly approach, low learning curve, cross platform and a very open and helpful community.
Since then I've been following Mercurial looking for opportunities to give back. GSoC looks like a great one.
Most of my programming experience is in C++, Iv'e also done some Java and C# here and there. I've been using Python for a lot of small tasks the past couple of years but I've always wanted to see how a real application is written using it and in my opinion Mercurial is an excellent example of one.
Project title Command Server
Synopsis: Mercurial's primary stable API is its command line interface. Creating a tool and library to communicate with this API over a pipe or a socket will help improve performance for third-party tools that use Mercurial.
Benefits to Mercurial
- As stated above, improving performance (saving process startup time, cache the repository object).
- It can also benefit regular users by having a small, fast client talk to the server directly.
- Easier integration with Mercurial for programming languages other than Python.
- Allow a remote repository to be queried without needing a local clone.
- As stated above, improving performance (saving process startup time, cache the repository object).
Deliverables:
- A functioning server.
- A sample client (possibly a tiny C program that simply forwards its argv to the server, can be used instead of hg's main python module by regular users).
- Make the test suite pass while using the server rather than talking to hg directly.
Project details:
When integrating with Mercurial, the recommended approach by the Mercurial team is to use the command-line interface. Mercurial goes to great lenghts to make sure the command-line interface doesn't change very often, thus ensuring existing tools who rely on it stability when upgrading. The other, unrecommended option (available to Python applications) is to use Mercurial's internal API [1], yielding better performance and more control at the cost of possibly breaking between releases of Mercurial (an example of such tools can be seen here [2], [5]).
The command server will aim to be the best of those two worlds. It will maintain stability throughout Mercurial releases and offer better performance over calling the command-line interface directly.
Existing tools I've looked at (MercurialEclipse, TortoiseHg, VisualHG, MacHG etc.) take the recommended approach and use the command-line interface. This is done by opening a process for every hg command. Tools written in Python usually import Mercurial and call it directly (saving process creation).
The specifics of how the requests to the server is to be determined, but an initial thought is something of this sort: "<path-to-repository>;<command-line>". The servers answer might look like this: "<exit-status>;<output>". An attempt was made by a Mercurial developer (hgrpc, source here [4]) to write something that behaves roughly like that. It doesn't offer anything beyond that in terms of performance (it does save process creation though). It can be improved in that regard by caching the repository object for a path it serves, reusing it in subsequent requests. Doing something with the ui object.
There are many hg commands that give meaningful output to the user (status, log, diff...) other than an exit status. The tedious part is parsing the output and this part is bound to be duplicated among tools. In this regard, Mercurial helps by offering a way to customize its output (as explained here [3]). Tools use this facility to arrange output of commands such as 'hg log' in a way that suits their needs. We might be able to use this to provide output that can be parsed more easily.
The above request/response 'protocol' is quite simple and can be taken further, perhaps by introducing a small protocol for the output of various hg commands. For instance, a response to an 'hg status' might look like this in JSON (or some other suitable format):
{
- "exitcode": 0, "modified": [
- "name": "foo",
- ..
- ..
}
Basically this will remove all the boilerplate code tools need to write to parse output of certain hg commands (the server will have to take care of that) by having it in a nice data structure. But going this route means that the transition from how tools integrate with Mercurial today won't be as seamless.
The command server also opens up the possibility of querying a Mercurial repository without having to clone it locally (or having Mercurial installed for that matter), similiar to what hgweb offers. Some tools only need read abilities from a repository. They can benefit by talking to the command server (that'll run on the centralized server) rather than keeping a local clone that is constantly being updated. Another idea could be for GUI tools to add a 'repository explorer' that will let the user explore the tree, logs, diffs of a remote repository (somewhat similiar to SVN's repository explorer).
I see the command server being useful mostly for applications written in languages other than Python. But Python applications that choose not to mess with Mercurials internal API can also gain some performance improvements.
[1]: http://mercurial.selenic.com/wiki/MercurialApi
[2]: https://developers.kilnhg.com/Repo/Kiln/Group/Kiln-Storage-Service
[3]: http://hgbook.red-bean.com/read/customizing-the-output-of-mercurial.html
[4]: https://bitbucket.org/wbruna/hgrpc/
[5]: http://trac.edgewall.org/browser//plugins/0.10/mercurial-plugin/tracvc/hg/backend.py
Project schedule:
April 26th - May 22rd
Familiarize myself with Mercurials command-line interface.
Discuss with my mentor and the community to set up some goals for how the finished project should look like.May 23rd - July 10th
Finish the design after community comments and write it in a wiki page.
Code the command server.
Do necessary Mercurial internal modifications to support performance improvements.July 11th - July 15th
Mid-term evaluations. Continue coding.
July 16th - August 15th
Finish last tasks in the command server.
Write test cases.
Integrate the command server with the existing test suite.August 30th
Project ends. Submit code to Google.
Exams and other commitments: Due to summer vacations not overlapping with the US, I will still be in the middle of a semester. I'm only taking one course so I don't think it'll hamper my ability to successfully finish the project.
Other summer plans: I also plan on taking a 10 day vacation in early May. Since coding begins May 23rd I plan on putting more time before I go on vacation so I don't fall behind.
Post GSoC plans: I will gladly continue my involvement in Mercurial after GSoC. I've been following several open source projects through the years but never really got involved in any. I think Mercurial is a great opportunity for me due to my growing interest in Python and in open source software in general.