Differences between revisions 16 and 17
Revision 16 as of 2016-01-25 23:23:36
Size: 11493
Editor: JunWu
Comment: Update to reflect discussion at https://bitbucket.org/yuja/chg/pull-requests/4
Revision 17 as of 2016-01-26 19:19:53
Size: 11488
Editor: JunWu
Comment:
Deletions are marked like this. Additions are marked like this.
Line 121: Line 121:
Consider the fact that a user can disable or change extension in `$REPO/.hg/hgrc`, multiple processes is the way to go here. Otherwise if the user is working on multiple repos with different configs at the same time, the server can restarts frequently. Consider the fact that a user can disable or change extensions in `$REPO/.hg/hgrc`, multiple processes is the way to go. Otherwise if the user is working on multiple repos with different configs at the same time, the server can restart frequently.

Note:

This page is primarily intended for developers of Mercurial.

cHg Porting Plan

Steps to merge cHg into the Mercurial tree, plus possible future improvements.

1. How to Merge

Perhaps (B).

  1. single big commit
    • {o} hard to review

    • {o} useless history for digging

  2. reorganize as a few patches (base implementation, pager, setenv, sendfds, ...)
    • {*} can improve the code incrementally

    • {o} less useful history for digging

  3. pull, rename and merge
    • {o} having more than one roots

  4. convert with filemap and rebase
    • {*} full history is useful when digging into bugs

    • {o} 300+ uninteresting revisions

2. Source Layout

Perhaps (A) or (B) because we don't have to worry about the extension path.

Original:

README
hgext/chgserver.py
      chgutil.c
src/Makefile
    chg.c
    hgclient.[ch]
    util.[ch]

A. Merge extension part into hgext:

contrib/chg/Makefile
            README
            chg.c
            hgclient.[ch]
            util.[ch]
hgext/chgserver.py
mercurial/osutil.c <- chgutil.c

B. Merge extension part into core:

contrib/chg/Makefile
            README
            chg.c
            hgclient.[ch]
            util.[ch]
mercurial/chgserver.py
mercurial/osutil.c <- chgutil.c

C. Put everything under contrib/chg:

contrib/chg/hgext/...
            src/...

3. Coding Style

Current state:

  • .py mostly follows the Mercurial style

  • .c is similar to the Mercurial style, but

    • uses 4 spaces instead of tab
    • uses C99 comment
    • uses C99 variable declaration

What to do:

  • update import lines (easy)

  • replace 4 spaces by tab (easy)
  • replace C99 comments (easy)
  • move variable declarations to top (really?)
    this appears non-trivial, I won't do for contrib/chg sources

4. Future Improvements

4.1. Handling config change

To behavior exactly like original hg, the server needs to do something when "config" (config files, environment variables etc.) changed.

This can be tricky with config items like extensions.*. Since it's impossible to undo the side effects caused by pre-loading an extension.

4.1.1. Places affecting config

  • Config files (Note: $HGRCPATH can override where we look up configs)

  • Command line arguments like --config, --repo, --cwd (--repo and --cwd can affect where we read .hg/hgrc)

  • Current directory (like --cwd)

  • Environement variables: LD*, TZ, HG*, LANG, LC_*

  • Python souce files, like __version__, or even extension files (used by developer)

Be careful with config files, if they are read in different processes or at different time, there can be race conditions.

4.1.2. What to do on config change

Restart? Reload? Multiple processes?

On config change, it isn't always necessary to restart the server. For example, ui.style can be reloaded by recreating ui object. On the other hand, extensions can't be unloaded. Anyway, it won't be a problem to restart the server aggressively assuming that the config files won't change too often.

Environment variables are also a problem. The two things above change in a "global event" basis. At some point of time, the config/version changes and all running servers need to be restarted. However, Environement variable can change more often and can be different in two different shell, using Mercurial at the same time. We probably need multiple server based on environment here.

Consider the fact that a user can disable or change extensions in $REPO/.hg/hgrc, multiple processes is the way to go. Otherwise if the user is working on multiple repos with different configs at the same time, the server can restart frequently.

The basic idea is to have different processes listening on different socket files. However with filesystem race conditions considered, reload or restart makes sense. Otherwise we need to verify two processes see same files.

Put it all together, multiple processes listening on different socket files. The socket paths are decided by everything in "Places affecting config" except for files. These processes is responsible for restarting themselves when config (or source code) files change.

See "How to detect config change?", "How to handle Environment Variables difference", "Who restarts the server?" below for possible implementation details.

4.1.3. Where to detect the change

Client or Server? Either one, we may want to do it completely in one side. In this way the logic is easier to follow and more maintainable.

Client has difficulity to detect hg installation location. This makes __version__ and extension source code change detection extremely hard.

Do it on server after receiving all the information (environ, command line) just before executing the actual command.

4.1.4. Server model

Probably master - workers and double connects from client for the first step. Each worker is responsible for a specified config (not considering filesystem). Workers (and master) restart themselves on files change (see below "Transparent restart").

client      master      workers
  :           |listen()   :         # master socket (/tmp/chg${UID}/master)
  |-connect()>|accept()   :
  |-send()--->|calc_hash():         # send env, argv and calculate non-files hash
  :           |-fork()--->|exec()   # spawn if no worker for that non-files hash (by checking filesystem and pid)
  :           :           |listen() # worker socket (/tmp/chg${UID}/worker-${HASH})
  |<---send()-|           :         # tell worker socket name
  |-close()-->|           :         # client no longer needs master 
  |-connect()------------>|accept()
  |           :           |fork()   # fork per connection
  |-send()--------------->|...      # send env, argv and run the actual command
  ...

For the first step, workers are forked per connection. See "Forking model" below for possible future improvements.

If we want to kill double connects from client (for performance), we may want to do some fd magic and maintain worker state in master, which is not trivial to do correctly.

4.1.5. Transparent restart

Restart visible to outside is error prone consider multiple restarts happening in parallel, the downtime, client side wait and retry, locks etc.

Want to continue serve chg clients during restart. Ideally the restart is transparent to client. This seems to be possible if we transfer file descriptors from the old to the new process.

The restart process looks like:

  1. Server process A listening. Forked process B handling the request detects files change.
  2. B forks and execs a new server with correct environ and commandline arguments, then pass two fds, one is the socket listening, the other is the one returned by accept.

  3. The new server initialized itself (preloading extensions) and forks immidately to handle the request.
  4. The new server takes over the unix socket.
  5. The old server ends itself (see below).

To make the unix socket file always available to clients, and work correctly even when multiple restarts happen at the same time, do something like:

  1. Instead of bind(server_address), bind to a temporary, unique address such as server_address + pid and then do a rename. Therefore no downtime. This also allows us to do 2 (see below).

  2. Periodically check if the socket file is owned by current process (by checking inode). If not, exit.

This efficiently handles parallel restarts lock-free cleanly at a little cost of CPU and filesystem.

4.1.6. How to detect config change?

A. keep hashes of all config files and compare them:

  1. hook ui.readconfig -> config.parse to know all involved files

  2. keep full text or hash of these files
  3. read all config files and compare them with (2) per connection

https://bitbucket.org/yuja/chg/pull-requests/3/483b35203d92/diff#comment-8188548

/!\ We can't know if chg server is about to start when config files are loaded, so it would have to be always enabled.

B. (another idea)

  1. always recreate ui

  2. request to restart the server if extensions are changed

4.1.7. How to handle Environment Variables difference

Some of them are sent by the client and updated for each worker. (eg: HGEDITOR)

Some other have global effect and requires dispatch to different server. (eg: HGPLAIN*, HGENCODING*, HGRCPATH, LANG, LANGUAGE, LC_*)

4.1.8. Who restarts the server? (previous discusion)

  • this would be deeply linked to the server model:
    • fork per connection, or pre-forked worker pool (for PyPy JIT)

    • round-robin on accept(), or more intelligent dispatcher (e.g. dispatch per repo.root for better caching of repo instance)

  • but we'll need a simple implementation first. otherwise we can't run tests!

(from previous discussion)

  • marmoute, lcharignon: server tells dirty and dies, client starts the server ?
  • yuya: server tells dirty, client kills and restarts the server

(from Dec. 18, 2015 with marmoute, lchrignon, junw)

  • start multiple servers per hash (see below)
  • the hash is calculated at the client
  • server listens to a unix socket, whose path includes the hash
  • server does gc: kills itself after being idle for long
  • therefore no restart needed
  • things like filesystem race and __version__.py makes it actaully not trivial

4.1.9. See also

4.2. Forking Model

Need long-lived workers for better JIT optimization and caching of repo objects.

cHg was originally a pre-forking server, which worked as follows:

  1. master bind() and listen() on shared socket

  2. master fork() pre-configured number of workers

  3. each worker accept() connection (round-robin)

https://bitbucket.org/yuja/chg/src/prefork/hgext/chgsupport.py

There were two major issues:

  1. client stuck if no idle worker available
  2. global space could be tainted by running command, for example:
    • hg unknown-command loads all extension modules and break things (iirc, "factotum" or "inotify" changed the global socket timeout value and made chg failing)

    • commands or extensions could be loaded from .hg/hgrc

(a) can be solved by master-worker channel:

  1. worker tells master it is busy on accept()

  2. master fork() one more worker if no idle worker available

(b) was solved by terminating worker if unwanted changes were detected.

4.3. Known Issues

  • serve --daemon not work

  • --time is noop

  • extensions loaded by {repo}/.hg/hgrc can't be found by extensions loaded globally

  • ^Z does not stop the server (needs handling of SIGTSTP and SIGCONT ?)

  • see also CommandServer

4.4. Random Thoughts

  • eliminate copy-paste codes
    • pager
    • _requesthandler

    • util.system

  • testing: ./run-tests.py --with-hg=chg ?

  • environment variables
  • debian package


CategoryDeveloper CategoryNewFeatures

ChgPortingPlan (last edited 2016-12-07 23:26:33 by JunWu)