Size: 3708
Comment: Promoted from a contrib script to an extension.
|
Size: 1186
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 2: | Line 2: |
=== Recreate hardlinks between two Mercurial repositories === | === Relink Extension === |
Line 7: | Line 7: |
Also, pulling with {{{--rev}}} never uses hardlinks. | |
Line 8: | Line 9: |
Here's a quick and dirty way to recreate those hardlinks and reclaim that wasted space (this script is also available as {{{contrib/hg-relink}}} in the source tarball, and see [[http://mercurial.selenic.com/bts/issue919|Issue919]] for a proposed {{{hg relink}}} command): | You can recreate those hardlinks and reclaim that wasted space using the Relink Extension. Enable it: |
Line 11: | Line 12: |
#!/usr/bin/env python | [extensions] relink = # optionally specify the origin, if you do this a lot: [paths] default-relink = ../incoming }}} |
Line 13: | Line 19: |
import os, sys | and run it: |
Line 15: | Line 21: |
class ConfigError(Exception): pass | {{{ $ hg relink relinking /home/hacker/src/incoming to /home/hacker/src/work collected 9999 candidate storage files pruned down to 5555 probably relinkable files relinked 4444 files (12345678 bytes reclaimed) }}} |
Line 17: | Line 29: |
def usage(): print """relink <source> <destination> Hard-link files from source to destination""" |
In Mercurial 1.3.1 and older (prior to [[http://mercurial.selenic.com/bts/issue919|Issue919]]), {{{contrib/hg-relink}}} in the source tarball can be used for the same purpose. |
Line 21: | Line 31: |
class Config: def __init__(self, args): if len(args) != 3: raise ConfigError("wrong number of arguments") self.src = os.path.abspath(args[1]) self.dst = os.path.abspath(args[2]) for d in (self.src, self.dst): if not os.path.exists(os.path.join(d, '.hg')): raise ConfigError("%s: not a mercurial repository" % d) try: cfg = Config(sys.argv) except ConfigError, inst: print str(inst) usage() sys.exit(1) relinked = 0 savedbytes = 0 CHUNKLEN = 4096 def collect(src): seplen = len(os.path.sep) candidates = [] for dirpath, dirnames, filenames in os.walk(src): relpath = dirpath[len(src) + seplen:] for filename in filenames: if not (filename.endswith('.i') or filename.endswith('.d')): continue st = os.stat(os.path.join(dirpath, filename)) candidates.append((os.path.join(relpath, filename), st)) return candidates def prune(candidates, dst): targets = [] for fn, st in candidates: tgt = os.path.join(dst, fn) try: ts = os.stat(tgt) except OSError: # Destination doesn't have this file? continue if st.st_ino == ts.st_ino: continue if st.st_dev != ts.st_dev: raise Exception('Source and destination are on different devices') if st.st_size != ts.st_size: continue targets.append((fn, ts.st_size)) return targets def relink(src, dst, files): CHUNKLEN = 65536 relinked = 0 savedbytes = 0 for f, sz in files: source = os.path.join(src, f) tgt = os.path.join(dst, f) sfp = file(source) dfp = file(tgt) sin = sfp.read(CHUNKLEN) while sin: din = dfp.read(CHUNKLEN) if sin != din: break sin = sfp.read(CHUNKLEN) if sin: continue try: os.rename(tgt, tgt + '.bak') try: os.link(source, tgt) except OSError: os.rename(tgt + '.bak', tgt) raise print 'Relinked %s' % f relinked += 1 savedbytes += sz os.remove(tgt + '.bak') except OSError, inst: print '%s: %s' % (tgt, str(inst)) print 'Relinked %d files (%d bytes reclaimed)' % (relinked, savedbytes) src = os.path.join(cfg.src, '.hg') dst = os.path.join(cfg.dst, '.hg') candidates = collect(src) targets = prune(candidates, dst) relink(src, dst, targets) }}} |
|
Line 115: | Line 32: |
CategoryTipsAndTricks | CategoryExtension |
Relink Extension
When repositories are cloned locally, their data files will be hardlinked so that they only use the space of a single repository.
Unfortunately, subsequent pulls into either repository will break hardlinks for any files touched by the new changesets, even if both repositories end up pulling the same changes. Also, pulling with --rev never uses hardlinks.
You can recreate those hardlinks and reclaim that wasted space using the Relink Extension. Enable it:
[extensions] relink = # optionally specify the origin, if you do this a lot: [paths] default-relink = ../incoming
and run it:
$ hg relink relinking /home/hacker/src/incoming to /home/hacker/src/work collected 9999 candidate storage files pruned down to 5555 probably relinkable files relinked 4444 files (12345678 bytes reclaimed)
In Mercurial 1.3.1 and older (prior to Issue919), contrib/hg-relink in the source tarball can be used for the same purpose.