Differences between revisions 1 and 51 (spanning 50 versions)
Revision 1 as of 2010-04-01 08:21:03
Size: 986
Editor: Pradeepkumar
Comment:
Revision 51 as of 2010-06-23 04:49:20
Size: 2692
Editor: Pradeepkumar
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
##master-page:HomepageTemplate
#format wiki
#language en
== @``ME@ ==
## page was renamed from in3xes
## page was renamed from Pradeepkumar
== Google Summer of Code-2010 ==
----
'''
'''
Line 6: Line 8:
Email: <<MailTo(deepu.aprjc AT SPAMFREE gmail DOT com)>> '''Work Progress:'''
Line 8: Line 10:
IRC Nick: inXs_ ''' '''Few things related revlog:
Line 10: Line 12:
'''About Me:''' 1) A [[http://markmail.org/download.xqy?id=ooqvjdaor4t3sxan&number=1|script]] by tonfa, proof for parent delta.
Line 12: Line 14:
- Student at [[IIT|www.iitr.ac.in]] in India. Mathematics is my major. 2) mpm [[http://markmail.org/thread/ooqvjdaor4t3sxan|explaining]] how space inefficiency is caused.
Line 14: Line 16:
- Hardcore fan of C, Python, Haskell. You can find some of my small projects on [[Github|http://www.github.com/pradeepkumarg]]. - 3) A [[http://mercurial.selenic.com/hg/hg/file/45aabc523c86/contrib/shrink-revlog.py|python script]] in mercurial repo to shrink revlog by sorting in topological order.
Line 16: Line 18:
- Love solving mathematics problems by programming. [[Here|http://www.github.com/pradeepkumar/projecteuler.git]] are solutions for the problems in [[Projecteuler|www.projecteuler.net]]. Earlier in this week I started working on revlog, most of the work is already done by tonfa. He shared patches. All I did was understand what he did, and rewrite those patches with minor changes.
Line 18: Line 20:
- Experienced user of git. Comfortable with Bazaar. '''strategy:'''
Line 20: Line 22:
  '''Google Summer of Code-2010''' As mpm suggested, I have changed the way new revision is being added to existing revlog, and make corresponding changes in creating a given revision of file. The crucial step (performance affecting step) is creating a given version of file. previously to create a given version of a file first we take base revision , to that we keep adding the deltas till given revision. Earlier deltas are added in linear manner. So creating a given revision of file is simply seeking O(1) times.
Line 23: Line 24:
I am interested in "Parent delta". I can also work conversion tools as I have good idea about git and bazaar. But I am mainly focusing on Parent Delta Plan. ''Defination of base in revlog is not given anywhere. From revlog, a base is the version where we add full text of file to the revlog instead of delta.''
Line 25: Line 26:
example scenario:
Line 26: Line 28:
## You can even more obfuscate your email address by adding more uppercase letters followed by a leading and trailing blank. We want to create a R1 revision of a file. The parents of R1 are P1 and P2. Base of R1 is B1.
Line 28: Line 30:
... Consider a situation where P2 is null, and P1 < B1. In this case I simply can't take B1 and create R1. I have first find base and parents of P1 and construct P1. The same case might repeat. where parent of P1 is < base of P1. So, first I have to find a base which is ancestor of all the parents, then create a chain of deltas from that base to construct a given revision.
Line 30: Line 32:
---- Output of --profile says that, what ever the decrease in the performance is caused only by this function which used for constructing the delta chain. So, I need to find a better algorithm to construct the delta chain. The current algorithm for constructing the delta chain is.

{{{
    def deltachain(self, rev):
        chain = []
        while self.base(rev) != rev:
            chain.append(rev)
            rev = self.deltaparent(rev)
        chain.reverse()
        return rev, chain

    def deltaparent(self, rev):
        if self.base(rev) == rev:
            return nullrev
        elif self.flags(rev) & REVLOG_PARENTDELTA_FLAGS:
            return self.parentrevs(rev)[0]
        else:
            return rev-1
}}}

Google Summer of Code-2010


Work Progress:

Few things related revlog:

1) A script by tonfa, proof for parent delta.

2) mpm explaining how space inefficiency is caused.

3) A python script in mercurial repo to shrink revlog by sorting in topological order.

Earlier in this week I started working on revlog, most of the work is already done by tonfa. He shared patches. All I did was understand what he did, and rewrite those patches with minor changes.

strategy:

As mpm suggested, I have changed the way new revision is being added to existing revlog, and make corresponding changes in creating a given revision of file. The crucial step (performance affecting step) is creating a given version of file. previously to create a given version of a file first we take base revision , to that we keep adding the deltas till given revision. Earlier deltas are added in linear manner. So creating a given revision of file is simply seeking O(1) times.

Defination of base in revlog is not given anywhere. From revlog, a base is the version where we add full text of file to the revlog instead of delta.

example scenario:

We want to create a R1 revision of a file. The parents of R1 are P1 and P2. Base of R1 is B1.

Consider a situation where P2 is null, and P1 < B1. In this case I simply can't take B1 and create R1. I have first find base and parents of P1 and construct P1. The same case might repeat. where parent of P1 is < base of P1. So, first I have to find a base which is ancestor of all the parents, then create a chain of deltas from that base to construct a given revision.

Output of --profile says that, what ever the decrease in the performance is caused only by this function which used for constructing the delta chain. So, I need to find a better algorithm to construct the delta chain. The current algorithm for constructing the delta chain is.

    def deltachain(self, rev):
        chain = []
        while self.base(rev) != rev:
            chain.append(rev)
            rev = self.deltaparent(rev)
        chain.reverse()
        return rev, chain

    def deltaparent(self, rev):
        if self.base(rev) == rev:
            return nullrev
        elif self.flags(rev) & REVLOG_PARENTDELTA_FLAGS:
            return self.parentrevs(rev)[0]
        else:
            return rev-1

CategoryHomepage

Pradeepkumar (last edited 2010-10-22 18:17:21 by mpm)