Differences between revisions 7 and 56 (spanning 49 versions)
Revision 7 as of 2010-04-01 10:52:54
Size: 927
Editor: Pradeepkumar
Comment:
Revision 56 as of 2010-06-27 20:48:29
Size: 3410
Editor: Pradeepkumar
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
##master-page:HomepageTemplate
#format wiki
#language en
Email: <<MailTo(deepu.aprjc AT SPAMFREE gmail DOT com)>>

IRC Nick: inXs_

'''About Me:'''

- Student at [[http://www.iitr.ac.in|IIT]] in India. Mathematics is my major.

- Hardcore fan of C, Python, Haskell. You can find some of my small projects on [[http://www.github.com/pradeepkumarg|Github]]. -

- Love solving mathematics problems by programming. [[http://www.github.com/pradeepkumarg/projecteuler|Solutions]] for the problems in [[http://www.projecteuler.net|Project Euler]]

- Experienced user of git. Comfortable with Bazaar.

'''Google Summer of Code-2010'''

I am interested in "Parent delta". I can also work on conversion tools. But I am mainly focusing on Parent Delta Plan.

## You can even more obfuscate your email address by adding more uppercase letters followed by a leading and trailing blank.
...
## page was renamed from in3xes
## page was renamed from Pradeepkumar
== Google Summer of Code-2010 ==
----
=== Contact Details: ===
{{{
        Name: Pradeepkumar Gayam
        Email: in3xes@gmail.com
        IRC Nick: in3xes (freenode.net)
        Jabber ID: in3xes [AT] gmail.com (talk.google.com)
}}}
My GSoC [[http://bitbucket.org/in3xes/gsoc/src/tip/application|application]] on bitbucket.
Line 26: Line 15:
''' '''

=== Journal: ===
2010-06-23: Study shallow clone how they are related to parent delta

2010-06-24: Revlog experiments to improve performance

2010-06-25: More experiments, fixed the issue, so that it will not differ for old revlogs, still expriments needs to be done.

2010-07-26: Study wire-proto, understand how bundle, unbundle is being done.

2010-06-27: Week end

----
=== Work Progress: ===
''' '''Few things related revlog:

1) A [[http://markmail.org/download.xqy?id=ooqvjdaor4t3sxan&number=1|script]] by tonfa, proof for parent delta.

2) mpm [[http://markmail.org/thread/ooqvjdaor4t3sxan|explaining]] how space inefficiency is caused.

3) A [[http://mercurial.selenic.com/hg/hg/file/45aabc523c86/contrib/shrink-revlog.py|python script]] in mercurial repo to shrink revlog by sorting in topological order.

Earlier in this week I started working on revlog, most of the work is already done by tonfa. He shared patches. All I did was understand what he did, and rewrite those patches with minor changes.

'''strategy:'''

As mpm suggested, I have changed the way new revision is being added to existing revlog, and make corresponding changes in creating a given revision of file. The crucial step (performance affecting step) is creating a given version of file. previously to create a given version of a file first we take base revision , to that we keep adding the deltas till given revision. Earlier deltas are added in linear manner. So creating a given revision of file is simply seeking O(1) times.

''Defination of base in revlog is not given anywhere. From revlog, a base is the version where we add full text of file to the revlog instead of delta.''

example scenario:

We want to create a R1 revision of a file. The parents of R1 are P1 and P2. Base of R1 is B1.

Consider a situation where P2 is null, and P1 < B1. In this case I simply can't take B1 and create R1. I have first find base and parents of P1 and construct P1. The same case might repeat. where parent of P1 is < base of P1. So, first I have to find a base which is ancestor of all the parents, then create a chain of deltas from that base to construct a given revision.

Output of --profile says that, what ever the decrease in the performance is caused only by this function which used for constructing the delta chain. So, I need to find a better algorithm to construct the delta chain. The current algorithm for constructing the delta chain is.

{{{
    def deltachain(self, rev):
        chain = []
        while self.base(rev) != rev:
            chain.append(rev)
            rev = self.deltaparent(rev)
        chain.reverse()
        return rev, chain

    def deltaparent(self, rev):
        if self.base(rev) == rev:
            return nullrev
        elif self.flags(rev) & REVLOG_PARENTDELTA_FLAGS:
            return self.parentrevs(rev)[0]
        else:
            return rev-
}}}

Google Summer of Code-2010


Contact Details:

        Name:           Pradeepkumar Gayam
        Email:          in3xes@gmail.com
        IRC Nick:       in3xes (freenode.net)
        Jabber ID:      in3xes [AT] gmail.com (talk.google.com)

My GSoC application on bitbucket.


Journal:

2010-06-23: Study shallow clone how they are related to parent delta

2010-06-24: Revlog experiments to improve performance

2010-06-25: More experiments, fixed the issue, so that it will not differ for old revlogs, still expriments needs to be done.

2010-07-26: Study wire-proto, understand how bundle, unbundle is being done.

2010-06-27: Week end


Work Progress:

Few things related revlog:

1) A script by tonfa, proof for parent delta.

2) mpm explaining how space inefficiency is caused.

3) A python script in mercurial repo to shrink revlog by sorting in topological order.

Earlier in this week I started working on revlog, most of the work is already done by tonfa. He shared patches. All I did was understand what he did, and rewrite those patches with minor changes.

strategy:

As mpm suggested, I have changed the way new revision is being added to existing revlog, and make corresponding changes in creating a given revision of file. The crucial step (performance affecting step) is creating a given version of file. previously to create a given version of a file first we take base revision , to that we keep adding the deltas till given revision. Earlier deltas are added in linear manner. So creating a given revision of file is simply seeking O(1) times.

Defination of base in revlog is not given anywhere. From revlog, a base is the version where we add full text of file to the revlog instead of delta.

example scenario:

We want to create a R1 revision of a file. The parents of R1 are P1 and P2. Base of R1 is B1.

Consider a situation where P2 is null, and P1 < B1. In this case I simply can't take B1 and create R1. I have first find base and parents of P1 and construct P1. The same case might repeat. where parent of P1 is < base of P1. So, first I have to find a base which is ancestor of all the parents, then create a chain of deltas from that base to construct a given revision.

Output of --profile says that, what ever the decrease in the performance is caused only by this function which used for constructing the delta chain. So, I need to find a better algorithm to construct the delta chain. The current algorithm for constructing the delta chain is.

    def deltachain(self, rev):
        chain = []
        while self.base(rev) != rev:
            chain.append(rev)
            rev = self.deltaparent(rev)
        chain.reverse()
        return rev, chain

    def deltaparent(self, rev):
        if self.base(rev) == rev:
            return nullrev
        elif self.flags(rev) & REVLOG_PARENTDELTA_FLAGS:
            return self.parentrevs(rev)[0]
        else:
            return rev-

CategoryHomepage

Pradeepkumar (last edited 2010-10-22 18:17:21 by mpm)