Size: 5911
Comment:
|
Size: 6352
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 24: | Line 24: |
2010-07-26: Study wire-proto, understand how bundle, unbundle is being done. | 2010-06-26: Study wire-proto, understand how bundle, unbundle is being done. |
Line 27: | Line 27: |
2010-06-28: More expreiments, found the reason for the performance issue, [[http://paste.lisp.org/+2EEE|this]] paste explains. Conclusion: 'The number of patches to be applied increases at starting rev of new branch'. ---- 2010-06-29: Update blog, journal, Attened GSoC meeting, 2010-06-30: Worked through bundle/unbundle code and understand wire-proto completely 2010-07-01: Worked through bundle/unbundle code and understand wire-proto completely 2010-07-02: - (Tried to work, but, unproductive day) 2010-07-03: - (another unproductive day :-( ) 2010-07-04: - Week end 2010-07-05: Worked on bundle/unbundle, it works for now, have to rewrite the patches,(Ugly code) |
|
Line 75: | Line 92: |
In the starting of the week, I stopped working on revlog for sometime to look at other parts of the api. In the starting I studied abit abotu shallow clones, and parent delta's role in it. Though I didn't go much detailed in it, I got general idea about how things work in shallow clones. Later studied wire protocol abit. | In the starting of the week, I stopped working on revlog for sometime to look at other part of API. I studied abit about shallow clones, and parent delta's role in it. Though I didn't study in much detail, got general idea about how things work in shallow clones. Later worked on wire protocol abit. |
Line 77: | Line 94: |
After some gap I again started working on revlog performance, did a bit of exprimentation, fixed some issues(which brought back the performace when the revlog is non-parent deltad). From the results I concluded that the decrease in the performance is bacause of cache (As tonfa said). In the previous post I wrote that revlog slowed down becase of the deltachain(), and deltaparent() functions are repeated being called. Thought that is one of the reason, but not the major reason. Here are some results. | After some gap I again started working on revlog performance, did some exprimentation, fixed some issues (which brought back the performace when the revlog is non-parent deltad). From the results I concluded that the decrease in the performance is bacause of number of patches being applied in to construct a revision of a file. In the previous post I wrote that revlog slowed down becase of the deltachain(), and deltaparent() functions are repeated being called. Thought that is one of the reason, but not the major reason. Some important results are pasted below. |
Line 81: | Line 98: |
-- Old revlog, Normal repo -- | * Old revlog, Normal repo |
Line 83: | Line 100: |
''hg perfrevlog .hg/store/00manifest.i'' | {{{ hg perfrevlog .hg/store/00manifest.i ! wall 0.570377 comb 0.570000 user 0.560000 sys 0.010000 (best of 18) hg verify --time Time: real 11.290 secs (user 10.790+0.000 sys 0.250+0.000) }}} * New revlog (before fix), Normal repo |
Line 85: | Line 108: |
''! wall 0.570377 comb 0.570000 user 0.560000 sys 0.010000 (best of 18)'' | {{{ hg perfrevlog .hg/store/00manifest.i ! wall 0.646474 comb 0.650000 user 0.640000 sys 0.010000 (best of 16) hg verify --time Time: real 15.930 secs (user 15.350+0.000 sys 0.290+0.000) }}} * New revlog (after fix), Normal repo |
Line 87: | Line 116: |
''hg verify --time'' | {{{ hg perfrevlog .hg/store/00manifest.i ! wall 0.589325 comb 0.550000 user 0.540000 sys 0.010000 (best of 17) hg verify --time Time: real 12.420 secs (user 11.700+0.000 sys 0.300+0.000) }}} * New revlog (before fix), Repo with parent delta'ed manifest |
Line 89: | Line 124: |
''Time: real 11.290 secs (user 10.790+0.000 sys 0.250+0.000)'' | {{{ hg perfrevlog .hg/store/00manifest.i ! wall 2.044643 comb 2.050000 user 2.020000 sys 0.030000 (best of 5) hg verify --time Time: real 42.450 secs (user 41.330+0.000 sys 0.300+0.000) }}} * New revlog (after fix), Repo with parent delta'ed manifest |
Line 91: | Line 132: |
-- New revlog(before fix), Normal repo -- ''hg perfrevlog .hg/store/00manifest.i'' '' '' ''! wall 0.646474 comb 0.650000 user 0.640000 sys 0.010000 (best of 16)'' ''hg verify --time'' '' '' ''Time: real 15.930 secs (user 15.350+0.000 sys 0.290+0.000)'' -- New revlog (after fix), Normal repo -- ''hg perfrevlog .hg/store/00manifest.i'' '' '' ''! wall 0.589325 comb 0.550000 user 0.540000 sys 0.010000 (best of 17)'' ''hg verify --time'' ''Time: real 12.420 secs (user 11.700+0.000 sys 0.300+0.000)'' -- New revlog (before fix), Repo with parent delta'ed manifest -- ''hg perfrevlog .hg/store/00manifest.i'' '' '' ''! wall 2.044643 comb 2.050000 user 2.020000 sys 0.030000 (best of 5)'' ''hg verify --time'' ''Time: real 42.450 secs (user 41.330+0.000 sys 0.300+0.000)'' -- New revlog (after fix), Repo with parent delta'ed manifest -''-'' ''hg perfrevlog .hg/store/00manifest.i'' '' '' ''! wall 1.967572 comb 1.970000 user 1.950000 sys 0.020000 (best of 6)'' ''hg verify --time'' ''Time: real 39.150 secs (user 37.470+0.000 sys 0.450+0.000)'' [[http://paste.lisp.org/+2EEE|This]] paste is self explanatory.'''' |
{{{ hg perfrevlog .hg/store/00manifest.i ! wall 1.967572 comb 1.970000 user 1.950000 sys 0.020000 (best of 6) hg verify --time Time: real 39.150 secs (user 37.470+0.000 sys 0.450+0.000) }}} |
Google Summer of Code-2010
Contact Details:
Name: Pradeepkumar Gayam Email: in3xes@gmail.com IRC Nick: in3xes (freenode.net) Jabber ID: in3xes [AT] gmail.com (talk.google.com)
My GSoC application on bitbucket.
Journal:
2010-06-23: Study shallow clone how they are related to parent delta
2010-06-24: Revlog experiments to improve performance
2010-06-25: More experiments, fixed the issue, so that it will not differ for old revlogs, still expriments needs to be done.
2010-06-26: Study wire-proto, understand how bundle, unbundle is being done.
2010-06-27: Week end
2010-06-28: More expreiments, found the reason for the performance issue, this paste explains. Conclusion: 'The number of patches to be applied increases at starting rev of new branch'.
2010-06-29: Update blog, journal, Attened GSoC meeting,
2010-06-30: Worked through bundle/unbundle code and understand wire-proto completely
2010-07-01: Worked through bundle/unbundle code and understand wire-proto completely
2010-07-02: - (Tried to work, but, unproductive day)
2010-07-03: - (another unproductive day )
2010-07-04: - Week end
2010-07-05: Worked on bundle/unbundle, it works for now, have to rewrite the patches,(Ugly code)
Work Progress:
Few things related revlog:
1) A script by tonfa, proof for parent delta.
2) mpm explaining how space inefficiency is caused.
3) A python script in mercurial repo to shrink revlog by sorting in topological order.
4th Week:
Earlier in this week I started working on revlog, most of the work is already done by tonfa. He shared patches. All I did was understand what he did, and rewrite those patches with minor changes.
strategy:
As mpm suggested, I have changed the way new revision is being added to existing revlog, and make corresponding changes in creating a given revision of file. The crucial step (performance affecting step) is creating a given version of file. previously to create a given version of a file first we take base revision , to that we keep adding the deltas till given revision. Earlier deltas are added in linear manner. So creating a given revision of file is simply seeking O(1) times.
Defination of base in revlog is not given anywhere. From revlog, a base is the version where we add full text of file to the revlog instead of delta.
example scenario:
We want to create a R1 revision of a file. The parents of R1 are P1 and P2. Base of R1 is B1.
Consider a situation where P2 is null, and P1 < B1. In this case I simply can't take B1 and create R1. I have first find base and parents of P1 and construct P1. The same case might repeat. where parent of P1 is < base of P1. So, first I have to find a base which is ancestor of all the parents, then create a chain of deltas from that base to construct a given revision.
Output of --profile says that, what ever the decrease in the performance is caused only by this function which used for constructing the delta chain. So, I need to find a better algorithm to construct the delta chain. The current algorithm for constructing the delta chain is.
def deltachain(self, rev): chain = [] while self.base(rev) != rev: chain.append(rev) rev = self.deltaparent(rev) chain.reverse() return rev, chain def deltaparent(self, rev): if self.base(rev) == rev: return nullrev elif self.flags(rev) & REVLOG_PARENTDELTA_FLAGS: return self.parentrevs(rev)[0] else: return rev-1
5th Week:
In the starting of the week, I stopped working on revlog for sometime to look at other part of API. I studied abit about shallow clones, and parent delta's role in it. Though I didn't study in much detail, got general idea about how things work in shallow clones. Later worked on wire protocol abit.
After some gap I again started working on revlog performance, did some exprimentation, fixed some issues (which brought back the performace when the revlog is non-parent deltad). From the results I concluded that the decrease in the performance is bacause of number of patches being applied in to construct a revision of a file. In the previous post I wrote that revlog slowed down becase of the deltachain(), and deltaparent() functions are repeated being called. Thought that is one of the reason, but not the major reason. Some important results are pasted below.
(All the experiments are done on mercurial repo, Right now, I don't have good enough computer for larger repos )
- Old revlog, Normal repo
hg perfrevlog .hg/store/00manifest.i ! wall 0.570377 comb 0.570000 user 0.560000 sys 0.010000 (best of 18) hg verify --time Time: real 11.290 secs (user 10.790+0.000 sys 0.250+0.000)
- New revlog (before fix), Normal repo
hg perfrevlog .hg/store/00manifest.i ! wall 0.646474 comb 0.650000 user 0.640000 sys 0.010000 (best of 16) hg verify --time Time: real 15.930 secs (user 15.350+0.000 sys 0.290+0.000)
- New revlog (after fix), Normal repo
hg perfrevlog .hg/store/00manifest.i ! wall 0.589325 comb 0.550000 user 0.540000 sys 0.010000 (best of 17) hg verify --time Time: real 12.420 secs (user 11.700+0.000 sys 0.300+0.000)
- New revlog (before fix), Repo with parent delta'ed manifest
hg perfrevlog .hg/store/00manifest.i ! wall 2.044643 comb 2.050000 user 2.020000 sys 0.030000 (best of 5) hg verify --time Time: real 42.450 secs (user 41.330+0.000 sys 0.300+0.000)
- New revlog (after fix), Repo with parent delta'ed manifest
hg perfrevlog .hg/store/00manifest.i ! wall 1.967572 comb 1.970000 user 1.950000 sys 0.020000 (best of 6) hg verify --time Time: real 39.150 secs (user 37.470+0.000 sys 0.450+0.000)