Differences between revisions 5 and 12 (spanning 7 versions)
Revision 5 as of 2008-10-25 10:25:57
Size: 2605
Editor: hk29kj1
Comment:
Revision 12 as of 2010-04-01 14:25:19
Size: 3391
Comment: Added links to prototype repos
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
'''[http://29kj.com >>>> 香港六合彩曾道人白小姐提供六合彩资料和六合彩开奖结果]''' ## page was renamed from WireProtocolNG
Line 3: Line 3:
'''[http://29kj.com >>>> 香港六合彩专业网站,香港赛马会连结,提供香港六合彩特]''' Status: Draft/Experiment
Line 5: Line 5:
'''[http://29kj.com >>>> 香港六合彩官方网站曾道人内幕玄香港六合 彩大公开]''' == Proposed new discovery/changegroup protocol ==
Line 7: Line 7:
'''[http://1878.cc >>>> 香港六合彩天线宝宝公开内幕玄香港六合 彩]''' The wire protocol has several flaws:
 * it uses the roots of the new branches, this is susceptible to a race (see issue1320) .
 * if the client is missing a lot of nodes but doesn't have any local changes, it will still have to do a lot of roundtrips to discover the base nodes.
Line 9: Line 11:
'''[http://29kj.com >>>> 香港六合彩曾道人特码主论坛六合彩图库白小姐liuhecai]''' See WireProtocol for the current protocol.
Line 11: Line 13:
'''[http://1878.cc >>>> 香港六合彩内幕玄香港六合 彩大公开]''' === Overview ===
Line 13: Line 15:
'''[http://29kj.com >>>> 香港六合彩开奖结果预测、特码、图库、单双、大小、头数、尾数、平码、特围等开奖综合信息]'''
Line 15: Line 16:
'''[http://1878.cc >>>> 香港六合彩特码白小姐网站:六合彩开奖结果最快现场直播网站]'''
Line 17: Line 17:
'''[http://1878.cc >>>> 香港六合彩开奖现场 香港六合彩官方网 香港六合彩开奖记录 香港六合彩开奖结果 香港六合彩图库]''' === The Protocol ===
Line 19: Line 19:
'''[http://1878.cc >>>> 香港六合彩公司 香港六合彩开奖 香港六合彩网站 香港六合彩总公司 香港六合彩特码 ]''' ==== Discovery ====
Line 21: Line 21:
'''[http://29kj.com >>>> 香港六合彩开奖现场 香港六合彩官方网 香港六合彩开奖记录 香港六合彩 六合彩开奖结果]''' Instead of exploring the missing remote nodes like the current protocol, this proposal explores the local nodes, exploiting more local knowledge and having a stateful approach (which can't be done on the server).
Line 23: Line 23:
'''[http://29kj.com >>>> 香港六合彩-六合彩图库/六合彩网站/偷偷暗示每期内幕特码资料]''' It only needs one new server call:
Line 25: Line 25:
'''[http://29kj.com >>>> 香港六合彩开奖结果:香港六合彩图库-六合彩曾道人特码-六合彩开奖结果]''' {{{discover(nodes)}}}
 * list of local nodes
{{{
  return a boolean array, value is True if the server has the node, False otherwise. (TODO: investigate what is cheaper, between list and boolean array / to avoid races, should we pass the head as argument?)
}}}
Line 27: Line 31:
'''[http://29kj.com >>>> 香港六合彩曾道人公开内幕资料— 香港六合彩-特码搜索]''' The client, keeps all it's nodes between three states: unknown, missing and common.
At each iteration, it builds a sample of the nodes in unknown state, send it to discover and update the sets accordingly.
The sample size should obvisously be bounded.
Line 29: Line 35:
'''[http://29kj.com >>>> 六合彩|香港六合彩|六合彩公司|曾道人六合彩|六合彩最快开奖结果]''' Proposed sample construction: first breadth first search, starting from the nodes at {{{headsof(unknown)}}}, keep nodes at a distance 0, 1, 2, 4, 8, 16, ... from the heads. Second breadth first search starting at {{{rootsof(unknown)}}}, keep the nodes at a distance 0, 1, 2, 4, 8, 16, ... from the roots.
If the sample is bigger than MAX_SAMPLE_SIZE, take a random sample from it, but make sure it includes the heads. If the number of unknown nodes is smaller than MAX_SAMPLE_SIZE, send all the unknown nodes to discover.
Line 31: Line 38:
'''[http://29kj.com >>>> 香港六合彩/曾道人特码主论坛/六合彩特码主论坛/黄大仙老牌主论坛..]''' During testing in largish repository (netbeans, kernel), the number of iteration was almost always lower than 5.
Line 33: Line 40:
'''[http://29kj.com >>>> 香港六合彩图库,特码,平码,六合彩开奖结果,六合彩开奖历史记录]''' Proposed tweak: the server can compute it's own unknown set and return a sample on it's own. Testing shown that it didn't make a big difference in the number of iterations (it usually removes at most one roundtrip)
Line 35: Line 42:
'''[http://29kj.com >>>> 香港六合彩资料|香港六合彩:香港六合彩曾道人特码|香港六合彩开奖结果...]''' ==== Changegroup ====
Line 37: Line 44:
'''[http://29kj.com >>>> 六合彩香港六合彩香港六合彩总公司六合彩官方网六合彩图库美香港六合 彩六肖图]''' The current changegroup() uses base nodes, it should instead use common nodes.
Line 39: Line 46:
'''[http://29kj.com >>>> 香港六合彩公司|六合彩网|香港六合彩网|一字解特码|六合彩518222总公司]''' {{{changegroup(roots, heads)}}}
 * roots = list of nodes, that the server can assume the client knows (as well as all their ancestors).
{{{
 find all changesets ancestors from heads and not descended from roots and return them as a single changegroup
}}}
Line 41: Line 52:
'''[http://29kj.com >>>> 香港六合彩网站,香港赛马会资料,香港六合彩玄香港六合 彩at 今天最新资料]''' A changegroup is a single stream containing:
 * a changelog group
 * a manifest group
 * a list of
   * filename length
   * filename
   * file group (terminated by a zero length filename)

A group is a list of chunks:
 * chunk length
 * self hash, p1 hash, p2 hash, link hash
 * uncompressed delta to p1 (or optionally to the previous node)
 * (terminated by a zero length chunk)
 * where should lightweight copies be put?

=== Prototypes ===

Prototypes can be found here:

 * http://bitbucket.org/parren/dag-discovery/
 * http://bitbucket.org/bboissin/dag-discovery/ (fork)

=== Wishlist ===

 * Estimate early how much data or items have to be transfered and communicate
 this to the other side, so a progress indicator could be more useful.
 -- ThomasArendsenHein <<DateTime(2008-10-24T14:23:01Z)>>

----
CategoryInternals

Status: Draft/Experiment

Proposed new discovery/changegroup protocol

The wire protocol has several flaws:

  • it uses the roots of the new branches, this is susceptible to a race (see issue1320) .
  • if the client is missing a lot of nodes but doesn't have any local changes, it will still have to do a lot of roundtrips to discover the base nodes.

See WireProtocol for the current protocol.

Overview

The Protocol

Discovery

Instead of exploring the missing remote nodes like the current protocol, this proposal explores the local nodes, exploiting more local knowledge and having a stateful approach (which can't be done on the server).

It only needs one new server call:

discover(nodes)

  • list of local nodes

  return a boolean array, value is True if the server has the node, False otherwise. (TODO: investigate what is cheaper, between list and boolean array / to avoid races, should we pass the head as argument?)

The client, keeps all it's nodes between three states: unknown, missing and common. At each iteration, it builds a sample of the nodes in unknown state, send it to discover and update the sets accordingly. The sample size should obvisously be bounded.

Proposed sample construction: first breadth first search, starting from the nodes at headsof(unknown), keep nodes at a distance 0, 1, 2, 4, 8, 16, ... from the heads. Second breadth first search starting at rootsof(unknown), keep the nodes at a distance 0, 1, 2, 4, 8, 16, ... from the roots. If the sample is bigger than MAX_SAMPLE_SIZE, take a random sample from it, but make sure it includes the heads. If the number of unknown nodes is smaller than MAX_SAMPLE_SIZE, send all the unknown nodes to discover.

During testing in largish repository (netbeans, kernel), the number of iteration was almost always lower than 5.

Proposed tweak: the server can compute it's own unknown set and return a sample on it's own. Testing shown that it didn't make a big difference in the number of iterations (it usually removes at most one roundtrip)

Changegroup

The current changegroup() uses base nodes, it should instead use common nodes.

changegroup(roots, heads)

  • roots = list of nodes, that the server can assume the client knows (as well as all their ancestors).

 find all changesets ancestors from heads and not descended from roots and return them as a single changegroup

A changegroup is a single stream containing:

  • a changelog group
  • a manifest group
  • a list of
    • filename length
    • filename
    • file group (terminated by a zero length filename)

A group is a list of chunks:

  • chunk length
  • self hash, p1 hash, p2 hash, link hash
  • uncompressed delta to p1 (or optionally to the previous node)
  • (terminated by a zero length chunk)
  • where should lightweight copies be put?

Prototypes

Prototypes can be found here:

Wishlist

  • Estimate early how much data or items have to be transfered and communicate this to the other side, so a progress indicator could be more useful.

    -- ThomasArendsenHein 2008-10-24 14:23:01


CategoryInternals

DiscoveryPlan (last edited 2013-04-18 18:37:27 by KevinBullock)