1 / 28

Replication Strategies in Unstructured Peer-to-Peer Networks

Replication Strategies in Unstructured Peer-to-Peer Networks. Edith Cohen AT&T Labs-research. Scott Shenker ICIR. Peer-to-peer Networks. Peers are connected by an overlay network. Users cooperate to share files (e.g., music, videos, etc.). (Search in) Basic P2P Architectures.

astra
Download Presentation

Replication Strategies in Unstructured Peer-to-Peer Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Replication Strategies in Unstructured Peer-to-Peer Networks Edith Cohen AT&T Labs-research Scott Shenker ICIR Sigcomm 2002

  2. Peer-to-peer Networks • Peers are connected by an overlay network. • Users cooperate to share files (e.g., music, videos, etc.) Sigcomm 2002

  3. (Search in) Basic P2P Architectures • Centralized: central directory server. (Napster) Supports versatile queries, scope, legal troubles • Decentralized: search is performed by probing peers • Structured (DHTs): (Freenet, Can, Chord,…)location is coupled with topology - search is routed by the query.Scope, Only exact-match queries, tightly controlled overlay. • Unstructured:(Gnutella, FastTrack); search is “blind” - probed peers are unrelated to query. Resilient to transient peers; versatile queries; Harsh scope/scalability tradeoff. Sigcomm 2002

  4. (replication in) P2P architectures • No proactive replication (Gnutella) • Hosts store and serve only what they requested • A copy can be found only by probing a host with a copy • Proactive replication of “keys” (= meta data + pointer) for search efficiency (FastTrack, DHTs) • Proactive replication of “copies” – for search and download efficiency, anonymity. (Freenet) Sigcomm 2002

  5. Question: how to use replication to improve search efficiency in unstructured networks with a proactive replication mechanism ? Sigcomm 2002

  6. Search and replication model Unstructured networks with replication of keys or copies. Peers probed (in the search and replication process) are unrelated to query/item - Probe success likelihood can not be better, on average, than random probes. • Search: probe hosts, uniformly at random, until the query is satisfied (or the search max size is exceeded) • Replication: Each host can store up to r copies (or keys=metadata+pointer) of items. Goal: minimize average search size (number of probes till query is satisfied) Sigcomm 2002

  7. Search size • Query is soluble if there are sufficiently many copies of the item. • Query is insoluble if item is rare or non existent. What is the search size of a query ? • Insoluble queries: maximum search size • Soluble queries: number of probes until answer is found. We look at the Expected Search Size (ESS) of each item. The ESS is inversely proportional to the fraction of peers with a copy of the item. Sigcomm 2002

  8. Search Example 2 probes 4 probes Sigcomm 2002

  9. Expected Search Size (ESS) • m items with relative query rates • q1> q2> q3> … > qm. Siqi = 1 • Allocation : p1, p2, p3,…, pmSipi = 1 ith item is allocated pi fraction of storage. (keys placed in pir fraction of hosts) • Search size for ith item is a Geometric r.v. with mean Ai = 1/(r pi ). • ESS is SiqiAi = (Siqi / pi)/r Sigcomm 2002

  10. Example: 3 items, q1=1/2, q2=1/3, q3=1/6 Uniform Proportional Uniform and Proportional Replication • Two natural strategies: • Uniform Allocation: pi = 1/m • Simple, resources are divided equally • Proportional Allocation: pi = qi • “Fair”, resources per item proportional to demand • Reflects current P2P practices Sigcomm 2002

  11. Basic Questions • How do Uniform and Proportional allocations perform/compare ? • Which strategy minimizes the Expected Search Size (ESS) ? • Is there a simple protocol that achieves optimal replication in decentralized unstructured networks ? Sigcomm 2002

  12. Insoluble queries • Search always extends to the maximum allowed search size. • If we fix the available storage for copies, the query rate distribution, and the number of items that we wish to be “locatable”, then • The maximum required search size depends on the smallest allocation of an item. Thus, • Uniform allocation minimizes this maximum and thus the cost induced by insoluble queries. What about the cost of soluble queries? Answer is more surprising … Sigcomm 2002

  13. ESS under Uniform and Proportional Allocations (soluble queries) • Lemma: The ESS under either Uniform or Proportional allocations is m/r • Independent of query rates (!!!) • Same ESS for Proportional and Uniform (!!!) • Proof… Proportional: ASS is(Siqi / pi)/r = (Siqi / qi)/r = m/r Uniform: ASS is(Siqi / pi)/r = (Sim qi)/r = (m/r) Si qi = m/r Sigcomm 2002

  14. Proportional and Uniform are the worst “reasonable” strategies (!!!) Space of Possible Allocations Definition: Allocation p1, p2, p3,…, pm is “in-between” Uniform and Proportional if for 1< i <m, q i+1/q i< p i+1/p i< 1 Theorem1: All (strictly) in-between strategies are (strictly) better than Uniform and Proportional • Theorem2: p is worse than Uniform/Proportional if • for all i, p i+1/p i> 1 (more popular gets less) OR • for all i, q i+1/q i> p i+1/p i (less popular gets less than “fair share”) Sigcomm 2002

  15. Worse than prop/uni More popular item gets less. SR Better than prop/uni Worse than prop/uni More popular gets more than its proportional share Space of allocations on 2 items Uniform Proportional p2/p1 q2/q1 Sigcomm 2002

  16. So, what is the best strategy for soluble queries ? Sigcomm 2002

  17. Square-Root Allocation pi is proportional to square-root(qi) • Lies “In-between” Uniform and Proportional • Theorem: Square-Root allocation minimizes the ESS (on soluble queries) • Minimize Siqi / pi such thatSipi = 1 Sigcomm 2002

  18. How much can we gain by using SR ? Zipf-like query rates Sigcomm 2002

  19. OPT is a hybrid of Uniform and SR Tuned to balance cost of soluble and insoluble queries. • OK • SR is best for soluble queries • Uniform minimizes cost of insoluble queries What is the optimal strategy? Sigcomm 2002

  20. 10^4 items, Zipf-like w=1.5 All Soluble 85% Soluble All Insoluble SR Uniform Sigcomm 2002

  21. We now know what we need. How do we get there? Sigcomm 2002

  22. Replication Algorithms • Uniform and Proportional are “easy” :- • Uniform: When item is created, replicate its key in a fixed number of hosts. • Proportional: for each query, replicate the key in a fixed number of hosts • Fully distributed where peers communicate through random probes; minimal bookkeeping; and no more communication than what is needed for search. • Converge to/obtain SR allocation when query rates remain steady. Desired properties of algorithm: Sigcomm 2002

  23. Property of the process: <Ci> average value of C used to replicate ith item. Claim: If <Ci>/<Cj> remains fixed over time, and <Ci>, <Cj> >e, then pi/pjg qi <Ci>/qj <Cj> Model for Copy Creation/Deletion • Creation: after a successful search, C(s) new copies are created at random hosts. • Deletion: is independent of the identity of the item; copy survival chances are non-decreasing with creation time. (i.e., FIFO at each node) Sigcomm 2002

  24. Algorithm for square-root allocation needs to have <Ci> equal to or converge to a value inversely proportional to Creation/Deletion Process Corollary: If then Sigcomm 2002

  25. SR Replication Algorithms • Path replication: number of new copies C(s) is proportional to the size of the search (Freenet) • Converges to SR allocation (+reasonable conditions) • Convergence unstable with delayed creations • Sibling memory: each copy remembers the number of sibling copies, • Quickly “on target” • For “good estimates” need to find several copies. • Probe memory: each peer records number and combined search size of probes it sees for each item. C(S) is determined by collecting this info from number of peers proportional to search size. • Immediately “on target” • Extra communication (proportional to that needed for search). Sigcomm 2002

  26. Alg1: Path Replication • Number of new copies produced per query, <Ci>, is proportional to search size 1/pi • Creation rate is proportional to qi <Ci> • Steady state: creation rate proportional to allocation pi, thus Sigcomm 2002

  27. Simulation Delay = 0.25 * copy lifetime; 10000 hosts Path replication Sibling number Hosts with copy time Sigcomm 2002

  28. Summary • Random Search/replication Model: probes to “random” hosts • Proportional allocation – current practice • Uniform allocation – best for insoluble queries • Soluble queries: • Proportional and Uniform allocations are two extremes with same average performance • Square-Root allocation minimizes Average Search Size • OPT (all queries) lies between SR and Uniform • SR/OPT allocation can be realized by simple algorithms. Sigcomm 2002

More Related