1 / 35

SHELL: A Distributed Oblivious Heap & its Applications

An overview of SHELL, a distributed heap overlay architecture for robust systems and P2P networks, explained by Christian Scheideler and Stefan Schmid in a 2008 lecture. Topics include dynamics in peer-to-peer computing, heterogeneity challenges, distributed heap properties, and the concept of an oblivious distributed heap. The presentation covers the motivation, objectives, and construction of the SHELL overlay graph designed for dynamic and fault-tolerant networks.

olsont
Download Presentation

SHELL: A Distributed Oblivious Heap & its Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SHELL: A Distributed and Oblivious Heapwith Applications for Robust Information Systems and Heterogeneous Peer-to-Peer Networks Christian Scheideler Stefan Schmid Network Algorithms Summer 2008

  2. Bevor wir SHELL anschauen... • Prof. Scheideler an Konferenz • Deshalb: Spezialprogramm • Shell - Baut auf gelerntem auf! • Ongoing work...  Keine Unterlagen  Hat noch Lücken, ev. auch Fehler  /  Slides auf Englisch damit auch sonst mal gebrauchbar! Offen für Inputs / Ideen! DISTRIBUTED COMPUTING Stefan Schmid @ TU München, 2008

  3. Motivation • Today, still many challenges in distributed systems (e.g., the Internet) • E.g., viruses, spam, DoS attacks, selfish users, etc. • Very active research • For example, peer-to-peer computing • Dynamics / churn: Peers join and leave frequently • In 1,000,000 network where peer sessions are around 60 minutes, there are hundreds of membership changes every second! • Peer-to-peer based on contributions of participants: problematic if users are selfish! • E.g., BitThief free-rides in BitTorrent • Heterogeneity: peers have different Internet connections, different CPUs, run different operating systems, etc. DISTRIBUTED COMPUTING Stefan Schmid @ TU München, 2008

  4. SHELL Overview • SHELL = our overlay architecture • Basically, a distributed heap • Refresher: min heap - children have larger key than parent - e.g., useful for priority queues (fast removeMin()) DISTRIBUTED COMPUTING slide from GAD lecture 2008... Stefan Schmid @ TU München, 2008

  5. Heap Refresher Heap in GAD... Stefan Schmid @ TU München, 2008

  6. A Distributed Heap? • What is a distributed heap? • We assume that peers have a key / order / rank / id - for example: time when peer joined • (Min-) heap property: Peers only connect to peers of lower order - for example: peers only connect to older peers - Shell constructs a directed overlay (however, backward edges, see later) DISTRIBUTED COMPUTING 28 26 23 21 20 19 18 17 16 9 10 3 Stefan Schmid @ TU München, 2008

  7. An Oblivious Distributed Heap? (1) • What is an oblivious distributed heap? • Oblivious = overlay topology only depends on set of currently active peers (and their IDs / orders) in the network - but not on history, e.g., on time when these peers joined! - example: if at join time, a new peer is inserted at the end of a list of peers, the resulting topology is not oblivious - example: if a new peer is inserted in a list of peers with respect to the peer‘s order, the topology is oblivious DISTRIBUTED COMPUTING Stefan Schmid @ TU München, 2008

  8. An Oblivious Distributed Heap? (2) • Why is oblivious good? - the oblivious property is useful when it comes to fault-tolerance - e.g., desktops may crash temporarily, and will then rejoin - if topology is oblivious, peers can „remember“ their old contacts, and when an old contact reappears, it can be integrated immediately (instantaneous rejoin) DISTRIBUTED COMPUTING • Many systems today are oblivious - e.g., Pastry, Chord, etc. - but not: e.g., Pagoda - many systems in practice are not: Gnutella, BitTorrent, etc. Stefan Schmid @ TU München, 2008

  9. Objectives of Shell • Primary goal: dynamic and robust overlay • In particular: - maintaining heap property - low peer degree, low network diameter, low congestion - fast join / rejoin / leave - peers can simply crash DISTRIBUTED COMPUTING • Applications - i-SHELL: A distributed information system robust to Sybil attacks - h-SHELL: A peer-to-peer system for heterogeneous environments Stefan Schmid @ TU München, 2008

  10. Overlay Graph (1) • How to achieve these goals? • Overlay based on continuous-discrete approach - basically a de Bruijn graph • Refresher: continuous-discrete approach - peers in cyclic [0,1)-interval - connected to peer responsible for continuous position x/2 and (x+1)/2 Stefan Schmid @ TU München, 2008

  11. Overlay Graph (2) • Our distributed heap has larger peer degree • Space is divided into different partitions - partition i = 2i intervals of size 1/2i - global partition renders analysis simpler („same views“) Stefan Schmid @ TU München, 2008

  12. Overlay Graph (3) • Peer connects to all peers of lower order in - Level-i home interval (interval which includes position x of peer) - Adjacent level-i intervals to home - de Bruijn intervals: intervals which include position x/2 and (x+1)/2 • What is level i? - Level i chosen such that there are c log np peers in interval - np = total number of peers in system with lower order - np can be estimated, in the following we assume it is given Stefan Schmid @ TU München, 2008

  13. Overlay Graph (4) • In order to ensure connectivity when many peers leave, interval size must be increased over time (peer upgrades to larger partition) • Similarly, if many peers of lower order join in interval, peers needs to downgrade • In addition to these forward edges, peers store incoming edges - called backward edges Stefan Schmid @ TU München, 2008

  14. Overlay Graph (5) • These edges are already sufficient for Shell • However, in order to speed-up changes between levels, peer additionally store pointers to peers it would connect to if it upgraded - to „funnel“ to which peer would connect - of course, peer only connects to these lower order peers once they are on the corresponding level - requires notification mechanism Level 1 ... ... • In the following, we will not consider funnel edges in further detail! Level i-2 Level i-1 Level i Stefan Schmid @ TU München, 2008

  15. Implication: Monotonicity • From this construction, we can already derive some properties • For instance, Shell features a monotonicity property: If two peers p and p‘ are connected to the same interval I and if p is of larger order than p‘, then p knows strictly more peers in I - because peers only connect to lower order peers in an interval Stefan Schmid @ TU München, 2008

  16. Distributed Order...: A Simplification • In the following, we will assume that peers have distinct IDs • E.g., assigned at join time by network entry point • Otherwise: in case of multiple joins close in time, peers may not be able to decide which is older => need to introduce blackout zones, etc. • In the following, we will not consider this issue in more detail Stefan Schmid @ TU München, 2008

  17. Analysis of Degree (1) • Topological description allows to analyze the peer degree • Peers employ the following strategy: if number of neighbors falls below c log n_p in at least one interval, all intervals are doubled • According to Chernoff bounds, it holds that if one interval contains c log n peers, there is no interval of size larger (1+d) c log n for any d > 0, with high probability. • Therefore, degree is in O(log n) w.h.p. - with funnel edges, the degree is log square Stefan Schmid @ TU München, 2008

  18. Analysis of Degree (2) • What about incoming / backward edges? Stefan Schmid @ TU München, 2008

  19. Routing (1) • The Shell overlay allows peers to route messages • Similarly to continuous-discrete routing (adjusting one bit after another) • Routing operation route(x) consists of two phases Phase 1: Route along forward edges to peer of lower order which is closest to x (or: to a lower order peer whose home region contains position x) Phase 2: Descent along backward edges to peer which is closest to x Implication: If a peer wants to send a message to a peer of lower order, only Phase 1 is necessary, and the message will not traverse any higher order peers! Stefan Schmid @ TU München, 2008

  20. Routing (2) • Observe that in our overlay, peers have multiple neighbors which could be used for the next de Bruijn routing hop (log n neighbors per interval) • This can be exploited in order to minimize congestion • Routing policy: peer p always forwards packets to its neighbor which is of largest order among the eligible peers (lower order than p) • This alleviates load on very low order peers Stefan Schmid @ TU München, 2008

  21. Routing (3) • Visualization of routing towards higher order peers • Messages travel towards lower order peers • But on each hop, as high order peer as possible is taken Stefan Schmid @ TU München, 2008

  22. Routing (4) towards higher order peers • Analysis of Phase 1 - accoring to continuous-discrete routing, at most log n hops are needed to destination - we make the following observation: prob that all peers of order lower than p but higher than n_p-l_1 are in other interval prob that this peer is located in the corresponding interval Stefan Schmid @ TU München, 2008

  23. Routing (5) towards higher order peers • Generally for i-th hop: • Summing up, after some lines of calculation, the probability that the final peer reached is of order np/2 or smaller is at most O(np-c) for some constant c With high probability, in first phase of routing, request travels to peer of order at least np/2. Stefan Schmid @ TU München, 2008

  24. Routing (6) towards higher order peers • Definition of congestion: • So what is the congestion in the first routing phase? Stefan Schmid @ TU München, 2008

  25. Routing (7) towards higher order peers • So what is the congestion in the first routing phase? See our argument before... At most k peers can send via p, routing path is of length log 2k and probability that it enters interval on one of these hops is c log k / k Stefan Schmid @ TU München, 2008

  26. Routing (8) Theorem: First phase of routing terminates in logarithmic time and yields congestion of asymptotically log2 np. Stefan Schmid @ TU München, 2008

  27. Routing (9) • Routing phase 2: descent along backward edges to higher order peers - idea: binary search which exploits monotonicity property - higher order peers know more about interval - on each level i, go to highest order peer which is located in interval which includes final position x - terminates in logarithmic time - logarithmic congestion: in each hop, a peer forwards at most one request Stefan Schmid @ TU München, 2008

  28. Join and Leave • Join: similar to lookup, find highest order peer in final interval, get integrated • Leave: peers can even crash, not particular operation • Change of level in time O(1), update cost induced at other peers in O(log2 n) Stefan Schmid @ TU München, 2008

  29. Application 1: i-Shell • i-Shell is a distributed information system • Idea: data management through consistent hashing approach • Generalized to multiple levels: on each level, data is stored on peer closest to x - on each hop during insertion, a replica is placed • Order of peers: time-stamps (assigned by network entry point) • Thus: peers only connect to older peers Stefan Schmid @ TU München, 2008

  30. i-Shell • Therefore: - we immediately get that two peers p and p‘ can communicate on paths which include only peers which are of peers at least their age - this renders the communication independent of younger peers • Side benefit: measurement studies have shown that older peers typically have a longer remaining session time - renders topology more stable • Shells imply rebustness to various attacks • E.g., Sybil attack Stefan Schmid @ TU München, 2008

  31. Sybil Attack (1) • Sybil attack - big problem in Internet - e.g., spam - Sybil: book by Flora Rheta about person with 16 identities • Attacker seeks to acquire many identities - e.g., to control large fraction of network • Countermeasures - virutal identities: captchas etc. - real identities? botnet? - Douceur has shown that issue is difficult to deal with in distributed environments... Stefan Schmid @ TU München, 2008

  32. Sybil Attack (2) • Shell is resilient to Sybil attacks of any scale! • Model: Sybil attack starts at some time t0 • Theorem: traffic of old peers independent of Sybil attack • Techniques - Admission control - Rate control 3 5 traffic between older peers unaffected 4 7 9 12 higher peers can perform a rate control algorithm 10 8 21 14 15 11 attack originates from lower peers Stefan Schmid @ TU München, 2008

  33. Application 2: h-Shell • Alternatively, IDs could represent inverse of the peers‘ capabilities • Therefore: peers only connect to peers with stronger capabilities • Interesting architecture for heterogeneous systems • Corollary: paths between strong peers only include strong peers • Interesting, e.g., for multi-quality live-streaming Stefan Schmid @ TU München, 2008

  34. Conclusion • Distributed heap based on continuous-discrete appraoch • Oblivious for highly transient environments • Robustness to Sybil attacks of arbitrary scale • Alternatively, useful for heterogeneous environments • Work in progress... Stefan Schmid @ TU München, 2008

  35. Stefan Schmid @ TU München, 2008

More Related