1 / 28

Peer to Peer and Distributed Hash Tables

Peer to Peer and Distributed Hash Tables. Distributed Hash Tables. Challenge: To design and implement a robust and scalable distributed system composed of inexpensive, individually unreliable computers in unrelated administrative domains. Partial thanks Idit Keidar ).

emma
Download Presentation

Peer to Peer and Distributed Hash Tables

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Peer to Peer and Distributed Hash Tables CS 271

  2. Distributed Hash Tables Challenge: To design and implement a robust and scalable distributed system composed of inexpensive, individually unreliable computers in unrelated administrative domains Partial thanks IditKeidar) CS 271

  3. Searching for distributed data • Goal: Make billions of objects available to millions of concurrent users • e.g., music files • Need a distributed data structure to keep track of objects on different sires. • map object to locations • Basic Operations: • Insert(key) • Lookup(key) CS 271

  4. Searching N2 N1 N3 Key=“title” Value=MP3 data… Internet ? Client Publisher Lookup(“title”) N4 N6 N5 CS 271

  5. Simple Solution • First There was Napster • Centralized server/database for lookup • Only file-sharing is peer-to-peer, lookup is not • Launched in 1999, peaked at 1.5 million simultaneous users, and shut down in July 2001. CS 271

  6. Publish Napster: Publish insert(X, 123.2.21.23) ... I have X, Y, and Z! 123.2.21.23 CS 271

  7. Fetch Query Reply Napster: Search 123.2.0.18 search(A) --> 123.2.0.18 Where is file A? CS 271

  8. Overlay Networks • A virtual structure imposed over the physical network (e.g., the Internet) • A graph, with hosts as nodes, and some edges Overlay Network Keys Node ids Hash fn Hash fn CS 271

  9. Unstructured Approach: Gnutella • Build a decentralized unstructured overlay • Each node has several neighbors • Holds several keys in its local database • When asked to find a key X • Check local database if X is known • If yes, return, if not, ask your neighbors • Use a limiting threshold for propagation. CS 271

  10. I have file A. I have file A. Reply Query Gnutella: Search Where is file A? CS 271

  11. Structured vs. Unstructured • The examples we described are unstructured • There is no systematic rule for how edges are chosen,each node “knows some” other nodes • Any node can store any data so a searched data might reside at any node • Structuredoverlay: • The edges are chosen according to some rule • Data is stored at a pre-defined place • Tables define next-hop for lookup CS 271

  12. Hashing • Data structure supporting the operations: • void insert( key, item ) • item search( key ) • Implementation uses hash function for mapping keys to array cells • Expected search time O(1) • provided that there are few collisions CS 271

  13. Distributed Hash Tables (DHTs) • Nodes store table entries • lookup( key ) returns the location of the node currently responsible for this key • We will mainly discuss Chord, Stoica, Morris, Karger, Kaashoek, and Balakrishnan SIGCOMM 2001 • Other examples: CAN (Berkeley), Tapestry (Berkeley), Pastry (Microsoft Cambridge), etc. CS 271

  14. CAN [Ratnasamy, et al] • Map nodes and keys to coordinates in a multi-dimensional cartesian space Zone source key Routing through shortest Euclidean path For d dimensions, routing takes O(dn1/d) hops

  15. Chord Logical Structure (MIT) • m-bit ID space (2m IDs), usually m=160. • Nodes organized in a logical ring according to their IDs. N1 N56 N51 N8 N10 N48 N14 N42 N21 N38 N30

  16. DHT: Consistent Hashing Key 5 K5 Node 105 N105 K20 Circular ID space N32 N90 K80 A key is stored at its successor: node with next higher ID CS 271 Thanks CMU for animation

  17. Consistent Hashing Guarantees • For any set of N nodes and K keys: • A node is responsible for at most (1 + )K/N keys • When an (N + 1)st node joins or leaves, responsibility for O(K/N) keys changes hands CS 271

  18. DHT: Chord Basic Lookup N120 N10 “Where is key 80?” N105 N32 “N90 has K80” N90 K80 N60 • Each node knows only its successor • Routing around the circle, one node at a time. CS 271

  19. DHT: Chord “Finger Table” 1/2 1/4 1/8 1/16 1/32 1/64 1/128 N80 • Entry i in the finger table of node n is the first node that succeeds or equals n + 2i • In other words, the ith finger points 1/2n-i way around the ring CS 271

  20. DHT: Chord Join • Assume an identifier space [0..8] • Node n1 joins Succ. Table 0 i id+2i succ 0 2 1 1 3 1 2 5 1 1 7 6 2 5 3 4 CS 271

  21. DHT: Chord Join • Node n2 joins Succ. Table 0 i id+2i succ 0 2 2 1 3 1 2 5 1 1 7 6 2 Succ. Table i id+2i succ 0 3 1 1 4 1 2 6 1 5 3 4 CS 271

  22. DHT: Chord Join Succ. Table • Nodes n0, n6 join i id+2i succ 0 1 1 1 2 2 2 4 0 Succ. Table 0 i id+2i succ 0 2 2 1 3 6 2 5 6 1 7 Succ. Table i id+2i succ 0 7 0 1 0 0 2 2 2 6 2 Succ. Table i id+2i succ 0 3 6 1 4 6 2 6 6 5 3 4 CS 271

  23. DHT: Chord Join Succ. Table Items • Nodes: n1, n2, n0, n6 • Items: f7, f1 7 i id+2i succ 0 1 1 1 2 2 2 4 0 0 Succ. Table Items 1 1 i id+2i succ 0 2 2 1 3 6 2 5 6 7 6 2 Succ. Table i id+2i succ 0 7 0 1 0 0 2 2 2 Succ. Table i id+2i succ 0 3 6 1 4 6 2 6 6 5 3 4 CS 271

  24. DHT: Chord Routing Succ. Table Items • Upon receiving a query for item id, a node: • Checks whether stores the item locally? • If not, forwards the query to the largest node in its successor table that does not exceed id 7 i id+2i succ 0 1 1 1 2 2 2 4 0 0 Succ. Table Items 1 1 i id+2i succ 0 2 2 1 3 6 2 5 6 7 query(7) 6 2 Succ. Table i id+2i succ 0 7 0 1 0 0 2 2 2 Succ. Table i id+2i succ 0 3 6 1 4 6 2 6 6 5 3 4 CS 271

  25. Chord Data Structures • Finger table • First finger is successor • Predecessor • What if each node knows all other nodes • O(1) routing • Expensive updates CS 271

  26. Routing Time n • Node nlooks up a key stored at node p • pis in n’sith interval: p  ((n+2i-1)mod 2m, (n+2i)mod 2m] • ncontacts f=finger[i] • The interval is not empty so:f  ((n+2i-1)mod 2m, (n+2i)mod 2m] • f is at least 2i-1 away from n • p is at most 2i-1 away from f • The distance is halved at each hop. n+2i-1 f finger[i] p n+2i

  27. Routing Time • Assuming uniform node distribution around the circle, the number of nodes in the search space is halved at each step: • Expected number of steps: log N • Note that: • m = 160 • For 1,000,000 nodes, log N = 20 CS 271

  28. P2P Lessons • Decentralized architecture. • Avoid centralization • Flooding can work. • Logical overlay structures provide strong performance guarantees. • Churn a problem. • Useful in many distributed contexts. CS 271

More Related