1 / 165

Ecs251 Winter 2013 : Operating System #6: Distributed Hash Table

Ecs251 Winter 2013 : Operating System #6: Distributed Hash Table. Dr. S. Felix Wu Computer Science Department University of California, Davis http://www.facebook.com/group.php?gid=29670204725 http://cyrus.cs.ucdavis.edu/~wu/ecs251. Structured Peering. Peer identity and routability

Download Presentation

Ecs251 Winter 2013 : Operating System #6: Distributed Hash Table

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ecs251 Winter 2013:Operating System#6: Distributed Hash Table Dr. S. Felix Wu Computer Science Department University of California, Davis http://www.facebook.com/group.php?gid=29670204725 http://cyrus.cs.ucdavis.edu/~wu/ecs251 DHT

  2. Structured Peering • Peer identity and routability • Key/content assignment • Which identity owns what? GFS/Napster: centralized index service Skype/Kazaa: login-server & super peers DNS: hierarchical DNS servers Two problems: (1). How to connect to the “topology”? (2). How to prevent failures/changes? DHT

  3. DHT • Most s-P2P systems are DHT-based. • Distributed hash tables (DHTs) • decentralized lookup service of a hash table • (name, value) pairs stored in the DHT • any peer can efficiently retrieve the value associated with a given name • the mapping from names to values is distributed among peers DHT

  4. HT as a search table(BitTorrent, Napster) “160 bits” Information/content is distributed, and we need to know where? Where is this GFS chunk? Where is this piece of music? Is this BT piece available? What is the location of this type of content? What is the current IP address of this skype user? Index key Content Object/Peer naming DHT

  5. DHT as a search table ??? Index key DHT

  6. DHT as a search table ??? Index key DHT

  7. DHT segment ownership ??? Index key DHT

  8. DHT • Scalable • Peer arrivals, departures, and failures • Unstructured versus structured DHT

  9. DHT (Name, Value) • How to utilize DHT to avoid Trackers in Bittorrent? DHT

  10. DHT-based Tracker FreeBSD 5.4 CD images Whoever owns this hash entry is the tracker for the corresponding key! Publish the key on the class web site. Index key Seed’s IP address PUT & GET DHT

  11. Chord • Given a key (content object), it maps the key onto a peer -- consistent hash • Assign keys to peers. • Solves problem of locating key in a collection of distributed peers. • Maintains routing information as peers join and leave the system DHT

  12. Chord • Consistent Hashing • A Simple Key Lookup Algorithm • Scalable Key Lookup Algorithm • Node Joins and Stabilization • Node Failures DHT

  13. Consistent Hashing • Consistent hash function assigns each peer and key anm-bit identifier (e.g., 140 bits). • SHA-1 as a base hash function. • A peer’sidentifier is defined by hashing the peer’s IP address. (other possibilities?) • A content identifier is produced by hashing the key: • ID(peer) = SHA-1(IP, Port) • ID(content) = SHA-1(related to the content object) • Application-dependent! DHT

  14. Peer, Content • In an m-bit identifier space, there are 2m identifiers (for both peer and content). • Which peer handles which content? DHT

  15. Peer, Content • In an m-bit identifier space, there are 2m identifiers (for both peer and content). • Which peer handles which contents? • We will not have 2m peers/contents! • Each peer might need to handle more than one contents. • In that case, which peer has what? DHT

  16. Consistent Hashing • In an m-bit identifier space, there are 2m identifiers. • an identifier circle modulo 2m. • The identifier ring is called Chord ring. • Content X isassigned to the first peer whose identifier is equal to or follows(the identifier of) X in the identifier space. • This peer is the successor peer of key X, denoted by successor(X). DHT

  17. identifier node 6 X key 0 1 7 6 2 5 3 4 2 Successor Peers 1 successor(1) = 1 identifier circle successor(6) = 0 6 2 successor(2) = 3 DHT

  18. Join and Departure • When a node N joins the network, certain contentspreviously assigned to N’s successor now become assigned toN. • When node N leaves the network, all of its assigned contents arereassigned to N’s successor. DHT

  19. 0 1 7 6 2 5 3 4 Join keys 5 7 keys 1 keys keys 2 DHT

  20. 0 1 7 6 2 5 3 4 Departure keys 7 keys 1 keys 6 keys 2 DHT

  21. Join/Depart • What information must be maintained? DHT

  22. Join/Depart • What information must be maintained? • Pointer to successor(s) • Content itself (but application dependent) DHT

  23. Tracker gone? FreeBSD 5.4 CD images Whoever owns this hash entry is the tracker for the corresponding key! Publish the key on the class web site. Index key Seed’s IP address PUT & GET DHT

  24. How to identify the tracker? • And, its IP address, of course? DHT

  25. A Simple Key Lookup • A very small amount of routing information suffices to implement consistent hashing in a distributed environment • If each node knows only how to contact itscurrent successor node on the identifier circle, all node can be visited in linear order. • Queries for agiven identifier could be passed around the circle via these successorpointers until they encounter the node that contains the key. DHT

  26. A Simple Key Lookup • Pseudo code for finding successor: // ask node n to find the successor of id N.find_successor(id) if (id  (N,successor]) return successor; else // forward the query around the circle return successor.find_successor(id); DHT

  27. A Simple Key Lookup • The path taken by a query from node 8 for key 54: DHT

  28. Successor • Each active node MUST know the IP address of its successor! • N8 has to know that the next node on the ring is N14. • Departure N8 => N21 • But, how about failure or crash? DHT

  29. Robustness • Successor in R hops • N8 => N14, N21, N32, N38 (R=4) • Periodic pinging along the path to check, & also find out maybe there are “new members” in between DHT

  30. Is that good enough? DHT

  31. Without Periodic Ping…?? Triggered only by dynamics (Join/Depart)! DHT

  32. Complexity of the search • Time/messages: O(N) • N: # of nodes on the Ring • Space: O(1) • We only need to remember R IP addresses • Stablization depends on “period”. DHT

  33. Scalable Key Location • To acceleratelookups, Chord maintains additional routing information. • Thisadditional information is not essential for correctness, which isachieved as long as each node knows its correct successor. DHT

  34. Finger Tables • Each node N’maintains a routing table with up tomentries (which is in fact the number of bits in identifiers), called finger table. • The ith entry in the table at node N contains theidentity of the first node s that succeeds N by at least 2i-1 on the identifier circle. • s = successor (n+2i-1). • s is called the ith finger of node N, denoted by N.finger(i) DHT

  35. 0 1 7 6 2 5 3 4 Finger Tables s = successor (n+2i-1). finger table keys start succ. 6 For. 1 2 4 1 3 0 0+20 0+21 0+22 finger table keys For. start succ. 1 1+20 1+21 1+22 2 3 5 3 3 0 finger table keys For. start succ. 2 4 5 7 0 0 0 3+20 3+21 3+22 DHT

  36. Finger Tables • A fingertable entry includes both the Chord identifier and the IP address(and port number) of the relevant node. • The first fingerof N is the immediatesuccessor of N on the circle. DHT

  37. Example query • The path a query for key 54 starting at node 8: DHT

  38. Scalable Key Location • Since each node has finger entries at power of two intervalsaround the identifier circle, each node can forward a query atleast halfway along the remaining distance between the nodeand the target identifier. From this intuition follows a theorem: Theorem: With high probability, the number of nodesthat must be contacted to find a successor in an N-node networkis O(logN). DHT

  39. Complexity of the Search • Time/messages: O(logN) • N: # of nodes on the Ring • Space: O(logN) • We need to remember R IP addresses • We need to remember logN Fingers • Stablization depends on “period”. DHT

  40. An Example • M = 140 (identifier size), ring size is 2140 • N = 216 (# of nodes) • How many entries we need to have for the Finger Table? Each node n’ maintains a routing table with up tomentries (which is in fact the number of bits in identifiers), called finger table. The ith entry in the table at node n contains theidentity of the first node s that succeeds n by at least 2i-1 on the identifier circle. s = successor(n+2i-1). DHT

  41. Complexity of the Search • Time/messages: O(M) • M: # of bits of the identifier • Space: O(M) • We need to remember R IP addresses • We need to remember M Fingers • Stablization depends on “period”. DHT

  42. Kademlia routinghttp://en.wikipedia.org/wiki/Kademlia DHT

  43. 0 1 7 6 2 5 3 4 Finger Tables s = successor (n+2i-1). finger table keys start succ. 6 For. 1 2 4 1 3 0 0+20 0+21 0+22 finger table keys For. start succ. 1 1+20 1+21 1+22 2 3 5 3 3 0 finger table keys For. start succ. 2 4 5 7 0 0 0 3+20 3+21 3+22 DHT

  44. Structural Search • Distributed, P2P • Attributes about the nodes • Nodes are connecting via some structures (ring, grid, or hypergraph) • Objective: Where is X? • X could be some content or a node identity DHT

  45. Structured/Clustered Trade off between D and Ccluster ! Davis Social Links

  46. p, q, r • p: lattice distance between one node and all its local neighbors • q: number of long range contacts • r: inverse probability [d(u,v)]-r • What is the intuition about r? • What about r = 0 Davis Social Links

  47. Kleinberg’s Basic setting Davis Social Links

  48. Kleinberg’s results A decentralized routing/search problem • For nodes s,t with known lattice coordinates, find a short path from s to t. • At any step, can only use local information, • Kleinberg suggests a simple greedy algorithm and analyzes it: Davis Social Links

  49. Local Information • Local contacts • Coordinate for the target • The locations and long-range contacts of all nodes that have come in contact with the message. Davis Social Links

  50. Results • If r = 0, expected delivery time is at least a0n2/3. • Lower bound • If r = 2, p = q = 1, a2(log n)2 • Martel/Nguyen’s newer results • 0 <= r < 2 ~ arn(2-r)/3 • r > 2 ~ arn(r-2)(r-1) Davis Social Links

More Related