Ecs251 Winter 2013 : Operating System #6: Distributed Hash Table

Ecs251 Winter 2013:Operating System#6: Distributed Hash Table Dr. S. Felix Wu Computer Science Department University of California, Davis http://www.facebook.com/group.php?gid=29670204725 http://cyrus.cs.ucdavis.edu/~wu/ecs251 DHT

Structured Peering • Peer identity and routability • Key/content assignment • Which identity owns what? GFS/Napster: centralized index service Skype/Kazaa: login-server & super peers DNS: hierarchical DNS servers Two problems: (1). How to connect to the “topology”? (2). How to prevent failures/changes? DHT

DHT • Most s-P2P systems are DHT-based. • Distributed hash tables (DHTs) • decentralized lookup service of a hash table • (name, value) pairs stored in the DHT • any peer can efficiently retrieve the value associated with a given name • the mapping from names to values is distributed among peers DHT

HT as a search table(BitTorrent, Napster) “160 bits” Information/content is distributed, and we need to know where? Where is this GFS chunk? Where is this piece of music? Is this BT piece available? What is the location of this type of content? What is the current IP address of this skype user? Index key Content Object/Peer naming DHT

DHT as a search table ??? Index key DHT

DHT segment ownership ??? Index key DHT

DHT • Scalable • Peer arrivals, departures, and failures • Unstructured versus structured DHT

DHT (Name, Value) • How to utilize DHT to avoid Trackers in Bittorrent? DHT

DHT-based Tracker FreeBSD 5.4 CD images Whoever owns this hash entry is the tracker for the corresponding key! Publish the key on the class web site. Index key Seed’s IP address PUT & GET DHT

Chord • Given a key (content object), it maps the key onto a peer -- consistent hash • Assign keys to peers. • Solves problem of locating key in a collection of distributed peers. • Maintains routing information as peers join and leave the system DHT

Chord • Consistent Hashing • A Simple Key Lookup Algorithm • Scalable Key Lookup Algorithm • Node Joins and Stabilization • Node Failures DHT

Consistent Hashing • Consistent hash function assigns each peer and key anm-bit identifier (e.g., 140 bits). • SHA-1 as a base hash function. • A peer’sidentifier is defined by hashing the peer’s IP address. (other possibilities?) • A content identifier is produced by hashing the key: • ID(peer) = SHA-1(IP, Port) • ID(content) = SHA-1(related to the content object) • Application-dependent! DHT

Peer, Content • In an m-bit identifier space, there are 2m identifiers (for both peer and content). • Which peer handles which content? DHT

Peer, Content • In an m-bit identifier space, there are 2m identifiers (for both peer and content). • Which peer handles which contents? • We will not have 2m peers/contents! • Each peer might need to handle more than one contents. • In that case, which peer has what? DHT

Consistent Hashing • In an m-bit identifier space, there are 2m identifiers. • an identifier circle modulo 2m. • The identifier ring is called Chord ring. • Content X isassigned to the first peer whose identifier is equal to or follows(the identifier of) X in the identifier space. • This peer is the successor peer of key X, denoted by successor(X). DHT

identifier node 6 X key 0 1 7 6 2 5 3 4 2 Successor Peers 1 successor(1) = 1 identifier circle successor(6) = 0 6 2 successor(2) = 3 DHT

Join and Departure • When a node N joins the network, certain contentspreviously assigned to N’s successor now become assigned toN. • When node N leaves the network, all of its assigned contents arereassigned to N’s successor. DHT

0 1 7 6 2 5 3 4 Join keys 5 7 keys 1 keys keys 2 DHT

0 1 7 6 2 5 3 4 Departure keys 7 keys 1 keys 6 keys 2 DHT

Join/Depart • What information must be maintained? DHT

Join/Depart • What information must be maintained? • Pointer to successor(s) • Content itself (but application dependent) DHT

Tracker gone? FreeBSD 5.4 CD images Whoever owns this hash entry is the tracker for the corresponding key! Publish the key on the class web site. Index key Seed’s IP address PUT & GET DHT

How to identify the tracker? • And, its IP address, of course? DHT

A Simple Key Lookup • A very small amount of routing information suffices to implement consistent hashing in a distributed environment • If each node knows only how to contact itscurrent successor node on the identifier circle, all node can be visited in linear order. • Queries for agiven identifier could be passed around the circle via these successorpointers until they encounter the node that contains the key. DHT

A Simple Key Lookup • Pseudo code for finding successor: // ask node n to find the successor of id N.find_successor(id) if (id  (N,successor]) return successor; else // forward the query around the circle return successor.find_successor(id); DHT

A Simple Key Lookup • The path taken by a query from node 8 for key 54: DHT

Successor • Each active node MUST know the IP address of its successor! • N8 has to know that the next node on the ring is N14. • Departure N8 => N21 • But, how about failure or crash? DHT

Robustness • Successor in R hops • N8 => N14, N21, N32, N38 (R=4) • Periodic pinging along the path to check, & also find out maybe there are “new members” in between DHT

Is that good enough? DHT

Without Periodic Ping…?? Triggered only by dynamics (Join/Depart)! DHT

Complexity of the search • Time/messages: O(N) • N: # of nodes on the Ring • Space: O(1) • We only need to remember R IP addresses • Stablization depends on “period”. DHT

Scalable Key Location • To acceleratelookups, Chord maintains additional routing information. • Thisadditional information is not essential for correctness, which isachieved as long as each node knows its correct successor. DHT

Finger Tables • Each node N’maintains a routing table with up tomentries (which is in fact the number of bits in identifiers), called finger table. • The ith entry in the table at node N contains theidentity of the first node s that succeeds N by at least 2i-1 on the identifier circle. • s = successor (n+2i-1). • s is called the ith finger of node N, denoted by N.finger(i) DHT

0 1 7 6 2 5 3 4 Finger Tables s = successor (n+2i-1). finger table keys start succ. 6 For. 1 2 4 1 3 0 0+20 0+21 0+22 finger table keys For. start succ. 1 1+20 1+21 1+22 2 3 5 3 3 0 finger table keys For. start succ. 2 4 5 7 0 0 0 3+20 3+21 3+22 DHT

Finger Tables • A fingertable entry includes both the Chord identifier and the IP address(and port number) of the relevant node. • The first fingerof N is the immediatesuccessor of N on the circle. DHT

Example query • The path a query for key 54 starting at node 8: DHT

Scalable Key Location • Since each node has finger entries at power of two intervalsaround the identifier circle, each node can forward a query atleast halfway along the remaining distance between the nodeand the target identifier. From this intuition follows a theorem: Theorem: With high probability, the number of nodesthat must be contacted to find a successor in an N-node networkis O(logN). DHT

Complexity of the Search • Time/messages: O(logN) • N: # of nodes on the Ring • Space: O(logN) • We need to remember R IP addresses • We need to remember logN Fingers • Stablization depends on “period”. DHT

An Example • M = 140 (identifier size), ring size is 2140 • N = 216 (# of nodes) • How many entries we need to have for the Finger Table? Each node n’ maintains a routing table with up tomentries (which is in fact the number of bits in identifiers), called finger table. The ith entry in the table at node n contains theidentity of the first node s that succeeds n by at least 2i-1 on the identifier circle. s = successor(n+2i-1). DHT

Complexity of the Search • Time/messages: O(M) • M: # of bits of the identifier • Space: O(M) • We need to remember R IP addresses • We need to remember M Fingers • Stablization depends on “period”. DHT

Kademlia routinghttp://en.wikipedia.org/wiki/Kademlia DHT

0 1 7 6 2 5 3 4 Finger Tables s = successor (n+2i-1). finger table keys start succ. 6 For. 1 2 4 1 3 0 0+20 0+21 0+22 finger table keys For. start succ. 1 1+20 1+21 1+22 2 3 5 3 3 0 finger table keys For. start succ. 2 4 5 7 0 0 0 3+20 3+21 3+22 DHT

Structural Search • Distributed, P2P • Attributes about the nodes • Nodes are connecting via some structures (ring, grid, or hypergraph) • Objective: Where is X? • X could be some content or a node identity DHT

Structured/Clustered Trade off between D and Ccluster ! Davis Social Links

p, q, r • p: lattice distance between one node and all its local neighbors • q: number of long range contacts • r: inverse probability [d(u,v)]-r • What is the intuition about r? • What about r = 0 Davis Social Links

Kleinberg’s Basic setting Davis Social Links

Kleinberg’s results A decentralized routing/search problem • For nodes s,t with known lattice coordinates, find a short path from s to t. • At any step, can only use local information, • Kleinberg suggests a simple greedy algorithm and analyzes it: Davis Social Links

Local Information • Local contacts • Coordinate for the target • The locations and long-range contacts of all nodes that have come in contact with the message. Davis Social Links

Results • If r = 0, expected delivery time is at least a0n2/3. • Lower bound • If r = 2, p = q = 1, a2(log n)2 • Martel/Nguyen’s newer results • 0 <= r < 2 ~ arn(2-r)/3 • r > 2 ~ arn(r-2)(r-1) Davis Social Links

Ecs251 Winter 2013 : Operating System #6: Distributed Hash Table