1.22k likes | 1.3k Views
Explore the fundamentals of Distributed Hash Tables (DHT) in Peer-to-Peer (P2P) systems, including DHT concepts, structured vs. unstructured networks, Chord protocol, and advantages of using DHT for resource location.
E N D
Peer-to-Peer (P2P) systems DHT, Chord, Pastry, …
Unstructured vs Structured • Unstructured P2P networksallow resources to be placed at any node. The network topology is arbitrary, and the growth is spontaneous. • Structured P2P networkssimplify resource location and load balancing by defining a topology and defining rules for resource placement. • Guarantee efficient search for rare objects What are the rules??? Distributed Hash Table (DHT)
Hash Tables • Store arbitrary keys and satellite data (value) • put(key,value) • value = get(key) • Lookup must be fast • Calculate hash function h() on key that returns a storage cell • Chained hash table: Store key (and optional value) there
What Is a DHT? • Single-node hash table: key = Hash(name) put(key, value) get(key) -> value • How do I do this across millions of hosts on the Internet? • Distributed Hash Table
…. node node node DHT Distributed application data get (key) put(key, data) (DHash) Distributed hash table lookup(key) node IP address (Chord) Lookup service • Application may be distributed over many nodes • DHT distributes data storage over many nodes
Why the put()/get() interface? • API supports a wide range of applications • DHT imposes no structure/meaning on keys
Distributed Hash Table • Hash table functionality in a P2P network : lookup of data indexed by keys • Key-hash node mapping • Assign a unique live node to a key • Find this node in the overlay network quickly and cheaply • Maintenance, optimization • Load balancing : maybe even change the key-hash node mapping on the fly • Replicate entries on more nodes to increase robustness
The lookup problem N2 N1 N3 Key=“title” Value=MP3 data… Internet ? Client Publisher Lookup(“title”) N4 N6 N5
Centralized lookup (Napster) N2 N1 SetLoc(“title”, N4) N3 Client DB N4 Publisher@ Lookup(“title”) Key=“title” Value=MP3 data… N8 N9 N7 N6 Simple, but O(N) state and a single point of failure
Flooded queries (Gnutella) N2 N1 Lookup(“title”) N3 Client N4 Publisher@ Key=“title” Value=MP3 data… N6 N8 N7 N9 Robust, but large number of messages per lookup
Routed queries (Chord, Pastry, etc.) N2 N1 N3 Client N4 Lookup(“title”) Publisher Key=“title” Value=MP3 data… N6 N8 N7 N9
Routing challenges • Define a useful key nearness metric • Keep the hop count small • Keep the tables small • Stay robust despite rapid change • Freenet: emphasizes anonymity • Chord: emphasizes efficiency and simplicity
What is Chord? What does it do? • In short: a peer-to-peer lookup service • Solves problem of locating a data item in a collection of distributed nodes, considering frequent node arrivals and departures • Core operation in most p2p systems is efficient location of data items • Supports just one operation: given a key, it maps the key onto a node
Chord Characteristics • Simplicity, provable correctness, and provable performance • Each Chord node needs routing information about only a few other nodes • Resolves lookups via messages to other nodes (iteratively or recursively) • Maintains routing information as nodes join and leave the system
Mapping onto Nodes vs. Values • Traditional name and location services provide a direct mapping between keys and values • What are examples of values? A value can be an address, a document, or an arbitrary data item • Chord can easily implement a mapping onto values by storing each key/value pair at node to which that key maps
Napster, Gnutella etc. vs. Chord • Compared to Napster and its centralized servers, Chord avoids single points of control or failure by a decentralized technology • Compared to Gnutella and its widespread use of broadcasts, Chord avoids the lack of scalability through a small number of important information for routing
Addressed Difficult Problems (1) • Load balance: distributed hash function, spreading keys evenly over nodes • Decentralization: chord is fully distributed, no node more important than other, improves robustness • Scalability: logarithmic growth of lookup costs with number of nodes in network, even very large systems are feasible
Addressed Difficult Problems (2) • Availability: chord automatically adjusts its internal tables to ensure that the node responsible for a key can always be found • Flexible naming: no constraints on the structure of the keys – key-space is flat, flexibility in how to map names to Chord keys
Chord properties • Efficient: O(log(N)) messages per lookup • N is the total number of servers • Scalable: O(log(N)) state per node • Robust: survives massive failures
The Base Chord Protocol (1) • Specifies how to find the locations of keys • How new nodes join the system • How to recover from the failure or planned departure of existing nodes
Consistent Hashing • Hash function assigns each node and key an m-bit identifier using a base hash function such as SHA-1 • ID(node) = hash(IP, Port) • ID(key) = hash(key) • Properties of consistent hashing: • Function balances load: all nodes receive roughly the same number of keys • When an Nth node joins (or leaves) the network, only an O(1/N) fraction of the keys are moved to a different location
Chord IDs • Key identifier = SHA-1(key) • Node identifier = SHA-1(IP, Port) • Both are uniformly distributed • Both exist in the same ID space • How to map key IDs to node IDs?
Chord IDs • consistent hashing (SHA-1) assigns each node and object an m-bit ID • IDs are ordered in an ID circle ranging from 0 – (2m-1). • New nodes assume slots in ID circle according to their ID • Key k is assigned to first node whose ID ≥ k • successor(k)
Consistent hashing Key 5 K5 Node 105 N105 K20 Circular 7-bit ID space N32 N90 K80 A key is stored at its successor: node with next higher ID
identifier node 6 X key 0 1 7 6 2 5 3 4 2 Successor Nodes 1 successor(1) = 1 identifier circle successor(6) = 0 6 2 successor(2) = 3
0 1 7 6 2 5 3 4 Node Joins and Departures 6 1 6 successor(6) = 7 successor(1) = 3 1 2
Consistent Hashing – Join and Departure • When a node n joins the network, certain keyspreviously assigned to n’s successor now become assigned ton. • When node n leaves the network, all of its assigned keys arereassigned to n’s successor.
0 1 7 6 2 5 3 4 Consistent Hashing – Node Join keys 5 7 keys 1 keys keys 2
0 1 7 6 2 5 3 4 Consistent Hashing – Node Dep. keys 7 keys 1 keys 6 keys 2
Scalable Key Location • A very small amount of routing information suffices to implement consistent hashing in a distributed environment • Each node need only be aware of its successor node on the circle • Queries for a given identifier can be passed around the circle via these successor pointers
A Simple Key Lookup • Pseudo code for finding successor: // ask node n to find the successor of id n.find_successor(id) if (id (n,successor]) return successor; else // forward the query around the circle return successor.find_successor(id);
Scalable Key Location • Resolution scheme correct, BUT inefficient: • it may require traversing all N nodes!
Acceleration of Lookups • Lookups are accelerated by maintaining additional routing information • Each node maintains a routing table with (at most) m entries (where N=2m) called the finger table • ithentry in the table at node n contains the identity of the first node, s, that succeeds n by at least 2i-1 on the identifier circle • s = successor(n + 2i-1) (all arithmetic mod 2m) • s is called the ith finger of node n, denoted by n.finger(i).node
0 1 7 6 2 5 3 4 Scalable Key Location – Finger Tables finger table keys start succ. 6 For. 1 2 4 1 3 0 0+20 0+21 0+22 finger table keys For. start succ. 1 1+20 1+21 1+22 2 3 5 3 3 0 finger table keys For. start succ. 2 4 5 7 0 0 0 3+20 3+21 3+22
finger table keys start int. succ. 6 0 finger table keys 1 7 start int. succ. 1 2 3 5 [2,3) [3,5) [5,1) 3 3 0 6 2 5 3 finger table keys start int. succ. 2 4 4 5 7 [4,5) [5,7) [7,3) 0 0 0 Finger Tables 1 2 4 [1,2) [2,4) [4,0) 1 3 0
Finger Tables - characteristics • Each node stores information about only a small number of other nodes, and knows more about nodes closely following it than about nodes fartheraway • A node’s finger table generally does not contain enough information to determine the successor of an arbitrary key k • Repetitive queries to nodes that immediately precede the given key will lead to the key’s successor eventually
0 finger table keys 1 7 start int. succ. 1 2 3 5 [2,3) [3,5) [5,1) 3 3 0 6 2 finger table keys start int. succ. 7 0 2 [7,0) [0,2) [2,6) 0 0 3 5 3 4 Node Joins – with Finger Tables finger table keys start int. succ. 6 1 2 4 [1,2) [2,4) [4,0) 1 3 0 6 6 finger table keys start int. succ. 2 4 5 7 [4,5) [5,7) [7,3) 0 0 0 6 6
0 1 7 6 2 5 3 4 Node Departures – with Finger Tables finger table keys start int. succ. 1 2 4 [1,2) [2,4) [4,0) 1 3 0 3 6 finger table keys start int. succ. 1 2 3 5 [2,3) [3,5) [5,1) 3 3 0 6 finger table keys start int. succ. 6 7 0 2 [7,0) [0,2) [2,6) 0 0 3 finger table keys start int. succ. 2 4 5 7 [4,5) [5,7) [7,3) 6 6 0 0
Simple Key Location Scheme N1 N8 K45 N48 N14 N42 N38 N21 N32
Scalable Lookup Scheme N1 Finger Table for N8 N56 N8 N51 finger 6 finger 1,2,3 N48 N14 finger 5 N42 finger 4 N38 N21 N32 finger [k] = first node that succeeds (n+2k-1)mod2m
Chord key location • Lookup in finger table the furthest node that precedes key • -> O(log n) hops
Scalable Lookup Scheme // ask node n to find the successor of id n.find_successor (id) n0 = find_predecessor(id); return n0.successor; // ask node n to find the predecessor of id n.find_predecessor (id) n0 = n; while (id not in (n0, n0.successor] ) n0 = n0.closest_preceding_finger(id); return n0; // return closest finger preceding id n.closest_preceding_finger (id) for i = m downto 1 if (finger[i].node belongs to (n, id)) return finger[i].node; return n;
Lookup Using Finger Table N1 lookup(54) N56 N8 N51 N48 N14 N42 N38 N21 N32
Scalable Lookup Scheme • Each node forwards query at least halfway along distance remaining to the target • Theorem: With high probability, the number of nodes that must be contacted to find a successor in a N-node network is O(log N)
“Finger table” allows log(N)-time lookups ½ ¼ 1/8 1/16 1/32 1/64 1/128 N80
Dynamic Operations and Failures Need to deal with: • Node Joins and Stabilization • Impact of Node Joins on Lookups • Failure and Replication • Voluntary Node Departures
Node Joins and Stabilization • Node’s successor pointer should be up to date • For correctly executing lookups • Each node periodically runs a “Stabilization” Protocol • Updates finger tables and successor pointers
Node Joins and Stabilization • Contains 6 functions: • create() • join() • stabilize() • notify() • fix_fingers() • check_predecessor()
Create() • Creates a new Chord ring n.create() predecessor = nil; successor = n;