Chord: A scalable Peer-to-peer Lookup Service for Internet Applications Dariotaki Roula

Chord: A scalable Peer-to-peer Lookup Service for Internet Applications Dariotaki Roula th.dariotaki@di.uoa.gr 1

Peer-to-peer system: A distributed system without any centralized control or hierarchical organization, where the software running at each node is equivalent in functionality • The goal:Locate the node that stores a particular data item • General Idea of Chord:A distributed lookup protocol that, given a key, it maps the key onto a node • Benefits: • Adapts as nodes join or leave the system • Scalable with communication cost scaling • logarithmically with the number of nodes

Chord features • Load balance:Using a distributed hash function it spreads keys evenly over the nodes • Decentralization:Fully distributed. No node is more important than any other • Scalability:Cost of lookup grows with O(logN) • Availability:Adjusts its internal tables to reflect joins and failure of nodes • Flexible Naming:No constraints on the structure of the keys it looks up

Examples of Chord applications • Cooperative Mirroring:S/W developers who publish demonstrations and the demand vary dramatically. • =>balance load • =>replication and caching • =>ensure authenticity (storing the data under o Chord key derived from a cryptographic hash of the data) • Time-Shared Storage:Store data in others’ machine to ensure that they will always be available. The data’s name can serve as a key to identify the node responsible for the data at any time • Distributed Index:Support for Gnutella, Napster keyword search. Key can be derived from desired keywords and values can be lists of machines offering documents with these keywords.

The base Chord Protocol Consistent Hashing Assigns each node and key an m-bit identifier. It is chosen by hashing the node’s IP address or the key Identifier length m must be large enough to make the probability of two nodes or keys hashing to the same identifier negligible Assignment: Identifiers are ordered modulo 2m Key k is assigned to the first node whose identifier is equal to or follows the identifier of key k (successor node) When node n joins the network, certain keys previously assigned to n’s successor now become assigned to n When node n leaves the network, all of its assigned keys are reassigned to n’s successor m=3

Theorem 1 • For any set of N nodes and K keys, with high probability • Each node is responsible for at most (1+ε)K/N keys (ε=O(logN)) • When an (N+1)st node joins or leaves the network, responsibility for O(K/N) keys changes hands (and only to or from the joining node) • ScalableKeyLocation • To implement consistent hashing, each node need only be aware • of its successor node on the circle. • May be inefficient e.g. traverse all nodes to find the appropriate • Each node maintains a routing table with m entries(finger table) • ith finger of n: ith entry in finger table of n, • the identity of the first node s that succeeds n by • at least 2i-1 on the identifier circle

Two important characteristics • Eachnode stores information about a small number of nodes and knows more about nodes closely following it • A node’s finger table generally does not contain enough information to determine the successor of an arbitrary key. • What happens if a node does not know the successor of a key? • It asks a node with ID closer to the key whose successor is • requested than its own ID • Theorem 2 • With high probability the number of nodes that must be • contacted to find successor in an N-node network is O(logN)

Joining the network Requires: 1.Each node’s successor is correctly maintained 2.For every key k, node successor(k) is responsible for k (For fast lookups, finger tables must be correct) Theorem 3 With high probability any node joining or leaving an N-node Chord network will use O(log2N) messages to re-establish the Chord routing invariants and finger tables For simplicity each node maintains a predecessor pointer Step 1:Initialize the predecessor and fingers of node n - Ask an existing node n’ to lookup them up - Need for O(mlogN) lookups for m entries - m entries can be reduced checked to O(logN) (with additional check for empty intervals) As a result the overall time for lookups is O(log2N)

Step 2:Update the fingers and predecessors of existing • nodes to reflect the addition of n • Node n will become the ith finger of a node p iff • p precedes n by at least 2i-1 and • the ith finger of node p succeeds n (see e.g.) • Finding and updating these nodes takes O(log2N) but can be • reduced to O(logN) • Step 3:Notify the higher layer s/w so that it can transfer • state associated with keys (e.g. values) that node n is now • responsible for • Itdepends on the higher level s/w using Chord. • Node n can become the successor only for keys that the • immediately following node was previously responsible for

Example of a node join Node 6 joins the network

Concurrent joins Use of a stabilization protocol to keep successors up to date. Every node run stabilization periodically What happens if a lookup occurs before stabilization has finished? Three cases: 1.All finger tables are current=>Lookup needs O(logN) steps 2.Successors correct, fingers inaccurate=>Correct lookups but slow 3.Successors inaccurate=>Query failure. Retry after short pause Example of stabilization Existing nodes:np and ns Joining node: n with ID between np and ns n acquire ns as successor -> n notifies ns ,ns acquire nas predecessor -> when np runs stabilization it asks ns for its predecessor(=n) -> np acquire n as successor -> np notifies n, n acquires np as predecessor. Correct state

Theorem 4-5 - Once a node can successfully resolve a given query, it will always be able to do so in the future - At some time after the last join all successor pointers will be correct Joins don’t substantially damage the performance of fingers. O(logN) hops to reach the interval close to the target node and a linear search to find the exact node. Theorem 6 If we take a stable network with N nodes and another set of up to N nodes joins the network with no finger pointers (but with correct successor pointers), then lookups will still take O(logN) time with high probability

Failures • - Each node keeps a successor-list of its r nearest successors • If a node notices that its successor failed it replaces it with the • first live entry in its successor-list. • - When stabilization runs, finger tables will be updated. • Theorem 7-8 • If we use a successor list of length r=O(logN) in a network that is initially stable, and then every node fail with probability ½, then with high probability find_successor returns the closest living successor to the query key and the expected time to execute find_successor in the failed network is O(logN) • Chord can store replicas of the data associated with a key to the next k nodes succeeding the key => It can inform the higher layer when successors come and go, and when the s/w should propagate new replicas

Simulation and experimental results Load Balance Num of nodes: 104 Total num of keys: 10x104-100x104 PDF for 50x104 keys There are nodes with no keys max num of nodes=9.1x mean num Notes: The ID do not cover uniformly the identifiers’ space The number of keys per node increase linearly with num of keys

Adding virtual nodes to improve load balance • Num of nodes: 104 • Total num of keys: 100x104 • Better distribution of keys as number • of virtual nodes increase • Notes: • The tradeoff is the routing table usage will increase as each actual node • needs r times as much space to store the finger tables for its virtual • nodes. In practice it is not a problem • -The number of v.n. needed is at most O(logN) so we must know the total • # of nodes. Solution: Use an upper bound of nodes

Path Length:# of nodes traversed during a lookup operation. O(logN) Num of nodes: 2k Num of keys: 100x2k PDF 212 nodes Notes The mean path length increases logarithmically with the numbers of nodes Path length ~ ½ log2N

Simultaneous node failures Num of nodes: 104 Num of keys: 100x104 Note The fraction of lookups that fail as a function of nodes that fail is almost the fraction of nodes that fail

Failed lookups and stabilization Cause: 1.The node responsible for the key has failed 2.Some finger tables and predecessor pointers are inconsistent Note: It may take more than one stabilization to completely clear out a failed node

Chord: A scalable Peer-to-peer Lookup Service for Internet Applications Dariotaki Roula