P2P Group Meeting (ICS/FORTH) Monday, 28 March, 2005

A Scalable Content-Addressable Network Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker P2P Group Meeting (ICS/FORTH)Monday, 28 March, 2005

CAN is a distributed hash table. Consider a virtual world with specified geometry. Add some nodes uniformly. Each node holds information about its zone. Each node holds information about its neighbors. Zone: a chunk of the entire hash table. In a nutshell...

We want to store a (key, data) pair. Use a uniform hash function to map the key in a point P in the virtual world. P lies in a zone. The node which is the owner of the zone now holds the (key, data) pair. How it works?

We want to retrieve the (key, value) pair. Apply the same hash function to key. You'll get P. Now you can travel from your node to the node-owner of the zone which P belongs to, using simple geometry. How it works?

CAN uses a virtual d-dimentional Cartesian coordinate space on a d-torus. The entire coordinate space is dynamically partitioned to zones, owned by n nodes. Add some black magic and you get: Average Routing Path: (d/4)(n1/d) hops. Nodes maintain: 2d neighbors. (*If d=lgn/2, then we have O(lgn)) Now, the horry details...

Bootstrap process: via host caches. After the initial connection the new node selects a random point P and sends a JOIN request. The node that owns P splits its zone and sets the new node as a neighbor. Periodically update messages are exchanged between nodes located closed to the "neighborhood". Node insertion is a local process and affects only O(dimensions) existing nodes. Design – Construction

Explicit hand over: a node sends its zone to one of its neighbors (the one with the smallest one), before it leaves the system. TAKEOVER: If a node doesn't receive an UPDATE message from one of its neighbors for a long period, then it takes the zone (recovery). Node Departure

Goal: decrease routing latency by adding some nice features. Feature addition tradeoff: Per node state and system complexity is increased. Design Improvements

1. Multiple dimensions

Reality: An independent coordinate space with a specific zone mapping. Having multiple realities, a data object is replicated to various locations in the system. 2. Realities

2. Realities

Realities vs Dimensions

Each node forwards a message to the neighbor with higher RTT (round-trip-time) ratio. 3. RTT based routing

Multiple peers (MAXPEERS~3,4) share the same zone in the system. Advantages - Reduced path length. Zone overloading has the same affect as reducing the nodes of the system. - Reduced per-hop latency. Each node has multiple choices. - Improved fault tolerance. A zone is less possible to be vacant. 4. Zone Overloading

5. Multiple hash functions

Goal: apply an ordering based on RTT measures between each node and a set of landmarks (well known machines, i.e. DNS root names servers). Stretch: the ratio of the latency on the CAN network to the average latency on the IP network. 6. Topology Forcing

6. Topology Forcing

When a new node enters in the system, another node must split its zone. If uniform partitioning is enabled, the node with the largest zone volume will be selected. Uniform partitioning is a load balancing technique. 7. Uniform Partitioning

Caching: Each node maintains a cash with keys of popular data objects. Replication: Each node may replicate popular data objects to its neighbors. 8. Hot Spot Management

Design Review

System Size = n18 Results with a fixed system size

TS topologies model networks using a 2-level hierarchy of routing domains with transit domains that interconnect lower level stub domains. H(intra-transit, transit-stub, intra-stub) R(low_limit, up_limit) {random values } Example H(100, 10, 1): A Transit-Stub topology with a hierarchical link delay assignment of 100ms on intra -transit links, 10ms on transit-stub links and 1ms on intra-stub links. Transit-Stub Topology Generator

In a system with approximately 260,000 peers, CAN may achieve routing with a latency that is well within a factor of two of the underlying network latency. Main Result

Elias Athanasopoulos elathan@ics.forth.gr http://www.csd.uoc.gr/~elathan/ Thank you for your time. :-)

P2P Group Meeting (ICS/FORTH) Monday, 28 March, 2005