Scalable Architecture of Plaxton Routing and Tapestry Overlay for P2P Systems

Space complexity of Plaxton routing • There are n nodes, and logBn digits in the id, where B = 2b. • The neighbor table of each node consists of • primary neighbors (log n)/b per level, I.e. O(log n) • secondary neighbors O(log n) • reverse neighbors O(log n) expected and O(log2n) w.h.p • (proof appears in a technical report by the authors) • In addition, each node contains a pointer list <Obj, Server, Cost> • size of the pointer list = O(log2n) w.h.p. So, total auxiliary • memory per node = O(log2n) w.h.p (Scalable architecture)

Need for global knowledge to construct neighbor table Static architecture - no provision for node addition or deletion Single root for an object is a bottleneck. It is a single point of failure Tapestry provides innovative solutions to some of the bottlenecks of classical Plaxton routing. Limitations of Plaxton routing

Tapestry Tapestry: A Resilient Global-scale Overlay for Service Deployment Ben Y. Zhao, Ling Huang, Jeremy Stribling, Sean C. Rhea, Anthony D Joseph, and John Kubiatowicz: IEEE Journal on Selected Areas in Communications, January 2004, A self-organizingrobust scalable wire-area infrastructure for efficient location and delivery of contents in presence of heavy load and node or link failure. It is the backbone of the Oceanstore, a persistent wide-area storage system.

Major features of Tapestry Most contemporary P2P systems did not take take network distances into account when constructing their routing overlay; thus, a single overlay hop may span the diameter of the network. In contrast, Tapestry (and Pastry) construct locally optimal routing tables from initialization and subsequently maintain them.

Major features of Tapestry DHT based systems fix the number and location of object replicas. In contrast, Tapestry allows applications to place objects according to their needs. Tapestry “publishes” location pointers throughout the network to facilitate efficient routing to those objects with low network stretch.This makes Tapestry locality-aware.

Major features of Tapestry Allows node insertion and deletion to support a dynamic environment.These algorithms are fairly efficient.

Routing and Location Namespace (both nodes and objects) 160 bits using the hash function SHA-1 Each object has has one or more roots H (ObjectID) = RootID Suffix routing from A to B At hth hop, arrive at nearest node Nh that shares suffix with B of length h digits • Example: 5324 routes to 0629 via5324 --> 2349 --> 1429 --> 7629 --> 0629

Tapestry API PUBLISH OBJECT : Publish (make available) object on the local node. UNPUBLISH OBJECT : Best-effort attempt to remove location mappings for an object . ROUTE TO OBJECT : Routes message to location of an object with GUID (globally unique id). ROUTE TO NODE : Route message to application on the exact node .

Requirements for Insert and Delete • No central directory can be used • No hot spot or single point of failure • Reduce danger/threat of DoS. • Must be fast (should contact only a few nodes) • Keep objects available

Node Insertion • Need-to-know nodes are notified of, since the inserted node fills a null entry in their routing tables. • The new node might become the root for existing objects. • References to those objects must be moved to maintain object availability. • The algorithms must construct a near optimal routing table for the inserted node. Nodes near the inserted node are notified and such nodes may consider using it in their routing tables as an optimization.

Choosing Root in Tapestry Compute H (ObjectID) = RootID. Attempt to route to node the root(without knowing if it exists). If it exists, then it becomes the root. But otherwise • Whenever null entry encountered, choose the next “higher” non-null pointer entry (thus, if xx53 does not exist, try xx63) • If current node S is only non-null pointer in rest of map, terminate route, and choose S as the root (surrogate routing)

04345 & 54345   Acknowledged Multicast AlgorithmLocates & Contacts all nodes with a given suffix • Create a tree based on IDs • Starting node knows when all nodes reached The node then sends to any ?0345, any ?1345, any ?2345, etc. if possible ??345 ?1345 ?4345 ?4345 sends to 04345, 54345… if they exist 04345 54345

Three Parts of Insertion • Establish pointers from surrogates to new node. • Notify the need-to-know nodes • Create routing tables & notify other nodes

01234 01334 ????4 ???34 79334 39334 Gate Finding the surrogates surrogates • The new node sends a join message to a surrogate • The primary surrogate multicasts to all other surrogates. • Each surrogate establishes a pointer to the new node. • When all pointers established, continue new node

Need-to-know nodes • “Need-to-know” = a node with a hole in neighbor table filled by new node • If 01234 is new node, and no 234s existed, must notify ???34 nodes • Acknowledged multicast to all matching nodes • During this time, object requests may go either to new node or former surrogate, but that’s okay • Once done, delete pointers from surrogates.

Constructing the Neighbor Table via a nearest neighbor search • Suppose we have an algorithm A for finding the three nearest neighbors for a given node. • To fill in a slot, apply A to the subnetwork of three nodes that could fill that slot. • For ????1, run A on network of nodes ending in 1 • Pick nearest neighbor.

01234 01334 Finding Nearest Neighbor j-list is the closest k=O(log n) nodes matching in j digits • Let G be such that surrogate matches new node in last j digits of node ID • G sends j-list to new node; new node pings all nodes on j-list. • If one is closer, goto that node. If not, done with this level, and let j = j-1 and goto A. 32134 61524 11111

Is this the nearest node?Yes, with high probability under an assumption New node • Pink circle = ball around new node of radius d(G, new node) • Progress = find any node in pink circle • Consider the ball around the G containing all its j-list. Two cases: • Black ball contain pink ball; found closest node • High overlap between pink ball and G-ball so unlikely pink ball empty while G-ball has k nodes G, matches in j digits

out-neighbors 11111 xxxx1 exiting node 12345 xxx45 xxxx5 11115 54321 In-neighbors Deleting a node

Planned Delete • Notify its out-neighbors: Exiting node says “I’m no longer pointing to you” • To in-neighbors: Exiting node says it is going and proposes at least one replacement. • Exiting node republishes all objects ptrs it stores • Use republish-on-delete to clean things up • Objects rooted at exiting node get new roots

republish republish Republish-On-Delete republish republish republish

Unplanned Delete • Planned delete relied exiting node’s neighbor table. • List of out-neighbors • List of in-neighbors • Closest matching node for each level. • Can we reconstruct this information? • Not easily • Fortunately, we probably don’t need to.

Lazy Handling of Unplanned Delete • A notices B is dead, A fixes its own state • A removes B from routing tables • If removing B produces a hole, A must fill the hole, or be sure that the hole cannot be filled—use acknowledged multicast • A republishes all objs with next hop = B. • Use republish-on-delete as before

3 4 2 NodeID 0x43FE 1 4 3 2 1 3 4 4 3 2 3 4 2 3 1 2 1 2 3 1 Tapestry MeshIncremental suffix-based routing (slide borrowed from the original authors) NodeID 0x79FE NodeID 0x23FE NodeID 0x993E NodeID 0x43FE NodeID 0x73FE NodeID 0x44FE NodeID 0xF990 NodeID 0x035E NodeID 0x04FE NodeID 0x13FE NodeID 0x555E NodeID 0xABFE NodeID 0x9990 NodeID 0x239E NodeID 0x73FF NodeID 0x1290 NodeID 0x423E

Fault detection • Soft-state vs. explicit fault-recovery - Soft-state periodic republish is more attractive • Expected additional traffic for periodic republish is low • Redundant roots for better resilience • Object names hashed w/ small salts i.e. multiple names/roots • Queries and publishing utilize all roots in parallel

Summary of Insertion steps Step 1: Build up N’s routing maps • Send messages to each hop along path from gateway to current node N • The ith hop along the path sends its ith level route table to N • N optimizes those tables where necessary Step 2: Move appropriate data from N’ to N Step 3: Use back pointers from N’ to find nodes which have null entries for N’s ID, tell them to add new entry to N Step 4: Notify local neighbors to modify paths to route through N where appropriate

3 2 1 4 3 2 1 3 4 4 3 2 3 4 2 3 1 2 1 3 1 Dynamic Insertion Exampleborrowed from the original slides 4 NodeID 0x779FE NodeID 0xA23FE NodeID 0x6993E NodeID 0x243FE NodeID 0x243FE NodeID 0x973FE NodeID 0x244FE NodeID 0x4F990 NodeID 0xC035E NodeID 0x704FE NodeID 0x913FE NodeID 0xB555E NodeID 0x0ABFE NodeID 0x09990 NodeID 0x5239E NodeID 0x71290 Gateway 0xD73FF NEW 0x143FE

Scalable Architecture of Plaxton Routing and Tapestry Overlay for P2P Systems

Scalable Architecture of Plaxton Routing and Tapestry Overlay for P2P Systems

Presentation Transcript

Space Complexity

A Routing Protocol for Space Communication

Space and Time Complexity in Chapter 1

time complexity space complexity

THE SPACE ECONOMY Symposium Internet Routing in Space (IRIS)

Space-bounded Communication Complexity

The Data Stream Space Complexity of Cascaded Norms

THEORY OF COMPUTATION TOPIC:COMPLEXITY CLASSES SPACE COMPLEXITY SHESHAN SRIVATHSA

Lecture 23 Space Complexity of DTM

Space Complexity

Space Complexity

Szymanski’s Conjecture and the Complexity of Hypercube Routing

Multi-Layer Channel Routing Complexity and Algorithms

Space Complexity

Offloading Routing Complexity to the Cloud(s)

Space complexity

Space complexity

Plaxton Routing

Multi-Layer Channel Routing Complexity and Algorithm

Time and Space Complexity

Space Complexity