1 / 44

Turning Heterogeneity into an Advantage in Overlay Routing

This paper discusses the use of distributed hash table (DHT) based overlay networks in large distributed systems, focusing on their scalability, fault-tolerance, security, reliability, and low maintenance cost. It introduces a new approach for achieving proximity awareness in the overlay network by constructing an auxiliary routing network using AS-level topology derived from BGP reports. The simulation results show close to optimal routing performance compared to previous approaches.

wherzog
Download Presentation

Turning Heterogeneity into an Advantage in Overlay Routing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Turning Heterogeneity into an Advantage in Overlay Routing(To be presented at IEEE Infocom’03) Zhichen Xu, Mallik Mahalingam, Magnus Karlsson Internet Systems and Storage Lab Hewlett-Packard Company

  2. Motivation • For a large distributed system to function well it must be scalable, fault-tolerant, secure, reliable, and have low maintenance cost • Distributed hash table (DHT) based overlay networks provide a simple abstraction that maps “keys” to “values” • They can be used in many important applications, as a result these applications can enjoy these nice properties • E.g., distributed storage, DNS, media streaming, web caching, content-based searching, distributed firewalls, etc. • Several proposals: Pastry, Tapestry, CAN, eCAN, SkipNet, etc. • Provide a homogeneous abstraction to the applications, but vary in their logical structures and flexibility Zhichen Xu

  3. Baseline DHT, a 2-dimensional CAN node zone • Cartesian space partitioned into zones • A node serves as “owner” of a zone • A key is a “point” in the Cartesian space • “Value” stored on node that owns the zone that contains the point (key) Zhichen Xu

  4. Low maintenance cost & self-organizing… new zone new node • Node join: pick a point and split zone with node currently owns the point • Node departure: a neighboring node takes over “state” of the departing node • Dynamisms are shielded from the users and applications! Zhichen Xu

  5. Logical routing 1 2 3 • Routing: traverse a series of neighboring zones from source to destination Zhichen Xu

  6. Each logical hop can correspond to multiple physical hops 1 1 2 3 3 2 • It is important that the structure of the overlay efficiently uses the underlying physical network! Zhichen Xu

  7. Techniques for achieving proximity awareness • Within the overlay [Castro et al] • Geographic layout, e.g., Topologically-aware CAN • uneven distribution of the nodes and • chance of overloading nodes Zhichen Xu

  8. Techniques for achieving proximity awareness • Within the overlay [Castro et al] • Geographic layout, e.g., Topologically-aware CAN • uneven distribution of the nodes and • chance of overloading nodes • Proximity routing, e.g., Chord, • Choices limited Closest to s s: source Candidate 1 Candidate 2 d: destination Candidate 3 Zhichen Xu

  9. Techniques for achieving proximity awareness • Within the overlay [Castro et al] • Geographic layout, e.g., Topologically-aware CAN • uneven distribution of the nodes and • chance of overloading nodes • Proximity routing, e.g., Chord, • Choices limited • Proximity-neighbor selection, e.g., Pastry, Tapestry, eCAN • Routing table entries selected according to proximity metric among nodes that satisfy the constraint 1 3 2 4 1 2 4 8 3 6 7 5 11 12 9 10 16 14 15 13 Zhichen Xu

  10. Techniques for achieving proximity awareness • Within the overlay [Castro et al] • Geographic layout, e.g., Topologically-aware CAN • uneven distribution of the nodes and • chance of overloading nodes • Proximity routing, e.g., Chord, • Choices limited • Proximity-neighbor selection, e.g., Pastry,Tapestry, eCAN • Routing table entries selected according to proximity metric among nodes that satisfy the constraint Performance constrained by the logical structure of the default overlay Zhichen Xu

  11. Techniques for achieving proximity awareness • Auxiliary networks, e.g. Brocade • Constructing a secondary overlay network • Still use logical routing in the secondary network • Pushes the problem to an auxiliary network of a smaller size • Dilemma in picking the size of the secondary network • Within the overlay [Castro et al] • Geographic layout, e.g., Topologically-aware CAN • uneven distribution of the nodes and • chance of overloading nodes • Proximity routing, e.g., Chord, • Choices limited • Proximity-neighbor selection, e.g., Pastry • Routing table entries selected according to proximity metric among nodes that satisfy the constraint Performance constrained by the logical structure of the default overlay Zhichen Xu

  12. Our contributions • Decouple the homogeneous abstraction from routing • Constructing auxiliary routing network using • AS-level topology derived from BGP reports • Landmark-numbering scheme • Route advertisement using a “distance vector” algorithm with a route summarization to reduce state • Works with all currently existing overlays • Simulation results show that our approach can achieve close to optimal routing performance • 1.04 to 1.12 times optimal for an Internet-like topology • Previous approaches 2.5 to 5 times optimal for the same topology Zhichen Xu

  13. Outline • Motivation • Related work • Default overlay network eCAN • Expressway: unconstrained auxiliary network • How does a node find the close-by nodes? • How do we control the routing state? • What can the expressway be used for? • Experimental results • Discussions & conclusions Zhichen Xu

  14. eCAN, represents state-of-art CAN zones (order-1 zones) Zhichen Xu

  15. K default CAN zones make an order-2 zone Order-2 zones Zhichen Xu

  16. K order-2 zones make an order-3 zone Order-3 zones Zhichen Xu

  17. High order routing neighbors • High-order routing tables are soft-state • Allows for proximity-neighbor selection • Neighbor selection based on landmark clustering / controlled data placement • Topology-aware Chord is equivalent to 1-d eCAN Zhichen Xu

  18. Expressway definitions & challenges • Expressway nodes are nodes that have good connectivity and availability • Expressway nodes connect to other expressway nodes that are close-by to form a backbone • Ordinary nodes connect to closest expressway node • Traffic go through expressway, if possible • Challenges: • How does a node (ordinary or expressway) find the close-by expressway nodes? • How do we control the routing state? • What can the expressway be used for? Zhichen Xu

  19. Outline • Motivation • Related work • Default overlay network eCAN • Expressway: unconstrained auxiliary network • How does a node find the close-by nodes? • How do we control the routing state in the expressway? • What can the expressway be used for? • Experimental results • Discussions & conclusions Zhichen Xu

  20. Landmark clustering • Related work • Landmark ordering [Ratnasamy et al 2002]: • Coordinate-based [Eugene and Zhang 2001]: Landmark3 Landmark space di: distance to landmark I <d1, d2, d3> Landmark1 Landmark vector Nodes with similar distances to landmarks likely close to each other Landmark2 Zhichen Xu

  21. Locating close-by expressway node • Landmark vector as key to store information of the expressway nodes on the DHT such that distances in the “landmark space” are preserved • A node uses its landmark vector to search the DHT to find close-by nodes • Expressway nodes finds and connects to physically close-by expressway nodes to form the expressway network Landmark3 DHT a a b b Landmark1 c c Landmark2 Zhichen Xu

  22. But, the dimensionality of the landmark space and that of the DHT can be different Landmark3 DHT Dimension reduction a a b b Landmark1 c c Landmark2 Zhichen Xu

  23. Space Filling Curves : Hilbert Curve • Points close to each other in n-d space mapped to points close to each other in 1-d space, and vice versa 2 3 8 7 1 4 5 6 Zhichen Xu

  24. Proximity-preserving dimension reduction of landmark vectors : landmark numbering 5 6 2 3 7 8 4 3 1 4 6 5 7 1 2 Landmark number (a) (b) Zhichen Xu

  25. Discussions • A similar procedure can be used for other overlays • For Chord, we use the landmark number as the DHT key to store information of the expressway nodes on a node whose ID is greater or equal to the landmark number • For Tapestry and Pastry, we can use a prefix of the node IDs to partition the logical space into grids. In summary, our goal is to store expressway node information such that information about close-by nodes is stored close to each other on the overlay Whereas, e.g., Pastry relies on the ability of finding physically closest node at node join and requires message exchanges to fix up the existing routing tables Zhichen Xu

  26. Outline • Motivation • Related work • Default overlay network eCAN • Expressway: unconstrained auxiliary network • How does a node find the close-by nodes? • How do we control the routing state in the expressway? • What can the expressway be used for? • Experimental results • Discussions & conclusions Zhichen Xu

  27. Route advertisement with summarization • An expressway node periodically advertises all local nodes that are in its physical proximity to neighboring expressway nodes • Same as the standard distance vector algorithm, except • advertise summarization of multiple nodes, and transport address of one representative node • only expressway nodes participate in route advertisement • Route advertisement messages are controlled with a time-to-live (TTL) expressed as the number of expressway hops Zhichen Xu

  28. 0 4 8 12 1 5 9 13 2 6 10 14 3 7 11 15 Route summarization: aggregate multiple nodes • For CAN, we partition the Cartesian space into virtual grids • For Pastry, we can summarize multiple node with nodeID prefix • For Chord, we can summarize multiple nodes with a nodeID range • Nodes whose zone falls in a virtual grid are summarized by the ID of the virtual grid • The pair <GridID, IP of representative node> are propagated representative node Zhichen Xu

  29. Outline • Motivation • Related work • Default overlay network eCAN • Expressway: unconstrained auxiliary network • How does a node find the close-by nodes? • How do we control the routing state in the expressway? • What can the expressway be used for? • Experimental results • Discussions & conclusions Zhichen Xu

  30. Expressway node Expressway node node node node Expressway node Direct route vs. Expressway-node forwarding source • Direct route: • Requires slightly more storage space to keep the route summary and relies on IP routing • Expressway–node forwarding: • If a node leaves the system, it is less expensive to repair • May deliver routing performance better than default IP routing [RON 2001, Detour 1999] • Ordinary nodes cache addresses of nodes associated with the same expressway node node node node node Direct route node node Expressway node node node dest Zhichen Xu

  31. Experimental evaluation : 2-d eCAN as default overlay • AS topology: • 1000 AS from a total of 13,000 active AS • Assume 100 ms inter-AS delay and 10 ms intra-AS delay • A node is assigned to one of the 1000 AS. • Transit-stub graph using GT-ITM: • 10,000 nodes, 228 transit domains, 5 nodes /transit domain, 4 stub domains/transit node, and 2 nodes in each stub domain. • 100ms for cross transit links, 20 ms for links inside a transit, 5 ms for links connecting a transit and stub node, and 2 ms for links inside a stub • Compare against • eCAN with roughly the same amount of state • Logical auxiliary: a Brocade-like system that uses a homogeneous auxiliary logical overlay network Zhichen Xu

  32. eCAN with similar state • For fairness, we compare with eCAN with similar state • How do we make use of the additional state? • Rather than always route to the physically closest nexthop candidate, we route to the nexthop that can bring down overall delay 1 2 3 Zhichen Xu

  33. node node Logical auxiliary 0.5 caching along advertising paths 2: lookup the IP address of the destination node Homogeneous auxiliary overlay network 0: ordinary nodes advertise themselves on the auxiliary using nodeIDs as keys to store their IP addresses 1: contact local super node 3: route to the destination Default overlay Zhichen Xu

  34. Parameters used • # of nodes: 512-8K (4K as default) • TTL: 1-9 (9 as default) • Virtual grids : 1 virtual grid /1 node – 1/16 nodes(1/1as default) • # number of landmarks: 15 • Fraction of nodes that are expressway nodes: 1/1-1/64 (1/10 default) • Routing: direct, expressway-node forwarding • Performance metric: stretch • Routing delay / shortest-path delay Zhichen Xu

  35. Summary of results • Expressway produces good average routing performance • Landmark clustering: • For the AS topology, 1.07 times shortest-path routing, individual measurement ranging from 1.04 to 1.12 • For the transit-stub graph, 1.41 on average, with individual measurements ranging from 1.20 to 1.55 (Can be better as ordinary nodes associating with the same expressway node do not establish direct route among themselves) • eCAN and homogeneous auxiliary stays between 2.5-7 times shortest-path routing Zhichen Xu

  36. Comparison of various approaches • Our approach: 1.07 to 1.41 times of optimal • Other approaches: 2.5 to 7 times of optimal AS topology Transit-stub graph Zhichen Xu

  37. Direct route vs. expressway-node forwarding • Direct route performs better than expressway-node forwarding, due to shortest-path routing • Performance of our approach improves as number of nodes increases Zhichen Xu

  38. Effect of varying the ratio of expressway nodes in the system • As the percentage of expressway nodes increases, expressway better approximates the underlying physical network • Whereas “logical auxiliary” cannot take advantage of this Zhichen Xu

  39. Conclusions • Propose generic techniques to construct an auxiliary network for DHT-based overlays • Decouples routing from DHT abstraction to take advantage of heterogeneity that exists in the system • Achieves routing performance close to optimal • The protocol is relatively complicated • The expressway nodes need to be relatively stable Zhichen Xu

  40. High-order node node zone 5 High-order node zone 1 node High-order zone 4 node node node node node node node High-order High-order zone 3 zone 2 node node node node node More about eCAN • Topology-aware Chord: 1-d eCAN • High-order zones allows for locality-preserving data placement (SkipNet) Placement of objects can be controlled to preserve locality Machines that belong to certain organizations can be co-located logically node Zhichen Xu

  41. Varying the number of virtual grids 1 node/virtual grid 4 nodes/virtual grid 16 nodes/virtual grid Zhichen Xu

  42. Effect of varying TTL for route advertisement Zhichen Xu

  43. Example applications • Distributed storage space • Content  SHA-1  key • Place <Key, document> pair on top of DHT • Object lookup translates to routing • Distributed content-based search • Controlled placement of document info on DHT such that documents that are similar in contents are co-located • Search space is effectively controlled It is important that structure of the overlay efficiently uses the underlying physical network! Zhichen Xu

More Related