1 / 27

An Overlay Infrastructure for Decentralized Object Location and Routing

This paper discusses a decentralized overlay infrastructure for object location and routing, using node IDs and keys from a randomized namespace. It explores incremental routing, small sets of outgoing routes, and log(n) neighbors per node. The paper compares this approach with unstructured and structured peer-to-peer overlays.

denisej
Download Presentation

An Overlay Infrastructure for Decentralized Object Location and Routing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Overlay Infrastructure for Decentralized Object Location and Routing Ben Y. Zhaoravenben@cs.ucsb.edu University of California at Santa Barbara

  2. Structured Peer-to-Peer Overlays • Node IDs and keys from randomized namespace (SHA-1) • incremental routing towards destination ID • each node has small set of outgoing routes, e.g. prefix routing • log (n) neighbors per node, log (n) hops between any node pair ID: ABCE ABC0 To: ABCD AB5F A930 ravenben@cs.ucsb.edu

  3. Related Work • Unstructured Peer to Peer Approaches • Napster, Gnutella, KaZaa • probabilistic search (optimized for the hay, not the needle) • locality-agnostic routing (resulting in high network b/w costs) • Structured Peer to Peer Overlays • the first protocols (2001): Tapestry, Pastry, Chord, CAN • then: Kademlia, SkipNet, Viceroy, Symphony, Koorde, Ulysseus… • distinction: how to choose your neighbors • Tapestry, Pastry: latency-optimized routing mesh • distinction: application interface • distributed hash table: put (key, data); data = get (key); • Tapestry: decentralized object location and routing ravenben@cs.ucsb.edu

  4. Chord • NodeIDs are numbers on ring • Closeness defined by numerical proximity • Finger table • keep routes for next node 2i away in namespace • routing table size: log2 n • n = total # of nodes • Routing • iterative hops from source • at most log2 n hops Node 0/1024 0 128 896 256 768 640 384 512 ravenben@cs.ucsb.edu

  5. Chord II • Pros • simplicity • Cons • limited flexibility in routing • neighbor choices unrelated to network proximity* but can be optimized over time • Application Interface: • distributed hash table (DHash) ravenben@cs.ucsb.edu

  6. Tapestry / Pastry • incremental prefix routing • 11110XXX00XX 000X0000 • routing table • keep nodes matching at least i digits to destination • table size: b * logb n • routing • recursive routing from source • at most logb n hops Node 0/1024 0 128 896 256 768 640 384 512 ravenben@cs.ucsb.edu

  7. 2175 0157 0154 0123 0880 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 4 4 4 4 4 5 5 5 5 5 7 7 7 7 7 3 3 3 3 3 6 6 6 6 6 Neighbor Map For “2175” (Octal) 0xxx 20xx 210x 2170 1xxx ---- 211x 2171 ---- 22xx 212x 2172 3xxx 23xx 213x 2173 4xxx 24xx 214x 2174 5xxx 25xx 215x ---- 6xxx 26xx 216x 2176 7xxx 27xx ---- 2177 4 3 2 1 Routing Levels Routing in Detail Example: Octal digits, 212 namespace, 2175  0157 2175 0880 0123 0154 0157 ravenben@cs.ucsb.edu

  8. Tapestry / Pastry II • Pros • large flexibility in neighbor choicechoose nodes closest in physical distance • can tune routing table size and routing hops using parameter b • Cons • more complex than Chord to implement / understand • Application Interface • Tapestry: decentralized object location • Pastry: distributed hash table ravenben@cs.ucsb.edu

  9. Talk Outline • Motivation and background • What makes Tapestry different • Tapestry deployment performance • Wrap-up ravenben@cs.ucsb.edu

  10. So What Makes Tapestry Different? • It’s all about performance • Proximity routing • leverage flexibility in routing rules • for each routing table entry, choose node • that satisfies prefix requirement • and is closest in network latency • result: end to end latency “proportional” to actual IP latency • DOLR interface • applications choose where to place objects • use application-level knowledge to optimize access time ravenben@cs.ucsb.edu

  11. Why Proximity Routing? • Fewer/shorter IP hops: shorter e2e latency, less bandwidth/congestion, less likely to cross broken/lossy links 01234 01234 ravenben@cs.ucsb.edu

  12. Performance Impact (Proximity) • Simulated Tapestry w/ and w/o proximity on 5000 node transit-stub network • Measure pair-wise routing stretch between 200 random nodes ravenben@cs.ucsb.edu

  13. backbone Decentralized Object Location & Routing routeobj(k) • redirect data traffic using log(n) in-network redirection pointers • average # of pointers/machine: log(n) * avg files/machine • keys to performance • proximity-enabled routing mesh with routing convergence routeobj(k) k publish(k) k ravenben@cs.ucsb.edu

  14. DOLR vs. Distributed Hash Table • DHT: hash content  name  replica placement • modifications  replicating new version into DHT • DOLR: app places copy near requests, overlay routes msgs to it ravenben@cs.ucsb.edu

  15. Performance Impact (DOLR) • simulated Tapestry w/ DOLR and DHT interfaces on 5000 node T-S • measure route to object latency from clients in 2 stub networks • DHT: 5 object replicas DOLR: 1 replica placed in each stub network ravenben@cs.ucsb.edu

  16. 0120 00XX 010X 0121 1XXX 011X 02XX 0122 2XXX 013X 3XXX 03XX XXXX 0XXX 01XX 012X ID = 0123 Weaving a Tapestry • inserting node (0123) into network • route to own ID, find 012X nodes, fill last column • request backpointers to 01XX nodes • measure distance, add to rTable • prune to nearest K nodes • repeat 2—4 Existing Tapestry ravenben@cs.ucsb.edu

  17. Talk Outline • Motivation and background • What makes Tapestry different • Tapestry deployment performance • Wrap-up ravenben@cs.ucsb.edu

  18. Implementation Performance • Java implementation • 35000+ lines in core Tapestry, 1500+ downloads • Micro-benchmarks • per msg overhead: ~ 50s, most latency from byte copying • performance scales w/ CPU speedup • 5KB msgs on P-IV 2.4Ghz: throughput ~ 10,000 msgs/sec • Routing stretch • route to node: < 2 • route to objects/endpoints: < 3higher stretch for close by objects ravenben@cs.ucsb.edu

  19. killnodes constantchurn largegroup join success rate (%) Stability Under Membership Changes • Routing operations on 40 node Tapestry cluster • Churn: nodes join/leave every 10 seconds, average lifetime = 2mins ravenben@cs.ucsb.edu

  20. Micro-benchmark Methodology SenderControl ReceiverControl LANLink Tapestry Tapestry • Experiment run in LAN, GBit Ethernet • Sender sends 60001 messages at full speed • Measure inter-arrival time for last 50000 msgs • 10000 msgs: remove cold-start effects • 50000 msgs: remove network jitter effects ravenben@cs.ucsb.edu

  21. 100mb/s Micro-benchmark Results (LAN) • Per msg overhead ~ 50s, latency dominated by byte copying • Performance scales with CPU speedup • For 5K messages, throughput = ~10,000 msgs/sec ravenben@cs.ucsb.edu

  22. Large Scale Methodology • PlanetLab global network • 500 machines at 100+ institutions, in North America, Europe, Australia, Asia, Africa • 1.26Ghz PIII (1GB RAM), 1.8Ghz P4 (2GB RAM) • North American machines (2/3) on Internet2 • Tapestry Java deployment • 6-7 nodes on each physical machine • IBM Java JDK 1.30 • Node virtualization inside JVM and SEDA • Scheduling between virtual nodes increases latency ravenben@cs.ucsb.edu

  23. Node to Node Routing (PlanetLab) Median=31.5, 90th percentile=135 • Ratio of end-to-end latency to ping distance between nodes • All node pairs measured, placed into buckets ravenben@cs.ucsb.edu

  24. Latency to Insert Node • Latency to dynamically insert a node into an existing Tapestry, as function of size of existing Tapestry • Humps due to expected filling of each routing level ravenben@cs.ucsb.edu

  25. Thanks! Questions, comments? ravenben@cs.ucsb.edu

  26. Object Location (PlanetLab) 90th percentile=158 • Ratio of end-to-end latency to client-object ping distance • Local-area stretch improved w/ additional location state ravenben@cs.ucsb.edu

  27. Bandwidth to Insert Node • Cost in bandwidth of dynamically inserting a node into the Tapestry, amortized for each node in network • Per node bandwidth decreases with size of network ravenben@cs.ucsb.edu

More Related