1 / 57

Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier and Nick Feamster ( Georgia Tech )

UFO: A Resilient Layered Routing Architecture. Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier and Nick Feamster ( Georgia Tech ). Scalability + High Availability ?. Scalability : Scalability of routing control plane Efficiency of routing data plane.

konala
Download Presentation

Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier and Nick Feamster ( Georgia Tech )

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. UFO: A Resilient Layered Routing Architecture Yaping Zhu Advisor: Prof. Jennifer Rexford With: Andy Bavier and Nick Feamster (Georgia Tech)

  2. Scalability + High Availability ? Scalability: Scalability of routing control plane Efficiency of routing data plane High Availability: Quick adaptation and re-route

  3. Can We Have the Best of Both Worlds? Basic Idea: 1. Layered routing architecture (borrowing idea from overlay routing) 2. Underlay Support for efficient and scalable overlay routing

  4. Outline • Background • Internet routing architecture • Overlay routing (Resilient Overlay Networks) • Basic idea of Layered routing architecture • Efficient overlay forwarding • Scalable overlay monitoring • Enhancing the scalability of UFO • Implementation and Evaluation • Conclusion and deployment

  5. AS AS Transit AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS Internet Routing designed for Scalability Autonomous System (AS) Peering

  6. Internet Routing without High Availability • Scalability • Statistics: 25K ASes, 200K prefixes, millions of routers • Hierarchical: intra-domain / inter-domain routing • Prefix aggregation • Routing protocols oblivious to performance • Intra-domain: static link weights • Inter-domain: routing policies • Slow outage detection and recovery • Disruptions during convergence • Performance suffers from black-holes and loops

  7. Scalable Internet Routing without Customization • IP does destination-based forwarding • All traffic follows the same paths • Independent of the application requirements • Yet, applications have different needs • Voice and gaming: low latency and loss • File sharing: high throughput High throughput, but high latency low latency, but low throughput

  8. Outline • Background • Internet routing architecture • Overlay routing (Resilient Overlay Networks) • Basic idea of Layered routing architecture • Efficient overlay forwarding • Scalable overlay monitoring • Enhancing the scalability of UFO • Implementation and Evaluation • Conclusion and deployment

  9. RON: Resilient Overlay Networks (by D. Andersen) Scalable IP routing substrate

  10. RON: Resilient Overlay Networks System Components • Overlay Control Plane • Probing, overlay path evaluation • Disseminate routing messages, update routes • Overlay Data Plane • Tunnel setup: packet encapsulation/decapsulation • User Opt-in Method • DNS redirection to overlay server • Connection to overlay server: tunnels (e.g VPN)

  11. Overlay Routing • Pros: • High availability: End hosts discover network-level path failure and cooperate to re-route. • Customization: Forwarding paths tailored to the application • Applications: • Content distribution (e.g. Akamai SureRoute) • Application layer multicast

  12. Overlay Routing: Poor Efficiency • Problem: traffic must traverse bottleneck link both inbound and outbound • Additional latency overhead • Additional traffic consumption Upstream ISP

  13. Overlay Routing: Poor Scalability Let’s just keep probing Scalable IP routing substrate Shall I re-route if one packet lost? I don’t know when failure happens

  14. Overlay Routing: Poor Scalability • Fundamental trade-off between probing freq and adaptation • To get Quick adaptation -> aggressive probing at short time interval -> poor scalability: ->RON only supports fora small (i.e.,< 50 nodes) set of connected hosts • Can not differentiate packet lost due to different events • Failure -> fast re-route • Congestions -> may slower? -> oscillation?

  15. Outline • Background • Internet routing architecture • Overlay routing (Resilient Overlay Networks) • Basic idea of Layered routing architecture • Efficient overlay forwarding • Scalable overlay monitoring • Enhancing the scalability of UFO • Implementation and Evaluation • Conclusion and deployment

  16. Can We Have the Best of Both Worlds?

  17. A Resilient Layered Routing Architecture • Combination of underlay and overlay routing

  18. UFO: Underlay Friendly to Overlays Underlay • In-network support for overlays Friendly to Overlays

  19. A Resilient Layered Routing Architecture • Questions: • Which functionality belong to which layer? • What are the interfaces between both layers? • Cross-layer design • Efficiency improvement: • Direct control over forwarding table entries • Scalability improvement: • Explicit notification about changing network conditions

  20. Outline • Efficient overlay forwarding • Overlay forwarding on line cards • Hosting the overlay control plane • Scalable overlay monitoring • Registration of overlay links • Notification of network events • Lazy recovery • Enhancing the scalability of UFO • Implementation and Evaluation • Conclusion and deployment

  21. Outline • Efficient overlay forwarding • Overlay forwarding on line cards • Hosting the overlay control plane • Scalable overlay monitoring • Registration of overlay links • Notification of network events • Lazy recovery • Enhancing the scalability of UFO • Implementation and Evaluation • Conclusion and deployment

  22. Efficient Overlay Forwarding • Problem: traffic must traverse bottleneck link both inbound and outbound • Solution: reflection points in routers Upstream ISP

  23. Overlay Forwarding on Router Line Cards • Building block: tunnels

  24. Where the overlay control plane runs? On Routers • On Routers: by Router virtualization • Pros: fast updates of forwarding tables • Pros: efficient transmission of control messages • Pros: fate-sharing Processors Router Switching Fabric Line Cards

  25. Where the overlay control plane runs? On Servers

  26. Where the overlay control plane runs? On Servers • On separate set of servers • Update forwarding table on router line cards • Data packets reflected in-network • Pros: • Pros: cheap compared to router • Pros: compatibility with legacy overlay server • Cons: • Lack of fate sharing

  27. Outline • Efficient overlay forwarding • Overlay forwarding on line cards • Hosting the overlay control plane • Scalable overlay monitoring • Registration of overlay links • Notification of different kinds of network events • Lazy recovery • Enhancing the scalability of UFO • Implementation and Evaluation • Conclusion and deployment

  28. Scalable Overlay Monitoring Assumption: Rich connectivity, multiple alternative overlay paths Overlays could even tolerate “false positive” notification What to notify? Different applications may want notification of different events Notification Benefits: Accurate adaptation (compared with RON) Reduce probing overhead, and increase scalability

  29. Scalable Overlay Monitoring • Notification preserve overlay link abstractions • Message format: (overlay source, overlay destination, event) • Routers store states by explicit overlay registration • Explicit notification about events which affect performance of overlay applications • Physical failures of routers or links • Reachability failures: route withdraw, routing session failure • Network congestion • few “hello” packets lost

  30. A B C Registration of Overlay Links Overlay Nodes: A, B, C Routers: 1, 2, 3, 4 Register for uni-directional overlay links A->B and A->C 2 3 1 4

  31. A B C Periodical Registration of Overlay Links ACK for successful registration (A,B) (A,B) (A,B) (A,B) 2 3 1 4

  32. A B C Periodical Registration of Overlay Links Registration kept as soft state Periodical re-registration (A,B) (A,C) (A,B) (A,C) (A,B) (A,B) 2 3 1 (A,C) (A,C) 4

  33. A B C Notification of Network Events (A,B) (A,C) (A,B) (A,C) (A,B) (A,B) 2 3 1 (A,C) (A,C) 4

  34. Reactive Routing and Lazy Recovery • Assumption: rich connectivity • Reactive routing after notification • Re-route via alternative overlay paths • Disseminate notification message to peers • Lazy recovery • Stick to alternative overlay paths (e.g. for mins) • Re-register for failed overlay • Reason: transient period during convergence of recovery, causing loops and blackholes

  35. Outline • Efficient overlay forwarding • Overlay forwarding on line cards • Hosting the overlay control plane • Scalable overlay monitoring • Registration of overlay links • Notification of network events • Lazy recovery • Enhancing the scalability of UFO • Implementation and Evaluation • Conclusion and deployment

  36. D E A C B Unicast Registration is Inefficient • Overlay Nodes: A, B, C, D, E and Routers: 1, 2, 3, 4 • Register for overlay links B->A, C->A, D->A, E->A (B,A) 2 3 1 (B,A) (C,A) (D,A) (E,A) (B,A) (C,A) (D,A) (E,A) (B,A) (C,A) (C,A) (D,A) (E,A) (D,A) 4 (E,A)

  37. A B C E D Unicast Notification is inefficient (B,A) 2 3 1 (B,A) (C,A) (D,A) (E,A) (B,A) (C,A) (D,A) (E,A) (B,A) (C,A) (C,A) (D,A) (E,A) (D,A) 4 (E,A)

  38. D E A C B Multicast Registration GroupA 2 3 1 GroupA GroupA GroupA GroupA GroupA 4 GroupA

  39. D E A C B Multicast Notification GroupA 2 3 1 GroupA GroupA GroupA GroupA GroupA 4 GroupA

  40. Benefits of Multicast registration/notification • Reduce registration states stored at routers • Unicast: store state for each (src, dst) pair, O(n2) • Multicast: store state each mcast group, O(n) • Reduce notification message overhead • Deployment Benefits: • Exploit IP-Multicast (which routers already have)

  41. Outline • Efficient overlay forwarding • Overlay forwarding on line cards • Hosting the overlay control plane • Scalable overlay monitoring • Registration of overlay links • Notification of network events • Lazy recovery • Enhancing the scalability of UFO • Implementation and Evaluation • Conclusion and deployment

  42. Prototype Implementation on VINI • What’s finished? • RON • Control plane: probing and reactive routing • Data plane: overlay tunnel setup • User Opt-in: user data packets delivered by overlays • UFO: Notification of link failure • What to do next? • UFO • Evaluate inter-domain routing convergence • Notification of link congestion • Run applications: e.g. VoIP

  43. PlanetLab VM Prototype Implementation on VINI • Overlay: RON • Overlay FIB • Client opt-in • Notification by Filter UML RON XORP IP Router eth0 eth1 eth2 eth3 Control Data Overlay FIB Packet Forward Engine UmlSwitch element Tunnel table Click Filters VPN Server Clients

  44. Evaluation Setup • Topology • Routers and Overlay nodes s d r

  45. Evaluation1: Reactive Routing of RON • How much time does RON spend to detect outage? • RON probe interval : 12s • RON probe timeout: 3s • Average detection time = Probe interval / 2 + probe timeout * 3 • What to evaluate? • Fundamental trade-off between probe frequency and detection time • Parameters: probe interval

  46. Evaluation1: Reactive Routing of RON • Detection time = probe interval / 2 + probe timeout *3

  47. Evaluation2: comparison of Convergence Speed • Controlled Experiment • Fail a link by filtering all the packets • Comparison of Convergence Speed • IP routing (XORP) • RON reactive routing • Reactive routing with UFO notification

  48. Link down Link up Evaluation2: comparison of Convergence Speed • IP Routing (XORP) • Hello-interval: 15s • Router-dead-interval: 45s

  49. Link down Link up RON up Evaluation2: comparison of Convergence Speed • RON • Probe interval: 12s • Probe timeout: 3s • Re-route immediately after outage detection

  50. Link down Link up RON up UFO up Evaluation2: comparison of Convergence Speed • UFO routing with explicit notification • Re-route immediately after outage notification

More Related