1 / 33

Consensus Routing: The Internet as a Distributed System

Consensus Routing: The Internet as a Distributed System. 2009. 2. 26 John P. John, Ethan Katz-Bassett, Arvind Krishnamurthy, and Thomas Anderson Presented by John P. John Modified by Moonyoung Chung. Contents. Introduction Motivation and Goals Consensus Routing Stable Mode

xerxes
Download Presentation

Consensus Routing: The Internet as a Distributed System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Consensus Routing: The Internet as a Distributed System 2009. 2. 26 John P. John, Ethan Katz-Bassett, Arvind Krishnamurthy, and Thomas Anderson Presented by John P. John Modified by Moonyoung Chung

  2. Contents • Introduction • Motivation and Goals • Consensus Routing • Stable Mode • Transient Mode • Evaluation • Conclusions NSDI '08

  3. Internet Routing • A goal of the Internet is global reachability • But, BGP fails to achieve this goal • Physical paths exist, but not BGP paths • 10-15% of BGP updates cause loops and blackholes • 90% of all packet losses on the Internet due to loops NSDI '08

  4. BGP • Opaque policy routing • Preferred routes visible to neighbors • Underlying policies not visible and under local control • Mechanism: • Autonomous Systems(ASes) send preferred path to neighbors • If AS receives new path, start using right away • Forward path to neighbors, after some delay • Path eventually propagates to all ASes NSDI '08

  5. Example 5: 4-5 5: 3-4-5 5: 1-5 1 2 5:4-5 5: 2-4-5 3 4 5 Destination NSDI '08

  6. BGP link failure 5: 4-5 5: 3-4-5 5: 1-5 5: 4-5 5: 2-4-5 1 2 3 5:4-5 4 Link 4-5 fails AS4 withdraws path from upstream ASes 5 Destination NSDI '08

  7. BGP link failure 5: 4-5 5: 3-4-5 5: 1-5 5: 4-5 5: 2-4-5 1 2 3 4 AS 2 and 3 pick their next best paths Routing loop is formed! 5 Destination NSDI '08

  8. BGP policy change 5: 4-5 5: 3-4-5 5: 6-4-5 5: 1-5 5: 4-5 5: 2-4-5 5: 6-4-5 1 2 3 6 5: 4-5 5: 2-4-5 5:4-5 4 AS4 wants all traffic destined for AS5 to come through AS6 5 Destination AS4 withdraws the path from AS2 and AS3 NSDI '08

  9. BGP policy change 5: 4-5 5: 3-4-5 5: 6-4-5 5: 1-5 5: 4-5 5: 2-4-5 5: 6-4-5 1 2 3 6 5: 4-5 5: 2-4-5 4 AS 2 and 3 pick their next best paths Routing loop is formed! 5 Destination NSDI '08

  10. Lack of Consistency • The underlying cause of all these problems is inconsistent global state • Link failures • Traffic engineering • Scheduled Maintenance • Link coming up • Protocol behavior complex, unpredictable • No indicator of when system converged to consistent state NSDI '08

  11. Motivation and Goal • Goal: • Networks that have high availability • Insight: • Consistency is the key NSDI '08

  12. Consensus Routing • Lesson from distributed system design: • De-couple safety and liveliness • Safety: Forwarding tables are always consistent and policy compliant, consistent view of global state • Liveness: Routing system adapts to failures quickly and maintains high availability NSDI '08

  13. Safety: Stable Mode • Problem: Inconsistent state • Solution: • Apply updates only after they have reached all dependent ASes • Apply updates synchronously across ASes NSDI '08

  14. Stable Mode • Consistent view of global state • Stable Forwarding Table (SFT) • at kth epoch • Update log • Distributed snapshot • Frontier computation • SFT computation • View change NSDI '08

  15. Update log 6 5 ASes compute and forward routes as before, but don’t apply to forwarding table 4 3 1 2 NSDI '08

  16. Distributed Snapshot Some node(s) calls for the (k+1)th distributed snapshot 6 5 • Run BGP, but don’t apply • the updates Periodically, a distributed snapshot is taken 4 3 Updates in transit, or being processed are marked incomplete 1 2 NSDI '08

  17. Frontier Computation: Aggregation 6 5 Consolidators • Run BGP, but don’t apply • the updates • Distributed Snapshot 4 3 ASes send snapshot report to the consolidators • the saved sequence of updates • the set of incomplete updates 1 2 * frontier: the most recent complete update at each AS NSDI '08

  18. Frontier Computation: Consensus 6 5 Consolidators • Run BGP, but don’t apply • the updates • Distributed Snapshot • Send info to consolidators 4 3 Consolidators run a consensus algorithm to agree on the set of incomplete updates 1 2 NSDI '08

  19. Frontier Computation: Flood 6 5 Consolidators • Run BGP, but don’t apply • the updates • Distributed Snapshot • Send info to consolidators • Consensus 4 3 Consolidators flood the incomplete set to all the ASes 1 2 NSDI '08

  20. SFT Computation & View Change 6 5 • Run BGP, but don’t apply • the updates • Distributed Snapshot • Send info to consolidators • Consensus • Flood 4 3 Apply completed updates 1 2 Versioning, Garbage collection Details and proof of consistency in the paper NSDI '08

  21. Mechanism • Other details in the paper: • Transition between epochs • Slow/unresponsive ASes • Failed ASes • Reintegration of failed ASes • Provable safety and liveness properties NSDI '08

  22. Transient Mode: Liveness • Problem: Upon link failure, need to wait till path reaches everyone • Solution: Dynamically re-route around the failed link • use existing techniques • Pre-computed backup paths • Deflection • Detour routing NSDI '08

  23. Routing Deflection S deflect packet to neighbor 1 3 traverse a different route 2 D Destination NSDI '08

  24. Backtracking S backtracking 1 4 3 2 D Destination NSDI '08

  25. Detour Routing S tunnel B Tier 1 4 B is responsible for forwarding packets 3 5 D Destination NSDI '08

  26. Backup routes • Pre-computed failover paths • e.g. RBGP, scheme for pre-computing backup routes to each destination NSDI '08

  27. BGP Global reachability Link Failure (or other BGP event) BGP converges to alternate path Connectivity Completely Unreachable Time NSDI '08

  28. Consensus Routing Global reachability Global reachability Link Failure (or other BGP event) Switch to transient routing Snapshot Connectivity Connectivity Completely Unreachable Completely Unreachable Time Time NSDI '08

  29. Evaluation • In the talk, answer the following: • How does consensus routing affect connectivity? • What is the traffic overhead? • Methodology • Extensive simulations on realistic Internet-scale topologies. • an implemented XORP prototype. • experiments on PlanetLab. NSDI '08

  30. Methodology • 23,390 ASes, 46,095 links • 9,100 multi-homed stub AS Fail each access link of each multi-homed stub AS 3 4 5 See what fraction of ASes are temporarily disconnected until convergence 1 2 NSDI '08

  31. Connectivity Consensus routing maintains complete connectivity in over 99% of the cases BGP maintains complete connectivity in < 40% of the failure cases NSDI '08

  32. Overhead overhead Entire update is not sent, only identifiers of the updates NSDI '08

  33. Conclusions • BGP’s transient problems are due to inconsistent global state • Consensus routing enables consistent routing state with opaque policies • key technique: separation of safety and liveness • We can have an Internet that has high availability! NSDI '08

More Related