1 / 20

On Understanding of Transient Interdomain Routing Failures

On Understanding of Transient Interdomain Routing Failures. Feng Wang, Lixin Gao, Jia Wang, and Jian Qiu. Department of Electrical and Computer Engineering University of Massachusetts, Amherst MA 01002. AT&T Labs-research 180 Park Ave, Florham Park NJ 07869. Outline.

jatin
Download Presentation

On Understanding of Transient Interdomain Routing Failures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. On Understanding of Transient Interdomain Routing Failures Feng Wang, Lixin Gao, Jia Wang, and Jian Qiu Department of Electrical and Computer Engineering University of Massachusetts, Amherst MA 01002 AT&T Labs-research 180 Park Ave, Florham Park NJ 07869

  2. Outline • What is transient routing failures? • When can transient routing failures occur? • How long can transient routing failures last? • Measurement results

  3. Internet Routing • Autonomous systems (ASes) • Internet Service Providers (ISPs) • Companies • Universities • Intradomain Routing Protocols • Static Routing, OSPF, IS-IS • Interdomain Routing Protocol • Border Gateway Protocol (BGP)

  4. Long Convergence Delay • Long convergence delay (Labovitz et al, TON2001) • Bringing a route back • (Tup): <shortest path length  MRAI • Disconnecting a route • (Tdown): <longest path length  MRAI • Fail-over: rerouting from Path A to Path B • During the time for discovering Path B, routers might experience transient routing failures, i.e., no route is available

  5. An Example of Transient Routing Failure AS3 AS1 W:20 W:20 W:20 AS2 120 10 10 20 210 A:10 A:10 A:10 losing reachability Traffic on data plane AS0 BGP update d BGP Routing table

  6. Our Contributions • Identify transient routing failures • Sufficient conditions • Bound transient routing failure duration

  7. Outline • What is transient routing failures? • When can transient routing failures occur? • How long can transient routing failures last? • Measurement results

  8. When Transient Routing Failures can Occur? • Two sufficient conditions for a node must experience a transient routing failure (transient routing failure for sure). • One sufficient condition for a node may experience a transient routing failure (potential transient routing failure). w 10 310 1 3 w 2 210 20 20 0

  9. When Transient Routing Failures can Occur? (contd.) w 310 320 320 10 310 1 3 w A 2 210 20 20 0

  10. Outline • What is transient routing failures? • When can transient routing failures occur? • How long can transient routing failures last? • Measurement results

  11. How long Transient Routing Failures last? MRAI timer MRAI timer W: 2 0 W: 2 0 W: 2 0 10 120 10 210 10 1 2 A: 10 A: 10 A: 10 0 d

  12. MRAI Timers • Minimum Advertisement Interval timer • Minimum amount of time that must elapse between routing updates • Applied to BGP announcement or withdrawal • Default MRAI value • eBGP session: 30 seconds • iBGP session: 5 seconds

  13. Upper Bound for Transient Routing Failure Duration • Transient routing failure  min(du +du)  MRAI du , du  u u  v 0 0

  14. Occurrence of Transient failures in a typical BGP system • In a typical BGP system, transient failures are prevalent. • Tier-1 ASes can experience transient routing failures, where alternate routes come from their edge routers. • Non tier-1 ASes can experience transient routing failures, where alternate routes are obtained from other ASes.

  15. Outline • What is transient routing failures? • When can transient routing failures occur? • How long can transient routing failures last? • Measurement results

  16. Measuring Transient Failures within a tier-1 AS BGP updates, BGP tables and router configuration files are collected during July 2004 Cumulative distribution of transient Failure Duration Percentage of transient failures among all routing failures that last less than 30 seconds

  17. Measuring Transient Failures contd. • Transient failures in tier-2 ASes using Oregon RouteView’s BGP updates (July 2004)

  18. Popularity of Prefixes Experiencing Transient Failures • We aggregate the Netflow data collected in the tier-1 AS during the week (1/2/2005~1/8/2005) • Transient routing failures can impact on popular prefixes and unpopular prefixes Fraction of transient routing failures

  19. Conclusions • Transient routing failures are prevalent in the Internet, and can last for a significant period of time. • Majority of transient failures occur under the commonly applied routing policy setting. • Popular and unpopular prefixes can experience transient failures.

  20. Thanks

More Related