1 / 24

A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance

A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance. Feng Wang 1 , Zhuoqing Morley Mao 2 Jia Wang 3 , Lixin Gao 1 , Randy Bush 4. 1 University of Massachusetts, Amherst 2 University of Michigan 3 AT&T Labs-Research 4 Internet Initiative Japan.

Download Presentation

A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance Feng Wang1, Zhuoqing Morley Mao2 Jia Wang3, Lixin Gao1, Randy Bush4 1University of Massachusetts, Amherst 2University of Michigan 3AT&T Labs-Research 4Internet Initiative Japan September 14, 2006

  2. Motivation • Real-time services have made high availability of end-to-end Internet paths of paramount importance. • low packet loss rate, low delay, high network availability, and fast reaction time • Internet path failures are widespread [Labovitz:98, Markopoulou:04,Feamster:03]. • can last as long as 10 minutes • Degraded end-to-end path performance is correlated with routing dynamics.

  3. Open Questions • How routing changes result in degraded end-to-end path performance? • What kinds of routing dynamics cause the degraded end-to-end performance? • How factors such as topological properties, or routing policies affect performance degradation?

  4. Our Work • Study end-to-end performance under realistic topologies. • Investigate several metrics to characterize the end-to-end loss, delay, and out-of-order packets. • Characterize the kinds of routing changes that impact end-to-end path performance. • Analyze the impact of topology, routing policies, MRAI timer and iBGP configurations on end-to-end path performance.

  5. Methodology • A multi-homed prefix • BGP Beacon prefix: 192.83.230.0/24 • Controlled Routing Changes • Failover events: Beacon changes from the state of having both providers to the state of having only a single provider. • Recovery events: Beacon changes from the state of having a single provider for connectivity to the state of having both providers. Provider 1 Provider 2 Provider 1 Provider 2 Provider 1 Provider 2 Failover event Recovery event Beacon Beacon Beacon

  6. Active Probing • From 37 PlanetLab hosts to the Beacon host (a host within the Beacon prefix) • Back-to-back traceroutes • Back-to-back pings • UDP probing (50msec interval) • Data plane performance metrics host B host A Internet host C Provider 1 Provider 2 Beacon host

  7. Packet Loss • Loss burst: consecutive UDP probing packets lost during a routing change event. Failover Recovery

  8. Correlating Packet Loss with Routing Failures • ICMP replies • temporary loss of reachability (!N or !H) • forwarding loops (exceeded TTL) • Routing failures • temporary loss of reachability and transient routing loops • Correlate loss bursts with ICMP messages • time window [-1 sec, 1 sec] • Underestimate the number of loss bursts due to routing failures • missing ICMP packets.

  9. An Example planet02.csc.ncsu.edu experiences packet loss on July 30, 2005

  10. Loss Bursts due to Routing Failures • Failover events: 76% packets lost • Recovery events: 26% packets lost Failover Recovery

  11. How Routing Failures Occur (Failover)? Prefer-customer routing policy: routes received from a provider’s customers are always preferred over those received from its peers. Provider 1 Provider 2 Peer link 0 R2 R3 R4 R5 0 0 2 0 0 1 0 R1 R6 0 0 Customer link Beacon AS 0

  12. How Routing Failures Occur (Failover)? (contd.) No-valley routing policy: peers do not transit traffic from one peer to another. 1 0 2 0 1 0 R8 R7 R9 2 0 1 0 Provider 3 Peer link R2 R3 R4 R5 Peer link 0 0 0 2 0 0 1 0 R1 R6 0 0 Provider 2 Provider 1 Beacon AS 0

  13. How Routing Failures Occur? (Recovery) iBGP constraint: a route received from an iBGP router cannot be transited to another iBGP router Provider 2 Withdraw (2 0) R1 R2 R4 Provider 1 1. Path 0 R3 recovery. 2. R3 sends the path to R2 path (0) Path (0) 3. R2 sends a withdrawal to R1 R3 4. R3 sends the recovery path to R1 0 5. R1 regains its connection to the Beacon Beacon AS 0

  14. Summary • During failover and recovery events • Routing changes impact packet loss significantly. • Multiple loss bursts are observed in 60% of events. • Routing changes can lead to long packet round-trip delays and reordering. • Loss bursts explained by routing failures last longer than those unidentified ones. • Loss bursts caused by forwarding loops last longer than those caused by loop-free routing failures.

  15. Conclusions • During failover and recovery events • routing failures contribute to end-to-end packet loss significantly. • Routing policies, iBGP configuration and MRAI timer values play a major role in causing packet loss during routing events. • Degraded end-to-end performance can be experienced by a diverse set of hosts when there is a routing change. • Accommodate routing redundancy may eliminate majority of identified path failures.

  16. The End Thanks!

  17. Location of Lost Bursts (Failover events) • Location of the first lost bursts caused by routing failures. • From ISP 2’s BGP updates: • Routing failures do occur and are not visible from ICMP messages due to short duration. • From another AS’s BGP updates, and Oregon RouteView • Routing failures are cascaded to other ASes.

  18. Location of Lost Bursts (Recovery events) • Location of the first lost bursts caused by routing failures. • BGP updates from ISP 2 • 12 withdrawals over 724 recovery events

  19. Representativeness • Connectivity of Destination Prefixes • SS: Single-homed prefixes via a single upstream link • SM: Single-homed prefixes via multiple upstream links • MS: Multi-homed prefixes via a single upstream link • MM: Multi-homed prefixes via multiple upstream links • Routing tables from one tier-1 ISP on January 15, 2006

  20. Representativeness (contd.) • Multi-homed destination prefixes Peer link ISP 2 ISP 3 ISP 1 Customer link Customer link destination

  21. Representativeness (contd.) • Multi-homed destination prefixes with multi-upstream links ISP 2 ISP 1 ISP 2 ISP 1

  22. Loss Burst Length • loss burst length can be as long as 480 packets for failover events, and 180 packets for recovery events Loss burst length Failover events Recovery events

  23. Multiple Loss Bursts • Multiple loss bursts after the injection of a withdrawal message or an announcement. Failover Recovery

  24. Methodology Evaluation • Our measurement is not significantly biased by ICMP blocking • The number of ICMP messages in the absence of routing change (0.6%). • ICMP messages from 68 ASes, and 53% of them belong to 10 tier-1 ASes. • 52% of ISP1’s routers, and 95% of ISP2’s routers generate ICMP messages.

More Related