1 / 32

Reducing Transient Disconnectivity using Anomaly-Cognizant Forwarding

Reducing Transient Disconnectivity using Anomaly-Cognizant Forwarding. Andrey Ermolinskiy, Scott Shenker University of California – Berkeley and ICSI. What’s the problem?. One of the central goals of the Internet - continuous end-to-end connectivity

booker
Download Presentation

Reducing Transient Disconnectivity using Anomaly-Cognizant Forwarding

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reducing Transient Disconnectivity using Anomaly-Cognizant Forwarding Andrey Ermolinskiy, Scott Shenker University of California – Berkeley and ICSI

  2. What’s the problem? • One of the central goals of the Internet - continuous end-to-end connectivity • BGP convergence is a major cause of connectivity disruption • Routers operate upon potentially inconsistent local views • Temporary inconsistencies give rise to anomalies such as loops and black holes that disrupt end-to-end packet delivery

  3. Example: transient routing loop with BGP 1. ECBA 2. GA 1. CBA 2. DBA F E 1. BA 2. DBA 1. BA 2. CBA C D G B withdraw BA A

  4. Example: transient routing loop with BGP Routing loop between C and D incurs temporary loss of connectivity between {B, C, D, E, F} and A. 1. ECBA 2. GA 1. CBA 2. DBA F E 1. BA 2. DBA 1. BA 2. CBA C D G B withdraw BA A

  5. Related Work • Shrinking the convergence time window through BGP protocol extensions • Ghost flushing • Consistency assertions • Protecting end-to-end packet delivery from adverse effects of convergence • R-BGP • Forward packets on pre-computed failover paths, • Propagate root cause information to prevent loops • Consensus Routing • Enforce a globally-consistent view via distributed snapshots and strategically delay adoption of incoming BGP updates • Anomaly-Cognizant Forwarding

  6. Anomaly-Cognizant Forwarding (ACF) • Approach • Accept routing anomalies as an unavoidable fact • Protect end-to-end packet delivery by detectingand recovering from anomalies on the forwarding path • Main hypothesis • Several simple and lightweight extensions to conventional IP forwarding enable us to sustain packet delivery during periods of BGP instability • without the use of pre-computed backup paths • without modifying the core routing protocol or altering its timing dynamics

  7. ACF Overview • Domain S has anomalous forwarding state for destination D if S’s outgoing packets destined for D arrive back to S as result of a routing loop. • Main idea of ACF: • Detect occurrences of anomalous state • Avoid forwarding packets via domains that are known to have anomalous state. S Each packet carries a list of prior AS-level hops (pathTrace) Anomalous forwarding state D Each packet carries a blackList of domains with anomalous state Packet header pathTrace blackList

  8. ACF Overview Forward (packet p) { if (localASNum in p.pathTrace) Move loop elements from p.pathTrace to p.blackList nextHoplookupNextHop(p.destAddr) if (nextHop in p.blackList) Invoke the control plane, look for alternate non-blacklisted routes in the RIB if (nextHop != NONE) { Append localASNum to p.pathTrace SendPacket(p, nextHop) } else Initiate recovery-mode forwarding for p }

  9. ACF Recovery-mode forwarding • If a router is unable to forward a packet because it does not have a valid non-blacklisted route, it initiates recovery forwarding. • Chooses a recovery destination R from a static and well-known set of highly-connected Tier-1 domains. • Detours the packet through R. Recovery destinations R1 R2 nextHop=NONE • Intuition: R or some router along the path to R may know a working alternate route to the original destination. Normal-mode forwarding Recovery-mode forwarding

  10. Anomaly-Cognizant Forwarding p.Header dst = A origDst = pathTrace = [ C ] blackList = { } 1. ECBA 2. GA 1. CBA 2. DBA F E 1. BA 2. DBA 1. BA 2. CBA C D G p B A

  11. Anomaly-Cognizant Forwarding p.Header dst = A origDst = pathTrace = [ C D ] blackList = { } 1. ECBA 2. GA 1. CBA 2. DBA F E 1. BA 2. DBA 1. BA 2. CBA C D G p B A

  12. Anomaly-Cognizant Forwarding p.Header dst = A origDst = pathTrace = [ C D ] blackList = {D } 1. ECBA 2. GA 1. CBA 2. DBA F E 1. BA 2. DBA 1. BA 2. CBA C D G p B C initiates recovery forwarding through domain F A

  13. Anomaly-Cognizant Forwarding p.Header dst = F origDst = A pathTrace = [ ] blackList = {C D } 1. ECBA 2. GA 1. CBA 2. DBA F E 1. BA 2. DBA 1. BA 2. CBA C D G p B C initiates recovery forwarding through domain F A

  14. Anomaly-Cognizant Forwarding p.Header dst = F origDst = A pathTrace = [ ] blackList = {C D } 1. ECBA 2. GA 1. CBA 2. DBA F E 1. BA 2. DBA 1. BA 2. CBA C D G p B C initiates recovery forwarding through domain F A

  15. Anomaly-Cognizant Forwarding p.Header dst = F origDst = A pathTrace = [ C] blackList = {C D } 1. ECBA 2. GA 1. CBA 2. DBA F E 1. BA 2. DBA 1. BA 2. CBA C D G p B C initiates recovery forwarding through domain F A

  16. Anomaly-Cognizant Forwarding p.Header dst = F origDst = A pathTrace = [ C] blackList = {C D } p 1. ECBA 2. GA 1. CBA 2. DBA F E 1. BA 2. DBA 1. BA 2. CBA C D G B C initiates recovery forwarding through domain F A

  17. Anomaly-Cognizant Forwarding p.Header dst = F origDst = A pathTrace = [ C] blackList = {C D E} p 1. ECBA 2. GA 1. CBA 2. DBA F E 1. BA 2. DBA 1. BA 2. CBA C D G B C initiates recovery forwarding through domain F A

  18. Anomaly-Cognizant Forwarding p.Header dst = F origDst = A pathTrace = [ C E] blackList = {C D E} p 1. ECBA 2. GA 1. CBA 2. DBA F E 1. BA 2. DBA 1. BA 2. CBA C D G B C initiates recovery forwarding through domain F A

  19. Anomaly-Cognizant Forwarding p.Header dst = F origDst = A pathTrace = [ C E] blackList = {C D E} p 1. ECBA 2. GA 1. CBA 2. DBA F E 1. BA 2. DBA 1. BA 2. CBA C D G B C initiates recovery forwarding through domain F A

  20. Anomaly-Cognizant Forwarding p.Header dst = F origDst = A pathTrace = [ ] blackList = {C D E} p 1. ECBA 2. GA 1. CBA 2. DBA F E F resumes normal-mode forwarding 1. BA 2. DBA 1. BA 2. CBA C D G B C initiates recovery forwarding through domain F A

  21. Anomaly-Cognizant Forwarding p.Header dst = F origDst = A pathTrace = [ F] blackList = {C D E} p 1. ECBA 2. GA 1. CBA 2. DBA F E F resumes normal-mode forwarding 1. BA 2. DBA 1. BA 2. CBA C D G B C initiates recovery forwarding through domain F A

  22. Anomaly-Cognizant Forwarding p.Header dst = F origDst = A pathTrace = [ F G] blackList = {C D E} 1. ECBA 2. GA 1. CBA 2. DBA F E F resumes normal-mode forwarding 1. BA 2. DBA 1. BA 2. CBA C D G p B C initiates recovery forwarding through domain F A

  23. Anomaly-Cognizant Forwarding p.Header dst = F origDst = A pathTrace = [ F G] blackList = {C D E} 1. ECBA 2. GA 1. CBA 2. DBA F E F resumes normal-mode forwarding 1. BA 2. DBA 1. BA 2. CBA C D G B C initiates recovery forwarding through domain F A p

  24. Anomaly-Cognizant Forwarding F E C D G B A

  25. ACF: Observations • ACF does not use pre-computed failover paths • Discovers alternate routes dynamically using state in the packet header • The two forwarding modes make use of the same forwarding table • Paths to recovery destinations are not assumed to be stable and anomaly-free • We protect recovery-mode forwarding using the same mechanism (pathTrace and blackList)

  26. ACF: Preliminary Evaluation • Evaluation metrics • Effectiveness in eliminating transient disconnectivity • Efficiency of alternate paths • Packet header overhead

  27. ACF: Preliminary Evaluation • Simulation methodology • CAIDA AS-level topology (27969 nodes) annotated with inferred inter-AS relationships • 12937 multihomed edge domains, 29426 adjacent provider links • Provider link failure experiment • For each multihomed domain D, and each provider link L • Fail L and simulate packet delivery from every other domain to D during convergence S2 S3 S1 S4 Packet TTL = 32 hops D Recovery destinations = 10 highly-connected Tier-1 ISPs

  28. ACF: Preliminary Evaluation • Transient disconnection after a link failure • BGP with conventional forwarding • 51% of failures cases produce unwarranted disconnection • Widespread disconnection (>50% of ASes) in 17% of cases • BGP with ACF • No disconnection in 92% of failure cases • <1% of ASes see disconnection in 98% of failure cases

  29. ACF: Preliminary Evaluation • Transient path efficiency • Causes of path dilation in ACF • Transient loops • Detouring via a recovery destination • In 65% of failure cases that produce disconnectivity, ACF recovers packets using ≤ 2 extra hops • 9% of cases require 7 hops or more F – failure cases that produce transient disconnection with conventional forwarding

  30. ACF: Preliminary Evaluation • Packet header overhead Maximum number of pathTrace and blackList entries in a representative sample of failure cases. • Worst-case pathTrace – 20entries • 40 bytes of overhead assuming 16-bit AS numbers • Worst-case blackList – 16entries • 10 bytes of overhead for a Bloom filter with 1% error rate

  31. Challenges / Concerns • Feasibility of deployment • ACF adds fields to packet header and modifies core IP forwarding logic. • Packet processing overhead • Control plane is invoked only during periods of instability • Common case: check pathTrace and blackList. Both operations admit efficient implementation in hardware and parallelization. • ACF and routing policies

  32. Thank you. Questions?

More Related