Advanced Networks

Advanced Networks 1. Delayed Internet Routing Convergence 2. The Impact of Internet Policy and Topology on Delayed Routing Convergence

The Problem • How to Recover from Failure Quickly? • Phone systems recover, failover, in milliseconds • Internet takes an order of minutes • Loss of Connectivity • Packet Loss • Latency

The Problem (cont) • Failure over on the internet not very good • Sluggish Backup systems • Internet has to adjust to the failure • Path must be restored to back up

The Questions • Why does convergence take so long? • What is the upper bound for convergence? • What causes this delayed convergence? • What can we do about it?

Theory • Unexpected Interaction of: • Protocol timers • Router Implementation • Policies (Safe/Unsafe)

Theory (cont) • Distance vector algorithm has issues • Lack of sufficient info to determine if next hop choice will cause loops

Convergence Accelerators • Use of Path Vector • Split Horizon • Triggered updates • Diffusion • Timers

Policies • Admins can implement unsafe policies • Policies can cause route oscillations • Routers default to Shortest Path • Even if constrained upper-bound might be as high factorial

Point of Paper • Measure the convergence behavior of BGP 4 • Done for Bellman-Ford O(n3) • Convergence in BGP is NOT much better than RIP • Give an upper and lower bounds to convergence

The Work Done • 2 year study • 250,000 routing fault injections • 25 Internet providers • End to End performance measurements

Terminology • Tup: (New) Route Announcement • Tdown: Route Withdrawal • Tshort: Shorter Route Replaces Current • Current Route is Withdrawn Implicitly • Tlong: Shorter Route Replaced with longer one • Represents a failure and failover • Current Route is Withdrawn Implicitly

Latency

Latency (cont) • Oscillation greater than 3 minutes • 20% of Tlong • 40% of Tdown • Equivalence Latency Classes • Tlong,Tdown • Tshort,Tup

Latency per ISP

BGP Update Volume Average Message Per Event Type Tup: Route Announcement Tdown: Route Withdrawal Tshort: Shorter Route Replacement Tlong: Longer Route Replacement

Questions • Why do Tlong and Tdown cause 2 times the amout of updates? • Why do certain ISP produce more updates per event? • Relationship between number of updates and convergence latency?

Questions (cont) • What makes an ISP have a higher latency? • Interesting Points • ISP3: Japan’s National Backbone • ISP5 Canadian ISP • Latency NOT Dependant Geographic Distance or Network Distance (aka hop count)

Graph Analysis • No relationship between day of the week and Latency! • Independent of Network load and congestion

End to End Measurements • Route Oscillation effects performance • Drop Packets, Buffering of Packets • Out of order delivery

Failover from end to end view • Time after ICMP echo arrived after Tup • Simulates a failover • 80% of test sites began returning after 30 seconds • 100% after one minute

BGP Convergence Model • IBGP ignored • Full Mesh • Ignore ingress and egress filters • Exclude MinRouteAdver • Updates messages follow FIFO ordering

BGP Convergence Example • Start: 0(*R, 1R, 2R) 1(0R, *R, 2R) 2(0R, 1R, *R) R Withdraws routes R -> 0 W R -> 1 W R -> 2 W

0(-, -, *2R) 1(-, -, *2R) 2(*01R, 10R, -) BGP Convergence Example 0(-, *1R, 2R) 1(*0R, -, 2R) 2(*0R, 1R, -) • 1 and 2 receive new announcement from 0 • 0 -> 1 01R (loop) • 0 -> 2 01R 0(-, *1R, 2R) 1(-, -, *2R) 2(01R, *1R, -) • 0 and 2 receive new announcement from 1 • 1 -> 0 10R (loop) • 1 -> 2 10R

BGP Convergence Example 0 and 1 receive new announcement from 2 2 -> 0 20R 2 -> 1 20R 0(-, -, -) 1(-, -, *20R) 2(*01R, 10R, -) 0 and 2 receive new announcement from 1 1 -> 0 12R 1 -> 2 12R 0(-, *12R, -) 1(-, -, *20R) 2(*01R, -, -) … 48 steps later 0(-, -, -) 1(-, -, -) 2(-, -, -)

Upper Bound • For n nodes there exist 0((n-1)!) distinct paths • When a route is withdrawn, a new route is found of equal or increasing length • Message count could be a bad as (n-1)O((n-1)!) until convergence • Not really possible on the internet

Lower Bound • Made possible by MinRouteAdver timers • (n-1) Rounds to convergence

MinRouteAdver • Minimum time between route advertisements • Gives a AS time to pick a good route before announcing it • In standard BGP, timer only applied to announcements • Does Not apply to explicit withdrawls

Example Reloaded • Instead of 48 rounds only took 13 rounds

Example Reloaded

Question Reloaded • Why do Tup/Tshort converge quicker than Tdown/Tlong? • Answer: Tup/Tshort are decreasing while Tdown/Tlong are increasing • One a path is selected a longer one will not be picked • While on Tdown/Tlong you pick the next best one until you are out of choices • O(1) for Tup while O(n) for Tdown

Question Reloaded • Why is there different latencies between the five ISPs? • Answer: The topological factors, length and number of possible paths (peering relationships, policies and agreements) are the answer. • Longer routes announced, longer latencies • Longer routes the more MinRouteAdver rounds

Loop Detection • Loop Detection done at receiver side • If done, at sender you can get more out of MinRouteAdver round • MinRouteAdver is good but causes a 30 second delay in end to end communication at best

Convergence Delay Due to Policies and Topology • 2nd study of convergence • 20 unique advertisement between 200 pairs of ISPs, 6 months • Measure the impact of Policies • Measure the impact of Topology • Analysis

Multi-home Networks • One network, two ISPs • Better connectivity + backup • Failover = New route convergence • Work done in this Paper • Convergence Analysis of Tdown event

Work Done • Fault injection announcements • Logged table snapshot to disk • Survey of backbone providers • Routing and peering policies • Used data to discuss impact on convergence

Policy • How policy impacts number and length of ASPaths with a given route • Limited inbound acceptance by all ISP

Inbound Filtering Example • ISP D filters peering session with ISPG • D only acceptG’s backbone and customers routes • ISP A filters peering session with D • A only acceptD’s backbone and customers routes • ISP A will accepts G’s routes by chaining

Outbound Filters • A will advertise routes with paths “D G” and “D” but not “C D G” • Done by 13% of ISPs • Combinations of ASPath and prefix filters create unintentional back-up transit paths

Topological Effect • Interaction of MinRouteAdver timers • MinRouteAdver is per peer not prefix • MinRouteAdver interference delays convergence

Backup Path Selection

Convergence Latency

Convergence Latency (cont) • ISP1 explored one backup path of length 2 • ISP2 explored backup paths of length 2 and 3 • ISP 3 explored backup paths of length 5

Convergence Latency (cont)

Advanced Networks

Advanced Networks

Presentation Transcript

EPL375: Advanced Networks TCP

Advanced Computer Networks

Advanced Neural Networks

Advanced Computer Networks

Advanced Computer Networks

Direct networks and asymmetric indirect networks -Advanced Networks-

Advanced Computer Networks

Advanced Networks

CSE390 Advanced Computer Networks

CE 151 – Advanced Networks

5101GB1005 Advanced Information Networks

Advanced Computer Networks

Advanced Computer Networks

CSE390 – Advanced Computer Networks

CSE390 Advanced Computer Networks

Advanced Computer Networks

Advanced Computer Networks

MITM753: Advanced Computer Networks

Advanced Computer Networks

MITM753: Advanced Computer Networks

MITM753: Advanced Computer Networks