Advanced networks l.jpg
This presentation is the property of its rightful owner.
Sponsored Links
1 / 44

Advanced Networks PowerPoint PPT Presentation


  • 74 Views
  • Uploaded on
  • Presentation posted in: General

Advanced Networks. 1. Delayed Internet Routing Convergence 2. The Impact of Internet Policy and Topology on Delayed Routing Convergence. The Problem. How to Recover from Failure Quickly? Phone systems recover, failover, in milliseconds Internet takes an order of minutes Loss of Connectivity

Download Presentation

Advanced Networks

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Advanced networks l.jpg

Advanced Networks

1. Delayed Internet Routing Convergence

2. The Impact of Internet Policy and Topology on Delayed Routing Convergence


The problem l.jpg

The Problem

  • How to Recover from Failure Quickly?

  • Phone systems recover, failover, in milliseconds

  • Internet takes an order of minutes

    • Loss of Connectivity

    • Packet Loss

    • Latency


The problem cont l.jpg

The Problem (cont)

  • Failure over on the internet not very good

  • Sluggish Backup systems

  • Internet has to adjust to the failure

    • Path must be restored to back up


The questions l.jpg

The Questions

  • Why does convergence take so long?

  • What is the upper bound for convergence?

  • What causes this delayed convergence?

  • What can we do about it?


Theory l.jpg

Theory

  • Unexpected Interaction of:

    • Protocol timers

    • Router Implementation

    • Policies (Safe/Unsafe)


Theory cont l.jpg

Theory (cont)

  • Distance vector algorithm has issues

  • Lack of sufficient info to determine if next hop choice will cause loops


Convergence accelerators l.jpg

Convergence Accelerators

  • Use of Path Vector

  • Split Horizon

  • Triggered updates

  • Diffusion

  • Timers


Policies l.jpg

Policies

  • Admins can implement unsafe policies

  • Policies can cause route oscillations

  • Routers default to Shortest Path

  • Even if constrained upper-bound might be as high factorial


Point of paper l.jpg

Point of Paper

  • Measure the convergence behavior of BGP 4

  • Done for Bellman-Ford O(n3)

  • Convergence in BGP is NOT much better than RIP

  • Give an upper and lower bounds to convergence


The work done l.jpg

The Work Done

  • 2 year study

  • 250,000 routing fault injections

  • 25 Internet providers

  • End to End performance measurements


Terminology l.jpg

Terminology

  • Tup: (New) Route Announcement

  • Tdown: Route Withdrawal

  • Tshort: Shorter Route Replaces Current

    • Current Route is Withdrawn Implicitly

  • Tlong: Shorter Route Replaced with longer one

    • Represents a failure and failover

    • Current Route is Withdrawn Implicitly


Latency l.jpg

Latency


Latency cont l.jpg

Latency (cont)

  • Oscillation greater than 3 minutes

    • 20% of Tlong

    • 40% of Tdown

  • Equivalence Latency Classes

    • Tlong,Tdown

    • Tshort,Tup


Latency per isp l.jpg

Latency per ISP


Bgp update volume l.jpg

BGP Update Volume

Average Message Per Event Type

Tup: Route Announcement

Tdown: Route Withdrawal

Tshort: Shorter Route Replacement

Tlong: Longer Route Replacement


Questions l.jpg

Questions

  • Why do Tlong and Tdown cause 2 times the amout of updates?

  • Why do certain ISP produce more updates per event?

  • Relationship between number of updates and convergence latency?


Questions cont l.jpg

Questions (cont)

  • What makes an ISP have a higher latency?

  • Interesting Points

    • ISP3: Japan’s National Backbone

    • ISP5 Canadian ISP

    • Latency NOT Dependant Geographic Distance or Network Distance (aka hop count)


Graph analysis l.jpg

Graph Analysis

  • No relationship between day of the week and Latency!

  • Independent of Network load and congestion


End to end measurements l.jpg

End to End Measurements

  • Route Oscillation effects performance

  • Drop Packets, Buffering of Packets

    • Out of order delivery


Failover from end to end view l.jpg

Failover from end to end view

  • Time after ICMP echo arrived after Tup

  • Simulates a failover

  • 80% of test sites began returning after 30 seconds

  • 100% after one minute


Bgp convergence model l.jpg

BGP Convergence Model

  • IBGP ignored

  • Full Mesh

  • Ignore ingress and egress filters

  • Exclude MinRouteAdver

  • Updates messages follow FIFO ordering


Bgp convergence example l.jpg

BGP Convergence Example

  • Start: 0(*R, 1R, 2R) 1(0R, *R, 2R) 2(0R, 1R, *R)

R Withdraws routes

R -> 0 W

R -> 1 W

R -> 2 W


Bgp convergence example23 l.jpg

0(-, -, *2R) 1(-, -, *2R) 2(*01R, 10R, -)

BGP Convergence Example

0(-, *1R, 2R) 1(*0R, -, 2R) 2(*0R, 1R, -)

  • 1 and 2 receive new announcement from 0

    • 0 -> 1 01R (loop)

    • 0 -> 2 01R

0(-, *1R, 2R) 1(-, -, *2R) 2(01R, *1R, -)

  • 0 and 2 receive new announcement from 1

    • 1 -> 0 10R (loop)

    • 1 -> 2 10R


Bgp convergence example24 l.jpg

BGP Convergence Example

0 and 1 receive new announcement from 2

2 -> 0 20R

2 -> 1 20R

0(-, -, -) 1(-, -, *20R) 2(*01R, 10R, -)

0 and 2 receive new announcement from 1

1 -> 0 12R

1 -> 2 12R

0(-, *12R, -) 1(-, -, *20R) 2(*01R, -, -) … 48 steps later

0(-, -, -) 1(-, -, -) 2(-, -, -)


Upper bound l.jpg

Upper Bound

  • For n nodes there exist 0((n-1)!) distinct paths

  • When a route is withdrawn, a new route is found of equal or increasing length

  • Message count could be a bad as (n-1)O((n-1)!) until convergence

  • Not really possible on the internet


Lower bound l.jpg

Lower Bound

  • Made possible by MinRouteAdver timers

  • (n-1) Rounds to convergence


Minrouteadver l.jpg

MinRouteAdver

  • Minimum time between route advertisements

  • Gives a AS time to pick a good route before announcing it

  • In standard BGP, timer only applied to announcements

  • Does Not apply to explicit withdrawls


Example reloaded l.jpg

Example Reloaded

  • Instead of 48 rounds only took 13 rounds


Example reloaded29 l.jpg

Example Reloaded


Question reloaded l.jpg

Question Reloaded

  • Why do Tup/Tshort converge quicker than Tdown/Tlong?

  • Answer: Tup/Tshort are decreasing while Tdown/Tlong are increasing

    • One a path is selected a longer one will not be picked

    • While on Tdown/Tlong you pick the next best one until you are out of choices

    • O(1) for Tup while O(n) for Tdown


Question reloaded31 l.jpg

Question Reloaded

  • Why is there different latencies between the five ISPs?

  • Answer: The topological factors, length and number of possible paths (peering relationships, policies and agreements) are the answer.

    • Longer routes announced, longer latencies

    • Longer routes the more MinRouteAdver rounds


Loop detection l.jpg

Loop Detection

  • Loop Detection done at receiver side

  • If done, at sender you can get more out of MinRouteAdver round

  • MinRouteAdver is good but causes a 30 second delay in end to end communication at best


Convergence delay due to policies and topology l.jpg

Convergence Delay Due to Policies and Topology

  • 2nd study of convergence

  • 20 unique advertisement between 200 pairs of ISPs, 6 months

  • Measure the impact of Policies

  • Measure the impact of Topology

  • Analysis


Multi home networks l.jpg

Multi-home Networks

  • One network, two ISPs

  • Better connectivity + backup

  • Failover = New route convergence

  • Work done in this Paper

    • Convergence Analysis of Tdown event


Work done l.jpg

Work Done

  • Fault injection announcements

  • Logged table snapshot to disk

  • Survey of backbone providers

    • Routing and peering policies

  • Used data to discuss impact on convergence


Policy l.jpg

Policy

  • How policy impacts number and length of ASPaths with a given route

  • Limited inbound acceptance by all ISP


Inbound filtering example l.jpg

Inbound Filtering Example

  • ISP D filters peering session with ISPG

    • D only acceptG’s backbone and customers routes

  • ISP A filters peering session with D

    • A only acceptD’s backbone and customers routes

  • ISP A will accepts G’s routes by chaining


Outbound filters l.jpg

Outbound Filters

  • A will advertise routes with paths “D G” and “D” but not “C D G”

  • Done by 13% of ISPs

  • Combinations of ASPath and prefix filters create unintentional back-up transit paths


Topological effect l.jpg

Topological Effect

  • Interaction of MinRouteAdver timers

  • MinRouteAdver is per peer not prefix

  • MinRouteAdver interference delays convergence


Backup path selection l.jpg

Backup Path Selection


Convergence latency l.jpg

Convergence Latency


Convergence latency cont l.jpg

Convergence Latency (cont)

  • ISP1 explored one backup path of length 2

  • ISP2 explored backup paths of length 2 and 3

  • ISP 3 explored backup paths of length 5


Convergence latency cont43 l.jpg

Convergence Latency (cont)


Convergence latency cont44 l.jpg

Convergence Latency (cont)


  • Login