computer networks graduate level n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Computer Networks (Graduate level) PowerPoint Presentation
Download Presentation
Computer Networks (Graduate level)

Loading in 2 Seconds...

play fullscreen
1 / 111

Computer Networks (Graduate level) - PowerPoint PPT Presentation


  • 99 Views
  • Uploaded on

Computer Networks (Graduate level). Lecture 10: Inter-domain Routing. University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani. Inter-Domain Routing. Border Gateway Protocol (BGP) Assigned reading [LAB00] Delayed Internet Routing Convergence Sources

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Computer Networks (Graduate level)' - jess


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
computer networks graduate level

Computer Networks(Graduate level)

Lecture 10: Inter-domain Routing

University of Tehran

Dept. of EE and Computer Engineering

By:

Dr. Nasser Yazdani

Computer Network

inter domain routing
Inter-Domain Routing
  • Border Gateway Protocol (BGP)
  • Assigned reading
    • [LAB00] Delayed Internet Routing Convergence
  • Sources
    • RFC1771: main BGP RFC
    • RFC1772-3-4: application, experiences, and analysis of BGP
    • RFC1965: AS confederations for BGP
    • Christian Huitema: “Routing in the Internet”, chapters 8 and 9.
    • John Stewart III: “BGP4 - Inter-domain routing in the Internet”

Computer Network

outline
Outline
  • External BGP (E-BGP)
  • Internal BGP (I-BGP)
  • Multi-Homing
  • Stability Issues
  • Scalability Issues

Computer Network

internet routing
Internet Routing
  • Internet organized as a two level hierarchy
    • First level – autonomous systems (AS’s)
      • AS – region of network under a single administrative domain
    • Each AS assigned unique ID
    • AS’s peer at network exchange routing information.
    • AS’s run an intra-domain routing protocols
      • Distance Vector, e.g., RIP
      • Link State, e.g., OSPF
    • Between AS’s runs inter-domain routing protocols, e.g., Border Gateway Routing (BGP)
      • De facto standard today, BGP-4

Computer Network

example
Example

Interior router

BGP router

AS-1

AS-3

AS-2

Computer Network

inter domain routing basics
Inter-domain Routing basics
  • Internet is composed of over 16000 autonomous systems
  • BGP = Border Gateway Protocol
    • Is a Policy-Based routing protocol
    • Is the de facto inter-domain routing protocol of today’s global Internet
  • Relatively simple but configuration is complex and the entire world can see, and be impacted by, your mistakes.
history
History
  • Mid-80s: EGP
    • Reachability protocol (no shortest path)
    • Did not accommodate cycles (tree topology)
    • Evolved when all networks connected to NSF backbone
  • Result: BGP introduced as routing protocol
    • Latest version = BGP 4
    • BGP-4 supports CIDR
    • Primary objective: connectivity not performance

Computer Network

choices
Choices
  • Link state or distance vector?
    • No universal metric – policy decisions
  • Problems with distance-vector:
    • Bellman-Ford algorithm may not converge
  • Problems with link state:
    • Metric used by routers not the same – loops
    • LS database too large – entire Internet
    • May expose policies to other AS’s

Computer Network

solution distance vector with path
Solution: Distance Vector with Path
  • Each routing update carries the entire path
  • Loops are detected as follows:
    • When AS gets route check if AS already is in path
      • If yes, reject route
      • If no, add self and (possibly) advertise route further
  • Advantage:
    • Metrics are local - AS chooses path, protocol ensures no loops

Computer Network

interconnecting bgp peers
Interconnecting BGP Peers
  • BGP uses TCP to connect peers
  • AS’s exchange reachability information through their BGP routers, only when routes change
  • Advantages:
    • Simplifies BGP
    • No need for periodic refresh - routes are valid until withdrawn, or the connection is lost
    • Incremental updates
  • Disadvantages
    • Congestion control on a routing protocol?
    • Poor interaction during high load

Computer Network

bgp operations simplified
BGP Operations (Simplified)

AS1

Establish session on

TCP port 179

BGP session

Exchange all

active routes

AS2

While connection

is ALIVE exchange

route UPDATE messages

Exchange incremental

updates

customers and providers

provider

customer

IP traffic

Customers and Providers

provider

customer

Customer pays provider for access to the Internet

Computer Network

the peering relationship

peer

peer

provider

customer

The “Peering” Relationship

Peers provide transit between

their respective customers

Peers do not provide transit

between peers

Peers (often) do not exchange $$$

traffic

allowed

traffic NOT

allowed

Computer Network

peering provides shortcuts

peer

peer

provider

customer

Peering Provides Shortcuts

Peering also allows connectivity between

the customers of “Tier 1” providers.

Computer Network

peering wars
Reduces upstream transit costs

Can increase end-to-end performance

May be the only way to connect your customers to some part of the Internet (“Tier 1”)

You would rather have customers

Peers are usually your competition

Peering relationships may require periodic renegotiation

Peering Wars

Peer

Don’t Peer

Peering struggles are by far the most

contentious issues in the ISP world!

Peering agreements are often confidential.

Computer Network

as categories
AS Categories
  • Stub: an AS that has only a single connection to one other AS - carries only local traffic.
  • Multi-homed: an AS that has connections to more than one AS, but does not carry transit traffic
  • Transit: an AS that has connections to more than one AS, and carries both transit and local traffic (under certain policy restrictions)

Computer Network

as categories1
AS Categories

AS1

AS3

AS1

AS2

AS1

AS3

AS2

Transit

Stub

AS2

Multi-homed

Computer Network

policy with bgp
Policy with BGP
  • BGP provides capability for enforcing various policies
  • Policies are not part of BGP: they are provided to BGP as configuration information
  • BGP enforces policies by choosing paths from multiple alternatives and controlling advertisement to other AS’s

Computer Network

examples of bgp policies
Examples of BGP Policies
  • A multi-homed AS refuses to act as transit
    • Limit path advertisement
  • A multi-homed AS can become transit for some AS’s
    • Only advertise paths to some AS’s
  • An AS can favor or disfavor certain AS’s for traffic transit from itself

Computer Network

routing information bases rib
Routing Information Bases (RIB)
  • Routes are stored in RIBs
  • Adj-RIBs-In: routing info that has been learned from other routers (unprocessed routing info)
  • Loc-RIB: local routing information selected from Adj-RIBs-In (routes selected locally)
  • Adj-RIBs-Out: info to be advertised to peers (routes to be advertised)

Computer Network

architecture of dynamic routing
Architecture of Dynamic Routing

OSPF

BGP

AS 1

EIGRP

IGP = Interior Gateway Protocol

Metric based: OSPF, IS-IS, RIP,

EIGRP (cisco)

AS 2

EGP = Exterior Gateway Protocol

Policy based: BGP

The Routing Domain of BGP is the entire Internet

Computer Network

four types of bgp messages
Four Types of BGP Messages
  • Open : Establish a peering session.
  • Keep Alive : Handshake at regular intervals.
  • Notification : Shuts down a peering session.
  • Update : Announcing new routes or withdrawing previously announced routes.

announcement

=

prefix + attributes values

two types of bgp neighbor relationships
Two Types of BGP Neighbor Relationships
  • External Neighbor (eBGP) in a different Autonomous Systems
  • Internal Neighbor (iBGP) in the same Autonomous System

AS1

iBGP is routed

using Interior Gateway Protocol

(IGP)!

eBGP

iBGP

AS2

ibgp peers must be fully meshed

eBGP update

iBGP updates

iBGP Peers Must be Fully Meshed
  • iBGP is needed to avoid routing loops within an AS
  • Injecting external routes into IGP does not scale and causes BGP policy information to be lost
  • BGP does not provide “shortest path” routing

iBGP neighbors do not announce

routes received via iBGP to other iBGP

neighbors.

important bgp attributes
Important BGP attributes
  • LocalPREF
    • Local preference policy to choose “most” preferred route
  • Multi-exit Discriminator (MED)
    • Which peering point to choose?
  • Import Rules
    • What route advertisements do I accept?
  • Export Rules
    • Which routes do I forward to whom?

Computer Network

implementing customer provider and peer peer relationships
Implementing Customer/Provider and Peer/Peer relationships
  • Enforce transit relationships
    • Outbound route filtering
  • Enforce order of route preference
    • provider < peer < customer

Two parts:

Computer Network

import routes

provider route

peer route

customer route

ISP route

Import Routes

From

provider

From

provider

From

peer

From

peer

From

customer

From

customer

Computer Network

export routes

filters

block

Export Routes

provider route

peer route

customer route

ISP route

To

provider

From

provider

To

peer

To

peer

To

customer

To

customer

Computer Network

bgp common header
BGP Common Header

1

2

3

0

Marker (security and message delineation)

16 bytes

Length (2 bytes)

Type (1 byte)

Types: OPEN, UPDATE, NOTIFICATION, KEEPALIVE

Computer Network

bgp open message
BGP OPEN message

1

2

3

0

Marker (security and message delineation)

Length

Type: open

version

My autonomous system

Hold time

BGP identifier

Optional parameters <type, length, value>

Parameter length

My AS: id assigned to that AS

Hold timer: max interval between KEEPALIVE or UPDATE messages interval implies no keep_alive.

BGP ID: IP address of one interface (same for all messages)

Computer Network

bgp update message
BGP UPDATE message

1

2

3

0

Marker (security and message delineation)

Length

Type: update

Withdrawn..

..routes len

Withdrawn routes (variable)

...

Path attribute len

Path attributes (variable)

Network layer reachability information (NLRI) (variable)

  • Many prefixes may be included in UPDATE, but must
    • share same attributes.
  • UPDATE message may report multiple withdrawn routes.

Computer Network

bgp update message1
BGP UPDATE Message
  • List of withdrawn routes
  • Network layer reachability information
    • List of reachable prefixes
  • Path attributes
    • Origin
    • Path
    • Metrics
  • All prefixes advertised in a message have same path attributes

Computer Network

slide33
NLRI
  • Network Level Reachability Information
    • list of IP address prefixes encoded as follows:

Length (1 byte)

Prefix (variable)

Computer Network

path attributes
Path attributes

Type-Length-Value encoding

Attribute type (2 bytes)

Attribute length (1-2 bytes)

Attribute Value (variable length)

Attribute type field

Attribute flags (1 byte)

Attribute type code (1 byte)

Flags: optional, v.s. well-known

transitive, partial, extended length

Computer Network

bgp notification message
BGP NOTIFICATION message

1

2

3

0

Marker (security and message delineation)

Length

Type: NOTIFICATION

Error code

Data

Error sub-code

  • Used for error notification
    • TCP connection is closed immediately after notification

Computer Network

bgp keepalive message
BGP KEEPALIVE message

1

2

3

0

Marker (security and message delineation)

Length

Type: KEEPALIVE

Sent periodically to peers to ensure connectivity.

If hold_time is zero, messages are not sent..

Sent in place of an UPDATE message

Computer Network

path selection criteria
Path Selection Criteria
  • Information based on path attributes
  • Attributes + external (policy) information
  • Examples:
    • Hop count
    • Policy considerations
      • Preference for AS
      • Presence or absence of certain AS
    • Path origin
    • Link dynamics

Computer Network

route selection summary
Route Selection Summary

Highest Local Preference

Enforce relationships

Shortest ASPATH

Lowest MED

traffic engineering

i-BGP < e-BGP

Lowest IGP cost

to BGP egress

Throw up hands and

break ties

Lowest router ID

Computer Network

back to frank

peer

peer

provider

customer

Back to Frank …

Local preference only used in iBGP

AS 4

local pref = 80

AS 3

local pref = 90

local pref = 100

AS 2

AS 1

Higher Local

preference values

are more preferred

13.13.0.0/16

backup links with local preference outbound traffic
Backup Links with Local Preference (Outbound Traffic)

AS 1

primary link

backup link

Set Local Pref = 100

for all routes from AS 1

Set Local Pref = 50

for all routes from AS 1

AS 65000

Forces outbound traffic to take primary link, unless link is down.

We’ll talk about inbound traffic soon …

aspath attribute

AS 1239

Sprint

ASPATH Attribute

AS 1129

135.207.0.0/16

AS Path = 1755 1239 7018 6341

Global Access

AS 1755

135.207.0.0/16

AS Path = 1239 7018 6341

135.207.0.0/16

AS Path = 1129 1755 1239 7018 6341

Ebone

AS 12654

RIPE NCC

RIS project

135.207.0.0/16

AS Path = 7018 6341

AS7018

135.207.0.0/16

AS Path = 3549 7018 6341

135.207.0.0/16

AS Path = 6341

AT&T

AS 3549

AS 6341

135.207.0.0/16

AS Path = 7018 6341

AT&T Research

Global Crossing

135.207.0.0/16

Prefix Originated

community attribute to the rescue
COMMUNITY Attribute to the Rescue!

AS 3: normal

customer local

pref is 100,

peer local pref is 90

AS 1

AS 3

provider

provider

192.0.2.0/24

ASPATH = 2

COMMUNITY = 3:70

192.0.2.0/24

ASPATH = 2

primary

backup

Customer import policy at AS 3:

If 3:90 in COMMUNITY then

set local preference to 90

If 3:80 in COMMUNITY then

set local preference to 80

If 3:70 in COMMUNITY then

set local preference to 70

customer

192.0.2.0/24

AS 2

hot potato routing go for the closest egress point
Hot Potato Routing: Go for the Closest Egress Point

192.44.78.0/24

egress 2

egress 1

IGP distances

56

15

This Router has two BGP routes to 192.44.78.0/24.

Hot potato: get traffic off of your network as

Soon as possible. Go for egress 1!

getting burned by the hot potato

Heavy

Content

Web Farm

Getting Burned by the Hot Potato

2865

High bandwidth

Provider backbone

17

SFF

NYC

Low bandwidth

customer backbone

56

15

San Diego

Many customers want

their provider to

carry the bits!

tiny http request

huge http reply

cold potato routing with meds multi exit discriminator attribute

Heavy

Content

Web Farm

Cold Potato Routing with MEDs(Multi-Exit Discriminator Attribute)

2865

Prefer lower

MED values

17

192.44.78.0/24

MED = 56

192.44.78.0/24

MED = 15

56

15

192.44.78.0/24

This means that MEDs must be considered BEFORE

IGP distance!

Note1 : some providers will not listen to MEDs

Note2 : MEDs need not be tied to IGP distance

slide46
MED
  • MED is typically used in provider/subscriber scenarios
  • It can lead to unfairness if used between ISP because it may force one ISP to carry more traffic:

ISP1

SF

ISP2

NY

  • ISP1 ignores MED from ISP2
  • ISP2 obeys MED from ISP1
  • ISP2 ends up carrying traffic most of the way

Computer Network

policies can interact strangely route pinning example
Policies Can Interact Strangely(“Route Pinning” Example)

backup

customer

1

2

Install backup link using community

3

Disaster strikes primary link

and the backup takes over

4

Primary link is restored but some

traffic remains pinned to backup

Computer Network

path attributes1
Path Attributes
  • Categories (recall flags):
    • well-known mandatory (passed on)
    • well-known discretionary (passed on)
    • optional transitive (passed on)
    • optional non-transitive (if unrecognized, not passed on)
  • Optional attributes allow for BGP extensions

Computer Network

path attribute message format repeated
Path attribute message format (repeated)

Attribute flags

Attribute type code

O T P E

0

O: optional or well-known

T: transitive or local

P: partially evaluated

E: length in 1 or 2 bytes

Origin

AS_path

Next hop

etc.

Computer Network

origin path attribute
ORIGIN path attribute
  • Well-known, mandatory attribute.
  • Describes how a prefix was generated at the origin AS. Possible values:
    • IGP: prefix learned from IGP
    • EGP: prefix learned through EGP
    • INCOMPLETE: none of the above (often seen for static routes)

Computer Network

next hop path attribute
Next hop path attribute
  • Well-known, mandatory attribute
  • NEXT_HOP: IP address of border router to be used as next hop
  • Usually, next hop is the router sending the UPDATE message
  • Useful when some routers do not speak BGP

Computer Network

other attributes
Other Attributes
  • ORIGIN
    • Source of route (IGP, EGP, other)
  • NEXT_HOP
    • Address of next hop router to use
    • Used to direct traffic to non-BGP router
  • Check out http://www.cisco.com for full explanation

Computer Network

outline1
Outline
  • External BGP (E-BGP)
  • Internal BGP (I-BGP)
  • Multi-Homing
  • Stability Issues

Computer Network

i bgp
I-BGP

Computer Network

internal bgp i bgp
Internal BGP (I-BGP)
  • Same messages as E-BGP
  • Different rules about re-advertising prefixes:
    • Prefix learned from E-BGP can be advertised to I-BGP neighbor and vice-versa, but
    • Prefix learned from one I-BGP neighbor cannot be advertised to another I-BGP neighbor
    • Reason: no AS PATH within the same AS and thus danger of looping.

Computer Network

internal bgp i bgp1
Internal BGP (I-BGP)
  • R3 can tell R1 and R2 prefixes from R4
  • R3 can tell R4 prefixes from R1 and R2
  • R3 cannot tell R2 prefixes from R1
  • R2 can only find these prefixes through a direct connection to R1
  • Result: I-BGP routers must be fully connected (via TCP)!
    • contrast with E-BGP sessions that map to physical links

R1

E-BGP

AS1

R3

R4

AS2

R2

I-BGP

Computer Network

link failures
Link Failures
  • Two types of link failures:
    • Failure on an E-BGP link
    • Failure on an I-BGP Link
  • These failures are treated completely different in BGP
  • Why?

Computer Network

failure on an e bgp link
Failure on an E-BGP Link
  • If the link R1-R2 goes down
    • The TCP connection breaks
    • BGP routes are removed
  • This is the desired behavior

AS1

AS2

E-BGP session

R1

R2

Physical link

138.39.1.1/30

138.39.1.2/30

Computer Network

failure on an i bgp link
Failure on an I-BGP Link
  • If link R1-R2 goes down, R1 and R2 should still be able to exchange traffic
  • The indirect path through R3 must be used
  • Thus, E-BGP and I-BGP must use different conventions with respect to TCP endpoints

R2

138.39.1.2/30

Physical link

138.39.1.1/30

R1

R3

I-BGP connection

Computer Network

outline2
Outline
  • External BGP (E-BGP)
  • Internal BGP (I-BGP)
  • Multi-Homing
  • Stability Issues
  • Scalability Issues

Computer Network

multi homing
Multi-homing
  • With multi-homing, a single network has more than one connection to the Internet.
  • Improves reliability and performance:
    • Can accommodate link failure
    • Bandwidth is sum of links to Internet
  • Challenges
    • Getting policy right (MED, etc..)
    • Addressing

Computer Network

multi homing to a single provider case 1

ISP

R1

Customer

R2

Multi-homing to a Single Provider Case 1
  • Easy solution:
    • Use IMUX or Multi-link PPP
  • Hard solution:
    • Use BGP
    • Makes assumptions about traffic (same amount of prefixes can be reached from both links)

Computer Network

multi homing to a single provider case 2
Multi-homing to a single provider: Case 2
  • If multiple prefixes, may use MED
    • good if traffic load from prefixes is equal
  • If single prefix, load may be unequal
    • break-down prefix and advertise different prefixes over different links

ISP

R1

Customer

R2

R3

138.39/16

204.70/16

Computer Network

multi homing to a single provider case 3
Multi-homing to a single provider: Case 3
  • For ISP-> customer traffic, same as before:
    • use MED
    • good if traffic load to prefixes is equal
  • For customer -> ISP traffic:
    • R3 alternates links
    • multiple default routes

ISP

R1

R2

Customer

R3

138.39/16

204.70/16

Computer Network

multi homing to a single provider case 4
Multi-homing to a single provider: Case 4
  • Most reliable approach
    • no equipment sharing
  • Customer -> ISP:
    • same as case 2
  • ISP -> customer:
    • same as case 3

ISP

R1

R2

Customer

R3

R4

138.39/16

204.70/16

Computer Network

multi homing to multiple providers
Multi-homing to Multiple Providers
  • Major issues:
    • Addressing
    • Aggregation
  • Customer address space:
    • Delegated by ISP1
    • Delegated by ISP2
    • Delegated by ISP1 and ISP2
    • Obtained independently
  • Advantage and disadvantage?

ISP3

ISP1

ISP2

Customer

Computer Network

address space from one isp
Address Space from one ISP
  • Customer uses address space from one, I.e ISP1
  • ISP1 advertises /16 aggregate
  • Customer advertises /24 route to ISP2
  • ISP2 relays route to ISP1 and ISP3
  • ISP2-3 use /24 route
  • ISP1 routes directly
  • Problems with traffic load?

ISP3

138.39/16

ISP1

ISP2

Customer

138.39.1/24

Computer Network

pitfalls
Pitfalls
  • ISP1 aggregates to a /19 at border router to reduce internal tables.
  • ISP1 still announces /16.
  • ISP1 hears /24 from ISP2.
  • ISP1 routes packets for customer to ISP2!
  • Workaround: ISP1 must inject /24 into I-BGP.

ISP3

138.39/16

ISP1

ISP2

138.39.0/19

Customer

138.39.1/24

Computer Network

address space from both isps
Address Space from Both ISPs
  • ISP1 and ISP2 continue to announce aggregates
  • Load sharing depends on traffic to two prefixes
  • Lack of reliability: if ISP1 link goes down, part of customer becomes inaccessible.
  • Customer may announce prefixes to both ISPs, but still problems with longest match as in case 1.

ISP3

ISP1

ISP2

204.70.1/24

Customer

138.39.1/24

Computer Network

address space obtained independently
Address Space Obtained Independently
  • Offers the most control, but at the cost of aggregation.
  • Still need to control paths
    • suppose ISP1 large, ISP2-3 small
    • customer advertises long path to ISP1, but local-pref attribute used to override
    • ISP3 learns shorter path from ISP2

ISP3

ISP1

ISP2

Customer

Computer Network

outline3
Outline
  • External BGP (e-BGP)
  • Internal BGP (i-BGP)
  • Multi-Homing
  • Stability Issues
  • Scalability Issues

Computer Network

convergence in the real world
Convergence in the real-world?
  • [Labovitz99] Experimental results from two year study which measured 150,000 BGP faults injected into peering sessions at several IXPs
  • Found
    • Internet averages 3 minutes to converge after failover
    • Some multihomed failovers (short to long ASPath) require 15 minutes

Computer Network

signs of routing instability
Signs of Routing Instability
  • Record of BGP messages at major exchanges, packet loss 30 times and delay 4.
  • Discovered orders of magnitude larger than expected updates
    • Bulk were duplicate withdrawals
      • Stateless implementation of BGP – did not keep track of information passed to peers
      • Impact of few implementations
    • Strong frequency (30/60 sec) components
      • Interaction with other local routing/links etc.

Computer Network

30 second bursts
30 Second Bursts

Computer Network

how long does bgp take to adapt to changes
How Long Does BGP Take to Adapt to Changes?

Thanks to Abha Ahuja and Craig Labovitz for this plot.

Computer Network

route flap storm
Route Flap Storm
  • Overloaded routers fail to send Keep_Alive message and marked as down
  • I-BGP peers find alternate paths
  • Overloaded router re-establishes peering session
  • Must send large updates
  • Increased load causes more routers to fail!

Computer Network

route flap dampening
Route Flap Dampening
  • Routers now give higher priority to BGP/Keep_Alive to avoid problem
  • Associate a penalty with each route
    • Increase when route flaps
    • Exponentially decay penalty with time
  • When penalty reaches threshold, suppress route

Computer Network

bgp limitations oscillations
BGP Limitations: Oscillations

(*R,1R,2R)

AS 0

R

AS 1

AS 2

(0R,1R,*R)

(0R,*R,2R)

Computer Network

bgp limitations oscillations1
BGP Limitations: Oscillations

AS 0

(-,*1R,2R)

(*R,1R,2R)

W

R

W

W

AS 1

AS 2

(*0R,-,2R)

(0R,*R,2R)

(0R,1R,*R)

(*0R,1R,-)

Computer Network

bgp limitations oscillations2
BGP Limitations: Oscillations

AS 0

(-,*1R,2R)

(-,*1R,2R)

01R

01R

R

AS 1

AS 2

(-,-,*2R)

(*0R,-,2R)

(*0R,1R,-)

(01R,*1R,-)

Computer Network

bgp limitations oscillations3
BGP Limitations: Oscillations

AS 0

(-,-,*2R)

(-,*1R,2R)

10R

R

AS 1

AS 2

(-,-,*2R)

(-,-,*2R)

(01R,*1R,-)

(*01R,10R,-)

10R

Computer Network

bgp limitations oscillations4
BGP Limitations: Oscillations

AS 0

(-,-,*2R)

(-,-,-)

20R

R

AS 1

AS 2

(-,-,*20R)

(-,-,*2R)

(*01R,10R,-)

(*01R,10R,-)

20R

Computer Network

bgp limitations oscillations5
BGP Limitations: Oscillations

AS 0

(-,*12R,-)

(-,-,-)

12R

R

AS 1

AS 2

(*01R,10R,-)

12R

(-,-,*20R)

(-,-,*20R)

(*01R,-,-)

Computer Network

bgp limitations oscillations6
BGP Limitations: Oscillations

AS 0

(-,*12R,21R)

(-,*12R,-)

21R

R

AS 1

AS 2

(*01R,-,-)

21R

(-,-,*20R)

(*01R,-,-)

(-,-,-)

Computer Network

bgp oscillations
BGP Oscillations
  • Can possible explore every possible path through network  (n-1)! Combinations
  • Limit between update messages (MinRouteAdver) reduces exploration
    • Forces router to process all outstanding messages
  • Typical Internet failover times
    • New/shorter link  60 seconds
      • Results in simple replacement at nodes
    • Down link  180 seconds
      • Results in search of possible options
    • Longer link  120 seconds
      • Results in replacement or search based on length

Computer Network

problems
Problems
  • Routing table size
    • Need an entry for all paths to all networks
  • Required memory= O((N + M*A) * K)
    • N: number of networks
    • M: mean AS distance (in terms of hops)
    • A: number of AS’s
    • K: number of BGP peers

Computer Network

routing table size
Routing Table Size
  • Problem reduced with CIDR

Networks

Mean AS Distance

Number of AS’s

BGP Peers/Net

Memory

2,100

5

59

3

27,000

4,000

10

100

6

108,000

10,000

15

300

10

490,000

100,000

20

3,000

20

1,040,000

Computer Network

outline4
Outline
  • External BGP (e-BGP)
  • Internal BGP (i-BGP)
  • Multi-Homing
  • Stability Issues
  • Scalability Issues

Computer Network

big and getting bigger
Big and Getting Bigger
  • Scaling the iBGP mesh
    • Confederations
    • Route Reflectors
  • BGP Table Growth
    • Address aggregation (CIDR)
    • Address allocation
  • AS number allocation and use
  • Dynamics of BGP
    • Inherent vs. accidental oscillation
    • Rate limiting and route flap dampening
    • Lots and lots of noise
    • Slow convergence time

Scale

Scale

Scale

Scale

Scale

Scale

Scale

Scale

Scale

Scale

Scale

Scale

Scale

Computer Network

ibgp mesh does not scale

iBGP updates

iBGP Mesh Does Not Scale

eBGP update

  • N border routers means N(N-1)/2 peering sessions
  • Each router must have N-1 iBGP sessions configured
  • The addition a single iBGP speaker requires configuration changes to all other iBGP speakers
  • Size of iBGP routing table can be order N larger than number of best routes (remember alternate routes!)
  • Each router has to listen to update noise from each neighbor
  • Currently four solutions:
  • (0) Buy bigger routers!
  • Break AS into smaller ASes
  • BGP Route reflectors
  • BGP confederations
bgp table growth
BGP Table Growth

Thanks to Geoff Huston. http://www.telstra.net/ops/bgptable.html on August 8, 2001

Computer Network

large bgp tables considered harmful
Large BGP Tables Considered Harmful
  • Routing tables must store best routes and alternate routes
  • Burden can be large for routers with many alternate routes (route reflectors for example)
  • Routers have been known to die
  • Increases CPU load, especially during session reset

Moore’s Law may save us in theory. But

in practice it means spending money to upgrade

equipment …

Computer Network

deaggregation due to multihoming may be a leading cause
Deaggregation Due to Multihoming May be a Leading Cause

If AS 1 does

not announce the

more specific prefix,

then most traffic

to AS 2 will go

through AS 3

because it is a

longer match

12.2.0.0/16

12.2.0.0/16

12.0.0.0/8

AS 3

AS 1

provider

provider

AS 2

customer

12.2.0.0/16

AS 2 is

“punching a hole” in

The CIDR block of AS 1

Computer Network

how many asns are there
How Many ASNs are there?

Thanks to Geoff Huston. http://www.telstra.net/ops on June 23, 2001

Computer Network

when will we run out of asns
When will we run out of ASNs?

64,511

2005?

Computer Network

2007?

what is to be done
What is to be done?
  • Make ASNs larger than 16 bits
    • How about 32 bits?
    • See Internet Draft: “BGP support for four-octet AS number space” (draft-ietf-idr-as4bytes-03.txt)
    • Requires protocol change and wide deployment
  • Change the way ASNs are used
    • Allow multihomed, non-transit networks to use private ASNs
    • Uses ASE (AS number Substitution on Egress )
    • See Internet Draft: “Autonomous System Number Substitution on Egress” (draft-jhaas-ase-00.txt)
    • Works at edge, requires protocol change (for loop prevention)
    • Makes some kinds of debugging harder!

Computer Network

as graphs can be fun
AS Graphs Can Be Fun

The subgraph showing all ASes that have more than 100 neighbors in full

graph of 11,158 nodes. July 6, 2001. Point of view: AT&T route-server

Computer Network

as graphs do not show topology

The AS graph

may look like this.

Reality may be closer to this…

AS Graphs Do Not Show Topology!

BGP was designed to

throw away information!

Computer Network

bgp dynamics
BGP Dynamics
  • How many updates are flying around the Internet?
  • How long Does it take Routes to Change?

The goals of

(1) fast convergence

(2) minimal updates

(3) path redundancy

are at odds

Computer Network

daily update count
Daily Update Count

Computer Network

a few bad apples
A Few Bad Apples …

Most prefixes are

stable most of the time.

On this day, about 83% of the prefixes were not updated.

Typically, 80% of

the updates are

for less than 5%

Of the prefixes.

Percent of BGP table prefixes

Computer Network

Thanks to Madanlal Musuvathi for this plot.

Data source: RIPE NCC

two bgp mechanisms for squashing updates
Two BGP Mechanisms for Squashing Updates
  • Rate limiting on sending updates
    • Send batch of updates every MinRouteAdvertisementInterval seconds (+/- random fuzz)
    • Default value is 30 seconds
    • A router can change its mind about best routes many times within this interval without telling neighbors
  • Route Flap Dampening
    • Punish routes for misbehaving

Effective in

dampening

oscillations

inherent in the

vectoring

approach

Must be turned on

with configuration

Computer Network

two main factors in delayed convergence
Two Main Factors in Delayed Convergence
  • Rate limiting timer slows everything down
  • BGP can explore many alternate paths before giving up or arriving at a new path
    • No global knowledge in vectoring protocols

Computer Network

q why all the updates
Q: Why All the Updates?
  • Networks come, networks go
  • There’s always a router rebooting somewhere
  • Hardware failure, flaky interface cards, backhoes digging, floods in Houston, …

This is “normal” --- exactly what

dynamic routing is designed for…

Computer Network

q why all the updates1
Q: Why All the Updates?
  • Misconfiguration
  • Route flap dampening not widely used
  • BGP exploring many alternate paths
  • Software bugs in implementation of routing protocols
  • BGP session resets due to congestion or lack of interoperability: BGP sessions are brittle. One malformed update is enough to reset session and flap 100K routes. (Consequence of incremental approach)
  • IGP instability exported by use of MEDs or IGP tie breaker
  • Sub-optimal vendor implementation choices
  • Secret sauce routing algorithms attempting fancy-dancy tricks
  • Weird policy interactions (MED oscillation, BAD GADGETS??)
  • Gnomes, sprites, and fairies
  • ….

A: NO ONE REALLY KNOWS …

Computer Network

igp tie breaking can export internal instability to the whole wide world
IGP Tie Breaking Can Export Internal Instability to the Whole Wide World

192.44.78.0/24

AS 1

AS 3

AS 2

10

FLAP

AS 4

56

15

FLAP

192.44.78.0/24

ASPATH = 4 2 1

192.44.78.0/24

ASPATH = 4 3 1

FLAP FLAP

meds can export internal instability

Heavy

Content

Web Farm

MEDs Can Export Internal Instability

2865

17

FLAP

FLAP

192.44.78.0/24

MED = 56 OR 10

192.44.78.0/24

MED = 15

10

FLAP

FLAP

FLAP

56

15

FLAP

192.44.78.0/24

implementation does matter
Implementation Does Matter!

stateless withdraws

widely deployed

stateful withdraws

widely deployed

Thanks to Abha Ahuja and Craig Labovitz for this plot.

Computer Network

how long will interdomain routing continue to scale
How Long Will Interdomain Routing Continue to Scale?

A quote from some recent email:

... the existing interdomain routing

infrastructure is rapidly nearing the

end of its useful lifetime. It

appears unlikely that mere tweaks

of BGP will stave off fundamental

scaling issues, brought on by growth,

multihoming and other causes.

Is this true or false? How can we tell?

Research required…

Computer Network

next lecture multicasting
Next Lecture: Multicasting
  • How to send data to a group of receivers.
  • Assigned reading
    • Multicast Routing in Datagram Internetworks and Extended LANs
    • Next Branch Multicast (NBM) routing protocol
    • Chapter 4, multicasting.

Computer Network