Netreview detecting when interdomain routing goes wrong l.jpg
This presentation is the property of its rightful owner.
Sponsored Links
1 / 22

NetReview: Detecting when interdomain routing goes wrong PowerPoint PPT Presentation


  • 42 Views
  • Uploaded on
  • Presentation posted in: General

NetReview: Detecting when interdomain routing goes wrong. Ioannis Avramopoulos Deutsche Telekom Laboratories. Peter Druschel MPI-SWS. Andreas Haeberlen MPI-SWS / Rice. Jennifer Rexford Princeton. Motivation. This is just the tip of the iceberg

Download Presentation

NetReview: Detecting when interdomain routing goes wrong

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Netreview detecting when interdomain routing goes wrong l.jpg

NetReview: Detecting when interdomain routing goes wrong

Ioannis AvramopoulosDeutsche Telekom Laboratories

Peter DruschelMPI-SWS

Andreas HaeberlenMPI-SWS / Rice

Jennifer RexfordPrinceton

© 2009 Andreas Haeberlen, MPI-SWS


Motivation l.jpg

Motivation

  • This is just the tip of the iceberg

  • A considerable fraction of Internet prefixes is affected by routing problems every day

YouTube outage underscores big Internet problem

BGP (Border Gateway Protocol) routing

© 2009 Andreas Haeberlen, MPI-SWS


Bgp and its problems l.jpg

I know how toget to AS A

BGP and its problems

E

B

L

  • ASes exchange routing information via BGP

  • BGP routing is plagued with many problems:

    • Misconfigurations

I

A

A

F

M

C

J

D

G

Autonomoussystems(ASes)

N

H

K

, bugs

, instabilities,

, attacks by spammers

hijacks, oscillation, equipment failures, policy conflicts, ...

© 2009 Andreas Haeberlen, MPI-SWS


Approach fault detection l.jpg

Approach: Fault detection

  • Goals:

    • Reliably detect each routing problem, and

    • link it to the AS that caused it

  • Benefits:

    • ASes can respond to problems quickly

    • No need to diagnose faults manually

    • Works for a very broad class of problems

    • Provides an incentive for reliable routing

    • Easy to deploy incrementally

© 2009 Andreas Haeberlen, MPI-SWS


Challenges in bgp fault detection l.jpg

Challenges in BGP fault detection

  • Idea: Upload all router logs to a central entity, who inspects them for problems

    • Sufficient to find almost any routing problem

  • Why wouldn't this work in practice?

    • Privacy: Logs contain sensitive information

    • Reliability: Logs may be inaccurate (bugs, hackers)

    • Automation: Can't manually inspect that much data

    • Deployability: Can't assume global deployment

    • Decentralization: ASes wouldn't accept a single detector entity

© 2009 Andreas Haeberlen, MPI-SWS


Netreview from 10 000 feet l.jpg

NetReview from 10,000 feet

C

E

  • Border routers maintain logs of all BGP messages

    • Logs are tamper-evident  can reliably detect & obtain proof if faulty routers omit, forge, or modify log entries

  • Neighbors periodically audit each other's logs and check them for routing problems

    • If a problem is found, auditor can prove its existence to a third party

A

F

D

Logs of BGPmessages

B

© 2009 Andreas Haeberlen, MPI-SWS


Outline l.jpg

Outline

  • Introduction

    • Motivation: Internet routing problems

    • Approach: Fault detection

  • What is a BGP fault?

  • The NetReview system

  • Practical challenges

  • Evaluation

  • Summary

© 2009 Andreas Haeberlen, MPI-SWS


Bgp routing policies l.jpg

BGP routing policies

  • How do ASes decide what to announce via BGP?

  • Each AS has a routing policy, which is based on:

    • Peering agreements: Customer/provider, ...

    • Best practices: Limited path length, ...

    • Internal goals: Choose the shortest/cheapest path, ...

    • Address assignments: IP address prefixes, ...

C

E

A

D

F

C's provider

B

G

© 2009 Andreas Haeberlen, MPI-SWS


What is a bgp fault l.jpg

What is a BGP fault?

  • Expected behavior of the AS := Combination of its peering agreements, best practices, internal goals, ...

  • BGP fault := The BGP messages sent by the AS do not conform to its expected behavior

  • How do we know what BGP messages the AS sent?

    • Need a complete+accurate message trace even if some routers are faulty in arbitrary, unknown ways

    • Requires a robust+secure tracing mechanism

  • How do we know what its expected behavior is?

    • Different for every AS  need a specification

© 2009 Andreas Haeberlen, MPI-SWS


Bgp rules l.jpg

C

E

A

F

D

B

G

BGP rules

  • For example, D might specify the following:

    • "I will filter out routes with excessive paths" (best practice)

    • "I will act as C's provider" (peering agreement)

    • "I will prefer routes through B, if available" (internal)

  • Some rules may be confidential, but the AS need not reveal all of them to each auditor

"Rules"

© 2009 Andreas Haeberlen, MPI-SWS


Outline11 l.jpg

Outline

  • Introduction

  • What is a BGP fault?

  • The NetReview system

  • Practical challenges

  • Evaluation

  • Summary

© 2009 Andreas Haeberlen, MPI-SWS


The tamper evident log l.jpg

Message

ACK

B's log

H4

Recv(M)

H3

Hash chain

Send(Z)

H2

Recv(Y)

H1

H0

Send(X)

The tamper-evident log

  • Based on the tamper-evident log in PeerReview [SOSP'07]

    • If router omits, modifies, or forges entries, neighbors can detect this and obtain evidence

  • Log entries form a hash chain

    • Messages include signed hash

    • Tampering breaks the hash chain and is thus detectable

  • Messages are acknowledged

    • Detects if message is ignored

  • Neighbors gossip about the hash values they've seen

Hash(log)

B

A

Hash(log)

© 2009 Andreas Haeberlen, MPI-SWS


Writing rules l.jpg

Writing rules

D

AS1 AS2

D

D AS1 AS2

D

  • Rules are predicates on the AS's routing state

    • Declarative; easy to get correct

    • Even simple rules can be very powerful

D AS3

ownPrefixes

D

D

Describes everything that S-BGPcan check, and more!

© 2009 Andreas Haeberlen, MPI-SWS


Auditing and rule evaluation l.jpg

Auditing and rule evaluation

E

  • To audit a neighboring AS:

    • Auditor requests the logs from each border router

    • Auditor checks logs for inconsistencies and tampering

    • Auditor locally replays the logs  series of routing states

    • Auditor evaluates the rules over each routing state

    • If a rule is violated during some time interval, auditor extracts verifiable evidence from the logs

Auditor

Routing state

D

Time

Rule violated in this interval

© 2009 Andreas Haeberlen, MPI-SWS


Outline15 l.jpg

Outline

  • Introduction

  • What is a BGP fault?

  • The NetReview system

  • Practical challenges

    • Incentives for incremental deployment

    • Partial deployment

    • Working without a certificate authority

    • Using existing routers

  • Evaluation

  • Summary

© 2009 Andreas Haeberlen, MPI-SWS


Incremental deployment l.jpg

Incremental deployment

  • What is the smallest useful deployment?

    • One AS can find bugs, misconfigurations, ...

    • Two adjacent ASes can check peering agreements, ...

  • What are the incentives for deployment?

    • Reliable ASes can attract more customers

    • Logs can be used for root-cause analysis

?

© 2009 Andreas Haeberlen, MPI-SWS


Outline17 l.jpg

Outline

  • Introduction

  • What is a BGP fault?

  • The NetReview system

  • Practical challenges

  • Evaluation

  • Summary

© 2009 Andreas Haeberlen, MPI-SWS


Experimental setup l.jpg

Experimental setup

Internet

AS 1

  • Synthetic network of 35 Zebra BGP daemons

  • Default routing policies (Gao-Rexford)

  • Injected real BGP trace (Equinix) to get scale

  • Results in this talk are from AS 5 (92% of Internet ASes have degree five or less)

Tier 1

AS 3

AS 2

AS 4

AS 6

AS 5

Tier 2

Stubs

AS 10

AS 8

AS 9

AS 7

© 2009 Andreas Haeberlen, MPI-SWS


Evaluation functionality check l.jpg

Evaluation: Functionality check

  • Fault injection experiment with five rules based on common routing problems:

    • No origin misconfiguration

    • Export customer routes

    • Honor NO_ADVERTISE community

    • Consistent path length

    • Backup link

  • NetReview detected all the injected faults

    • Also produced diagnostic information, such as time when the fault occurred, and prefixes that were affected

© 2009 Andreas Haeberlen, MPI-SWS


Evaluation overhead l.jpg

Evaluation: Overhead

  • Processing power: 15-minute log segment can be checked in 41.5s on a P4

    • A single commodity PC is sufficient for small networks

  • Storage space: 710kB/minute, ≈356 GB/year

    • Fits comfortably on asingle hard disk

  • Bandwidth: 420kbps, including BGP updates

    • Insignificant compared to typical traffic volume

© 2009 Andreas Haeberlen, MPI-SWS


Related work l.jpg

Related Work

  • Fault prevention

    • Secure routing protocols: S-BGP, soBGP, SPV, ...

    • Trusted monitors: N-BGP

  • Heuristic fault detection

    • Anomaly detection

    • Root-cause analysis

  • Accountability

    • PeerReview, AIP, AudIt, ...

© 2009 Andreas Haeberlen, MPI-SWS


Summary l.jpg

Summary

  • NetReview:A fault detection system for interdomain routing

    • Automatically detects a wide variety of routing problems

    • Links each problem to the responsible AS

    • Not a heuristic - produces proof of each fault

  • NetReview is practical

    • Easy to deploy incrementally

    • No PKI required

    • Reasonable overhead

Thank you!

© 2009 Andreas Haeberlen, MPI-SWS


  • Login