1 / 20

FlowRoute : Inferring Forwarding Table Updates Using Passive Flow-level Measurements

Amogh Dhamdhere ( CAIDA/UCSD ) amogh@caida.org with Lee Breslau, Nick Duffield, Cheng Ee , Alexandre Gerber, Carsten Lund and Shubho Sen ( AT&T Labs-Research ). FlowRoute : Inferring Forwarding Table Updates Using Passive Flow-level Measurements. Motivation.

caron
Download Presentation

FlowRoute : Inferring Forwarding Table Updates Using Passive Flow-level Measurements

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. AmoghDhamdhere (CAIDA/UCSD) amogh@caida.org with Lee Breslau, Nick Duffield, Cheng Ee, Alexandre Gerber, Carsten Lund and ShubhoSen (AT&T Labs-Research) FlowRoute: Inferring Forwarding Table Updates Using Passive Flow-level Measurements

  2. Motivation • Routing protocol performance during routing events can affect end-to-end performance • Transient loops and packet losses may occur during routing reconvergence • Network operators need to monitor routing protocol performance • Do routers respond as expected? • Update their forwarding tables in a timely manner? • Update their forwarding tables to the expected state? IMC 2010, Melbourne Australia

  3. Monitoring Routing Events • Control plane monitors (e.g., OSPFmon, BGPmon) • Monitor the control plane • cannot measure when a router implemented a change in its forwarding table • Active probing • Can only monitor paths that are probed • Spatial and temporal resolution limited by placement of probes and probing frequency IMC 2010, Melbourne Australia

  4. FlowRoute • A data-plane monitoring tool to work in conjunction with control plane monitors • Infer forwarding table updates using flow-level measurements • Works offline, for after-the-fact forensics and analysis • No additional overhead on routers • Uses flow-level measurements (e.g., Netflow) that are already collected IMC 2010, Melbourne Australia

  5. Basic Method • Single packet flows f1 and f2 towards D • f1 seen at N1: R is previous hop at time T1 • N1 is R’s next hop towards D at T1 • f2 seen at N2: R is previous hop at time T2 • N2 is R’s next hop towards D at T2 T1: f1 N1 R N2 T2: f2 R’s next hop towards D changed in [t1,t2] IMC 2010, Melbourne Australia

  6. Routing Flow Records o i Rp R Rn δ R sees flow towards destination D from tf to tl Netflow: (R, i, o, tf, tl, D) Map outgoing interface o to next hop router Map incoming interface i to previous hop router Subtract link propagation delays Duplicate first packet timestamp (Rp, tf-δ, tl- δ,D,R) (R, tf, tf, D, Rn) One flow record at R produces two routing flow records, giving the routing state of R and Rp IMC 2010, Melbourne Australia

  7. Inferring Forwarding Table Updates • Collect netflow records from all routers • Convert to Routing Flow Records (RFRs) for offline processing (R, T1, T2, N1, D) (R, T3, T4, N2, D) T2 < T3 R changed next hop towards D in the time window [t2,t3]  “range” of forwarding table update N1 N2 T1 T2 T3 T4 IMC 2010, Melbourne Australia

  8. Inferring Forwarding Table Updates • Collect netflow records from all routers • Convert to Routing Flow Records (RFRs) for offline processing (R, T1, T2, N1, D) (R, T3, T4, N2, D) T2 > T3 Routing flow records overlap  could be due to Equal Cost Multi-Path (ECMP) N2 N1 T1 T3 T2 T4 IMC 2010, Melbourne Australia

  9. ECMP • Router R can forward flows destined to D to either N1 or N2 • RFRs generated at N1 and N2 can overlap  inconsistency • Non-overlapping RFRs can appear as a routing change for every flow [T1,T2]: f1 N1 R D N2 [T3, T4]: f2 IMC 2010, Melbourne Australia

  10. Filtering ECMP • Observation: In 99% of next hop changes due to ECMP, a router routes fewer than 20 flows towards one next hop, before routing a flow towards an equal-cost next hop • Filtering heuristic: Declare routing change only if >20 flows were routed to the old next hop before a flow is routed to new next hop • Conservative: May miss routing changes before 20 flows are forwarded to the old next hop IMC 2010, Melbourne Australia

  11. Sampling • Both packet and flow sampling in high-speed networks • Sampling does not affect correctness of inferred ranges • Sampling affects the width of ranges; more sampling  lower temporal resolution • More discussion in the paper IMC 2010, Melbourne Australia

  12. Timely Forwarding Table Updates Forwarding table update ranges OSPF event “cluster” All ranges overlap with OSPF event cluster IMC 2010, Melbourne Australia

  13. Delayed Forwarding Table Updates Forwarding table updates consistent with OSPF events Forwarding table updates delayed w.r.t OSPF events Such behavior is not detectable using a control plane monitor alone! IMC 2010, Melbourne Australia

  14. Delayed Forwarding Table Updates • Used FlowRoute on a 2-month dataset • 2666 OSPF event clusters • 97010 time ranges consistent with OSPF event clusters • 117 ranges that showed delayed forwarding table updates • Two routers showed delayed updates 14 times in the 2-month dataset • Subsequently retired from the network IMC 2010, Melbourne Australia

  15. Loops • Delayed forwarding table updates can cause transient loops • Example in the paper of how this can happen • 392 instances of 1-hop loops during 2-month dataset • Mostly short-lived (sub-second) • A few loops lasted 10s of seconds • Long-lived loops were due to delayed updates by one or more routers IMC 2010, Melbourne Australia

  16. Summary • FlowRoute: A data plane monitor to work in conjunction with control plane monitors for forensics and analysis of forwarding table updates • Used to study forwarding table updates in a tier-1 ISP network • Found cases of delayed forwarding table updates due to buggy routers • Also found transient loops during routing convergence and spikes in link utilization IMC 2010, Melbourne Australia

  17. Thanks!amogh@caida.orgwww.caida.org/~amogh IMC 2010, Melbourne Australia

  18. Practical Issues • What should be the destination? Can be either destination IP address, prefix, or MPLS tunnel endpoint • Need to observe sufficient flow volume • We choose MPLS tunnel endpoint • Sampling • Both packet and flow sampling occur in high-speed networks • Sampling does not affect correctness of inferred ranges • Affects the width of the ranges; more sampling  lower temporal resolution IMC 2010, Melbourne Australia

  19. Existing Approaches • Control plane monitors (e.g., OSPFmon, BGPmon) • Monitor the control plane, cannot measure when a router implemented a change in its forwarding table • Collect and process router logs • Large volume of data, transporting and processing is hard • Limited by polling frequency, e.g., 5 minutes with SNMP • Active probing • Spatial and temporal resolution limited by placement of probes and probing frequency IMC 2010, Melbourne Australia

  20. Delayed Forwarding Table Updates • Used FlowRoute on a 2-month dataset -- 2666 OSPF event clusters • 97010 time ranges consistent with OSPF event clusters • 58 clusters, 117 ranges that showed delayed forwarding table updates • Two routers showed delayed updates 14 times in the 2-month dataset • Subsequently retired from the network IMC 2010, Melbourne Australia

More Related