1 / 46

Router Design

Router Design. (Nick Feamster) February 11, 2008. Today’s Lecture. The design of big, fast routers Partridge et al. , A 50 Gb/s IP Router Design constraints Speed Size Power consumption Components Algorithms Lookups and packet processing (classification, etc.) Packet queuing

rswan
Download Presentation

Router Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Router Design (Nick Feamster)February 11, 2008

  2. Today’s Lecture • The design of big, fast routers • Partridge et al., A 50 Gb/s IP Router • Design constraints • Speed • Size • Power consumption • Components • Algorithms • Lookups and packet processing (classification, etc.) • Packet queuing • Switch arbitration

  3. What’s In A Router • Interfaces • Input/output of packets • Switching fabric • Moving packets from input to output • Software • Routing • Packet processing • Scheduling • Etc.

  4. What a Router Chassis Looks Like Cisco CRS-1 Juniper M320 19” 17” Capacity: 1.2Tb/sPower: 10.4kWWeight: 0.5 TonCost: $500k Capacity: 320 Gb/sPower: 3.1kW 6ft 3ft 2ft 2ft

  5. What a Router Line Card Looks Like 1-Port OC48 (2.5 Gb/s)(for Juniper M40) 4-Port 10 GigE(for Cisco CRS-1) 10in 2in Power: about 150 Watts 21in

  6. Big, Fast Routers: Why Bother? • Faster link bandwidths • Increasing demands • Larger network size (hosts, routers, users)

  7. Summary of Routing Functionality • Router gets packet • Looks at packet header for destination • Looks up routing table for output interface • Modifies header (TTL, IP header checksum) • Passes packet to output interface

  8. Data Hdr Data Hdr IP Address Next Hop Generic Router Architecture Header Processing Lookup IP Address Update Header Queue Packet Address Table Buffer Memory 1M prefixes Off-chip DRAM 1M packets Off-chip DRAM Question: What is the difference between this architecture and that in today’s paper?

  9. Innovation #1: Each Line Card Has the Routing Tables • Prevents central table from becoming a bottleneck at high speeds • Complication: Must update forwarding tables on the fly. • How does the BBN router update tables without slowing the forwarding engines?

  10. Data Data Data Hdr Hdr Hdr Header Processing Header Processing Header Processing Lookup IP Address Lookup IP Address Lookup IP Address Update Header Update Header Update Header Address Table Address Table Address Table Data Data Hdr Hdr Data Hdr Generic Router Architecture Buffer Manager Buffer Memory Buffer Manager Interconnection Fabric Buffer Memory Buffer Manager Buffer Memory

  11. CPU Buffer Memory Route Table CPU Line Interface Line Interface Line Interface Memory MAC MAC MAC Typically <0.5Gb/s aggregate capacity First Generation Routers Off-chip Buffer Shared Bus Line Interface

  12. Fwding Cache Second Generation Routers CPU Buffer Memory Route Table Line Card Line Card Line Card Buffer Memory Buffer Memory Buffer Memory Fwding Cache Fwding Cache MAC MAC MAC Typically <5Gb/s aggregate capacity

  13. Fwding Table Third Generation Routers “Crossbar”: Switched Backplane Line Card CPU Card Line Card Local Buffer Memory Local Buffer Memory Line Interface CPU Routing Table Memory Fwding Table MAC MAC Typically <50Gb/s aggregate capacity

  14. Innovation #2: Switched Backplane • Every input port has a connection to every output port • During each timeslot, each input connected to zero or one outputs • Advantage:Exploits parallelism • Disadvantage:Need scheduling algorithm

  15. Head-of-Line Blocking Problem: The packet at the front of the queue experiences contention for the output queue, blocking all packets behind it. Input 1 Output 1 Input 2 Output 2 Input 3 Output 3 Maximum throughput in such a switch: 2 – sqrt(2) M.J. Karol, M. G. Hluchyj, and S. P. Morgan, “Input Versus Output Queuing on a Space-Division Packet Switch,” IEEE Transactions On Communications, Vol. Com-35, No. 12, December 1987, pp. 1347-1356.

  16. Speedup • What if the crossbar could have a “speedup”? Key result: Given a crossbar with 2x speedup, any maximal matching can achieve 100% throughput. I.e., does as well as a switch with Nx speedup. S.-T. Chuang, A. Goel, N. McKeown, and B. Prabhakarm, “Matching Output Queueing with a Combined Input Output Queued Switch”, Proceedings of INFOCOM,1998.

  17. Combined Input-Output Queuing • Advantages • Easy to build • 100% can be achieved with limited speedup • Disadvantages • Harder to design algorithms • Two congestion points • Flow control at destination input interfaces output interfaces Crossbar

  18. Solution: Virtual Output Queues • Maintain N virtual queues at each input • one per output Input 1 Output 1 Output 2 Input 2 Output 3 Input 3 N. McKeown, A. Mekkittikul, V. Anantharam, and J. Walrand, “Achieving 100% Throughput in an Input-Queued Switch,” IEEE Transactions on Communications, Vol. 47, No. 8, August 1999, pp. 1260-1267.

  19. Router Components and Functions • Route processor • Routing • Installing forwarding tables • Management • Line cards • Packet processing and classification • Packet forwarding • Switched bus (“Crossbar”) • Scheduling

  20. Crossbar Switching • Conceptually:N inputs, N outputs • Actually, inputs are also outputs • In each timeslot, one-to-one mapping between inputs and outputs. • Goal: Maximal matching Traffic Demands Bipartite Match L11(n) Maximum Weight Match LN1(n)

  21. Early Crossbar Scheduling Algorithm • Wavefront algorithm Problems: Fairness, speed, …

  22. Alternatives to the Wavefront Scheduler • PIM: Parallel Iterative Matching • Request: Each input sends requests to all outputs for which it has packets • Grant: Output selects an input at random and grants • Accept: Input selects from its received grants • Problem: Matching may not be maximal • Solution: Run several times • Problem: Matching may not be “fair” • Solution: Grant/accept in round robin instead of random

  23. Processing: Fast Path vs. Slow Path • Optimize for common case • BBN router: 85 instructions for fast-path code • Fits entirely in L1 cache • Non-common cases handled on slow path • Route cache misses • Errors (e.g., ICMP time exceeded) • IP options • Fragmented packets • Mullticast packets

  24. PEs Switch LineCards Recent Trends: Programmability • NetFPGA: 4-port interface card, plugs into PCI bus(Stanford) • Customizable forwarding • Appearance of many virtual interfaces (with VLAN tags) • Programmability with Network processors(Washington U.)

  25. Scheduling and Fairness • What is an appropriate definition of fairness? • One notion: Max-min fairness • Disadvantage: Compromises throughput • Max-min fairness gives priority to low data rates/small values • Is it guaranteed to exist? • Is it unique?

  26. Max-Min Fairness • A flow rate x is max-min fair if any rate x cannot be increased without decreasing some y which is smaller than or equal to x. • How to share equally with different resource demands • small users will get all they want • large users will evenly split the rest • More formally, perform this procedure: • resource allocated to customers in order of increasing demand • no customer receives more than requested • customers with unsatisfied demands split the remaining resource

  27. Example • Demands: 2, 2.6, 4, 5; capacity: 10 • 10/4 = 2.5 • Problem: 1st user needs only 2; excess of 0.5, • Distribute among 3, so 0.5/3=0.167 • now we have allocs of [2, 2.67, 2.67, 2.67], • leaving an excess of 0.07 for cust #2 • divide that in two, gets [2, 2.6, 2.7, 2.7] • Maximizes the minimum share to each customer whose demand is not fully serviced

  28. How to Achieve Max-Min Fairness • Take 1: Round-Robin • Problem: Packets may have different sizes • Take 2: Bit-by-Bit Round Robin • Problem: Feasibility • Take 3: Fair Queuing • Service packets according to soonest “finishing time” Adding QoS: Add weights to the queues…

  29. Why QoS? • Internet currently provides one single class of “best-effort” service • No assurances about delivery • Existing applications are elastic • Tolerate delays and losses • Can adapt to congestion • Future “real-time” applications may be inelastic

  30. IP Address Lookup Challenges: • Longest-prefix match (not exact). • Tables are large and growing. • Lookups must be fast.

  31. 128.9.16.14 IP Lookups find Longest Prefixes 128.9.176.0/24 128.9.16.0/21 128.9.172.0/21 142.12.0.0/19 65.0.0.0/8 128.9.0.0/16 0 232-1 Routing lookup:Find the longest matching prefix (aka the most specific route) among all prefixes that match the destination address.

  32. IP Address Lookup Challenges: • Longest-prefix match (not exact). • Tables are large and growing. • Lookups must be fast.

  33. Address Tables are Large

  34. IP Address Lookup Challenges: • Longest-prefix match (not exact). • Tables are large and growing. • Lookups must be fast.

  35. Lookups Must be Fast Year Line 40B packets (Mpkt/s) Cisco CRS-1 1-Port OC-768C (Line rate: 42.1 Gb/s) 1997 622Mb/s 1.94 OC-12 1999 2.5Gb/s 7.81 OC-48 2001 10Gb/s 31.25 OC-192 2003 40Gb/s 125 OC-768 Still pretty rare outside of research networks

  36. IP Address Lookup: Binary Tries Example Prefixes: 0 1 a) 00001 b) 00010 c) 00011 d) 001 e) 0101 g f d f) 011 g) 100 h i h) 1010 e i) 1100 j) 11110000 a b c j

  37. IP Address Lookup: Patricia Trie ExamplePrefixes 0 1 a) 00001 b) 00010 c) 00011 d) 001 e) 0101 g f d j Skip 5 1000 f) 011 g) 100 h i h) 1010 e i) 1100 j) 11110000 a b c Problem: Lots of (slow) memory lookups

  38. Address Lookup: Direct Trie • When pipelined, one lookup per memory access • Inefficient use of memory 0000……0000 1111……1111 24 bits 0 224-1 8 bits 0 28-1

  39. Faster LPM: Alternatives • Content addressable memory (CAM) • Hardware-based route lookup • Input = tag, output = value • Requires exact match with tag • Multiple cycles (1 per prefix) with single CAM • Multiple CAMs (1 per prefix) searched in parallel • Ternary CAM • (0,1,don’t care) values in tag match • Priority (i.e., longest prefix) by order of entries Historically, this approach has not been very economical.

  40. Faster Lookup: Alternatives • Caching • Packet trains exhibit temporal locality • Many packets to same destination • Cisco Express Forwarding

  41. Lookup limited by memory bandwidth. Lookup uses high-degree trie. State of the art: 10Gb/s line rate. Scales to: 40Gb/s line rate. IP Address Lookup: Summary

  42. Fourth-Generation: Collapse the POP High Reliability and Scalability enable “vertical” POP simplification Reduces CapEx, Operational cost Increases network stability

  43. Limit today ~2.5Tb/s • Electronics • Scheduler scales <2x every 18 months • Opto-electronic conversion Switch Linecards Fourth-Generation Routers

  44. Multi-rack routers Switch fabric Linecard In WAN Out In WAN Out

  45. Future: 100Tb/s Optical Router Optical Switch Electronic Linecard #1 Electronic Linecard #625 160-320Gb/s 160-320Gb/s 40Gb/s • Line termination • IP packet processing • Packet buffering • Line termination • IP packet processing • Packet buffering 40Gb/s 160Gb/s Arbitration 40Gb/s Request 40Gb/s Grant (100Tb/s = 625 * 160Gb/s) McKeown et al., Scaling Internet Routers Using Optics, ACM SIGCOMM 2003

  46. Challenges with Optical Switching • Mis-sequenced packets • Pathological traffic patterns • Rapidly configuring switch fabric • Failing components

More Related