1 / 74

QoS in an Ethernet world

QoS in an Ethernet world. Bill Lynch Founder & CTO. Why is it needed? (Or is it?) What does it do? (Or not do?) Gotchas…. Why is it hard to deploy?. QoS. Headend. VPN A. CE. VPN B. CE. CE. VPN A. Headend. Computational Particle Physicist. Triple play data networks.

dougal
Download Presentation

QoS in an Ethernet world

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. QoS in an Ethernet world Bill LynchFounder & CTO

  2. Why is it needed? (Or is it?) What does it do? (Or not do?) Gotchas…. Why is it hard to deploy? QoS

  3. Headend VPN A CE VPN B CE CE VPN A Headend Computational Particle Physicist Triple play data networks VOD, CONF, Data services Interface content mirroring for security requirements • High-speed Ethernet Edge • Assured QoS • DOS prevention Edge PE IP or MPLS or λ Core Distribution PE Broadband Home Centralized Headend • Video, voice, data over ethernet. • QoS across thousands of subscribers • SLAs and differential pricing

  4. Voice Many connections Low BW/connection Latency/jitter requirements Video Few sources Higher BW Latency Data Many connection Unpredictable BW BE generally okay Computational particle physicist Very high peak BW & duration Very few connections Triple play data characteristics

  5. Physical Port Physical Port Physical Port Physical Port Physical Port Physical Port Physical Port Physical Port Router QoS

  6. Physical Port Physical Port Physical Port Physical Port Physical Port Physical Port Physical Port Physical Port Router QoS QoS == which packet goes first Only matters under congestion

  7. Physical Port Physical Port Physical Port Physical Port Physical Port Physical Port Physical Port Physical Port Router QoS Inherent packet jitter Worse: N simultaneous arrivals Bad: Per hop! Worse: Bigger MTU

  8. Inherent jitter (per hop!) Fundamental conclusion: QoS more important at edge Edge also more likely to congest FE GE OC-12 OC-12 OC-192

  9. Gotchas…. • Already no guarantees from simultaneous arrival… … but hope the total worst case is < 10ms? • And what if your router wasn’t perfect?

  10. Physical Port Physical Port HI Queue HI Queue Physical Port Physical Port Physical Port Physical Port LO Queue LO Queue Physical Port Physical Port What is Queue Sharing? Queue Sharing is when multiple physical or switch fabric connections must share queues. Example: Each input linecard has two queues for each output linecard. All packets in a shared queue are treated equally.

  11. Physical Port HI Queue HI Queue Physical Port Physical Port LO Queue LO Queue Physical Port What is Head of Line Blocking? When an output linecard becomes congested, traffic becomes congested on the input linecard Traffic control (W/RED) must be performed at input VOQ.

  12. Physical Port HI Queue HI Queue Physical Port Physical Port LO Queue LO Queue Physical Port What is Head of Line Blocking? The output linecard cannot process all of the output traffic. Because all traffic in a shared queue (VOQ) is treated equally, we have affected traffic on the uncongested port.

  13. Queue Sharing Test Results Congested port (Flows C, D, E) remained at 100% throughput Uncongested (Flows A, B) were penalized because of Queue Sharing

  14. The effects of Queue Sharing With the presence of Queue Sharing, congestion can severely affect the performance of non-congested ports Congestion is caused by: Topology Changes Routing Instability Denial of Service Attacks High Service Demand Misconfiguration of systems or devices

  15. Physical Port Physical Port Physical Port Physical Port Physical Port Physical Port Physical Port Physical Port Output Queued Architectures - PRO/8000 Only one queuing location exists in the entire system 36,000 unique hardware queues Protected bandwidth on a queue Incoming packets are immediately placed into a unique output queue Centralized Shared Memory Switch Fabric

  16. Physical Port Physical Port Physical Port Physical Port Physical Port Physical Port Physical Port Physical Port Output Queued Architectures - PRO/8000 Only one queuing location exists in the entire system Over 36,000 unique hardware queues Bandwidth is protected on a per-queue basis Incoming packets are immediately placed into a unique output queue Centralized Shared Memory Switch Fabric

  17. Physical Port Physical Port Physical Port Physical Port Physical Port Physical Port Physical Port Physical Port Output Queued Architectures - PRO/8000 Traffic control (W/RED) is performed on each output queue individually Protected bandwidth for every single queue Centralized Shared Memory Switch Fabric

  18. Pro/8812 Test Results Congested port (Flows C, D, E) remained at 100% throughput Uncongested (Flows A, B) remained at 100% throughput

  19. Voice Many connections Low BW/connection Latency/jitter requirements Video Few sources Higher BW Latency Data Many connection Unpredictable BW BE generally okay Computational particle physicist Very high peak BW & duration Very few connections Triple play data characteristics

  20. Network Qos architectures 20

  21. Political Peers Equipment QoS is end to end Many queues/port Many shapers/port Fast diffserv/remarking Computation expense Operational Must deploy everywhere Must police at the edge Commercial Easier short term solutions to problems Cheaper alternatives Applications Not tuned or aware QoS not ‘required’ for the application Geographical Last mile technologies Single provider network Green field deployments QoS Deployment Issues

  22. Summary • Triple play requires QoS • Services drive quality • Most routers aren’t perfect • Shared queues mean you can’t provision a port independently • Political and deployment problems remain • Some geographic areas better suited

  23. SC LCU 297 sq mm (17.26mm x 17.26mm) 30.5M transistors 47M contacts 50KBytes of memory 425 sq mm (20.17mm x 21.07mm) 137M transistors 188M contacts 950KBytes of memory NPU Striper 429 sq mm (20.17mm x 21.29mm) 156M transistors 265M contacts 1.2MBytes of memory 429 sq mm (20.17mm x 21.29mm) 214M transistors 400M contacts 2.6MBytes of memory GA MCU 389 sq mm (19.05mm x 20.4mm) 106M transistors 188M contacts 1.2MBytes of memory 225 sq mm (15.02mm x 15.02mm) 83M transistors 136M contacts 900KBytes of memory Never underestimate the power of Moore’s Law Architecture

  24. NPU – 40G QoS lookups FTSRAM LxU • VLIW systolic Array • Packet advances every cycle • Named bypassing • > 200 processors • 4 ops/cycle/processor • 12 loads every cycle • (1Tb memory BW) • 36 loads/packet PxU IPA PBU pacman QxU

  25. NPU FTSRAM LxU • VLIW systolic Array • Normal instruction set • Arithmetic • Logical • Branch • Load • Simple programming model • Deterministic performance PxU IPA PBU pacman QxU

  26. Memory Controller – Service Level Queueing • High BW • 16 DRAM chips • independent memory banks • BW dist. across banks • 36K queues • Memory management • Write-once multicast • Preserve ordering

  27. Basic Router Architecture Elements Linecard Switch Fabric Linecard Three Classes of Switch Fabric Architecture - Input Queued (IQ) - Output Queued (OQ) - Combined Input/Output Queued (CIOQ)

  28. Input Queued (IQ) Fabrics Input Linecard Switch Fabric Ouput Linecard Input Queued Switch Fabrics: Inefficient use of memory Require Complex Scheduling

  29. Combined Input/Output Queued (CIOQ) Fabrics Input Linecard Switch Fabric Ouput Linecard CIOQ Switch Fabrics: Generally with point-to-point fabric in the middle (Crossbar, multi-stage (clos), torus) Requires Complex Scheduling Queues shared to reduce complexity

  30. Output Queued Fabrics Input Linecard Switch Fabric Ouput Linecard OQ Switch Fabrics: Require extremely high speed memory access Do not share queues Efficient multicast replication Protected bandwidth per queue

  31. Terabit Centralized Shared Memory Routers April 20, 2004 Bill Lynch CTO

  32. Whither QoS? April 20, 2004 Bill Lynch CTO

  33. Headend VPN A CE VPN B CE CE VPN A Headend Research, Education, Grid, Supercomputing Concurrent Services VOD, CONF, Data services Interface content mirroring for security requirements • High-speed Ethernet Edge • Assured QoS • DOS prevention Edge PE IP MPLS λ Distribution PE Broadband Home Centralized Headend • Video, voice, data over ethernet. • QoS across thousands of subscribers • SLAs and differential pricing

  34. (More Bill’s Slides Here) • (As much detail on the switch fabric and chips as you are comfortable saying in a multi-vendor environment!) • No scheduling • 36K service level queues • NPU for fast lookup, policing, shaping • SW abstraction based on service performed, not provided knobs • Many, many, many DRAM banks. However, ½ as many as CIOQ architectures. • 40G NPU for line rate • Policing • Remarking • DA, AS, other lookup • SW interface focus on service, not knobs.

  35. (Insert Bill’s Slides Here) • Self Introduction • Problem Statement (Bill) • "Layer 3 QoS at the right scale price is elusive"Throwing more bandwidth at lower layers only makes networking researchers commodity bandwidth brokers. Also that is fine for R&E but commercially that is too expensive, so there appears to be a growing disconnect between R&E and commercial.It will be important not to slam the current L2/L1 vogue lest we upset the locals :) • Numerous commercial implementations starting now • Single network country • High BW to home • Triple play • Assertion (Bill) • "System Architecture greatly contributes to the proper operation of network wide QoS"Current system architecture are completely unfocused on network wide QoS, and focused on per-hop-behaviors. This forces networkers to tweak 100 knobs to get the desired behavior. Why not architect the system to protect a flow through the router, so that behaviors are predictable in every circumstance? • End 2 end. Any problem exacerbated by TCP.

  36. Abilene Network Map Source: http://abilene.internet2.edu/new/upgrade.html

  37. Internet Growth Predictions “117% YEARLY GROWTH THROUGH 2006” “VIDEO WILL DRIVE TRAFFIC GROWTH OVER THE NEXT 10 YEARS” Source: Yankee Group April 2004

  38. Network Reference Design Single Element Core (Cluster) Interdomain QoS Peers Concurrent Services Edge Intradomain QoS

  39. PRO/8000TM Concurrent Services Routers • Highest performance and density • 960Gbps 2 per rack Ultra-compact 80Gbps 8 per rack

  40. PRO/8000 Series Logical Architecture Procket VLSI Forwarding Plane # CP Route Processors (1+1) CP Control Plane 1 5 CP 2 4 1 5 3 1 5 Line Card Line Card Switch Cards (2+1) 1 5 Media Adapters Media Adapters • Fully redundant Switch Cardsand Route Processors • All components hot-swappablein-service • No single point of failure • Strictly non-blocking

  41. Basic Router Architecture Elements Linecard Switch Fabric Linecard Three Classes of Switch Fabric Architecture - Input Queued (IQ) - Output Queued (OQ) - Combined Input/Output Queued (CIOQ)

  42. Input Queued (IQ) Fabrics Input Linecard Switch Fabric Ouput Linecard Input Queued Switch Fabrics: Inefficient use of memory Require Complex Scheduling

  43. Combined Input/Output Queued (CIOQ) Fabrics Input Linecard Switch Fabric Ouput Linecard CIOQ Switch Fabrics: Generally with point-to-point fabric in the middle (Crossbar, multi-stage (clos), torus) Requires Complex Scheduling Queues shared to reduce complexity

  44. Output Queued Fabrics Input Linecard Switch Fabric Ouput Linecard OQ Switch Fabrics: Require extremely high speed memory access Do not share queues Efficient multicast replication Protected bandwidth per queue

  45. Physical Port Physical Port HI Queue HI Queue Physical Port Physical Port Physical Port Physical Port LO Queue LO Queue Physical Port Physical Port What is Queue Sharing? Queue Sharing is when multiple physical or switch fabric connections must share queues. Example: Each input linecard has two queues for each output linecard. All packets in a shared queue are treated equally.

  46. Physical Port HI Queue HI Queue Physical Port Physical Port LO Queue LO Queue Physical Port What is Head of Line Blocking? When an output linecard becomes congested, traffic becomes congested on the input linecard Traffic control (W/RED) must be performed at input VOQ.

  47. Physical Port HI Queue HI Queue Physical Port Physical Port LO Queue LO Queue Physical Port What is Head of Line Blocking? The output linecard cannot process all of the output traffic. Because all traffic in a shared queue (VOQ) is treated equally, we have affected traffic on the uncongested port.

  48. Queue Sharing Test Results Congested port (Flows C, D, E) remained at 100% throughput Uncongested (Flows A, B) were penalized because of Queue Sharing Traffic on adjacent ports was dropped!

More Related