1 / 27

OR Project Group II: Packet Buffer Proposal

OR Project Group II: Packet Buffer Proposal. Da Chuang, Isaac Keslassy, Sundar Iyer, Greg Watson, Nick McKeown, Mark Horowitz E-mail: stchuang@stanford.edu Optical Router Project: http://klamath.stanford.edu/or/. Outline. Load-Balancing Background Mis-sequencing Problem

Download Presentation

OR Project Group II: Packet Buffer Proposal

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OR Project Group II: Packet Buffer Proposal Da Chuang, Isaac Keslassy, Sundar Iyer, Greg Watson, Nick McKeown, Mark Horowitz E-mail: stchuang@stanford.edu Optical Router Project: http://klamath.stanford.edu/or/

  2. Outline • Load-Balancing Background • Mis-sequencing Problem • Datapath Architecture • First stage - Segmentation • Second stage – Main Buffering • Third stage - Reassembly

  3. 100Tb/s router Switch Fabric Electronic Linecard #1 Electronic Linecard #625 160 Gb/s 160 Gb/s • Line termination • IP packet processing • Packet buffering • Line termination • IP packet processing • Packet buffering 160Gb/s 160Gb/s Arbitration Request Grant (100Tb/s = 625 * 160Gb/s)

  4. 1 2 1 2 1 1 1 N N N Load-Balanced Switch External Inputs Internal Inputs External Outputs Load-balancing cyclic shift Switching cyclic shift

  5. VOQs 1 2 N 160 Gbps Linecard R Fixed-size Packets R Lookup/Processing Segmentation Load-balancing R Input Block R Output Block Intermediate Input Block Switching R R Reassembly

  6. Outline • Load-Balancing Background • Mis-sequencing Problem • Datapath Architecture • First stage - Segmentation • Second stage – Main Buffering • Third stage - Reassembly

  7. 1 2 1 2 1 1 1 N N N Problem: Unbounded Mis-sequencing External Inputs Internal Inputs External Outputs Spanning Set of Permutations Spanning Set of Permutations

  8. N 1 N 1 1 1 N N Preventing Mis-sequencing Middle stage 1 1 1 N N N • Uniform Frame Spreading: • Group cells by frames of N cells each (frame building) • Spread each frame across all middle linecards • Each middle stage receives the same type of packets => has the same queue occupancy state

  9. Outline • Load-Balancing Background • Missequencing Problem • Datapath Architecture • First stage - Segmentation • Second stage – Main Buffering • Third stage - Reassembly

  10. 1st stage 2nd stage 3rd stage R/N 1 1 1 R R R R R R 2 2 2 N N N Segmentation/ Frame Building Reassembly Main Buffering Three stages on a linecard

  11. Technology Assumptions in 2005 • DRAM Technology • Access Time ~ 40 ns • Size ~ 1 Gbits • Memory Bandwidth ~ 16 Gbps (16 data pins) • On-chip SRAM Technology • Access Time ~ 2.5 ns • Size ~ 64 Mbits • Serial Link Technology • Bandwidth ~ 10 Gb/s • >100 serial links per chip

  12. 1 1 1 2 2 2 N N N First Stage 16-bytes 16-bytes 0-15 0-15 R/8 R/8 variable-size packets 16-31 16-31 128-byte cells 1 R/8 R/8 2 R N 108-127 Segmentation 108-127 R/8 R/8 Frame Building

  13. Segmentation Chip (1st stage) 16-bytes 0-15 R/8 variable-size packets 128-byte cells 16-31 1 2 R/8 R N 108-127 Segmentation R/8 • Incoming: 16x10 Gb/s • Outgoing: 8x2x10 Gb/s • On-chip Memory: N x 1500 bytes = 7.2 Mbits 3.2ns SRAM

  14. 1 2 N Frame Building Chip (1st stage) 16-bytes 16-bytes 0-15 0-15 R/8 R/8 Frame Building • Incoming: 2x10 Gb/s • Outgoing: 2x10 Gb/s • On-chip Memory: N^2 x 16 bytes = 48 Mbits 3.2ns SRAM

  15. Three stages on a linecard 1st stage 2nd stage 3rd stage R/N 1 1 1 R R R R R R 2 2 2 N N N Segmentation/ Frame Building Reassembly Main Buffering

  16. Write Rate, R One 128B packet every 6.4ns Read Rate, R One 128B packet every 6.4ns Packet Buffering Problem Packet buffers for a 160Gb/s router linecard 40Gbits Buffer Memory Buffer Manager

  17. Memory Technology • Use SRAM? +Fast enough random access time, but • Too low density to store 40Gbits of data. • Use DRAM? +High density means we can store data, but • Can’t meet random access time.

  18. Large DRAM memory holds the body of FIFOs 54 53 52 51 50 10 9 8 7 6 5 1 95 94 93 92 91 90 89 88 87 86 15 14 13 12 11 10 9 8 7 6 2 86 85 84 83 82 11 10 9 8 7 DRAM Q Reading b bytes Writing b bytes 1 1 4 3 1 2 Arriving Departing 55 60 59 58 57 56 2 Packets Packets 2 1 2 4 3 5 97 96 R R Q Q 6 5 4 3 2 1 SRAM 87 88 91 90 89 Arbiter or Scheduler Small head SRAM Small tail SRAM Requests cache for FIFO heads cache for FIFO tails Hybrid Memory Hierarchy

  19. SRAM/DRAM results • How much SRAM buffering, given: • DRAM Trc = 40ns • Write and read a 128-byte cell every 6.4ns • Let Q = 625, b = 2*40ns/6.4ns = 12.5 • Two Options [Iyer] • Zero Latency Qb[2+lnQ] = 61k cells = 66 Mbits • Some Latency Q(b-1) = 7.5k cells = 7.5 Mbits

  20. Outline • Load-Balancing Background • Missequencing Problem • Datapath Architecture • First stage - Segmentation • Second stage – Main Buffering • Third stage - Reassembly

  21. Problem Statement 40 Gb DRAM 160 Gb/s 160 Gb/s Write Rate, R Read Rate, R Queue Manager 160 Gb/s 160 Gb/s One 128B cell every 6.4ns One 128B cell every 6.4ns

  22. R/N R/N R/N 1 1 1 2 2 2 N N N Second Stage 16-bytes 16-bytes 0-15 0-15 R/8 R/8 16-31 16-31 R/8 R/8 108-127 108-127 R/8 R/8 Main Buffering

  23. R/N 1 2 N Queue Manager Chip (2nd stage) 5 x 1Gb DRAM R/4 R/4 16-bytes 16-bytes 0-15 0-15 R/8 R/8 Main Buffering • Incoming: 2x10 Gb/s • Outgoing: 2x10 Gb/s • 35 pins/DRAM x 5 DRAMs = 175 pins • SRAM/DRAM Memory: Q(b-1) = 2.8 Mbits 3.2ns SRAM • SRAM linked list = 1 Mbit 3.2ns SRAM

  24. Outline • Load-Balancing Background • Missequencing Problem • Datapath Architecture • First stage - Segmentation • Second stage – Main Buffering • Third stage - Reassembly

  25. Three stages on a linecard 1st stage 2nd stage 3rd stage R/N 1 1 1 R R R R R R 2 2 2 N N N Segmentation/ Frame Building Reassembly Main Buffering

  26. Third stage 16-bytes 0-15 variable-size packets 16-31 R/8 1 2 R/8 R N 108-127 Reassembly R/8 • Incoming: 8x2x10 Gb/s • Outgoing: 16x10 Gb/s • On-chip Memory: N x 1500 bytes = 7.2 Mbits 3.2ns SRAM

  27. Linecard Datapath Requirements • 1st stage • 1 segmentation chip • 8 frame building chips • 2nd stage • 8 queue manager chips • 40 1 Gb DRAMs • 3rd stage • 1 reassembly chip • Total chip count • 18 ASIC chips • 40 1 Gb DRAMs

More Related