1 / 18

The Crosspoint Queued Switch

The Crosspoint Queued Switch. Yossi Kanizo (Technion, Israel). Joint work with Isaac Keslassy (Technion, Israel) and David Hay (Politecnico di Torino, Italy). Typical Switch Architectures. Linecards. Linecards. Switch Fabric. Switch Fabric. Assumes Instantaneous Closed Loop.

norah
Download Presentation

The Crosspoint Queued Switch

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Crosspoint Queued Switch Yossi Kanizo (Technion, Israel) Joint work with Isaac Keslassy (Technion, Israel) and David Hay (Politecnico di Torino, Italy)

  2. Typical Switch Architectures Linecards Linecards Switch Fabric Switch Fabric Assumes Instantaneous Closed Loop CICQ – Combined Input and Crosspoint Queued IQ – Input Queued

  3. Single-Rack Router Linecards Switch Fabric • Instantaneous closed loop → works in a single rack • Problem: multi-rack routers

  4. Current Router Architectures Optical links 10s of meters Is the closed loop still instantaneous? [Source: N. McKeown]

  5. Time Trends ns

  6. Hiding Propagation Delays • Traditional solutions: • Increase time-slot  poor switch performance • Hide propagation delays using buffers  impractical amount of buffering • Proposed solution: closed loop → open loop • Performance degradation vs. instantaneous closed loop

  7. Outline • CQ: Open-loop switch architecture • Performance Evaluation • Analytical results • Simulations  CQ performance degradation is not significant

  8. Proposed Architecture:The Crosspoint-Queued (CQ) Switch Linecards Switch Core • No queues in the linecards • Buffering only inside the fabric • Independent output schedulers • Drops with full buffers 10s of meters

  9. CQ Properties • Open loop • No communication overhead • No linecard queues • No linecard queue management • “Router on a chip” • Buffering and switch fabric on same chip

  10. Why not 10 years ago? • No need: single rack • No technology: SRAM density • Moore’s law: density doubling every 2.5 years • Aggressive 128x128 CQ switch: 4 cells of 64 bytes per crosspoint → 64 cells today • Conservative buffer requirements • TCP Stanford model with smaller buffer needs [Appenzeller, Keslassy and McKeown ’04]

  11. Outline • CQ: Our open-loop switch architecture • Performance Evaluation • Analytical results • Simulations

  12. 100% Throughput as B→ ∞ • Throughput bounds:OQ(2B-1) ≤ CQ(B)≤ OQ(NB) 100% Throughput 100% Throughput 100% Throughput Buffer size B, LQF scheduling algorithm

  13. Uniform Traffic, B=1 • Uniform traffic model: • At each time-slot, at each of the N inputs: Bernoulli IID packet arrivals with probability . • Each packet is destined for one of the N outputs uniformly at random • Theorem: Under uniform traffic and B=1, the performance of the switch is independent of the specific work-conserving scheduling algorithm • Intuition: Symmetry

  14. Uniform Traffic, B=1 • Theorem: The throughput and waiting time of a CQ switch, B=1 is: • Proof: Based on Z-transform q=1-r/N Goes to 100% as N goes to infinity

  15. Models for larger buffers • Approximate Performance Analysis • Model for exhaustive round-robin scheduling • Based on modifications to polling system with zero switch-over times • Model for random scheduling algorithm • Show 100% throughput as N→∞

  16. Trace-Driven Simulation 32x32 CQ switch with different buffer sizes (in units of 64-byte packets) Buffers of size 64 suffice to ensure 99% throughput for N=32.

  17. Conclusions • CQ is open loop → allows multi-rack configuration • CQ provides easy scheduling • CQ is feasible to implement in a single chip • CQ shows good performance in simulations

  18. Thank You

More Related