1 / 32

Achieving 100% throughput Where we are in the course…

Achieving 100% throughput Where we are in the course…. Switch model Uniform traffic Technique: Uniform schedule (easy) Non-uniform traffic, but known traffic matrix Technique: Non-uniform schedule (Birkhoff-von Neumann) Unknown traffic matrix Technique: Lyapunov functions (MWM)

ian-reeves
Download Presentation

Achieving 100% throughput Where we are in the course…

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Achieving 100% throughputWhere we are in the course… • Switch model • Uniform traffic • Technique: Uniform schedule (easy) • Non-uniform traffic, but known traffic matrix • Technique: Non-uniform schedule (Birkhoff-von Neumann) • Unknown traffic matrix • Technique: Lyapunov functions (MWM) • Faster scheduling algorithms • Technique: Speedup (maximal matchings) • Technique: Memory and randomization (Tassiulas) • Technique: Twist architecture (buffered crossbar) • Accelerate scheduling algorithm • Technique: Pipelining • Technique: Envelopes • Technique: Slicing • No scheduling algorithm • Technique: Load-balanced router

  2. Buffered Crossbars With Performance Guarantees Taken from the 2004 Ph.D. defense of: Shang-Tse (Da) Chuang Department of Electrical Engineering, Stanford University, http://yuba.stanford.edu/~stchuang

  3. Motivation • Network operators want performance guarantees • Throughput guarantee • Delay guarantee • High performance routers use crossbars • Hard to build crossbar-based routers with guarantees • My talk: • How a crossbar with a small amount of internal buffering can give guarantees

  4. Contents • Throughput Guarantees • Buffered Crossbar - 100% Throughput • Buffered Crossbar - Work Conservation

  5. Generic Crossbar-Based Architecture Speedup of S VOQs Scheduler

  6. Admissible Traffic • Traffic Matrix • Traffic is admissible if

  7. Speedup of S Scheduler Throughput Guarantee • 100% Throughput • An algorithm delivers 100% throughput if for any admissible traffic the average backlog is finite

  8. Previous Work Heuristics Wave Front Arbiter [Tamir] Parallel Iterative Matching [Anderson et al.] iSLIP [McKeown] 1985 1990 1995 2000 2005 Maximal Matching S=2 [Dai,Prabhakar] TheoreticallyProven Longest Port First [Mekkittikul et al.] Maximum Weight Matching [McKeown et al.]

  9. Maximal Matching Has Become Hard • TTX Switch Fabric • Uses maximal matching • Speedup less than 2 • Consumes up to 8kW • Limited to ~2.5Tb/s • No 100% throughput guarantee

  10. Traditional Crossbar • Crossbar Requirements • An input can send at most one cell • An output can receive at most one cell • Scheduling Problem • Must overcome two constraints simultaneously • New Crossbar • Relieve contention • Remove dependency between inputs and outputs

  11. Contents • Throughput Guarantees • Buffered Crossbar - 100% Throughput • Buffered Crossbar - Work Conservation • Delay Guarantees • Traditional Crossbar – Emulating an OQ Switch • Buffered Crossbar – Emulating an OQ Switch

  12. Buffered Crossbar • Arrival Phase • Scheduling Phases – Speedup of 2 • Departure Phase

  13. Scheduling Phase • Input Schedule • Each input selects in parallel a cell for an empty crosspoint • Output Schedule • Each output selects in parallel a cell from a full crosspoint

  14. Example of Input/Output Scheduling • Round-robin Policy • Each input schedules in a round-robin order • Each output schedules in a round-robin order

  15. Previous Work • Buffered Crossbar Simulations [Rojas-Cessa et al. 2001] • 32x32 switch, Uniform Bernoulli Traffic, Round-Robin, S=1

  16. 100% Throughput • Theorem 1 • A buffered crossbar with speedup of 2 delivers 100% throughput for any admissible Bernoulli iid traffic using any work-conserving input/output schedules.

  17. <1-ε ε <1-ε Intuition of Proof 1 2 1-ε ε = 2- ε 1-ε + + • When a flow is backed up, the services for this backlog exceeds the arrivals

  18. Contents • Throughput Guarantees • Buffered Crossbar - 100% Throughput • Buffered Crossbar - Work Conservation • Delay Guarantees • Traditional Crossbar – Emulating an OQ Switch • Buffered Crossbar – Emulating an OQ Switch

  19. Work Conservation • Work-conserving Property • If there is a cell for a given output in the system, that output is busy. Output Queued (OQ) Switch

  20. Emulating an OQ switch • Under identical inputs, the departure time of every cell from both switches is identical  ?

  21. Input Priority List 7 6 5 2 3 2 8 1 9 5 1 6 4 1 4 3 • Label each cell with their corresponding departure times • Arrange input cells into an input priority list • Output selects crosspoint with earliest departure time

  22. Bad guy Bad guys Good guy Input Priority List 2 2 7 6 5 2 8 3 1 9 6 5 1 4 4 1 3 • Label each cell with their corresponding departure times • Arrange input cells into an input priority list • Output selects crosspoint with earliest departure time

  23. 2 bad guys 2 good guys Definitions 2 7 6 5 2 2 8 3 1 9 6 5 1 4 4 1 3 • Output Margin – cells at its output with earlier departure time • Input Margin – cells ahead in input priority list destined to different outputs • Total Margin – Output Margin minus Input Margin

  24. Emulation of FIFO OQ Switch 2 7 6 5 2 2 8 3 1 9 6 5 1 4 4 1 3 • Scheduling Phase • Crosspoint is full – Output Margin will increase by one • Crosspoint is empty – Input Margin will decrease by one • Total Margin increases by two

  25. Emulation of FIFO OQ Switch 3 7 6 5 2 2 8 3 2 1 9 6 5 1 4 4 1 3 • Arrival Phase • Input Margin might increase by one • Departure Phase • Output Margin will decrease by one • Total Margin decreases by at most two

  26. Emulation of FIFO OQ Switch 7 6 5 3 2 2 8 3 2 9 6 5 4 4 3 • Lemma 1 • For every time slot, total margin does not decrease

  27. 7 6 5 3 2 FIFO Insertion Policy 4 7 2 8 3 2 9 6 5 4 4 3 • Arrival Phase • Cell for non-empty VOQ, insert behind cells for same output • Cell for empty VOQ, insert at head of input priority list

  28. FIFO Insertion Policy 7 6 5 4 3 2 7 2 8 3 2 9 6 5 4 4 3 • Lemma 2 • An arriving cell will have a non-negative total margin

  29. Emulation of FIFO OQ Switch • Theorem 2 • A buffered crossbar with speedup of 2 can exactly emulate a FIFO OQ switch. • Result was shown independently • B. Magill, C. Rohrs, R. Stevenson, “Output-Queued Switch Emulation by Fabrics With Limited Memory”, in IEEE Journal on Selected Areas in Communications, pp.606-615, May. 2003. • Theorem 3 • A buffered crossbar with speedup of 2 can be work-conserving with a distributed algorithm.

  30. Summary • Buffered crossbars • Uses crosspoints to relieve contention • Inputs and outputs schedule independently and in parallel • Performance guarantees • Throughput – any work-conserving input/output schedule • Work Conservation – simple insertion policy

  31. Relevant Papers • Crossbars • Shang-Tse Chuang, Ashish Goel, Nick McKeown, Balaji Prabhakar, “Matching Output Queuing with a Combined Input Output Queued Switch,” IEEE Journal on Selected Areas in Communications, vol.17, n.6, pp.1030-1039, Dec.1999. • Buffered Crossbars • Shang-Tse Chuang, Sundar Iyer, Nick McKeown, “Practical Algorithms for Performance Guarantees in Buffered Crossbars,” in preparation for IEEE/ACM Transactions on Networking.

  32. Thank you!

More Related