1 / 44

Buffered Crossbars With Performance Guarantees

Buffered Crossbars With Performance Guarantees. EE384Y Thursday, April 29, 2004. Shang-Tse (Da) Chuang Department of Electrical Engineering, Stanford University, http://yuba.stanford.edu/~stchuang. Motivation. Network operators want performance guarantees Throughput guarantee

madge
Download Presentation

Buffered Crossbars With Performance Guarantees

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Buffered Crossbars With Performance Guarantees EE384Y Thursday, April 29, 2004 Shang-Tse (Da) Chuang Department of Electrical Engineering, Stanford University, http://yuba.stanford.edu/~stchuang

  2. Motivation • Network operators want performance guarantees • Throughput guarantee • Delay guarantee • High performance routers use crossbars • Hard to build crossbar-based routers with guarantees • My talk: • How a crossbar with a small amount of internal buffering can give guarantees

  3. Contents • Throughput Guarantees • Buffered Crossbar - 100% Throughput • Buffered Crossbar - Work Conservation • Delay Guarantees • Traditional Crossbar – Emulating an OQ Switch • Buffered Crossbar – Emulating an OQ Switch

  4. Generic Crossbar-Based Architecture Speedup of S VOQs Scheduler

  5. Admissible Traffic • Traffic Matrix • Traffic is admissible if

  6. Speedup of S Scheduler Throughput Guarantee • 100% Throughput • An algorithm delivers 100% throughput if for any admissible traffic the average backlog is finite

  7. Previous Work Heuristics Wave Front Arbiter [Tamir] Parallel Iterative Matching [Anderson et al.] iSLIP [McKeown] 1985 1990 1995 2000 2005 Maximal Matching S=2 [Dai,Prabhakar] TheoreticallyProven Longest Port First [Mekkittikul et al.] Maximum Weight Matching [McKeown et al.]

  8. Maximal Matching Has Become Hard • TTX Switch Fabric • Uses maximal matching • Speedup less than 2 • Consumes up to 8kW • Limited to ~2.5Tb/s • No 100% throughput guarantee

  9. Traditional Crossbar • Crossbar Requirements • An input can send at most one cell • An output can receive at most one cell • Scheduling Problem • Must overcome two constraints simultaneously • New Crossbar • Relieve contention • Remove dependency between inputs and outputs

  10. Contents • Throughput Guarantees • Buffered Crossbar - 100% Throughput • Buffered Crossbar - Work Conservation • Delay Guarantees • Traditional Crossbar – Emulating an OQ Switch • Buffered Crossbar – Emulating an OQ Switch

  11. Buffered Crossbar • Arrival Phase • Scheduling Phases – Speedup of 2 • Departure Phase

  12. Scheduling Phase • Input Schedule • Each input selects in parallel a cell for an empty crosspoint • Output Schedule • Each output selects in parallel a cell from a full crosspoint

  13. Example of Input/Output Scheduling • Round-robin Policy • Each input schedules in a round-robin order • Each output schedules in a round-robin order

  14. Previous Work • Buffered Crossbar Simulations [Rojas-Cessa et al. 2001] • 32x32 switch, Uniform Bernoulli Traffic, Round-Robin, S=1

  15. 100% Throughput • Theorem 1 • A buffered crossbar with speedup of 2 delivers 100% throughput for any admissible Bernoulli iid traffic using any work-conserving input/output schedules.

  16. <1-ε ε <1-ε Intuition of Proof 1 2 1-ε ε = 2- ε 1-ε + + • When a flow is backed up, the services for this backlog exceeds the arrivals

  17. 0 if buffer empty 1 if buffer full Bij = Intuition of Proof Qij = Queue Length

  18. Intuition of Proof • Recall • If Qij > 0, then for Xij, • Expected increase is 2 • Expected decrease If Bij = 1, then in output schedule one B*j will decrease If Bij = 0,then in input schedule one Qi* will decrease • Thus expected decrease is 2

  19. Contents • Throughput Guarantees • Buffered Crossbar - 100% Throughput • Buffered Crossbar - Work Conservation • Delay Guarantees • Traditional Crossbar – Emulating an OQ Switch • Buffered Crossbar – Emulating an OQ Switch

  20. Work Conservation • Work-conserving Property • If there is a cell for a given output in the system, that output is busy. Output Queued (OQ) Switch

  21. Emulating an OQ switch • Under identical inputs, the departure time of every cell from both switches is identical  ?

  22. Input Priority List 7 6 5 2 3 2 8 1 9 5 1 6 4 1 4 3 • Label each cell with their corresponding departure times • Arrange input cells into an input priority list • Output selects crosspoint with earliest departure time

  23. Bad guy Bad guys Good guy Input Priority List 2 2 7 6 5 2 8 3 1 9 6 5 1 4 4 1 3 • Label each cell with their corresponding departure times • Arrange input cells into an input priority list • Output selects crosspoint with earliest departure time

  24. 2 bad guys 2 good guys Definitions 2 7 6 5 2 2 8 3 1 9 6 5 1 4 4 1 3 • Output Margin – cells at its output with earlier departure time • Input Margin – cells ahead in input priority list destined to different outputs • Total Margin – Output Margin minus Input Margin

  25. Emulation of FIFO OQ Switch 2 7 6 5 2 2 8 3 1 9 6 5 1 4 4 1 3 • Scheduling Phase • Crosspoint is full – Output Margin will increase by one • Crosspoint is empty – Input Margin will decrease by one • Total Margin increases by two

  26. Emulation of FIFO OQ Switch 3 7 6 5 2 2 8 3 2 1 9 6 5 1 4 4 1 3 • Arrival Phase • Input Margin might increase by one • Departure Phase • Output Margin will decrease by one • Total Margin decreases by at most two

  27. Emulation of FIFO OQ Switch 7 6 5 3 2 2 8 3 2 9 6 5 4 4 3 • Lemma 1 • For every time slot, total margin does not decrease

  28. 7 6 5 3 2 FIFO Insertion Policy 4 7 2 8 3 2 9 6 5 4 4 3 • Arrival Phase • Cell for non-empty VOQ, insert behind cells for same output • Cell for empty VOQ, insert at head of input priority list

  29. FIFO Insertion Policy 7 6 5 4 3 2 7 2 8 3 2 9 6 5 4 4 3 • Lemma 2 • An arriving cell will have a non-negative total margin

  30. Emulation of FIFO OQ Switch • Theorem 2 • A buffered crossbar with speedup of 2 can exactly emulate a FIFO OQ switch. • Result was shown independently • B. Magill, C. Rohrs, R. Stevenson, “Output-Queued Switch Emulation by Fabrics With Limited Memory”, in IEEE Journal on Selected Areas in Communications, pp.606-615, May. 2003. • Theorem 3 • A buffered crossbar with speedup of 2 can be work-conserving with a distributed algorithm.

  31. Contents • Throughput Guarantees • Buffered Crossbar - 100% Throughput • Buffered Crossbar - Work Conservation • Delay Guarantees • Traditional Crossbar – Emulating an OQ Switch • Buffered Crossbar – Emulating an OQ Switch

  32. one output, single PIFO queue push constrained traffic Push In First Out (PIFO) Delay Guarantees one output, many logical FIFO queues Weighted fair queueing sorts packets 1 constrained traffic m PIFO models • Weighted Fair Queueing • Weighted Round Robin • Strict priority etc.

  33. Achieving Delay Guarantees in Crossbars • Theorem 4 • A crossbar switch with a speedup of 2 can exactly emulate an OQ switch which provides delay guarantees. • Theorem 5 • A crossbar switch with a speedup of 2-1/N is necessary and sufficient to exactly emulate an NxN FIFO OQ switch.

  34. Contents • Throughput Guarantees • Buffered Crossbar - 100% Throughput • Buffered Crossbar - Work Conservation • Delay Guarantees • Traditional Crossbar – Emulating an OQ Switch • Buffered Crossbar – Emulating an OQ Switch

  35. Emulation of PIFO OQ Switch 1 7 6 5 2 3 4 8 3 2 1 9 7 6 6 5 2 1 5 4 4 1 3 2 • Crosspoint Blocking • A cell in the crosspoint has a larger departure time • Swap Phase • If an arriving cell has a smaller departure time than the cell in the crosspoint, swap the two cells

  36. PIFO Insertion Policy 5 1 7 6 5 4 3 4 1 8 3 2 1 2 3 9 7 6 2 1 5 4 1 2 3 • Arrival Phase • Insert cell directly behind cell with departure time just earlier • If cell has earliest departure time, then insert at head of input priority list

  37. PIFO Emulation • Theorem 6 • A buffered crossbar with speedup of 3 can exactly emulate an OQ switch with delay guarantees.

  38. Header Scheduling Architecture Input Linecard Output Linecard Buffered Crossbar Grants Header Scheduler Headers

  39. 8 3 1 1 1 Header Scheduling 2 2 7 6 3 2 5 2 6 5 9 4 2 2 4 4 3 • Schedule headers instead of cells • Headers are converted into grants in output schedule • Grants are sent back to the input

  40. GrantFIFO Grant Stream Input Linecard Output Linecard Buffered Crossbar Grants Header Scheduler Headers • Input can receive N grants in one scheduling phase • Bounded to p+N-1 grants over p consecutive phases

  41. Counter Example p=1 p=2 p=3 p=4 p=5 p=6 1 2 3 Cells ToOutput Queue 3 1 2 1 1 3 Crosspoints 3 3 3 2 3 2 GrantFIFO 3 3 3 3 1 1 2 2 3 Grants 3 3 3 3

  42. Modified Buffered Crossbar • Modified Buffered Crossbar • N cells per crosspoint – requires N3 cell buffers • N cells per output – requires N2 cell buffers • Theorem 7 • A modified buffered crossbar with speedup of 2 can emulate an OQ switch with delay guarantees with a fixed delay of N scheduling phases.

  43. Summary • Buffered crossbars • Uses crosspoints to relieve contention • Inputs and outputs schedule independently and in parallel • Performance guarantees • Throughput – any work-conserving input/output schedule • Work Conservation – simple insertion policy • Delay – header scheduling

  44. Relevant Papers • Crossbars • Shang-Tse Chuang, Ashish Goel, Nick McKeown, Balaji Prabhakar, “Matching Output Queuing with a Combined Input Output Queued Switch,” IEEE Journal on Selected Areas in Communications, vol.17, n.6, pp.1030-1039, Dec.1999. • Buffered Crossbars • Shang-Tse Chuang, Sundar Iyer, Nick McKeown, “Practical Algorithms for Performance Guarantees in Buffered Crossbars,” Stanford HPNG Technical Report TR03-HPNG-061501 .

More Related