Stress-Resistant Scheduling Algorithms for CIOQ Switches

Stress Resistant Scheduling Algorithms for CIOQ Switches Prashanth Pappu Applied Research Laboratory Washington University in St Louis “Stress Resistant Scheduling Algorithms for CIOQ switches”, Prashanth Pappu, Jon Turner. To appear in ICNP 2003.

ControlProcessor Switch Fabric IPP IPP OPP IPP OPP OPP IPP IPP OPP OPP IPP OPP LineCard LineCard LineCard LineCard LineCard LineCard Anatomy of a Router Port processor queue packets and make routing decisions Line cards encode data for transmission on target physical layer. Control processor – routing protocols and monitoring functions. Prashanth Pappu

Input Ports Switching Fabric Output Ports Output Queuing • Queuing is done only at output ports. • Maximizes throughput. • Contentions between packets - only at output ports. • Speedup=N, impractical but ideal model. Prashanth Pappu

… … Input Ports Switching Fabric Output Ports Centralized Scheduler Combined Input Output Queuing (CIOQ) • Use of VOQs. • Crossbar configured by centralized scheduler. • Bipartite graph matching problem. Prashanth Pappu

Stability results • Maximum size matching (MSM) – stable for i.i.d, uniform, admissible traffic. • Maximum weight matching (MWM) – stable for independent, admissible traffic. • Too complex, O(N5/2) and O(N3logN). • Switch with 10 Gb/s links has < 40 ns to make scheduling decision. • Maximal size matching algorithms – Parallel Iterative Matching (PIM) and iterative SLIP (iSLIP). Prashanth Pappu

Parallel iterative Matching (PIM) • Iterative matching algorithm • each unmatched input sends request to every output for which it has a queued cell. • unmatched outputs randomly pick a request and send grant. • if input receives multiple grants, it picks one randomly. • O(log N) convergence. Prashanth Pappu

iterative SLIP (iSLIP) • Iterative matching algorithm • unmatched inputs send requests to unmatched outputs (for which they have cells) • unmatched outputs pick a request that appears next in a fixed round-robin order from an input pointer. (input pointer is updated only in first iteration) • if input gets multiple grants, it picks one that appears next in a fixed round robin order from an output pointer. (update of output pointer) • Desynchronization effect. • Simple to implement but do not perform well under extreme traffic conditions. Prashanth Pappu

Worst case results • Critical Cells First (CCF) can emulate output queuing with speedup =2. • Lowest occupancy output first algorithm is work conserving with speedup =2. • Can be augmented with timestamps to emulate output queuing. (speedup=3) Prashanth Pappu

LOOFA • Iterative matching algorithm • unmatched inputs send requests to outputs with lowest occupancy (for which they have queued cells) • outputs pick a request randomly and send grant to input • O(N) iterations to perform correctly. • Work conserving with speedup of 2. • Significant result but not practical. Prashanth Pappu

Traffic in IP networks • Unregulated nature of IP networks can cause sustained overloads. • use of slow congestion control mechanisms • limited route diversity makes congested links common • use of route selection mechanisms not guided by session bandwidth needs • sudden route changes causing rapid traffic shifts • malicious users • How do practical scheduling algorithms perform in these conditions? Prashanth Pappu

Solution • We use targeted stress tests to • study performance of practical scheduling algorithms under extreme conditions • study performance of work conserving scheduling algorithms under speedups < 2 • design stress resistant scheduling algorithms which maintain throughput under uniform traffic and stress tests and can still be implemented at high speeds. Prashanth Pappu

Miss fraction • Previous work use average queuing delay as a metric. • Not useful under inadmissible traffic conditions. • Miss fraction miss fraction = 1 – NA/NI • Determines relative loss in throughput. Prashanth Pappu

phase 1 phase 2 phase 3 phase 4 Stress Test Adversary approach in overloading (stressing) various outputs. Output with empty queues have cells queued at various inputs. Inputs with cells for an empty output also have cells queued for other outputs. Test can be varied by changing number of participating inputs or phases. Prashanth Pappu

Stress Test (Example) PIM (speedup =1.5). Stress test with 3 participating inputs, 4 phases. Prashanth Pappu

PIM (under uniform traffic) Average Queuing delays Miss fraction Prashanth Pappu

iSLIP (under uniform traffic) Average Queuing delays Miss fraction Prashanth Pappu

Stress Tests Test A (Worst case for PIM(4), speedup=2) Test B (Worst case for LOOFA, speedup=2) Prashanth Pappu

Stress resistant algorithms • Better performance of LOOFA suggests, ordering outputs is the key. • Complete ordering can make algorithms too complex to implement. • But traffic conditions are persistent and change slowly, use approximate ordering schemes. • Lowest Layer Selection (LLS) heuristic which achieves a coarser ordering of outputs. • Odd-even sorting which achieves approximate ordering but converges to ideal ordering under persistent traffic conditions. Prashanth Pappu

Lowest Layer Selection • achieves coarser ordering • bigger layers for larger queue lengths • beyond a queue limit all outputs are treated equal • number of layers independent of N. • algorithms give priority to outputs in lowest layer in accept phase. • priority encoder or N-way minimum finding circuit can be used on a grant vector. Prashanth Pappu

Lowest Layer Selection - Random (LLS-R) • Iterative matching algorithm • each unmatched input sends request to every output for which it has a queued cell. • unmatched outputs randomly pick a request and send grant. • if input receives multiple grants, it picks one randomly from lowest layer. • O(log N) convergence still holds. Prashanth Pappu

Lowest Layer Selection –SLIP (LLS-S) • Iterative matching algorithm • unmatched inputs send requests to unmatched outputs (for which they have cells) • unmatched outputs pick a request that appears next in a fixed round-robin order from an input pointer. (input pointer is updated only in first iteration) • if input gets multiple grants, it picks one that appears next in the lowest layer in a fixed round robin order from an output pointer. (update of output pointer) • Both LLS-R and LLS-S have the same performance as PIM and iSLIP under uniform traffic. Prashanth Pappu

Stress Test Miss fractions for LLS-R, LLS-S (using 16 layers) and LOOFA. Test B Test A Prashanth Pappu

Stress Test Miss fractions for LLS-S and LLS-R (single iteration) with varying layers. LLS-S (Test A) LLS-R (Test A) Prashanth Pappu

Approximate LOOFA (A-LOOFA) • LOOFA is complex but can be used as the basis for a practical algorithm (with similar performance) Prashanth Pappu

Approximate LOOFA (A-LOOFA) • Matching in A-LOOFA is accomplished using a simple combinational circuit. • O(N) but constant factor is determined by gate delays (2N times delay in each block). • .13 um ASIC process, gate delays are 25-50 ps. Match can be completed in 3.2-6.4 ns. Prashanth Pappu

A-LOOFA • Columns are ordered using odd-even sort. • for all even j < N, swap Bjand qjwith Bj+1and qj+1, if qj > qj+1. • Similarly, for all odd j < N-1 • Rows are ordered using a permutation based on perfect shuffle (to ensure fairness). • for all even i<N, generate a pseudo random bit xi. • if xi = 0, values in row i are moved to row i/2 and those in i+1 are moved to (N+i)/2. • else, values in row i are moved to row (N+i)/2 and values in row i+1 are moved to row i/2. Prashanth Pappu

A-LOOFA performance Test A Test B Prashanth Pappu

Stress-Resistant Scheduling Algorithms for CIOQ Switches

Stress-Resistant Scheduling Algorithms for CIOQ Switches

Presentation Transcript

Scheduling Algorithms

Scheduling Algorithms

Pipelined Two Step Iterative Matching Algorithms for CIOQ Crossbar Switches

Enabling Class of Service for CIOQ Switches with Maximal Weighted Algorithms

Stress-Prone and Stress-Resistant Personalities

Scheduling algorithms for CIOQ switches

Scheduling Crossbar Switches

Optimal Algorithms for Task Scheduling

Enabling Class of Service for CIOQ Switches with Maximal Weighted Algorithms

Localized Asynchronous Packet Scheduling for Buffered Crossbar Switches

Scheduling Algorithms

CPU Scheduling Algorithms

Memory Management Algorithms for DSM Switches

optimal scheduling algorithms for input-queued switches

Scheduling algorithms for CIOQ switches

Scheduling Algorithms

CPU Scheduling Algorithms

Scheduling Algorithms

Approximation Algorithms for Scheduling

Scheduling Algorithms

Hierarchical Scheduling Algorithms

Scheduling Crossbar Switches