Handout # 7: Input-queued Switches – Head of Line Blocking, Scheduling

CSC 2203 – Packet Switch and Network Architectures Handout # 7: Input-queued Switches – Head of Line Blocking, Scheduling Professor Yashar Ganjali Department of Computer Science University of Toronto yganjali@cs.toronto.edu http://www.cs.toronto.edu/~yganjali TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:

Announcements • Reading for next week: [6] and [7] • Final project proposal • Due: 5PM Friday October 12th. • Try to be as specific as you can; and • Start as soon as you can. • Presentations • Each team presents 1 paper • Preferably related to their own project • Talk to me before choosing a paper • Each presentation 25 mins (including Q&A). • Volunteers for next week? University of Toronto – Fall 2012

Where We Are • We have studied output-queued and shared-memory switches • Why they provide an ideal performance (work-conserving) • Why they can hardly be implemented (speed-up) • We have also studied techniques for • Output link scheduling • Fairness • Parallelism University of Toronto – Fall 2012

Next • We will now study input-queued switches • Why they solve the speed-up problem • A first problem: head-of-line blocking reduces throughput • Solution: virtual output queues • A second problem: arbitration between virtual output queues • Solution: scheduling algorithms University of Toronto – Fall 2012

Outline – Part I • Head-of-Line Blocking • HoL Blocking in Small Switches • 58% Throughput University of Toronto – Fall 2012

Input-Queued Switch: How It Works The switch matches inputs and outputs… Packets are queued at the inputs. University of Toronto – Fall 2012

Input-Queued Switch: How It Works University of Toronto – Fall 2012

Input-Queued Switch: Speed-Up Advantage At most one packet leaves from each input (arrives to each output)  speed-up=1, not N University of Toronto – Fall 2012

Head-of-Line Blocking Blocked! Blocked! Blocked! The switch is NOT work-conserving! University of Toronto – Fall 2012

University of Toronto – Fall 2012

Glimpse: Virtual Output Queues University of Toronto – Fall 2012

Question: Do More Lanes Help? • Answer: It depends on the scheduling. Head of Line Blocking University of Toronto – Fall 2012

Question: Do More Lanes Help? • Answer: It depends on the scheduling. VOQs with Bad Scheduling University of Toronto – Fall 2012

Question: Do More Lanes Help? • Answer: It depends on the scheduling. Good Scheduling? Depends on traffic matrix… University of Toronto – Fall 2012

Assumptions • As in analysis of OQ switch: • Time is slotted • At each time-slot, at each of the N inputs: Bernoulli IID packet arrivals with probability  • Each packet is destined for one of the N outputs uniformly at random • By symmetry, consider some given output • Scheduling: at each time-slot the output picks an HoL u.a.r. Problem. What throughput  can we get? University of Toronto – Fall 2012

HoL Blocking in 2x2 Switch University of Toronto – Fall 2012

Balls-and-Bins Model University of Toronto – Fall 2012

Balls-and-Bins Model • Saturated switch • Assume infinite number of packets in each queue • They are all destined to some output u.a.r. (random coloring of packets) • Balls-and-bins model • N outputs  N bins • N HoL packets  N balls • At each time-slot • Remove one ball from each non-empty bin • Assign free balls to bins independently and u.a.r. University of Toronto – Fall 2012

Markov Chain • There are three states for the bin occupancy: (2,0), (1,1), (0,2) • E.g., (2,0) means both HoL packets are destined to first output • We get a Markov chain: (2,0) (1,1) (0,2) University of Toronto – Fall 2012

Transition Probabilities in Markov Chain • Transition from (2,0) 1/2 1/2 University of Toronto – Fall 2012

Transition Probabilities in Markov Chain • Equilibrium state distribution: ={¼, ½, ¼} • Output throughput = 1-P(output empty) = 75% 1/2 1/2 1/2 1/2 1/4 (2,0) (1,1) (0,2) 1/2 1/4 University of Toronto – Fall 2012

1/2 1/2 1/2 (2,0) (1,1) 1/2 Side Note: State Collapse • Symmetric Markov chain • State collapse: (2,0) and (1,1) • Equilibrium (collapsed) state distribution: (1/2,1/2)  get real state distribution University of Toronto – Fall 2012

1/3 2/3 2/9 2/3 2/9 (3,0,0) (2,1,0) (1,1,1) 2/3 1/9 1/9 3x3 Switch • Markov chain with following states:(3,0,0),(0,3,0),(0,0,3),(2,1,0),(2,0,1),(1,2,0),(0,2,1),(0,1,2),(1,0,2)(1,1,1) • State collapse into: (3,0,0),(2,1,0) and (1,1,1) University of Toronto – Fall 2012

3x3 Switch • Equilibrium state distribution • Per-output throughput • 75% for 2x2, 68% for 3x3… but state space explosion for large N University of Toronto – Fall 2012

Method #2: Recurrence Equations • Consider a given bin (output) • Let Xt be the number of balls in this bin • Number of HoL packets for this output • Let At be the number of arrivals to this bin • Let Bt be the number of departures from all bins • The recurrence equation is: University of Toronto – Fall 2012

Method #2: Recurrence Equations • The only queues with new HoL packets are those from which HoL packets left at the last time-slot • At+1 is the sum of Bt Bernoulli I.I.D. variables: University of Toronto – Fall 2012

Method #2: Recurrence Equations • Steady-state: E[B] is N times the per-output throughput • As N  , binomial goes to Poisson and • (N) x (1/N)  (approximation) University of Toronto – Fall 2012

Method #2: Recurrence Equations • Same equations lead to same results (cf OQ switch) • When switch is saturated, there are N balls for N bins: EX=1 • Hence University of Toronto – Fall 2012

IQ switch with HoL blocking OQ switch Delay 0% 20% 40% 60% 80% 100% Load HoL Blocking vs. OQ Switch University of Toronto – Fall 2012

Where We Are • We introduced Input-Queued switches. • We saw that HoL blocking reduces throughput. • We use VOQs to solve HoL blocking problem. University of Toronto – Fall 2012

Next • Scheduling in Input-Queued Switches • Uniform Traffic • Maximum Size Matching (MSM) • Maximum Weighted Matching (MWM) • Maximal matching with speedup • Heuristic algorithms (PIMs, iSLIP, …) University of Toronto – Fall 2012

Outline • Uniform traffic • Uniform cyclic • Random permutation • Wait-until-full • Non-uniform traffic, known traffic matrix • Birkhoff-von-Neumann • Unknown traffic matrix • Maximum Size Matching • Maximum Weight Matching University of Toronto – Fall 2012

VOQs: How Packets Move VOQs Scheduler University of Toronto – Fall 2012

Basic Switch Model S(n) Q11(n) A11(n) D11(n) 1 1 A1(n) A1N(n) D1N(n) AN1(n) DN1(n) AN(n) N N ANN(n) DNN(n) QNN(n) University of Toronto – Fall 2012

Notations: Arrivals • Aij(n): packet arrivals at input i for output j at time-slot n • Aij(n) = 0 or 1 • ij=E[Aij(n)]: arrival rate • =[ij]: traffic matrix • A=[Aij(n)] admissible iff: • For all i, j ij < 1: no input is oversubscribed • For all j, iij < 1: no output is oversubscribed University of Toronto – Fall 2012

Notations: Schedule • Qij(n): queue size of VOQ (i,j) • Q=[Qij(n)] • Sij(n): whether the schedule connects input i to output j • Sij(n) = 0 or 1 • No speedup: each input is connected to at most one output, each output to at most one input • We will assume that each input is connected to exactly one output, and each output to exactly one input S=[Sij(n)] permutation matrix University of Toronto – Fall 2012

Scheduling Algorithm • What it does: determine S(n) • How: • Either using traffic matrix , • Or, in most cases, using queue sizes Q(n) (because  unknown) • Objective: 100% throughput • So that lines are fully utilized • Secondary objective: minimize packet delays/backlogs University of Toronto – Fall 2012

What is “100% throughput”? • Work-conserving scheduler • Definition: If there is one or more packet in the system for an output, then the output is busy. • An output queued switch is work-conserving. • Each output can be modeled as an independent single-server queue. • If λ <  then E[Qij(n)] < C for some C. • Therefore, we say it achieves “100% throughput”. • For fixed-sized packets, work-conservation also minimizes average packet delay. • Q: What happens when packet sizes vary? • Non work-conserving scheduler • An input-queued switch is, in general, non work-conserving. • Q: What definitions make sense for “100% throughput”? University of Toronto – Fall 2012

We will focus on this definition. Common Definitions of 100% throughput Work-conserving For alln,i,j, Qij(n) < C,i.e., For alln,i,j, E[Qij(n)] < Ci.e., Departure rate = arrival rate,i.e., weaker University of Toronto – Fall 2012

Uniform Traffic • Definition: ij= for all i,j • i.e., all input-output pairs have same traffic rate • Condition for admissible traffic:  < 1/N • Example: Bernoulli traffic •  = /N • Arrivals at input i are Bernoulli() and i.i.d. University of Toronto – Fall 2012

100% Throughput for Uniform Traffic • Nearly all algorithms in literature can give 100% throughput when traffic is uniform • For example: • Uniform cyclic. • Random permutation. • Wait-until-full [simulations]. • Maximum size matching (MSM) [simulations]. • Maximal size matching (e.g. WFA, PIM, iSLIP) [simulations]. University of Toronto – Fall 2012

A 1 A 1 A 1 2 2 B B 2 B 3 3 C C 3 C 4 4 D D 4 D Uniform Cyclic Scheduling Each (i, j) pair is served every N time slots: Geom/D/1 λ=/N < 1/N 1/N Stable for  < 1 University of Toronto – Fall 2012

Wait Until Full • We don’t have to do much at all to achieve 100% throughput when arrivals are Bernoulli IID uniform. • Simulation suggests that the following algorithm leads to 100% throughput. • Wait-until-full: • If any VOQ is empty, do nothing (i.e. serve no queues). • If no VOQ is empty, pick a random permutation. University of Toronto – Fall 2012

Handout # 7: Input-queued Switches – Head of Line Blocking, Scheduling

Handout # 7: Input-queued Switches – Head of Line Blocking, Scheduling

Presentation Transcript

HANDOUT

Handout

Handout

HANDOUT

Handout

Handout