CS244 Spring 2014 Packet Switching Sachin Katti
Questions What are the main goals the designer is trying to accomplish when designing a network switch? What is the “ideal” switch?
What you said Henry Wang: Today, end hosts and routers can run so rapidly that the latency costs in networking have almost completely shifted to the speed of electric signals in wires, making the effects of further optimizations in switching minimal at best. Thus, no matter how brilliant the switching algorithm introduced by these authors may be, it is unlikely to make a large impact today, let alone in the next decade.
Data Data H H Lookup Address Lookup Address Lookup Address Update Header Update Header Update Header Forwarding Table Forwarding Table Forwarding Table Data H Output Queued Packet Switch Queue Packet Buffer Memory Queue Packet Buffer Memory Queue Packet Buffer Memory
Data Data H H Lookup Address Lookup Address Lookup Address Update Header Update Header Update Header Forwarding Table Forwarding Table Forwarding Table Data H Input Queued Packet Switch Queue Packet Queue Packet Queue Packet Buffer Memory Buffer Memory Buffer Memory
7 Output Queued Packet Switch The best that any queueing system can achieve.
Properties of OQ switches • They are “work conserving”. • Throughput is maximized. • Expected delay is minimized. • We can control packet delay. Broadly speaking: When possible, use an OQ design.
9 Input Queued Packet SwitchHead of Line Blocking OQ Switch
Input Queued Packet SwitchWith Virtual Output Queues VOQs OQ Switch
Practical Goal Problem: Memory bandwidth Therefore: Try to approximate OQ. In this paper, we are just looking at those switches that attempt to match Property 2: Maximize throughput.
Questions • What is a virtual output queue (VOQ)? • How does a VOQ help? • What does the scheduler/arbiter do?
Iteration: 1 1 1 1 2 2 2 2 #1 1 1 3 3 3 3 2 2 4 4 4 4 Grant Accept 3 3 1 1 1 1 1 1 4 4 2 2 2 2 2 2 #2 3 3 3 3 3 3 4 4 4 4 4 4 Parallel Iterative Matching uar selection uar selection Request
PIM Properties • Guaranteed to find a maximal match in at most N iterations. • Inputs and outputs make decisions independently and in parallel. • In general, will converge to a maximal match in < N iterations. • How many iterations should we run?
Parallel Iterative Matching FIFO Maximum Size Output Queued Simulation 16-port switch Uniform iid traffic
Parallel Iterative Matching PIM with one iteration FIFO Maximum Size Output Queued Simulation 16-port switch Uniform iid traffic
Parallel Iterative Matching PIM with one iteration PIM with four iterations Simulation 16-port switch Uniform iid traffic
Parallel Iterative MatchingNumber of iterations Consider the n requests to output j k Requesting inputs receiving no other grants j n-k Requesting inputs receiving other grants
What you said Manikanta Kotaru: I particularly liked the argument about how maximum matching leads to starvation. With increased processing power, coming up with maximum bipartite matching may no longer be so slow as the paper depicts. However, the argument reflects why the objective of maximizing the throughput may not be the most suitable objective function, and how the proposed algorithm may the optimal algorithm when viewed under “right” objective function, which may involve tradeoff between several useful characteristics.
What you said • Raejoon Jung: First, I think the idea of separation of scheduling and forwarding contributes to the flexibility of the scheduling algorithm to support bandwidth guarantees and fairness. • Lisa Yan: I realize the impact the findings of this paper must have had in the key transition away from centralized networks. He only briefly discusses separating scheduling hardware from data forwarding hardware, but I believe that this insight was also key to the development of LAN technology.
Throughput “Maximize throughput” is equivalent to “remain stable for all non-oversubscribing traffic matrices”. i.e.l < m for every queue in the system, for all such traffic matrices. Observations: • Burstiness of arrivals does not affect throughput • When traffic is uniform, solution is trivial
Throughput 100% throughput is now known to be theoretically possible with: • Input queued switch, with VOQs, and • An arbiter to pick a permutation to maximize the total matching weight (e.g. weight is VOQ occupancy or packet waiting time) It is practically possible with: • IQ switch, VOQs, all running twice as fast • An arbiter running a maximal match (e.g. PIM) (Learn more in EE384x)
What you said • Alexander Valderrama: … I found that apparently high end switches can now handle hundreds or even thousands of connections. With that in mind I do wonder if the algorithms in this paper are still effective given that the author states they are only designed for switches, "in the range of 16 by 16 to 64 by 64". In particular when it comes to the parallel iterative matching algorithm they claim that their implementation stops at four iterations because "additional matches are hardly ever found after four iterations". However, with a couple thousand ports would the number of iterations be low enough that this algorithm could still feasibly work?
Questions • Why does the PIM paper talk about TDM scheduled traffic? • What about multicast? • Multiple priorities?
Question What else does a router need to do apart from switching packets?