1 / 11

EE384x: Packet Switch Architectures I

EE384x: Packet Switch Architectures I. a) Delay Guarantees with Parallel Shared Memory b) Summary of Deterministic Analysis. Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University nickm@stanford.edu http://www.stanford.edu/~nickm. Delay Guarantees.

chibale
Download Presentation

EE384x: Packet Switch Architectures I

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EE384x: Packet Switch Architectures I a) Delay Guarantees with Parallel Shared Memory b) Summary of Deterministic Analysis Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University nickm@stanford.edu http://www.stanford.edu/~nickm EE384x

  2. Delay Guarantees • Problem : • How can we design a parallel output-queued router from slower parallel memories and provide delay guarantees? • This is difficult because • The counting technique depends on being able to predict the departure time and schedule it (before, we assumed that the output queue is FCFS). • In policies such as strict priority, weighted fair queueing etc., we don’t know a cell’s departure time when it arrives. EE384x

  3. Delay Guarantees one output, many logical FIFO queues 1 Weighted fair queueing sorts packets by finishing time constrained traffic m one output, single PIFO queue PIFO models • Weighted Fair Queueing • Weighted Round Robin • Strict priority push-in constrained traffic Push In First Out (PIFO) EE384x

  4. Theorem A parallel output-queued router can give delay guarantees (within a bounded error) with 4N –2 memories that can perform at most one memory operation per time slot. EE384x

  5. DT = 3 DT= 2 DT= 1 9 8 7 6 5 4 3 2 1 FIFO: Window of memories of size N-1 that can’t be used 2.5 1.5 Departure Order 8 9 8 7 7 6 6 5 4 5 4 3 3 2.5 2 2 1 1 … N-1 packets before cell at time of insertion 7 8 6 7 6 5 4 5 4 3 3 2.5 2.5 2 1.5 2 1 1 Departure Order … N-1 packets after cell at time of insertion Intuition for Theorem 2N=3 Departure Order … PIFO: 2 windows of memories of size N-1 that can’t be used EE384x

  6. DT=t Before C After C • Used to read the N-1 cells that depart before it. • Used to read the N-1 cells that depart after it. Proof Time = t A packet cannot use the memories: • Used to write the N-1 arriving cells at t. • Used to read the Ndeparting cells at t. DT=t+T Cell C EE384x

  7. a2’ c4 b4 a4 c3 b3 a3 c2 b2 a2 c1 b1 a1 c4 b4 a3 c3 b3 a2 c2 b2 a2’ c1 b1 a1 Relative order of (a3,b3) reversed after being placed in memory Therefore, departure is not in PIFO order. By how much can the order differ? With a PIFO per output DT = 4 DT = 3 DT= 2 DT= 1 c4 b4 a4 c3 b3 a3 c2 b2 a2 c1 b1 a1 EE384x

  8. Nk N2 N1 ck c2 c1 bk b2 b1 ak a2 a1 a2’ Nk N2 N1 ck c2 c1 bk b2 b1 ak a2 a1 Nk N2 N1 ck c2 c1 bk b2 b1 a(k-1) a2’ a1 Cells are correctly resequenced by each output. Therefore, maximum delay is k-1 time slots. Permute departure order DT = k DT = 3 DT= 2 DT= 1 Nk bk ak N3 b3 a3 N2 b2 a2 N1 b1 a1 EE384x

  9. Input Queued - Crossbar N 2R 2NR NR Nk 2NR/k 2NR 2NR - Summary - Routers with delay guarantees Switch Algorithm Total MemoryBW Switch BW Fabric # Mem. Mem. BW Output-Queued Bus N (N+1)R N(N+1)R NR None Shared Mem. Bus 1 2NR 2NR 2NR None 2N 3R 6NR 2NR Marriage CIOQ (Cisco) Crossbar Time Reserve 2N 3R 6NR 3NR PSM Bus k 4NR/k 4NR 4NR C. Sets DSM (Juniper) N 4R 4NR 5NR Edge Color Xbar N 4R 4NR 8NR C. Sets N 6R 6NR 6NR C. Sets PPS - OQ Clos Nk 3R(N+1)/k 3N(N+1)R 6NR C. Sets Nk 6NR/k 6NR 6NR C. Sets PPS –Shared Memory Clos EE384x

  10. Summary of OQ Switches • Output queued switches are ideal • Work-conserving. • Maximize throughput. • Minimize expected delay (for fixed length packets). • Permit delay guarantees for constrained traffic. • Output queued switches don’t scale well • Requires N memory writes per time slot. • Memory bandwidth (dictated by the random-access time of a memory) is a bottleneck. • Parallelism is not straightforward. EE384x

  11. Summary of OQ Switches (2) • Parallelizing packet switches has problems • Resource conflicts. • Packet mis-sequencing. • Methods to analyze parallel OQ switches • Constraint Sets (based on pigeon-hole principle) • Parallel packet switches • Parallel shared memory • Distributed shared memory • Extension to PIFO • Parallel packet buffers • Hybrid SRAM-DRAM FIFO queues. • With and without lookahead buffer. EE384x

More Related