1 / 63

Lecture Note on Switch Architectures

Lecture Note on Switch Architectures. Function of Switch. Naive Way. Bus-Based Switch. No buffering at input port processor (IPP) Output port processor (OPP) buffers cells Controller exchanges control message with terminals and other controller. Disadvantage:

yagerd
Download Presentation

Lecture Note on Switch Architectures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture Note on Switch Architectures

  2. Function of Switch

  3. Naive Way

  4. Bus-Based Switch • No buffering at input port processor (IPP) • Output port processor (OPP) buffers cells • Controller exchanges control message with terminals and other controller. • Disadvantage: • Bus bandwidth is equal to sum of external link for non-blocking • IPPs and OPPs must operate at full bus bandwidth • Bus width increases with number of links

  5. Centralized Bus Arbitration • IPPs send requests to central arbiter • Request may includes: • Priority • Waiting time • OPP destination(s) • Length of IPP queue • Arbitration complexity is O(N^2) • Distributed version is preferred, but may degrade throughput.

  6. Bus Arbitration Using Rotating Daisy Chain • Rotating token eliminates positional favoritism

  7. Ring Switch • Same bandwidth and complexity as bus switch • Avoids capacitive loading of bus, allowing higher clock frequencies • Control mechanisms • Token passing • Slotted ring with busy bit

  8. Shared Buffer Switch • Individual queues are rarely full. • Shared memory needs two times of external link bandwidth • Require less memory • Better ability to handle burst traffic

  9. Crossbar Switch

  10. Output Buffering • Efficient, but needs N time speed up internally.

  11. Input Buffering • Multiple packets simultaneously transmitted distinct outputs. • Require sophisticated arbitration • No speed up required • Head-of-line blocking

  12. Bi-partite Matching • Require global information • Complexity is O(N^(5/2)) • Not suitable for hardware implementation • May leads to starvation

  13. Desired Arbitration Algorithms • High throughput • Low backlog in each input queue • Close to 100% for each input and output • Starvation free • No queue will be hold indefinitely • Simple to implement

  14. Options to Build High Performance Switches • Bufferless crossbar • Buffered crossbar • Shared buffer

  15. Bufferless Crossbar • Centralized arbitrator is required • Arbitration complexity is O(N*N) • O(log2N) iterations of arbitration needed for high throughput • Synchronization in all elements • Single point failure: central arbitrator • Complex line interface

  16. Buffered Crossbar • Simple scheduling algorithms • Ingress: O(1) • Egress: O(N) • Inefficient use of memory • Memory linearly increased with number of ports

  17. Shared Memory • No central arbitrator needed • Reduced memory requirements • Distributed flow control • Less timing constrains • Simpler line card interface

  18. Comparisons (1)

  19. Comparison (2) Assume 10G for each port and packet size is 64 bytes.

  20. Scaling Number of Ports • Single larger switch is less expensive, more reliable, easier to maintain and offer better performance, but • O(n2) complexity • Board-level buses limited by capacitive loading • Port multiplexing • Buffered multistage routing • Dynamic routing: Benes network • Static routing: Clos network • Bufferless multistage routing • Deflection routing

  21. Port Multiplexing • High speed core can handles high speed links as well as low speed • Sharing of common circuitry • Reduced complexity in interconnection network • Better queueing performance for bursty traffic • Less fragmentation of bandwidth and memory

  22. Dynamic Routing – Benes Network • Network expanded by adding stages on left and right • 2k-1 stages with d port switch elements supports dk ports • Traffic distribution on first k-1 stages • Routing on last k stages • Internal load  external load • Traffic maybe out of order: need re-sequencing

  23. Static Routing - Clos network • All traffic follows same path • r  2d-1 to be strict non-blocking.

  24. Deflection Routing

  25. Basic Architectural Components:Forwarding Decision 3. 1. Output Scheduling 2. Forwarding Table Interconnect Forwarding Decision Forwarding Table Forwarding Decision Forwarding Table Forwarding Decision

  26. ATM SwitchesDirect Lookup (Port, VCI) VCI Memory Address Data

  27. Associated Data { Hit? Address log2N Ethernet SwitchesHashing Memory #1 #2 #3 #4 Search Data #1 #2 Hashing Function 16 CRC-16 48 #1 #2 #3 Linked lists • Advantages • Simple • Expected lookup time can be small • Disadvantages • Non-deterministic lookup time • Inefficient use of memory

  28. Per-Packet Processing in IP Routers 1. Accept packet arriving on an incoming link. 2. Lookup packet destination address in the forwarding table, to identify outgoing port(s). 3. Manipulate packet header: e.g., update header checksum. 4. Send packet to the outgoing port(s). 5. Classify and buffer packet in the queue. 6. Transmit packet onto outgoing link.

  29. Forwarding Engine Next Hop Computation HEADER Forwarding Table Destination Next Hop ---- ---- ---- ---- Incoming Packet ---- ---- IP Router Lookup Destination Address Next Hop Link

  30. Lookup and Forwarding Engine Packet header payload Router Destination Address Outgoing Port Lookup Data Forwarding Engine

  31. Example Forwarding Table Prefix length IP prefix: 1-32 bits

  32. 128.9.16.14 Multiple Matching Longest matching prefix 128.9.176.0/24 128.9.16.0/21 128.9.172.0/21 142.12.0.0/19 65.0.0.0/8 128.9.0.0/16 0 232-1 Routing lookup: Find the longest matching prefix (or the most specific route) among all prefixes that match the destination address.

  33. Longest Prefix Matching Problem • 2-dimensional search • Prefix Length • Prefix Value • Performance Metrics • Lookup time • Storage space • Update time • Preprocessing time

  34. 31.25 Mpps  33 ns DRAM: 50-80 ns, SRAM: 5-10 ns Required LookupRates Year Line Line-rate (Gbps) 40B packets (Mpps) 1998-99 OC12c 0.622 1.94 1999-00 OC48c 2.5 7.81 2000-01 OC192c 10.0 31.25 2002-05 OC768c 40.0 125

  35. Size of Forward Table 10,000/year Number of Prefixes 95 96 97 98 99 00 Year Renewed growth due to multi-homing of enterprise networks

  36. log2N N entries Trees and Tries Binary Search Tree Binary Search Trie < > 0 1 < > < > 0 1 0 1 010 111

  37. Typical Profile of Forward Table Number Prefix Length Most prefixes are 24-bits or shorter

  38. Basic Architectural Components: Interconnect 3. 1. Output Scheduling 2. Forwarding Table Interconnect Forwarding Decision Forwarding Table Forwarding Decision Forwarding Table Forwarding Decision

  39. DMA DMA DMA Line Interface Line Interface Line Interface MAC MAC MAC First-Generation Routers Buffer Memory CPU

  40. Second-Generation Routers Buffer Memory CPU DMA DMA DMA Line Card Line Card Line Card Local Buffer Memory Local Buffer Memory Local Buffer Memory MAC MAC MAC

  41. Third-Generation Routers Line Card CPU Card Line Card Local Buffer Memory Local Buffer Memory MAC MAC

  42. Switching Goals

  43. Circuit Switches • A switch that can handle N calls has N logical inputs and N logical outputs • N up to 200,000 • Moves 8-bit samples from an input to an output port • Samples have no headers • Destination of sample depends on time at which it arrives at the switch • In practice, input trunks are multiplexed • Multiplexed trunks carry frames, i.e., set of samples • Extract samples from frame, and depending on position in frame, switch to output • each incoming sample has to get to the right output line and the right slot in the output frame

  44. Blocking in Circuit Switches • Can’t find a path from input to output • Internal blocking • slot in output frame exists, but no path • Output blocking • no slot in output frame is available

  45. Time Division Switching • Time division switching interchanges sample position within a frame: time slot interchange (TSI)

  46. Scaling Issues with TSI

  47. Space Division Switching • Each sample takes a different path through the switch, depending on its destination

More Related