1 / 67

Interconnection Networks

Interconnection Networks . Overview. Physical Layer and Message Switching Network Topologies Metrics Deadlock & Livelock Routing Layer The Messaging Layer. Interconnection Networks. Fabric for scalable, multiprocessor architectures

aaron
Download Presentation

Interconnection Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Interconnection Networks

  2. Overview • Physical Layer and Message Switching • Network Topologies • Metrics • Deadlock & Livelock • Routing Layer • The Messaging Layer

  3. Interconnection Networks • Fabric for scalable, multiprocessor architectures • Distinct from traditional networking architectures such as Internet Protocol (IP) based systems • We are interested in applications to large clusters as well as embedded systems

  4. CLUX: A Beowulf Cluster Interconnection Network Cables Myrinet Switch Images from the Clux cluster at http://www.fyslab.hut.fi/clux/

  5. The Practical Problem From: Ambuj Goyal, “Computer Science Grand Challenge – Simplicity of Design,” Computing Research Association Conference on "Grand Research Challenges" in Computer Science and Engineering, June 2002

  6. Example: Embedded Devices picoChip: http://www.picochip.com/ • Issues • Execution performance • Power dissipation • Number of chip types • Size and form factor PACT XPP Technologies: http://www.pactcorp.com/

  7. Physical Layer and Message Switching

  8. Messaging Hierarchy Routing Layer Where?: Destination decisions, i.e., which output port Switching Layer When?: When is data forwarded Physical Layer How?: synchronization of data transfer • This organization is distinct from traditional networking implementations • Emphasis is on low latency communication • Only recently have standards been evolving • Infiniband: http://www.infinibandta.org/home

  9. The Physical Layer Data • Data is transmitted based on a hierarchical data structuring mechanism • Messages  packets  flits  phits • While flits and phits are fixed size, packets and data may be variable sized Packets checksum header Flit: flow control digit Phit: physical flow control digit

  10. Flow Control • Flow control digit: synchronized transfer of a unit of information • Based on buffer management • Asynchronous vs. synchronous flow control • Flow control occurs at multiple levels • message flow control • physical flow control • Mechanisms • Credit based flow control

  11. Switching Layer • Comprised of three sets of techniques • switching techniques • flow control • buffer management • Organization and operation of routers are largely determined by the switching layer • Connection Oriented vs. Connectionless communication

  12. Generic Router Architecture Wire delay Switching delay Routing delay

  13. Virtual Channels • Each virtual channel is a pair of unidirectional channels • Independently managed buffers multiplexed over the physical channel • De-couples buffers from physical channels • Originally introduced to break cyclic dependencies • Improves performance through reduction of blocking delay • Virtual lanes vs. virtual channels • As the number of virtual channels increase, the increased channel multiplexing has two effects • decrease in header delay • increase in average data flit delay • Impact on router performance • switch complexity

  14. Circuit Switching Header Probe Acknowledgment Data • Hardware path setup by a routing header or probe • End-to-end acknowledgment initiates transfer at full hardware bandwidth • Source routing vs. distributed routing • System is limited by signaling rate along the circuits Link tr ts tsetup tdata Time Busy

  15. Message Header Message Data Link tr tpacket Time Busy Packet Switching • Blocking delays in circuit switching avoided in packet switched networks  full link utilization in the presence of data • Increased storage requirements at the nodes • Packetization and in-order delivery requirements • Buffering • use of local processor memory • central queues

  16. Virtual Cut-Through Packet Header Message Packet cuts through the Router • Messages cut-through to the next router when feasible • In the absence of blocking, messages are pipelined • pipeline cycle time is the larger of intra-router and inter-router flow control delays • When the header is blocked, the complete message is buffered • High load behavior approaches that of packet switching tw Link tblocking tr ts Time Busy

  17. Wormhole Switching Header Flit • Messages are pipelined, but buffer space is on the order of a few flits • Small buffers + message pipelining  small compact buffers • Supports variable sized messages • Messages cannot be interleaved over a channel: routing information is only associated with the header • Base Latency is equivalent to that of virtual cut-through Link Single Flit tr ts twormhole Time Busy

  18. Comparison of Switching Techniques • Packet switching and virtual cut-through • consume network bandwidth proportional to network load • predictable demands • VCT behaves like wormhole at low loads and like packet switching at high loads • link level error control for packet switching • Wormhole switching • provides low latency • lower saturation point • higher variance of message latency than packet or VCT switching • Virtual channels • blocking delay vs. data delay • router flow control latency • Optimistic vs. conservative flow control

  19. Saturation

  20. Network Topologies

  21. Motivation • Crossbars provide full connectivity among ports, but cost and complexity grow quadratically in the number of ports • Buses provide minimal connectivity and do not provide scalable performance • Network topologies span a spectrum of solutions that trade-off cost, performance (latency & bandwidth), reliability, and implementation complexity

  22. Direct Networks • Fixed degree • Modular • Topologies • Meshes • Multidimensional tori • Special case of tori – the binary hypercube

  23. 0000 0001 1110 1111 Indirect Networks • Indirect networks • uniform base latency • centralized or distributed control • Engineering approximations to direct networks Multistage Network Backward Forward Fat Tree Network Bandwidth increases as you go up the tree

  24. Switch sizes and interstage interconnect establish distinct MINS Majority of interesting MINs have been shown to be topologically equivalent Specific MINs 000 000 000 000 000 000 001 001 001 001 001 001 010 010 010 010 010 010 011 011 011 011 011 011 100 100 100 100 100 100 101 101 101 101 101 101 110 110 110 110 110 110 111 111 111 111 111 111

  25. Metrics

  26. Evaluation Metrics • Latency • Message transit time • Determined by switching technique and traffic patterns • Node degree (channel width) • Number of input/output channels • This metric is determined by packaging constraints • pin/wiring constraints • Diameter • Path diversity • A measure of reliability

  27. Evaluation Metrics bisection • Bisection bandwidth • This is minimum bandwidth across any bisection of the network • Bisection bandwidth is a limiting attribute of performance

  28. Constant Resource Analysis: Bisection Width

  29. Constant Resource Analysis: Pin out

  30. Latency Under Contention 32-ary 2-cube vs. 10-ary 3 cube

  31. Deadlock and Livelock

  32. Deadlock freedom can be ensured by enforcing constraints For example, following dimension order routing in 2D meshes Deadlock and Livelock router Virtual Channel

  33. Occurrence of Deadlock 3 1 4 2 • Deadlock is caused by dependencies between buffers

  34. Deadlock in a Ring Network

  35. Deadlock Avoidance: Principle • Deadlock is caused by dependencies between buffers

  36. Routing Constraints on Virtual Channels • Add multiple virtual channels to each physical channel • Place routing restrictions between virtual channels

  37. Break Cycles

  38. Channel Dependence Graph

  39. Routing Layer

  40. Routing Protocols Routing Algorithms Number of Destinations Unicast Routing Multicast Routing Routing Decisions Centralized Routing Source Routing Distributed Routing Multiphase Routing Implementation Table Lookup Finite State Machine Adaptivity Deterministic Routing Adaptive Routing Progressiveness Progressive Backtracking Minimality Profitable Misrouting Number of Paths Complete Partial Source: J. Duato, S. Yalamanchili, and L. Ni, “Interconnection Networks,” Morgan Kaufman 2003.

  41. Key Routing Categories • Deterministic • The path is fixed by the source destination pair • Source Routing • Path is looked up prior to message injection • May differ each time the network and NIs are initialized • Adaptive routing • Path is determined by run-time network conditions • Unicast • Single source to single destination • Multicast • Single source to multiple destinations

  42. From/to local processor Input queues (virtual channels) Output queues (virtual channels) mux Switch Physical input channels Physical output channels mux Address decoder Generic Router Architecture

  43. Software Layer

  44. The Message Layer • Message layer background • Cluster computers • Myrinet SAN • Design properties • End-to-End communication path • Injection • Network transmission • Ejection • Overall performance

  45. CPU CPU CPU Memory Memory Memory CPU Memory I/O Bus I/O Bus I/O Bus I/O Bus Network Interface Network Interface Network Interface Network Interface Network Cluster Computers • Cost-effective alternative to supercomputers • Number of commodity workstations • Specialized network hardware and software • Result: Large pool of host processors Courtesy of C. Ulmer

  46. CPU NI CPU NI CPU NI X X CPU NI X CPU NI CPU CPU NI NI Myrinet • Descendant of Caltech Mosaic project • Wormhole network • Source routing • High-speed, Ultra-reliable network • Configurable topology: Switches, NICs, and cables Courtesy of C. Ulmer

  47. Fiber Backplane Fiber X X X X Fiber 16 Xbar To Backplane Fiber Fiber Line Cards X X X Fiber X X X X X Line Card Fiber Fiber 16 Port Xbar 8 Hosts / Line Card Myrinet Switches & Links • 16 Port crossbar chip • 2.0+2.0 Gbps per port • ~300 ns Latency • Line card • 8 Network ports • 8 Backplane ports • Backplane cabinet • 17 line card slots • 128 Hosts Courtesy of C. Ulmer

  48. Myrinet NI Architecture • Custom RISC CPU • 33-200MHz • Big endian • gcc is available • SRAM • 1-9MB • No CPU cache • DMA Engines • PCI / SRAM • SRAM / Tx • Rx / SRAM SRAM RISC CPU PCI Host DMA SAN DMA Tx Rx LANai Processor Network Interface Card Courtesy of C. Ulmer

  49. Message Layers Courtesy of C. Ulmer

  50. CPU CPU CPU CPU CPU CPU CPU CPU CPU Cluster Message Layer “Message Layer” Communication Software • Message layers are enabling technology for clusters • Enable cluster to function as single image multiprocessor system • Responsible for transferring messages between resources • Hide hardware details from end users Courtesy of C. Ulmer

More Related