1 / 58

Multiprocessor System Interconnects

Multiprocessor System Interconnects. Allows fast communication among processors and shared memory, I/O, and peripheral devices IPMN : processors to shared memory PION : processors to I/O and peripheral devices IPCN : processors to processors. Network Characteristics.

olenb
Download Presentation

Multiprocessor System Interconnects

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiprocessor System Interconnects • Allows fast communication among processors and shared memory, I/O, and peripheral devices • IPMN: processors to shared memory • PION: processors to I/O and peripheral devices • IPCN: processors to processors EENG-630

  2. Network Characteristics • Timing: synchronous/asynchronous • Global clock / handshaking • Switching: circuit or packet • Path granted / compete for path • Control: centralized/distributed • Global controller / local devices • Topology • Bus system / crossbar / multistage EENG-630

  3. EENG-630

  4. Hierarchical Bus System • Consists of a hierarchy of buses connecting various components • Each bus formed with signal, control, and power lines • Different buses perform different interconnection functions EENG-630

  5. EENG-630

  6. Local Bus • Implemented on printed-circuit boards • Provides common communication path among components on the board • Memory bus on memory board • Data bus on I/O bus • Consists of signal and utility lines EENG-630

  7. Backplane and I/O Bus • Printed circuit on which many connectors are used to plug in functional boards • System bus provides a common communication path among all plug-in boards • Made of coaxial cables with taps connecting disks, printer, and tape units to a processor thru an I/O controller EENG-630

  8. Hierarchical Cache/Bus Architecture • Leaf nodes are processors and their private caches • Divided into clusters – cluster bus • Intercluster bus connects clusters • Second level caches used b/t cluster and intercluster buses • Each cluster operates as a single-bus system • Most memory requests satisfied at lower level caches EENG-630

  9. Second level caches used to extend consistency from each cluster to upper level • Upper level caches form another level of shared memory • Bridges b/t clusters allow transactions initiated on a local bus to be completed on a remote bus EENG-630

  10. EENG-630

  11. Single stage: Recirculating n/w Cheaper, but more passes needed Crossbar switch and multiport memory org. Multistage: More than one stage of switch boxes Should connect any input to any output May have same pattern at each stage Omega, Flip, and Baseline Network Stages EENG-630

  12. Simultaneous connections of some multiple I/O pairs may result in conflicts Omega, Baseline, Banyan Most multistage n/w May need multiple passes Can perform all possible connections by rearranging its connections Connection path can always be established Benes and Clos May require more stages Blocking vs. Nonblocking EENG-630

  13. Crossbar Networks • Every input port connected to a free output port w/o blocking • Single-stage n/w with unary switches • Requires nm crosspoint switches • If n=m, then can implement n! permutations without blocking EENG-630

  14. Crosspoint Switch Design • Only one switch/column can be connected at a time – need extra h/w to resolve • Each crosspoint has complexity of a bus • Requires extensive h/w – limit to n  16 • Can connect multiple switches/row EENG-630

  15. EENG-630

  16. Crossbar Limitations • At most, can deliver n words to at most n processors in each memory cycle • Memory modules can be n-way interleaved to allow overlapped access • Offers highest b/w of n data transfers/cycle • Cost effective for small multiprocessors with a few processors accessing few memory modules EENG-630

  17. Multiport Memory • Moves all crosspoint arbitration and switching functions to memory controller • Memory module is more expensive • One of n processor requests honored at a time • B/t low-cost, low-performance bus system, and high-cost, high-performance xbar • Contention bus is time-shared and multiport memory must resolve conflicts among processors EENG-630

  18. EENG-630

  19. Multiport Limitations • Expensive when m and n become large • Typically, n = 4 processors, m = 16 modules • Not scalable • Needs large number of interconnection cables and connectors when configuration becomes large EENG-630

  20. Routing in Omega Networks • If n inputs, then log n stages • Route by destination code • ith high-order bit = 0, upper o/p at stage i • Can have conflicts – blocking n/w • May need several passes • Can implement nn/2 permutations in one pass, out of n! permutations EENG-630

  21. EENG-630

  22. EENG-630

  23. Routing in Butterfly Networks • Constructed w/crossbar switches • If mxm crossbar switches, then # stages = logmn # of switches per stage = n/m • No broadcast connections allowed • Can modularly construct larger Butterfly networks by using more stages EENG-630

  24. EENG-630

  25. Hot-Spot Problem • Occurs when n/w traffic is nonuniform • A memory module is accessed excessively by many processors at the same time • Degrades network performance • Can use a combining mechanism to combine multiple requests • Atomic read-modify-write primitive Fetch&Add(x,e) performs parallel memory updates using the combining network EENG-630

  26. Fetch&Add(x,e) • Implements an N-way synchronization with a complexity independent of N • x is an integer variable in shared memory • e is an integer increment Fetch&Add(x,e) [single processor] { temp xi x  temp + ei return temp} EENG-630

  27. If N processors, memory updated only once following a serialization principle • The sum of the N increments is produced in any arbitrary serialization of the requests • The values returned to the N requests are all unique • Net result is similar to a sequential execution of N Fetch&Adds EENG-630

  28. EENG-630

  29. Message-Passing Mechanisms • Store-and-forward routing • Wormhole routing • Virtual channels • Deadlock situations • Deterministic and adaptive routing algorithms EENG-630

  30. Message Formats • Message: logical unit for internode communication • Packet: basic unit containing destination address for routing • Packets have sequencing # for reassembly • Flits: flow control digits of packets • Store-and-forward: packets • Wormhole routing: flits EENG-630

  31. Packets and Flits • Header flits contain routing information and sequence number • Flit length affected by network size • Packet length determined by routing scheme and network implementation • Lengths also dependent on channel b/w, router design, network traffic, etc. EENG-630

  32. Message Format EENG-630

  33. Store-and-Forward Routing • Packets are the basic unit • Each node has a packet buffer • When a packet reaches an intermediate node, it is first stored in the buffer, sent when output channel and next buffer ready • Latency directly proportional to the distance between source and destination EENG-630

  34. Wormhole Routing • Flits are the basic unit • Transmission through sequence of routers • All flits of same packet are pipelined • All data flits follow header flit • Packets can be interleaved, not flits • Latency is almost independent of distance EENG-630

  35. EENG-630

  36. Asynchronous Pipelining • Pipelining of flits is asynchronous • A 1-bit ready/request line used between adjacent routers • When D is ready to receive a flit, R/A = 0 • When S ready, R/A = 1, and transmits flit i • While flit being received, R/A stays high • Repeat cycle for remaining flits EENG-630

  37. Handshaking Protocol EENG-630

  38. Latency Analysis • L=packet length W=channel b/w (bits/s) • D=distance F=flit length • TSF=L/W (D + 1) • TWH=L/W + F/W x D • Store-and-forward: controlled by s/w • Wormhole: controlled by h/w EENG-630

  39. EENG-630

  40. Virtual Channels • A logical link b/t two nodes, formed by a flit buffer in source, a physical channel b/t them, and a flit buffer in receiver • Physical channel is time-shared by virtual channels • Sharing of physical channel by set of virtual channels is conducted bytime-multiplexing on a flit-by-flit basis EENG-630

  41. EENG-630

  42. EENG-630

  43. Deadlock Avoidance • Unidirectional/bidirectional channels • Combining two unidirectionals into one bidirectional will increase utilization rate and double channel b/w • Arbitration more complex for bidirectional • High-speed mulitplexing is required for implementing large # of virtual channels EENG-630

  44. EENG-630

  45. Packet Collision Resolution • To move a flit b/t adjacent nodes must have: • Source buffer holding flit • Channel being allocated • Receiver buffer accepting flit • Arbitration decisions • Which packet will be allocated the channel • What to do with rejected packet EENG-630

  46. Buffering with Virtual Cut-Through Routing • Rejected packet temporarily stored in buffer • Requires large buffer to hold entire packet • Does not waste allocated resources • Best case: wormhole routing • Worst case: store-and-forward EENG-630

  47. Blocking and Detour Policies • Blocking: block rejected packet, do not abandon • Economical, idle resources • Discard: drops blocked packed • Waste of resources • Detour: misroute to a detour channel • Flexible, but wastes channel resources, may cause cycle of livelock EENG-630

  48. EENG-630

  49. Dimension-Order Routing • Deterministic: patch completely determined • Adaptive: depends on n/w conditions • Dimension-Order: Require selection of successive channels to follow a specific order based on dimensions • X-Y routing, E-cube routing EENG-630

  50. EENG-630

More Related