1 / 22

Protocols and software for exploiting Myrinet clusters

Protocols and software for exploiting Myrinet clusters. Congduc Pham and the main contributors P. Geoffray, L. Prylli, B. Tourancheau, R. Westrelin. Parallel machines and clusters. Cplant. Standalone workstation. Pros for clusters.

lorne
Download Presentation

Protocols and software for exploiting Myrinet clusters

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Protocols and software for exploiting Myrinet clusters Congduc Pham and the main contributors P. Geoffray, L. Prylli, B. Tourancheau, R. Westrelin

  2. Parallel machines and clusters Cplant Standalone workstation

  3. Pros for clusters • Large supercomputers are expensive and suffer from a short useful life span • Performance of workstations and PCs is rapidly improving • The communications bandwidth between workstations is increasing as new networking technologies and protocols are implemented in LANs and WANs. • Workstation clusters are easier to integrate into existing networks than special parallel computers. • Use of clusters of workstations as a distributed computing resource is very cost effective - incremental growth or update of system!!!

  4. No polemical discussion, just statement… Mainframe PC Workstation Mini Computer 1984 Vector Supercomputer GigaEthernet Giganet SCI Myrinet … from R. Buyya

  5. The Myrinet technology • Switch • full crossbar • wormhole source routing • small latency • Network interface • embedded RISC processor • programmable • local memory • several DMA engines Current specifications: Up to 200Mhz processor Up to 8MB local memory 64bit/66Mhz PCI bus (528 MB/s peak) 250 MB/s full duplex links

  6. The raw performance is here, but… • the traditional communication software fail to bring the hardware performance to the applications 200mph 40mph 35mph Myrinet Traditional communication layers 180mph 175mph Optimized communication layers

  7. Going faster by taking shortcuts

  8. Our communication architecture • Provides a complete suite for high-performance communications.Focus on Myrinet-based clusters • Viewed as layers, but by-passes as much as possible the OS MPI-BIP BIP BIP-SMP programmable NICs break the traditional spatial distribution of tasks Myrinet physical layer

  9. MPI-BIP BIP BIP-SMP Myrinet BIP, the lowest protocol level • Basic Interface for Parallelism • very basic API • provides a library, a kernel module and a MCP • definitely not for the end-user • Optimizations for • latency • maximum throughput • the throughput increase • The implementation performs • reduction of the data critical path • distinction between small and large messages • burst or write combining for hostNIC • optimal cache usage • cache snooping for NIC host (monitoring of the PCI bus) • buffer alignment • optimal fragment size…

  10. Avoids handshakes between the host and the NIC Uses PIO to a NIC FIFO on the sending side and an extra memory copy on the receiving side BIP, small message strategy

  11. BIP, large message strategy • Use DMA both on the send side and receive side: higher bandwidth, offload the CPU • Zero-copy mechanism, pipelined transmission

  12. BIP-SMP: a low level for SMP machines • SMP viewed as best performance/price ratio architectures (2 or 4 proc.) • BIP-SMP provides • manage concurrent accesses to the NIC • low latency intra-node communications • BIP equivalent inter-node communication • total transparency for the applications and end-users 0 1 2 3

  13. BIP-SMP: Moving data between processes

  14. MPI-BIP: the communication middleware • MPI-BIP adds high-level features to BIP • based on the MPICH implementation • provides a portable and widely-used API • implements a credit-based flow control for small messages • request FIFO for multiple non-blocking operations • provides segmentation/reassembly features to avoid timeouts

  15. Working with the BIP software suite • installation • run configure • compilation and linkage • several libraries: bip, bip-smp, mpi • compile with bipcc • Submitting jobs and monitoring nodes • run myristat to know which nodes are available • run bipconf to configure the virtual machine • use bipload to lunch programs

  16. WebCM: a high level management tool • web-based management tool • integrates existing solutions into a common framework

  17. graphical interface for myristat and bipconf allows submission of jobs through batch packages shows the user's virtual machine definition and the user's runnning processes addition of fonctionnalities is performed by incorporating new software packages The WebCM user interface

  18. Latency: BIP and MPI-BIP

  19. Throughput: BIP and MPI-BIP

  20. BIP-SMP: intra-node communications

  21. BIP-SMP: inter-node communications

  22. What run on our clusters? • Genomic simulation • Fluid dynamic • Discrete Event Parallel Simulation • Distributed Shared Memory System • Want to know more? • getting the distribution • getting the documentation http://resam.univ-lyon1.fr

More Related