1 / 23

Introduction to Message Passing

Introduction to Message Passing. CMSC 34000 Lecture 3 1/11/05. Class goals. Parallel hardware and software issues First look at actual algorithm (trapezoidal rule for numerical integration) Introduction to message passing in MPI Networks and the cost of communication.

Download Presentation

Introduction to Message Passing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Message Passing CMSC 34000 Lecture 3 1/11/05

  2. Class goals • Parallel hardware and software issues • First look at actual algorithm (trapezoidal rule for numerical integration) • Introduction to message passing in MPI • Networks and the cost of communication

  3. Parallel Hardware (Flynn’s Taxonomy) • SISD • MIMD • SIMD • MISD S = Single, M = Multiple I = Instruction stream D = Data stream

  4. CPU (control and arithmetic) Memory (main & register Data/instruction transfer bottleneck Pipelining (multiple instructions operating simultaneously) Vectorizing (single instruction acts on vector register Cache -- memory hierarchy Von Neumann & Modern

  5. Single CPU for control Many (scalar) ALUs with registers One clock Many CPUs Each has control and ALU Memory may be “shared” or “distributed” Synchronized? SIMD / MIMD

  6. Shared memory MIMD • Bus (contention) • Switch (expensive) • Cache-coherency?

  7. Distributed memory MIMD • General interconnection network (e.g. CS Linux system) • Each processor has its own memory • To share information, processors must pass (and receive) messages that go over the network. • Topology is very important

  8. Different mesh topologies • Totally connected • Linear array/ring • Hypercube • Mesh/Torus • Tree / hypertree • Ethernet… • And others

  9. Issues to consider • Routing (shortest path = best-case cost of single message) • Contention - multiple messages between different processors must share a wire • Programming: would like libraries that hide all this (somehow)

  10. Numerical integration • Approximate • Using quadrature: • Repeated subdivision:

  11. Finite differences • On (0,1): • At endpoints

  12. System of Equations • Algebraic system of equations at each point • “Nearest neighbor stencil” • A row of the matrix looks like

  13. Parallel strategy: Integration • Divide [a,b] into p intervals • Approximation on each subinterval • Sum approximations over each processor • How do we communicate? • Broadcast / reduction

  14. Parallel strategy: Finite differences • How do we multiply the matrix by a vector (needed in Krylov subspace methods)? • Each processor owns: • A range of points • A range of matrix rows • A range of vector entries • To multiply by a vector (linear array) • Share the values at endpoints with neighbors

  15. SPMD (Integration) • Single program running on multiple data • Summation over intervals • Particular points are different • Instances of program can talk to each other • All must share information at the same time • Synchronization

  16. MPI • Message Passing Interface • Developed in 1990’s • Standard for • Sharing message • Collective communication • Logical topologies • Etc

  17. Integration in MPI • Python bindings developed by Pat Miller (LLNL) • Ignore data types, memory size for now • Look at sample code

  18. explicit send and receive O(p) communication cost (at best) Broadcast sends to all processes Reduce collects information to a single process Run-time depends on topology, implementation Two versions

  19. Fundamental model of a message • Processor p “sends” • Processor q “receives” • Information needed: • Address (to read from/write to) • Amount of data being sent • Type of data • Tag to screen the messages • How much data actually received?

  20. MPI Fundamentals • MPI_COMM_WORLD • MPI_Init() // import mpi • MPI_Comm_size() // mpi.size • MPI_Comm_rank() // mpi.rank • MPI_Finalize() // N/A

  21. send receive non-blocking versions broadcast reduce other collective ops MPI Fundamentals

  22. Communication costs over a network • Send, broadcast, reduce • Linear array • Point-to-point • Binary tree

  23. Getting started with MPI on CS machines • Machines available (Debian unstable) • bombadil, clark, guts, garfield • mpich (installed on our system already) • mpicc is the compiler (mpic++, mpif77,etc) • mpirun -np x -machinefile hostsexecutableargs) • download pyMPI & build in home directory • http://sourceforge.net/projects/pympi/ • ./configure --prefix = /home/<you> • builds out of box (fingers crossed)

More Related