An introduction
Sponsored Links
This presentation is the property of its rightful owner.
1 / 67

An Introduction PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

An Introduction. Background The message-passing model Parallel computing model Communication between process Source of further MPI information Basic of MPI message passing Fundamental concepts Simple examples in C Extended point-to-pint operations Non-blocking communication modes.

Download Presentation

An Introduction

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

An Introduction

  • Background

    • The message-passing model

    • Parallel computing model

    • Communication between process

    • Source of further MPI information

  • Basic of MPI message passing

    • Fundamental concepts

    • Simple examples in C

  • Extended point-to-pint operations

    • Non-blocking communication

    • modes

Companion Material

  • Online examples available at


    contains a relative complete reference on how to program on MPI and examples.

The Message-Passing Model

  • A process is (traditionally) a program counter and address space

  • Processes may have multiple threads(program counters and associated stacks) sharing a single address space. MPI is for communication among processes, which have separate address spaces.

  • Interprocess communication consists of

    • Synchronization/Asynchronization

    • Movement of data from one process’s address space to another’s

Synchronous Communication

  • A synchronous communication does not complete until the message has been received

    • A FAX or registered mail

Asynchronous Communication

  • An asynchronous communication completes as soon as the message is on the way.

    • A post card or email

What is MPI?

  • A message-passing library specification

    • Extended message-passing model

    • Not a language or compiler specification

    • Not a specific implementation or product

  • For parallel computers, clusters, and heterogeneous networks

  • Full-featured

  • Designed to permit the development of parallel software libraries

  • Designed to provide access to advanced parallel hardware for

    • End users

    • Library writers

    • Tool developers

  • What is message passing?

    • Data transfer plus synchronization

    • Requires cooperation of sender and receiver

    • Cooperation not always apparent in code

    MPI Sources

    • The standard itself:

      • At

      • All MPI official release, in both postscript and HTML

  • Books:

    • Using MPI: Portable parallel programming with the message-passing interface, by Gropp, Lusk, and Skjellum, MIT Press, 1994

    • MPI: The Complete Reference, by Snir, Otto, Huss-Lederman, Walker,and Dongarra, MIT Press, 1996

    • Designing and building Parallel programs, by Ian Foster, Addison-Wesley, 1995

    • Parallel Programming with MPI, by Peter Pacheco, Morgan-Kaufmann, 1997

    • MPI: The Complete Reference Vol 1 and 2, MIT Press, 1998

  • Other information on web:

    • At

    • Pointers to lot of stuff, including other talks and tutorials, a FAQ, other MPI pages

  • Features of MPI

    • General

      • Communications combine context and group for message security

      • Thread safety

  • Point-to-point communication

    • Structured buffers and derived datatypes, heterogeneity.

    • Modes: normal(blocking and non-blocking), synchronous, ready(to allow access to fast protocol), buffered

  • Collective

    • Both built-in and user-defined collective operations.

    • Large number of data movement routines.

    • Subgroups defined directly or by topology

  • Features NOT in MPI

    • Process Management

    • Remote memory transfer

    • Threads

    • Virtual shared memory

    Why use MPI?

    • MPI provides a powerful, efficient, and portable way to express parallel programs.

    • MPI was explicitly designed to enable libraries which may eliminate the need for many users to learn (much of) MPI.

    • Portable

    • Expressive

    • Good way to learn about subtle issues in parallel computing

    Is MPI large or small?

    • MPI is large(125 functions)

      • MPI’s extensive functionality requires many functions.

      • Number of functions not necessarily a measure of complexity.

  • MPI is small(6 functions)

    • Many parallel programs can be written with just 6 basic functions.

  • MPI is just right

    • One can access flexibility when it is required.

    • One need not master all parts of MPI to use it.

    • MPI is whatever size you like

  • Skeleton MPI Program

    #include <mpi.h>

    main( int argc, char** argv ) {

    MPI_Init( &argc, &argv );

    /* main part of the program */

    Use MPI function call depend on your data partition and parallization architecture



    Initializing MPI

    • The first MPI routine called in any MPI program must be the initialization routine MPI_INIT

    • MPI_INIT is called once by every process, before any other MPI routines

      int mpi_Init( int *argc, char **argv );

    A minimal MPI program(c)

    #include “mpi.h”

    #include <stdio.h>

    int main(int argc, char *argv[])


    MPI_Init(&argc, &argv);

    printf(“Hello, world!\n”);


    Return 0;



    • #include “mpi.h” provides basic MPI definitions and types.

    • MPI_Init starts MPI

    • MPI_Finalize exits MPI

    • Note that all non-MPI routines are local; thus “printf” run on each process

    Notes on C

    • In C:

      • mpi.h must be included by using #include mpi.h

      • MPI functions return error codes or MPI_SUCCESS

    Error handling

    • By default, an error causes all processes to abort.

    • The user can cause routines to return(with an error code) instead.

    • A user can also write and install custom error handlers.

    • Libraries might want to handle errors differently from applications.

    Running MPI Programs

    • The MPI-1 standard does not specify how to run an MPI program, just as the Fortran standard does not specify how to run a Fortran program.

    • In general, starting an MPI program is dependent on the implementation of MPI you are using, and might require various scripts, program arguments, and/or environment variables.

    • Common way: mpirun –np 2 hello

    Finding out about the environment

    • Two important questions that arise early in a parallel program are:

      -How many processes are participating in this computation?

      -Which one am I?

    • MPI provides functions to answer these questions:

      -MPI_Comm_size reports the number of processes.

      -MPI_Comm_rank reports the rank, a number between 0 and size-1, identifying the calling process.

    Better Hello(c)

    #include “mpi.h”

    #include <stdio.h>

    int main(int argc, char *argv[])


    int rank, size;

    MPI_Init(&argc, &argv);

    MPI_Comm_rank(MPI_COMM_WORLD, &rank);

    MPI_Comm_size(MPI_COMM_WORLD, &size);

    printf(“I am %d of\n”, rank, size);


    return 0;


    Some basic concepts

    • Processes can be collected into groups.

    • Each message is sent in a context, and must be received in the same context.

    • A group and context together form a communicator.

    • A process is identified by its rank in the group associated with a communicator.

    • There is a default communicator whose group contains all initial processes, called MPI_COMM_WORLD.

    MPI Datatypes

    • The data in a message to sent or received is described by a triple(address, count, datatype)

    • An MPI datatype is recursively defined as:

      • Predefined, corresponding to a datatype from the language(e.g., MPI_INT, MPI_DOUBLE_PRECISION)

      • A contiguous array of MPI datatypes

      • A strided block of datatypes

      • An indexed array of blocks of datatypes

      • An arbitrary structure of datatypes

    • There are MPI functions to construct custom datatypes, such an array of (int, float) pairs, or a row of a matrix stored columnwise.

    Why datatypes

    • Since all data is labeled by type, an MPI implementation can support communication between processes on machines with very different memory representations and lengths of elementary datatypes(heterogeneous communication).

    • Specifying application-oriented layout of data in memory

      • Reduces memory-to-memory copies in the implementation

      • Allows the use of special hardware(scatter/gather) when available.

    Process 1


    Process 0


    MPI basic send/receive

    • We need to fill in the details in

    • Things that need specifying:

      • How will “data” be described?

      • How will processes be identified?

      • How will the receiver recognize/screen messages?

      • What will it mean for these operation to complete?

    MPI basic (blocking) send

    • MPI_Send(void *start, int count,MPI_Datatype datatype, int dest, int tag, MPI_Comm comm)

    • The message buffer is described by (start, count, datatype).

    • The target process is specified by dest, which is the rank of the target process in the communicator specified by comm.

    • When this function returns, the data has been delivered to the system and the buffer can be reused. The message may not have been received by the target process.

    MPI_Send primitive

    • MPI_Send performs a standard-mode, blocking send.

    • The send buffer specified by MPI_Send consists of count successive entries of the type indicated by datatype, starting with the entry at address “buf”.

    • The count may be zero, in which case the data part of the message is empty.

    Basic MPI Datatype

    MPI datatypeC datatype

    MPI_CHARsigned char

    MPI_SIGNED_CHARsigned char

    MPI_UNSIGNED_CHARunsigned char

    MPI_SHORTsigned short

    MPI_UNSIGNED_SHORTunsigned short

    MPI_INTsigned int

    MPI_UNSIGNEDunsigned int

    MPI_LONGsigned long

    MPI_UNSIGNED_LONGunsigned long

    Basic MPI Datatype(continue)



    MPI_LONG_DOUBLElong double

    MPI basic (blocking) receive

    • MPI_RECV(void *start, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status *status)

      • Waits until a matching (on source and tag) message is received from the system, and the buffer can be used.

      • Source is rank in communicator specified by comm, or MPI_ANY_SOURCE.

      • Status contains further information

      • Receiving fewer than count occurrences of datatype is OK, but receiving more is an error.

    More comment on send and receive

    • A receive operation may accept messages from an arbitrary sender, but a send operation must specify a unique receiver.

    • Source equals destination is allowed, that is, a process can send a message to itself.

    MPI_ISEND and MPI_IRECV primitive

    MPI_ISEND(buf, count, datatype, dest, tag, comm, request)

    • Nonblocking communication use request objects to identify communication operations and link the posting operation with the completion operation.

    • request is a request handle which can be used to query the status of the communication or wait for its completion.

    More on nonblocking send and receive

    • A nonblocking send call indicates that the system may start copying data out of the send buffer. The sender must not access any part of the send buffer after a nonblocking send operation is posted, until the complete-send returns.

    • A nonblocking receive indicates that the system may start writing data into the receive buffer. The receiver must not access any part of the receive buffer after a nonblocking receive operation is posted, until the complete-receive returns.

    MPI_RECV primitive

    • MPI_RECV performs a standard-mode, blocking receive.

    • The receive buffer consists of storage sufficient to contain count consecutive entries of the type specified by datatype, starting at address buf.

    • An overflow error occurs if all incoming data don’t fit into the receive buffer.

    • The receiver can specify a wildcard value for souce(MPI_ANY_SOURCE) and/or a wildcard value for tag(MPI_ANY_TAG), indicating that any source and/or tag are acceptable.

    MPI tags

    • Message are sent with an accompanying user-defined integer tag, to assist the receiving process in identifying the message.

    • Message can be screened at the receiving end by specifying a specific tag, or not screened by specifying MPI_ANY_TAG as the tag in a receive.

    • Some non-MPI message-passing systems have called tags”message types”. MPI calls them tags to avoid confusion with datatype.

    Retrieving further information

    • Status is a data structure allocated in the user’s program.

    • In C:

      int recvd_tag, recvd_from, recvd_count;

      MPI_Status status;

      MPI_Recv(…, MPI_ANY_SOURCE, MPI_ANY_TAG, …, &status)

      recvd_tag = status.MPI_TAG;

      recvd_from = status.MPI_SOURCE;

      MPI_Get_count(&status, datatype, &recvd_count);

    Tags and contexts

    • Separation of messages used to be accomplished by use of tags, but

      • This requires libraries to be aware of tags used by other libraries

      • This can be defeated by use of “wild card” tags

  • Contexts are different from tags

    • No wild cards allowed

    • Allocated dynamically by the system when a library sets up a communicator for its own use.

  • User-defined tags still provided in MPI for user convenience in organizing application

  • Use MPI_Comm_split to create new communicators

  • MPI is simple

    • Many parallel programs can be written using just these six functions, only two of which are non-trivial;

      • MPI_INIT




      • MPI_SEND

      • MPI_RECV

  • Point-to-point(send/recv) isn’t the only way…

  • Alternative set of 6 functions fro simplified MPI

    • MPI_INIT






    Introduction to collective operations in MPI

    • Collective operations are called by all processes in a communicator

    • MPI_Bcast distributes data from one process(the root) to all others in a communicator.

      Syntax: MPI_Bcast(void *message, int count, MPI_Datatype datatype, int root, MPI_Comm comm)

    • MPI_Reduce combines data from all processes in communicator or and returns it to one process

      Syntax: MPI_Reduce(void *message, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, int root, MPI_Comm comm)

    • In many numerical algorithm, send/receive can be replaced by Bcast/Reduce, improving both simplicity and efficiency.

    • MPI_Datatype values


    • MPI_OP Values


    Example: PI in c –1

    #include “mpi.h”

    #include <math.h>

    int main(int argc, char *argv[])


    int done = 0, n, myid, numprocs, I, rc;

    double PI25DT = 3.141592653589793238462643;

    double mypi, pi, h, sum, x, a;

    MPI_Init(&argc, &argv);

    MPI_Comm_size(MPI_COMM_WORLD, &numprocs);

    MPI_COMM_rank(MPI_COMM_WORLD, &myid);

    while (!done)


    if (myid == 0)


    printf(“Enter the number of intervals: (0 quits) “);

    scanf(“%d”, &n);


    MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);

    if (n == 0)




    Example: PI in c - 2

    h = 1.0 / (double)n;

    sum = 0.0;

    for (i = myid + 1; i <= n; i += numprocs)


    x = h * ((double)i – 0.5);

    sum += 4.0 / (1.0 + x * x);


    mypi = h * sum;

    MPI_Reduce(&mypi, &pi, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD);

    if (myid == 0)

    printf(“pi is approximately %.16f, Error is %.16f\n”, pi, fabs(pi – PI25DT));



    return 0;


    Compile and run the code

    • mpicc –o pi pi.c

    • Run the code by using:

      mpirun –np # of procs –machinefile XXX pi

      -machinefile tells MPI to run the program on the machines of XXX.

    One more example

    • Simple summation of numbers using MPI

    • This program adds numbers that stored in the data file "rand_data.txt" in parallel.

    • This program is taken from book "Parallel Programming" by Barry Wilkinson and Michael Allen.


    #include "mpi.h"

    #include <stdio.h>

    #include <math.h>

    #define MAXSIZE 100

    int main(int argc, char **argv) {

    int myid, numprocs;

    int data[MAXSIZE], i, x, low, high, myresult=0, result;

    char fn[255];

    FILE *fp;

    MPI_Init(&argc, &argv);

    MPI_Comm_size(MPI_COMM_WORLD, &numprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myid);

    if(0 == myid) {

    /* open input file and intialize data */

    strcpy(fn, getenv("PWD"));

    strcat(fn, "/rand_data.txt");


    if( NULL == (fp = fopen(fn, "r")) ) {

    printf("Can't open the input file: %s\n\n", fn);



    for(i=0; i<MAXSIZE; i++) {

    fscanf(fp, "%d", &data[i]); }



    /* broadcast data */


    /* add portion of data */

    x = MAXSIZE/numprocs; /* must be an integer */

    low = myid * x;

    high = low + x;

    for(i=low; i<high; i++) {

    myresult += data[i];



    printf("I got %d from %d\n", myresult, myid);

    /* compute global sum */

    MPI_Reduce(&myresult, &result, 1, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD);

    if(0 == myid) {

    printf("The sum is %d.\n", result);




    Communication models

    MPI provides multiple modes for sending messages:

    • Synchronous mode(MPI_Ssend): the send does not complete until a matching receive has begun. (Unsafe programs become incorrect and usually deadlock within an MPI_Ssend)

    • Buffered mode(MPI_Bsend): the user supplies the buffer to system for its use. (User supplies enough memory to make unsafe program safe)

    • Read mode(MPI_Rsend): user guarantees that matching

    Deadlocks in blocking operations

    • Send a large message from process 0 to process 1

      • If there is insufficient storage at the destination, the send must wait for the user to provide the memory space(through a receive)

  • What happens with

    Process 0Process 1



  • This is called “unsafe” because it depends on the availability of system buffers.

  • Some solutions to the “unsafe” problem

    • Order the operations more carefully

      Process 0Process 1



      Use non-blocking operations:

      Process 0Process 1




    Non-blocking operations

    • Non-blocking operations return immediately “request Handles” that can be waited on and queried:

      • MPI_Isend(start, count, datatype, dest, tag, comm, request)

      • MPI_Irecv(start, count, datatype, dest, tag, comm, request)

      • MPI_Wait(request, status)

  • Without waiting: MPI_Test(request, flag, status)

  • When to use MPI

    • Portability and Performance

    • Irregular data structure

    • Building tools for others

    • Need to manage memory on a per processor basis

    When not to use MPI

    • Solution (e.g., library) already exists


    • Require fault tolerance

      • Socket

    Program with MPI and play with it

    • MPICH-1.2.4 for windows 2000 has installed in ECE226.

    • On every machine, please refer to c:\Program Files\MPICH\www\nt to find the HTML help page on how to run and program in the environment of Visual C++ 6.0

    • Examples have been installed under c:\Program Files\MPICH\SDK\Examples

    How to run the example

    • Open the MSDEV workspace file found in MPICH\SDK\Examples\nt\examples.dsw

    • Build the Debug target of the cpi project

    • Copy MPICH\SDK\Examples\nt\Debug\cpi.exe to a shared directory. (use copy/paste to \\pearl\files\mpi directory)Open a command prompt and change to the directory where you placed cpi.exe

    • Execute mpirun.exe –np 4 cpi

    • In order to set path in DOS, in this case, use command: set PATH=%PATH%;c:\Program Files\MPICH\mpd\bin

    Create your own project

    • Open MS Developer Studio - Visual C++

    • Create a new project with whatever name you want in whatever directory you want.  The easiest one is a Win32 console application with no files in it.

    Create your own project


    3. Finish the new project wizard.

    4. Go to Project->Settings or hit Alt F7 to bring up the project settings dialog box.

    5. Change the settings to use the multithreaded libraries. Change the settings for both Debug and Release targets.




    6. Set the include path for all target configurations: This should be c:\Program Files\MPICH\SDK\include


    7. Set the lib path for all target configurations: This should be c:\Program Files\MPICH\SDK\lib


    8. Add the ws2_32.lib library to all configurations (This is the Microsoft Winsock2 library.  It's in your default library path).Add mpich.lib to the release target and mpichd.lib to the debug target.



    9. Close the project settings dialog box.

    10. Add your source files to the project

    Useful MPI function to test your program

    MPI_Get_processor_name(name, resultlen)

    • name is a unique specifier for the actual node. (string)

    • resultlen is length of the result returned in name(integer)

      This routine returns the name of the processor on which it was called at the moment of the call. The number of characters actually written is returned in the output argument resultlen.

  • Login