An introduction
This presentation is the property of its rightful owner.
Sponsored Links
1 / 67

An Introduction PowerPoint PPT Presentation


  • 114 Views
  • Uploaded on
  • Presentation posted in: General

An Introduction. Background The message-passing model Parallel computing model Communication between process Source of further MPI information Basic of MPI message passing Fundamental concepts Simple examples in C Extended point-to-pint operations Non-blocking communication modes.

Download Presentation

An Introduction

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


An introduction

An Introduction

  • Background

    • The message-passing model

    • Parallel computing model

    • Communication between process

    • Source of further MPI information

  • Basic of MPI message passing

    • Fundamental concepts

    • Simple examples in C

  • Extended point-to-pint operations

    • Non-blocking communication

    • modes


Companion material

Companion Material

  • Online examples available at http://www.mcs.anl.gov/mpi/tutorial

  • http://www.netlib.org/utk/papers/mpi-book/mpi-book.html

    contains a relative complete reference on how to program on MPI and examples.


The message passing model

The Message-Passing Model

  • A process is (traditionally) a program counter and address space

  • Processes may have multiple threads(program counters and associated stacks) sharing a single address space. MPI is for communication among processes, which have separate address spaces.

  • Interprocess communication consists of

    • Synchronization/Asynchronization

    • Movement of data from one process’s address space to another’s


Synchronous communication

Synchronous Communication

  • A synchronous communication does not complete until the message has been received

    • A FAX or registered mail


Asynchronous communication

Asynchronous Communication

  • An asynchronous communication completes as soon as the message is on the way.

    • A post card or email


What is mpi

What is MPI?

  • A message-passing library specification

    • Extended message-passing model

    • Not a language or compiler specification

    • Not a specific implementation or product

  • For parallel computers, clusters, and heterogeneous networks

  • Full-featured

  • Designed to permit the development of parallel software libraries

  • Designed to provide access to advanced parallel hardware for

    • End users

    • Library writers

    • Tool developers


  • What is message passing

    What is message passing?

    • Data transfer plus synchronization

    • Requires cooperation of sender and receiver

    • Cooperation not always apparent in code


    Mpi sources

    MPI Sources

    • The standard itself:

      • At http://www.mpi-forum.org

      • All MPI official release, in both postscript and HTML

  • Books:

    • Using MPI: Portable parallel programming with the message-passing interface, by Gropp, Lusk, and Skjellum, MIT Press, 1994

    • MPI: The Complete Reference, by Snir, Otto, Huss-Lederman, Walker,and Dongarra, MIT Press, 1996

    • Designing and building Parallel programs, by Ian Foster, Addison-Wesley, 1995

    • Parallel Programming with MPI, by Peter Pacheco, Morgan-Kaufmann, 1997

    • MPI: The Complete Reference Vol 1 and 2, MIT Press, 1998

  • Other information on web:

    • At http://www.mcs.anl.gov/mpi

    • Pointers to lot of stuff, including other talks and tutorials, a FAQ, other MPI pages


  • Features of mpi

    Features of MPI

    • General

      • Communications combine context and group for message security

      • Thread safety

  • Point-to-point communication

    • Structured buffers and derived datatypes, heterogeneity.

    • Modes: normal(blocking and non-blocking), synchronous, ready(to allow access to fast protocol), buffered

  • Collective

    • Both built-in and user-defined collective operations.

    • Large number of data movement routines.

    • Subgroups defined directly or by topology


  • Features not in mpi

    Features NOT in MPI

    • Process Management

    • Remote memory transfer

    • Threads

    • Virtual shared memory


    Why use mpi

    Why use MPI?

    • MPI provides a powerful, efficient, and portable way to express parallel programs.

    • MPI was explicitly designed to enable libraries which may eliminate the need for many users to learn (much of) MPI.

    • Portable

    • Expressive

    • Good way to learn about subtle issues in parallel computing


    Is mpi large or small

    Is MPI large or small?

    • MPI is large(125 functions)

      • MPI’s extensive functionality requires many functions.

      • Number of functions not necessarily a measure of complexity.

  • MPI is small(6 functions)

    • Many parallel programs can be written with just 6 basic functions.

  • MPI is just right

    • One can access flexibility when it is required.

    • One need not master all parts of MPI to use it.

    • MPI is whatever size you like


  • Skeleton mpi program

    Skeleton MPI Program

    #include <mpi.h>

    main( int argc, char** argv ) {

    MPI_Init( &argc, &argv );

    /* main part of the program */

    Use MPI function call depend on your data partition and parallization architecture

    MPI_Finalize();

    }


    Initializing mpi

    Initializing MPI

    • The first MPI routine called in any MPI program must be the initialization routine MPI_INIT

    • MPI_INIT is called once by every process, before any other MPI routines

      int mpi_Init( int *argc, char **argv );


    A minimal mpi program c

    A minimal MPI program(c)

    #include “mpi.h”

    #include <stdio.h>

    int main(int argc, char *argv[])

    {

    MPI_Init(&argc, &argv);

    printf(“Hello, world!\n”);

    MPI_Finalize();

    Return 0;

    }


    Commentary

    Commentary

    • #include “mpi.h” provides basic MPI definitions and types.

    • MPI_Init starts MPI

    • MPI_Finalize exits MPI

    • Note that all non-MPI routines are local; thus “printf” run on each process


    Notes on c

    Notes on C

    • In C:

      • mpi.h must be included by using #include mpi.h

      • MPI functions return error codes or MPI_SUCCESS


    Error handling

    Error handling

    • By default, an error causes all processes to abort.

    • The user can cause routines to return(with an error code) instead.

    • A user can also write and install custom error handlers.

    • Libraries might want to handle errors differently from applications.


    Running mpi programs

    Running MPI Programs

    • The MPI-1 standard does not specify how to run an MPI program, just as the Fortran standard does not specify how to run a Fortran program.

    • In general, starting an MPI program is dependent on the implementation of MPI you are using, and might require various scripts, program arguments, and/or environment variables.

    • Common way: mpirun –np 2 hello


    Finding out about the environment

    Finding out about the environment

    • Two important questions that arise early in a parallel program are:

      -How many processes are participating in this computation?

      -Which one am I?

    • MPI provides functions to answer these questions:

      -MPI_Comm_size reports the number of processes.

      -MPI_Comm_rank reports the rank, a number between 0 and size-1, identifying the calling process.


    Better hello c

    Better Hello(c)

    #include “mpi.h”

    #include <stdio.h>

    int main(int argc, char *argv[])

    {

    int rank, size;

    MPI_Init(&argc, &argv);

    MPI_Comm_rank(MPI_COMM_WORLD, &rank);

    MPI_Comm_size(MPI_COMM_WORLD, &size);

    printf(“I am %d of\n”, rank, size);

    MPI_Finalize();

    return 0;

    }


    Some basic concepts

    Some basic concepts

    • Processes can be collected into groups.

    • Each message is sent in a context, and must be received in the same context.

    • A group and context together form a communicator.

    • A process is identified by its rank in the group associated with a communicator.

    • There is a default communicator whose group contains all initial processes, called MPI_COMM_WORLD.


    Mpi datatypes

    MPI Datatypes

    • The data in a message to sent or received is described by a triple(address, count, datatype)

    • An MPI datatype is recursively defined as:

      • Predefined, corresponding to a datatype from the language(e.g., MPI_INT, MPI_DOUBLE_PRECISION)

      • A contiguous array of MPI datatypes

      • A strided block of datatypes

      • An indexed array of blocks of datatypes

      • An arbitrary structure of datatypes

    • There are MPI functions to construct custom datatypes, such an array of (int, float) pairs, or a row of a matrix stored columnwise.


    Why datatypes

    Why datatypes

    • Since all data is labeled by type, an MPI implementation can support communication between processes on machines with very different memory representations and lengths of elementary datatypes(heterogeneous communication).

    • Specifying application-oriented layout of data in memory

      • Reduces memory-to-memory copies in the implementation

      • Allows the use of special hardware(scatter/gather) when available.


    Mpi basic send receive

    Process 1

    Receive(data)

    Process 0

    Send(data)

    MPI basic send/receive

    • We need to fill in the details in

    • Things that need specifying:

      • How will “data” be described?

      • How will processes be identified?

      • How will the receiver recognize/screen messages?

      • What will it mean for these operation to complete?


    Mpi basic blocking send

    MPI basic (blocking) send

    • MPI_Send(void *start, int count,MPI_Datatype datatype, int dest, int tag, MPI_Comm comm)

    • The message buffer is described by (start, count, datatype).

    • The target process is specified by dest, which is the rank of the target process in the communicator specified by comm.

    • When this function returns, the data has been delivered to the system and the buffer can be reused. The message may not have been received by the target process.


    Mpi send primitive

    MPI_Send primitive

    • MPI_Send performs a standard-mode, blocking send.

    • The send buffer specified by MPI_Send consists of count successive entries of the type indicated by datatype, starting with the entry at address “buf”.

    • The count may be zero, in which case the data part of the message is empty.


    Basic mpi datatype

    Basic MPI Datatype

    MPI datatypeC datatype

    MPI_CHARsigned char

    MPI_SIGNED_CHARsigned char

    MPI_UNSIGNED_CHARunsigned char

    MPI_SHORTsigned short

    MPI_UNSIGNED_SHORTunsigned short

    MPI_INTsigned int

    MPI_UNSIGNEDunsigned int

    MPI_LONGsigned long

    MPI_UNSIGNED_LONGunsigned long


    Basic mpi datatype continue

    Basic MPI Datatype(continue)

    MPI_FLOATfloat

    MPI_DOUBLEdouble

    MPI_LONG_DOUBLElong double


    Mpi basic blocking receive

    MPI basic (blocking) receive

    • MPI_RECV(void *start, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status *status)

      • Waits until a matching (on source and tag) message is received from the system, and the buffer can be used.

      • Source is rank in communicator specified by comm, or MPI_ANY_SOURCE.

      • Status contains further information

      • Receiving fewer than count occurrences of datatype is OK, but receiving more is an error.


    More comment on send and receive

    More comment on send and receive

    • A receive operation may accept messages from an arbitrary sender, but a send operation must specify a unique receiver.

    • Source equals destination is allowed, that is, a process can send a message to itself.


    Mpi isend and mpi irecv primitive

    MPI_ISEND and MPI_IRECV primitive

    MPI_ISEND(buf, count, datatype, dest, tag, comm, request)

    • Nonblocking communication use request objects to identify communication operations and link the posting operation with the completion operation.

    • request is a request handle which can be used to query the status of the communication or wait for its completion.


    More on nonblocking send and receive

    More on nonblocking send and receive

    • A nonblocking send call indicates that the system may start copying data out of the send buffer. The sender must not access any part of the send buffer after a nonblocking send operation is posted, until the complete-send returns.

    • A nonblocking receive indicates that the system may start writing data into the receive buffer. The receiver must not access any part of the receive buffer after a nonblocking receive operation is posted, until the complete-receive returns.


    Mpi recv primitive

    MPI_RECV primitive

    • MPI_RECV performs a standard-mode, blocking receive.

    • The receive buffer consists of storage sufficient to contain count consecutive entries of the type specified by datatype, starting at address buf.

    • An overflow error occurs if all incoming data don’t fit into the receive buffer.

    • The receiver can specify a wildcard value for souce(MPI_ANY_SOURCE) and/or a wildcard value for tag(MPI_ANY_TAG), indicating that any source and/or tag are acceptable.


    Mpi tags

    MPI tags

    • Message are sent with an accompanying user-defined integer tag, to assist the receiving process in identifying the message.

    • Message can be screened at the receiving end by specifying a specific tag, or not screened by specifying MPI_ANY_TAG as the tag in a receive.

    • Some non-MPI message-passing systems have called tags”message types”. MPI calls them tags to avoid confusion with datatype.


    Retrieving further information

    Retrieving further information

    • Status is a data structure allocated in the user’s program.

    • In C:

      int recvd_tag, recvd_from, recvd_count;

      MPI_Status status;

      MPI_Recv(…, MPI_ANY_SOURCE, MPI_ANY_TAG, …, &status)

      recvd_tag = status.MPI_TAG;

      recvd_from = status.MPI_SOURCE;

      MPI_Get_count(&status, datatype, &recvd_count);


    Tags and contexts

    Tags and contexts

    • Separation of messages used to be accomplished by use of tags, but

      • This requires libraries to be aware of tags used by other libraries

      • This can be defeated by use of “wild card” tags

  • Contexts are different from tags

    • No wild cards allowed

    • Allocated dynamically by the system when a library sets up a communicator for its own use.

  • User-defined tags still provided in MPI for user convenience in organizing application

  • Use MPI_Comm_split to create new communicators


  • Mpi is simple

    MPI is simple

    • Many parallel programs can be written using just these six functions, only two of which are non-trivial;

      • MPI_INIT

      • MPI_FINALIZE

      • MPI_COMM_SIZE

      • MPI_COMM_RANK

      • MPI_SEND

      • MPI_RECV

  • Point-to-point(send/recv) isn’t the only way…


  • Alternative set of 6 functions fro simplified mpi

    Alternative set of 6 functions fro simplified MPI

    • MPI_INIT

    • MPI_FINALIZE

    • MPI_COMM_SIZE

    • MPI_COMM_RANK

    • MPI_BCAST

    • MPI_REDUCE


    Introduction to collective operations in mpi

    Introduction to collective operations in MPI

    • Collective operations are called by all processes in a communicator

    • MPI_Bcast distributes data from one process(the root) to all others in a communicator.

      Syntax: MPI_Bcast(void *message, int count, MPI_Datatype datatype, int root, MPI_Comm comm)

    • MPI_Reduce combines data from all processes in communicator or and returns it to one process

      Syntax: MPI_Reduce(void *message, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, int root, MPI_Comm comm)

    • In many numerical algorithm, send/receive can be replaced by Bcast/Reduce, improving both simplicity and efficiency.


    An introduction

    • MPI_Datatype values

      MPI_CHAR, MPI_SHORT, MPI_INT, MPI_LONG, MPI_FLOAT, MPI_DOUBLE, MPI_LONG_DOUBLE, MPI_UNSIGNED_CHAR, MPI_UNSIGNED_SHORT, MPI_UNSIGNED, MPI_UNSIGNED_LONG, MPI_BYTE, MPI_PACKED

    • MPI_OP Values

      MPI_MAX, MPI_MIN, MPI_SUM, MPI_PROD, MPI_LAND, MPI_BAND, MPI_LOR, MPI_BOR, MPI_LXOR, MPI_BXOR, MPI_MAXLOC, MPI_MINLOC


    Example pi in c 1

    Example: PI in c –1

    #include “mpi.h”

    #include <math.h>

    int main(int argc, char *argv[])

    {

    int done = 0, n, myid, numprocs, I, rc;

    double PI25DT = 3.141592653589793238462643;

    double mypi, pi, h, sum, x, a;

    MPI_Init(&argc, &argv);

    MPI_Comm_size(MPI_COMM_WORLD, &numprocs);

    MPI_COMM_rank(MPI_COMM_WORLD, &myid);

    while (!done)

    {

    if (myid == 0)

    {

    printf(“Enter the number of intervals: (0 quits) “);

    scanf(“%d”, &n);

    }

    MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);

    if (n == 0)

    break;

    }

    }


    Example pi in c 2

    Example: PI in c - 2

    h = 1.0 / (double)n;

    sum = 0.0;

    for (i = myid + 1; i <= n; i += numprocs)

    {

    x = h * ((double)i – 0.5);

    sum += 4.0 / (1.0 + x * x);

    }

    mypi = h * sum;

    MPI_Reduce(&mypi, &pi, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD);

    if (myid == 0)

    printf(“pi is approximately %.16f, Error is %.16f\n”, pi, fabs(pi – PI25DT));

    }

    MPI_Finalize();

    return 0;

    }


    Compile and run the code

    Compile and run the code

    • mpicc –o pi pi.c

    • Run the code by using:

      mpirun –np # of procs –machinefile XXX pi

      -machinefile tells MPI to run the program on the machines of XXX.


    One more example

    One more example

    • Simple summation of numbers using MPI

    • This program adds numbers that stored in the data file "rand_data.txt" in parallel.

    • This program is taken from book "Parallel Programming" by Barry Wilkinson and Michael Allen.


    Continue

    Continue….

    #include "mpi.h"

    #include <stdio.h>

    #include <math.h>

    #define MAXSIZE 100

    int main(int argc, char **argv) {

    int myid, numprocs;

    int data[MAXSIZE], i, x, low, high, myresult=0, result;

    char fn[255];

    FILE *fp;

    MPI_Init(&argc, &argv);

    MPI_Comm_size(MPI_COMM_WORLD, &numprocs); MPI_Comm_rank(MPI_COMM_WORLD, &myid);

    if(0 == myid) {

    /* open input file and intialize data */

    strcpy(fn, getenv("PWD"));

    strcat(fn, "/rand_data.txt");


    Continue1

    Continue…

    if( NULL == (fp = fopen(fn, "r")) ) {

    printf("Can't open the input file: %s\n\n", fn);

    exit(1);

    }

    for(i=0; i<MAXSIZE; i++) {

    fscanf(fp, "%d", &data[i]); }

    }

    }

    /* broadcast data */

    MPI_Bcast(data, MAXSIZE, MPI_INT, 0, MPI_COMM_WORLD);

    /* add portion of data */

    x = MAXSIZE/numprocs; /* must be an integer */

    low = myid * x;

    high = low + x;

    for(i=low; i<high; i++) {

    myresult += data[i];

    }


    Continue2

    Continue…

    printf("I got %d from %d\n", myresult, myid);

    /* compute global sum */

    MPI_Reduce(&myresult, &result, 1, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD);

    if(0 == myid) {

    printf("The sum is %d.\n", result);

    }

    MPI_Finalize();

    }


    Communication models

    Communication models

    MPI provides multiple modes for sending messages:

    • Synchronous mode(MPI_Ssend): the send does not complete until a matching receive has begun. (Unsafe programs become incorrect and usually deadlock within an MPI_Ssend)

    • Buffered mode(MPI_Bsend): the user supplies the buffer to system for its use. (User supplies enough memory to make unsafe program safe)

    • Read mode(MPI_Rsend): user guarantees that matching


    Deadlocks in blocking operations

    Deadlocks in blocking operations

    • Send a large message from process 0 to process 1

      • If there is insufficient storage at the destination, the send must wait for the user to provide the memory space(through a receive)

  • What happens with

    Process 0Process 1

    Send(1)Send(0)

    Recv(1)Recv(0)

  • This is called “unsafe” because it depends on the availability of system buffers.


  • Some solutions to the unsafe problem

    Some solutions to the “unsafe” problem

    • Order the operations more carefully

      Process 0Process 1

      Send(1)Recv(0)

      Recv(1)Send(0)

      Use non-blocking operations:

      Process 0Process 1

      ISend(1)ISend(0)

      IRecv(1)IRecv(0)

      WaitallWaitall


    Non blocking operations

    Non-blocking operations

    • Non-blocking operations return immediately “request Handles” that can be waited on and queried:

      • MPI_Isend(start, count, datatype, dest, tag, comm, request)

      • MPI_Irecv(start, count, datatype, dest, tag, comm, request)

      • MPI_Wait(request, status)

  • Without waiting: MPI_Test(request, flag, status)


  • When to use mpi

    When to use MPI

    • Portability and Performance

    • Irregular data structure

    • Building tools for others

    • Need to manage memory on a per processor basis


    When not to use mpi

    When not to use MPI

    • Solution (e.g., library) already exists

      • http://www.mcs.anl.gov/mpi/libraries.html

    • Require fault tolerance

      • Socket


    Program with mpi and play with it

    Program with MPI and play with it

    • MPICH-1.2.4 for windows 2000 has installed in ECE226.

    • On every machine, please refer to c:\Program Files\MPICH\www\nt to find the HTML help page on how to run and program in the environment of Visual C++ 6.0

    • Examples have been installed under c:\Program Files\MPICH\SDK\Examples


    How to run the example

    How to run the example

    • Open the MSDEV workspace file found in MPICH\SDK\Examples\nt\examples.dsw

    • Build the Debug target of the cpi project

    • Copy MPICH\SDK\Examples\nt\Debug\cpi.exe to a shared directory. (use copy/paste to \\pearl\files\mpi directory)Open a command prompt and change to the directory where you placed cpi.exe

    • Execute mpirun.exe –np 4 cpi

    • In order to set path in DOS, in this case, use command: set PATH=%PATH%;c:\Program Files\MPICH\mpd\bin


    Create your own project

    Create your own project

    • Open MS Developer Studio - Visual C++

    • Create a new project with whatever name you want in whatever directory you want.  The easiest one is a Win32 console application with no files in it.


    Create your own project1

    Create your own project


    Continue3

    continue

    3. Finish the new project wizard.

    4. Go to Project->Settings or hit Alt F7 to bring up the project settings dialog box.

    5. Change the settings to use the multithreaded libraries. Change the settings for both Debug and Release targets.


    Continue4

    continue


    Continue5

    continue


    Continue6

    continue

    6. Set the include path for all target configurations: This should be c:\Program Files\MPICH\SDK\include


    Continue7

    continue

    7. Set the lib path for all target configurations: This should be c:\Program Files\MPICH\SDK\lib


    Continue8

    continue

    8. Add the ws2_32.lib library to all configurations (This is the Microsoft Winsock2 library.  It's in your default library path).Add mpich.lib to the release target and mpichd.lib to the debug target.


    Continue9

    continue


    Continue10

    continue

    9. Close the project settings dialog box.

    10. Add your source files to the project


    Useful mpi function to test your program

    Useful MPI function to test your program

    MPI_Get_processor_name(name, resultlen)

    • name is a unique specifier for the actual node. (string)

    • resultlen is length of the result returned in name(integer)

      This routine returns the name of the processor on which it was called at the moment of the call. The number of characters actually written is returned in the output argument resultlen.


  • Login