1 / 31

Parallel Processing

Javier Delgado Grid-Enabledment of Scientific Applications Professor S. Masoud Sadjadi. Parallel Processing. Outline. Why parallel processing Overview The Message Passing Interface (MPI) ‏ Introduction Basics Examples OpenMP Alternatives to MPI. Why parallel processing?.

lassie
Download Presentation

Parallel Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Javier Delgado Grid-Enabledment of Scientific Applications Professor S. Masoud Sadjadi Parallel Processing

  2. Parallel Processing - GCB Outline • Why parallel processing • Overview • The Message Passing Interface (MPI)‏ • Introduction • Basics • Examples • OpenMP • Alternatives to MPI

  3. Parallel Processing - GCB Why parallel processing? • Computationally-intensive scientific applications • Hurricane modelling • Bioinformatics • High-Energy Physics • Physical limits of one processor

  4. Parallel Processing - GCB Types of Parallel Processing • Shared Memory • e.g. Multiprocessor computer • Distributed Memory • e.g. Compute Cluster

  5. Parallel Processing - GCB Shared Memory • Advantages • No explicit message passing • Fast • Disadvantages • Scalability • Synchronizaton Source: http://kelvinscale.net

  6. Parallel Processing - GCB Distributed Memory • Advantages • Each processor has its own memory • Usually more cost-effective • Disadvantages • More programmer involvement • Slower

  7. Parallel Processing - GCB Combination of Both • Emerging trend • Best and worst of both worlds

  8. Parallel Processing - GCB Outline • Why parallel processing • Overview • The Message Passing Interface (MPI)‏ • Introduction • Basics • Examples • OpenMP • Alternatives to MPI

  9. Parallel Processing - GCB Message Passing • Standard for Distributed Memory systems • Networked workstations can communicate • De Facto specification: • The Message Passing Interface (MPI)‏ • Free MPI Implementations: • MPICH • OpenMPI • LAM-MPI

  10. Parallel Processing - GCB MPI Basics • Design Virtues • Defines communication, but not its hardware • Expressive • Performance • Concepts • No adding/removing of processors during computation • Same program runs on all processors • Single-Program, Multiple Data (SPMD)‏ • Multiple Instruction, Multiple Data (MIMD)‏ • Processes identified by “rank”

  11. Parallel Processing - GCB Communication Types • Standard • Synchronous (blocking send)‏ • Ready • Buffered (asynchronous)‏ • For non-blocking communication: • MPI_Wait – block until receive • MPI_Test - true/false

  12. Parallel Processing - GCB Message Structure Data Length Data Type Data Type Variable Name Data Length Data Send Recv Destination Status Communication context Communication context Tag Tag

  13. Parallel Processing - GCB Data Types and Functions • Uses its own types for consistency • MPI_INT, MPI_CHAR, etc. • All Functions prefixed with “MPI_” • MPI_Init, MPI_Send, MPI_Recv, etc.

  14. Parallel Processing - GCB Our First Program: Numerical Integration • Objective: Calculate area under f(x) = x2 • Outline: • Define variables • Initialize MPI • Determine subset of program to calculate • Perform Calculation • Collect Information (at Master)‏ • Send Information (Slaves)‏ • Finalize

  15. Parallel Processing - GCB Our First Program • Download Link: • http://www.fiu.edu/~jdelga06/integration.c

  16. Parallel Processing - GCB Variable Declarations #include "mpi.h" #include <stdio.h> /* problem parameters */ #define f(x) ((x) * (x))‏ #define numberRects 50 #define lowerLimit 2.0 #define upperLimit 5.0 int main( int argc, char * argv[] )‏ { /* MPI variables */ int dest, noProcesses, processId, src, tag; MPI_Status status; /* problem variables */ int i; double area, x, height, lower, width, total, range; ...

  17. Parallel Processing - GCB Variable Declarations #include "mpi.h" #include <stdio.h> /* problem parameters */ #define f(x) ((x) * (x))‏ #define numberRects 50 #define lowerLimit 2.0 #define upperLimit 5.0 int main( int argc, char * argv[] )‏ { /* MPI variables */ int dest, noProcesses, processId, src, tag; MPI_Status status; /* problem variables */ int i; double area, x, height, lower, width, total, range; ...

  18. Parallel Processing - GCB Variable Declarations #include "mpi.h" #include <stdio.h> /* problem parameters */ #define f(x) ((x) * (x))‏ #define numberRects 50 #define lowerLimit 2.0 #define upperLimit 5.0 int main( int argc, char * argv[] )‏ { /* MPI variables */ int dest, noProcesses, processId, src, tag; MPI_Status status; /* problem variables */ int i; double area, x, height, lower, width, total, range; ...

  19. Parallel Processing - GCB Variable Declarations #include "mpi.h" #include <stdio.h> /* problem parameters */ #define f(x) ((x) * (x))‏ #define numberRects 50 #define lowerLimit 2.0 #define upperLimit 5.0 int main( int argc, char * argv[] )‏ { /* MPI variables */ int dest, noProcesses, processId, src, tag; MPI_Status status; /* problem variables */ int i; double area, x, height, lower, width, total, range; ...

  20. Parallel Processing - GCB MPI Initialization int main( int argc, char * argv[] )‏ { ... MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &noProcesses); MPI_Comm_rank(MPI_COMM_WORLD, &processId); ...

  21. Parallel Processing - GCB Calculation int main( int argc, char * argv[] )‏ { ... /* adjust problem size for subproblem*/ range = (upperLimit - lowerLimit) / noProcesses; width = range / numberRects; lower = lowerLimit + range * processId; /* calculate area for subproblem */ area = 0.0; for (i = 0; i < numberRects; i++)‏ { x = lower + i * width + width / 2.0; height = f(x); area = area + width * height; } ...

  22. Parallel Processing - GCB Sending and Receiving int main( int argc, char * argv[] )‏ { ... tag = 0; if (processId == 0) /* MASTER */ { total = area; for (src=1; src < noProcesses; src++)‏ { MPI_Recv(&area, 1, MPI_DOUBLE, src, tag, MPI_COMM_WORLD, &status); total = total + area; } fprintf(stderr, "The area from %f to %f is: %f\n", lowerLimit, upperLimit, total ); } else /* WORKER (i.e. compute node) */ { dest = 0; MPI_Send(&area, 1, MPI_DOUBLE, dest, tag, MPI_COMM_WORLD); }; ...

  23. Parallel Processing - GCB Finalizing int main( int argc, char * argv[] )‏ { ... MPI_Finalize(); return 0; }

  24. Parallel Processing - GCB Communicators • MPI_COMM_WORLD – All processes involved • What if different workers have different tasks?

  25. Parallel Processing - GCB Additional Functions • Data Management • MPI_Bcast (broadcast)‏ • Collective Computation • Min, Max, Sum, AND, etc. • Benefits: • Abstraction • Optimized Source: http://www.pdc.kth.se

  26. Parallel Processing - GCB Typical Problems • Designing • Debugging • Scalability

  27. Parallel Processing - GCB Scalability Analysis • Definition: Estimation of resource (computation and computation) requirements of a program as problem size and/or number of processors increases • Require knowledge of communication time • Assume otherwise idle nodes • Ignore data requirements of node

  28. Parallel Processing - GCB Simple Scalability Example • Tcomm = Time to send a message • Tcomm = s + rn • s = start-up time • r = time to send a single byte (i.e. 1/bandwidth)‏ • n = size of the data type (int, double, etc.)‏

  29. Parallel Processing - GCB Simple Scalability Example • Matrix Multiplication of two square matrices of size (N x N). • First Matrix is broadcasted to all nodes • Cost for the rest • Computation • n multiplications and (n – 1) additions per cell • n2 x (2n – 1) = 2n3 -n2 floating point operations • Communication • Send n elements to worker node, and return the resulting n elements to the master node (2n)‏ • After doing this for each column in the result matrix: • n x 2n

  30. Parallel Processing - GCB Simple Scalability Example • Therefore, we get the following ratio of communication to computation • As n becomes very large, the ratio approaches 1/n. So this problem is not severely affected by communication overhead

  31. Parallel Processing - GCB References • http://nf.apac.edu.au/training/MPIProg/mpi-slides/allslides.html • High Performance Linux Clusters. By Joseph D. Sloan. O'Reilly Press. • Using MPI, second edition. By Gropp, Lusk, and Skjellum. MIT Press.

More Related