1 / 79

What is a Supercomputer?

Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010. What is a Supercomputer?.

ursala
Download Presentation

What is a Supercomputer?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Parallel Programming with C and MPI at MCSRPart 1The University of Southern MississippiApril 8, 2010

  2. What is a Supercomputer? Loosely speaking, it is a “large” computer with an architecture that has been optimized for bigger solving problems faster than a conventional desktop, mainframe, or server computer.- Pipelining - Parallelism (lots of CPUs or Computers)

  3. Supercomputers at MCSR: mimosa • 253 CPU Intel Linux Cluster – Pentium 4 • Distributed memory – 500MB – 1GB per node • Gigabit Ethernet

  4. Supercomputers at MCSR: redwood • 224 CPU Memory Supercomputer • Intel Itanium 2 • Shared Memory: 1GB per node

  5. Supercomputers at MCSR: sequoia • 46 node Linux Cluster • 8 cores (CPUs) per node = 368 cores total • 2 GB memory per core (16 GB per node) • Shared memory intra-node • Distributed memory inter-node • Intel Xeon processors

  6. Supercomputers at MCSR: sequoia

  7. What is Parallel Computing? Using more than one computer (or processor) to complete a computational problem

  8. How May a Problem be Parallelized? Data Decomposition Task Decomposition

  9. Models of Parallel Programming • Message Passing Computing • Processes coordinate and communicate results via calls to message passing library routines • Programmers “parallelize” algorithm and add message calls • At MCSR, this is via MPI programming with C or Fortran • Sweetgum – Origin 2800 Supercomputer (128 CPUs) • Mimosa – Beowulf Cluster with 253 Nodes • Redwood – Altix 3700 Supercomputer (224 CPUs) • Shared Memory Computing • Processes or threads coordinate and communicate results via shared memory variables • Care must be taken not to modify the wrong memory areas • At MCSR, this is via OpenMP programming with C or Fortran on sweetgum

  10. Message Passing Computing at MCSR • Process Creation • Manager and Worker Processes • Static vs. Dynamic Work Allocation • Compilation • Models • Basics • Synchronous Message Passing • Collective Message Passing • Deadlocks • Examples

  11. Message Passing Process Creation • Dynamic • one process spawns other processes & gives them work • PVM • More flexible • More overhead - process creation and cleanup • Static • Total number of processes determined before execution begins • MPI

  12. Message Passing Processes • Often, one process will be the manager, and the remaining processes will be the workers • Each process has a unique rank/identifier • Each process runs in a separate memory space and has its own copy of variables

  13. Message Passing Work Allocation • Manager Process • Does initial sequential processing • Initially distributes work among the workers • Statically or Dynamically • Collects the intermediate results from workers • Combines into the final solution • Worker Process • Receives work from, and returns results to, the manager • May distribute work amongst themselves (decentralized load balancing)

  14. Message Passing Compilation • Compile/link programs w/ message passing libraries using regular (sequential) compilers • Fortran MPI example:include mpif.h • C MPI example: #include “mpi.h”

  15. Message Passing Compilation

  16. Message Passing Models • SPMD – Shared Program/Multiple Data • Single version of the source code used for each process • Manager executes one portion of the program; workers execute another; some portions executed by both • Requires one compilation per architecture type • MPI • MPMP – Multiple Program/Multiple Data • Once source code for master; another for slave • Each must be compiled separately • PVM

  17. Message Passing Basics • Each process must first establish the message passing environment • Fortran MPI example: integer ierror call MPI_INIT (ierror) • C MPI example:MPI_Init(&argc, &argv);

  18. Message Passing Basics • Each process has a rank, or id number • 0, 1, 2, … n-1, where there are n processes • With SPMD, each process must determine its own rank by calling a library routine • Fortran MPI Example:integer comm, rank, ierror call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierror) • C MPI ExampleMPI_Comm_rank(MPI_COMM_WORLD, &rank);

  19. Message Passing Basics • Each process has a rank, or id number • 0, 1, 2, … n-1, where there are n processes • Each process may use a library call to determine how many total processes it has to play with • Fortran MPI Example:integer comm, size, ierror call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierror) • C MPI ExampleMPI_Comm_size(MPI_COMM_WORLD, &size);

  20. Message Passing Basics • Each process has a rank, or id number • 0, 1, 2, … n-1, where there are n processes • Once a process knows the size, it also knows the ranks (id #’s) of those other processes, and can send or receive a message to/from any other process. • C Example:MPI_Send(buf, count, datatype,dest, tag, comm, ierror)------DATA-------------EVELOPE----status------MPI_Recv(buf, count, datatype, sourc,tag,comm,status,ierror)

  21. MPI Send and Receive Arguments • Buf starting location of data • Count number of elements • Datatype MPI_Integer, MPI_Real, MPI_Character… • Destination rank of process to whom msg being sent • Source rank of sender from whom msg being received or MPI_ANY_SOURCE • Tag integer chosen by program to indicate type of message or MPI_ANY_TAG • Communicator id’s the process team, e.g., MPI_COMM_WORLD • Status the result of the call (such as the # data items received)

  22. Synchronous Message Passing • Message calls may be blocking or nonblocking • Blocking Send • Waits to return until the message has been received by the destination process • This synchronizes the sender with the receiver • Nonblocking Send • Return is immediate, without regard for whether the message has been transferred to the receiver • DANGER: Sender must not change the variable containing the old message before the transfer is done. • MPI_ISend() is nonblocking

  23. Synchronous Message Passing • Locally Blocking Send • The message is copied from the send parameter variable to intermediate buffer in the calling process • Returns as soon as the local copy is complete • Does not wait for receiver to transfer the message from the buffer • Does not synchronize • The sender’s message variable may safely be reused immediately • MPI_Send() is locally blocking

  24. Synchronous Message Passing • Blocking Receive • The call waits until a message matching the given tag has been received from the specified source process. • MPI_RECV() is blocking. • Nonblocking Receive • If this process has a qualifying message waiting, retrieves that message and returns • If no messages have been received yet, returns anyway • Used if the receiver has other work it can be doing while it waits • Status tells the receive whether the message was received • MPI_Irecv() is nonblocking • MPI_Wait() and MPI_Test() can be used to periodically check to see if the message is ready, and finally wait for it, if desired

  25. Collective Message Passing • Broadcast • Sends a message from one to all processes in the group • Scatter • Distributes each element of a data array to a different process for computation • Gather • The reverse of scatter…retrieves data elements into an array from multiple processes

  26. Collective Message Passing w/MPI MPI_Bcast()Broadcast from root to all other processes MPI_Gather()Gather values for group of processes MPI_Scatter()Scatters buffer in parts to group of processes MPI_Alltoall()Sends data from all processes to all processes MPI_Reduce()Combine values on all processes to single val MPI_Reduce_Scatter()Broadcast from root to all other processes MPI_Bcast()Broadcast from root to all other processes

  27. Message Passing Deadlock • Deadlock can occur when all critical processes are waiting for messages that never come, or waiting for buffers to clear out so that their own messages can be sent • Possible Causes • Program/algorithm errors • Message and buffer sizes • Solutions • Order operations more carefully • Use nonblocking operations • Add debugging output statements to your code to find the problem

  28. Sample PBS Script sequoia% vi example.pbs #!/bin/bash #PBS -l nodes=4 # Mimosa #PBS –l ncpus=4 # Redwood #PBS –l ncpus=4 # Sequoia #PBS –l cput=0:5:0 # Request 5 minutes of CPU time #PBS –N example cd $PWD rm *.pbs.[eo]* icc –lmpi –o add_mpi.exe add_mpi.c #Sequoia mpiexec -n 4 add_mpi.exe #Sequoia sequoia % qsub example.pbs 37537.sequoia.mcsr.olemiss.edu

  29. PBS: Querying Jobs

  30. MPI Programming Exercises Hello World sequential parallel (w/MPI and PBS) • Add the prime numbers in an Array of numbers • sequential • parallel (w/MPI and PBS)

  31. Log in to sequoia & get workshop files • Use secure shell to login from your PC to hpcwoods ssh trn_N8Y9@hpcwoods.olemiss.edu B. Use secure shell to from hpcwoods to your training account on sequoia: ssh tracct1@sequoia ssh tracct2@sequoia • C. Copy workshop files into your home directory by running: /usr/local/apps/ppro/prepare_mpi_workshop

  32. Examine, compile, and execute hello.c

  33. Examine hello_mpi.c

  34. Examine hello_mpi.c Add macro to include theheader file for the MPI library calls.

  35. Examine hello_mpi.c Add function call to initialize the MPI environment

  36. Examine hello_mpi.c Add function call find out how many parallel processes there are.

  37. Examine hello_mpi.c Add function call to find out which processthis is – the MPI process ID of this process.

  38. Examine hello_mpi.c Add IF structure so that the manager/boss process can do one thing, and everyone else (the workers/servants)can do something else.

  39. Examine hello_mpi.c All processes, whether manager or worker, must finalize MPI operations.

  40. Compile hello_mpi.c Compile it. Why won’t this compile? You must link to the MPI library.

  41. Run hello_mpi.exe On 1 CPU On 2 CPUs On 4 CPUs

  42. hello_mpi.pbs

  43. hello_mpi.pbs

  44. hello_mpi.pbs

  45. hello_mpi.pbs

  46. hello_mpi.pbs

  47. hello_mpi.pbs

More Related