1 / 22

Tutorial on MPI Experimental Environment for ECE5610/CSC6220

Tutorial on MPI Experimental Environment for ECE5610/CSC6220. Outline. The WSU Grid cluster How to login to the Grid How to run your program on a single node How to run your program on multiple nodes. WSU Grid Cluster.

kaya
Download Presentation

Tutorial on MPI Experimental Environment for ECE5610/CSC6220

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tutorial on MPI Experimental Environment for ECE5610/CSC6220

  2. Outline • The WSU Grid cluster • How to login to the Grid • How to run your program on a single node • How to run your program on multiple nodes

  3. WSU Grid Cluster The WSU Grid Cluster is a high performance computing system that hosts and manages research related projects.  The Grid currently has the combined processing power of 4,568 cores: 1,346 Intel cores, 3,222 AMD cores, with over 13.5TB of RAM and 1.2PB of disk space. The system is open to every researcher at WSU.

  4. Login to the Grid Host name: grid.wayne.edu, Port: 22 • Download putty.exe: http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html

  5. Login to the Grid • Use putty.exe to login Username: ab1234 (your AccessID) Password: your pipeline password

  6. Login to the Grid You can start writing an MPI program now!

  7. Start MPI Programming • MPI environment • Initialize and finalize • Know who I am and my community • Writing MPI programs • Similar to writing a C program • Call MPI functions • Compiling and running MPI programs • Compiler: mpicc • Execution: mpiexec • Example: copy hello.c to your home directory

  8. Initialize and Finalize the Environment • Initializing the MPI environment before calling any MPI functions int MPI_Init (int *argc, char *argv) • Finalizing the MPI environment before terminating your program int MPI_Finalize () • The two functions should be called by all processes, and no other MPI calls are allowed before MPI_Init and after MPI_Finalize.

  9. Finding out about the Environment • Two important questions that arise early in a parallel program are: • How many processes are participating in this computation? • Who am I? • MPI provides functions to answer these questions • MPI_Comm_size reports the number of processes. • MPI_Comm_rank reports the rank, a number between 0 and size-1, identifying the calling process.

  10. First Program hello.c: “Hello World!” #include "mpi.h" #include <stdio.h> int main(int argc, char *argv[]) { int n, myid, numprocs,i,namelen; char processor_name[MPI_MAX_PROCESSOR_NAME]; MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&numprocs); MPI_Comm_rank(MPI_COMM_WORLD,&myid); MPI_Get_processor_name(processor_name,&namelen); printf("hello world, I am Process %d of %d is on %s\n",myid,numprocs, processor_name); MPI_Finalize(); return 0; }

  11. Compile and run your program • Questions: • Why is the rank order random? • Can we serialize the rank order?

  12. MPI Basic (Blocking) Send MPI_Send(start, count, datatype, dest, tag, comm) • The message buffer is described by (start, count, datatype). • The target process is specified by dest, which is the rank of the target process in the communicator specified by comm. • When this function returns, the data has been delivered to the system and the buffer can be reused. The message may not have been received by the target process.

  13. MPI Basic (Blocking) Receive MPI_RECV(start, count, datatype, source, tag, comm, status) • Waits until a matching (on source and tag) message is received from the system, and the buffer can be used. • Source is rank in communicator specified by comm, or MPI_ANY_SOURCE.

  14. Processes Execution in Order I. Process i sends a message to process i+1; II. After receiving the message, process i+1 sends its message;

  15. The Program hello_order.c #include "mpi.h" #include <stdio.h> int main(int argc, char *argv[]) { int n, myid, numprocs,i,namelen; char processor_name[MPI_MAX_PROCESSOR_NAME]; char message[100]; MPI_Status status; MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&numprocs); MPI_Comm_rank(MPI_COMM_WORLD,&myid); MPI_Get_processor_name(processor_name,&namelen); if(0 == myid){ printf("hello world, I am Process %d of %d is on %s\n",myid,numprocs, processor_name); strcpy(message, "next"); MPI_Send(message, strlen(message)+1, MPI_CHAR, myid+1, 99, MPI_COMM_WORLD); }else if(myid < (numprocs-1)){ MPI_Recv(message, 100, MPI_CHAR, myid-1, 99, MPI_COMM_WORLD, &status); printf("hello world, I am Process %d of %d is on %s\n",myid,numprocs, processor_name); MPI_Send(message, strlen(message)+1, MPI_CHAR, myid+1, 99, MPI_COMM_WORLD); }else{ MPI_Recv(message, 100, MPI_CHAR, myid-1, 99, MPI_COMM_WORLD, &status); printf("hello world, I am ProcesSs %d of %d is on %s\n",myid,numprocs, processor_name); } MPI_Finalize(); return 0; }

  16. Results Discuss: Parallel programs use message communication to achieve determinism.

  17. Run programs on multiple nodes #!/bin/bash #PBS -l ncpus=4 #PBS -l nodes=2:ppn=2 #PBS -m ea #PBS -q mtxq #PBS -o grid.wayne.edu:~fb4032/tmp3/output_file.64 #PBS -egrid.wayne.edu:~fb4032/tmp3/error_file.64 /wsu/arch/x86_64/mpich/mpich-3.0.4-icc/bin/mpiexec \ -machinefile $PBS_NODEFILE \ -n 8 \ /wsu/home/fb/fb40/fb4032/main • Edit the job running script: job.sh • This job requests 2 nodes with 2 processors each, and it is submitted to queue mtxq

  18. Run programs on multiple nodes • -l specify the resources_list, • ncpus - Number of CPUs nodes - Number of Nodes • -o specify the location of output file • -e specify the location of the error file

  19. Execution on Multiple nodes • Make sure you change the permissions of the job.sh before your submit it.

  20. Execution on Multiple Nodes • Use “qsub job.sh” to submit the job • Use “qme” to check the status of the job • Use “qdel 971324.vpbs1” to delete the job if necessary. (971324.vpbs1 is the job ID).

  21. Execution on Multiple Servers • The output will be copied to the location specified in job.sh. • It is in ~/tmp3/output_file.64 in this case.

  22. Useful Links • Grid tutorial: http://www.grid.wayne.edu/resources/tutorials/index.html • Job scheduling on grid: http://www.grid.wayne.edu/resources/tutorials/pbs.html • Step by step to run jobs on Grid: http://www.grid.wayne.edu/resources/tutorials/jobs/index.html

More Related