1 / 50

OpenMP

OpenMP. Presented by Kyle Eli. OpenMP. Open Open, Collaborative Specification Managed by the OpenMP Architecture Review Board (ARB) MP Multi Processing. OpenMP is…. A specification for using shared memory parallelism in Fortran and C/C++. Compiler Directives Library Routines

mei
Download Presentation

OpenMP

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OpenMP Presented by Kyle Eli

  2. OpenMP • Open • Open, Collaborative Specification • Managed by the OpenMP Architecture Review Board (ARB) • MP • Multi Processing

  3. OpenMP is… • A specification for using shared memory parallelism in Fortran and C/C++. • Compiler Directives • Library Routines • Environment Variables • Usable with • Fortran 77, Fortran 90, ANSI 89 C or ANSI C++ • Does not require Fortran 90 or C++

  4. OpenMP requires… • Platform support • Many operating systems, including Windows, Solaris, Linux, AIX, HP-UX, IRIX, OSX • Many CPU architectures, including x86, x86-64, PowerPC, Itanium, PA-RISC, MIPS • Compiler support • Many commercial compilers from vendors such as Microsoft, Intel, Sun, and IBM • GCC via GOMP • Should be included in GCC 4.2 • May already be available with some distributions

  5. OpenMP offers… • Consolidation of vendor-specific implementations • Single-source portability • Support for coarse grain parallelism • Allows for complex (coarse grain) code in parallel applications.

  6. OpenMP offers… • Scalability • Simple constructs with low overhead • However, still dependent on the application and algorithm. • Nested parallelism • May be executed on a single thread • Loop-level parallelism • However, no support for task parallelism • They’re working on it

  7. OpenMP compared to… • Message Passing Interface (MPI) • OpenMP is not a message passing specification • Less overhead • High Performance Fortran (HPF) • Not widely accepted • Focus on data parallelism

  8. OpenMP compared to… • Pthreads • Not targeted for HPC/scientific computing • No support for data parallelism • Requires lower-level programming • FORALL loops • Simple loops • Subroutine calls can’t have side-effects • Various parallel programming languages • May be architecture specific • May be application specific

  9. The OpenMP Model • Sequential code • Implemented in the usual way • Executes normally • Parallel code • Multiple threads created • Number of threads can be user-specified • Each thread executes the code in the parallel region

  10. Using OpenMP • Compiler directives • Begin with #pragma omp • In C/C++, the code region is defined by curly braces following the directive • Should be ignored by compilers that don’t understand OpenMP • Define how regions of code should be executed • Define variable scope • Synchronization

  11. Using OpenMP • Parallel region construct • #pragma omp parallel • Defines a region of parallel code • Causes a team of threads to be created • Threads execute code in the region in the same order • Threads join after the region ends

  12. Using OpenMP • Work-sharing directives • For • Sections • Single

  13. Using OpenMP • For construct • #pragma omp for • Loop parallelism • Iterations of the loop are divided amongst worker threads • Workload division can be user-specified • Branching out of the loop is not allowed

  14. Using OpenMP • Sections construct • #pragma omp sections • Divides code into sections which are divided amongst worker threads • #pragma omp section • Used to define each section

  15. Using OpenMP • Single construct • #pragma omp single • Only one thread executes the code • Useful when code is not thread-safe • All other threads wait until execution completes

  16. Using OpenMP • Synchronization directives • Master • #pragma omp master • Code is executed only by the master thread • Critical • #pragma omp critical • Code is executed by only one thread at a time • Barrier • #pragma omp barrier • Threads will wait for all other threads to reach this pointbefore continuing • Atomic • #pragma omp atomic • The following statement (which must be an assignment)is executed by only one thread at a time.

  17. Using OpenMP • Synchronization Directives • Flush • #pragma omp flush • Thread-visible variables are written back to memory to present a consistent view across all threads • Ordered • #pragma omp ordered • Forces iterations of a loop to be executed in sequential order • Used with the For directive • Threadprivate • #pragma omp threadprivate • Causes global variables to be local and persistent to a thread across multiple parallel regions

  18. Using OpenMP • Data Scope • By default, most variables are shared • Loop index and subroutine stack variables are private

  19. Using OpenMP • Data scoping attributes • Private • New object of the same type is created for each thread • Not initialized • Shared • Shared amongst all threads • Default • Allows specification of default scope (Private, Shared, or None) • Firstprivate • Variable is initialized with the value from the original object

  20. Using OpenMP • Data scoping attributes • Lastprivate • Original object is updated with data from last section or loop iteration • Copyin • Variable in each thread is initialized with the data from the original object in the master thread • Reduction • Each thread gets a private copy of the variable, and the reduction clause allows specification of an operator for combining the private copies into the final result

  21. Using OpenMP

  22. OpenMP Example • A short OpenMP example…

  23. References • http://www.openmp.org • http://www.llnl.gov/computing/tutorials/openMP/

  24. MPI By Chris Van Horn

  25. What is MPI? • Message Passing Interface • More specifically a library specification for a message passing interface

  26. Why MPI? • What are the advantages of a message passing interface? • What could a message passing interface be used for?

  27. History of MPI • MPI 1.1 • Before everyone had to implement own message passing interface • Committee formed of around 60 people from 40 organizations

  28. MPI 1.1 • The standardization process began in April 1992 • Preliminary draft submitted November 1992 • Just meant to get the ball rolling

  29. MPI 1.1 Continued • Subcommittees were formed for the major component areas • Goal to produce standard by Fall 1993

  30. MPI Goals • Design an application programming interface (not necessarily for compilers or a system implementation library). • Allow efficient communication: Avoid memory-to-memory copying and allow overlap of computation and communication and offload to communication co-processor, where available. • Allow for implementations that can be used in a heterogeneous environment. • Allow convenient C and Fortran 77 bindings for the interface. • Assume a reliable communication interface: the user need not cope with communication failures. Such failures are dealt with by the underlying communication subsystem.

  31. MPI Goals Continued • Define an interface that is not too different from current practice, such as PVM, NX, Express, p4, etc., and provides extensions that allow greater flexibility. • Define an interface that can be implemented on many vendor's platforms, with no significant changes in the underlying communication and system software. • Semantics of the interface should be language independent. • The interface should be designed to allow for thread-safety

  32. MPI 2.0 • In March 1995 work began on extensions to MPI 1.1 • Forward Compatibility was preserved

  33. Goals of MPI 2.0 • Further corrections and clarifications for the MPI-1.1 document. • Additions to MPI-1.1 that do not significantly change its types of functionality (new datatype constructors, language interoperability, etc.). • Completely new types of functionality (dynamic processes, one-sided communication, parallel I/O, etc.) that are what everyone thinks of as ``MPI-2 functionality.'' • Bindings for Fortran 90 and C++. This document specifies C++ bindings for both MPI-1 and MPI-2 functions, and extensions to the Fortran 77 binding of MPI-1 and MPI-2 to handle Fortran 90 issues. • Discussions of areas in which the MPI process and framework seem likely to be useful, but where more discussion and experience are needed before standardization (e.g. 0-copy semantics on shared-memory machines, real-time specifications).

  34. How MPI is used • An MPI program consists of autonomous processes • The processes communicate via calls to MPI communication primitives

  35. Features • Process Management • One Sided Communication • Collective Operations • I/O

  36. What MPI Does not Do • Resource Control • Not able to design a portable interface that would be appropriate for the broad spectrum of existing and potential resource and process controllers.

  37. Process Management • Can be tricky to implement properly • What to watch out for: • The MPI-2 process model must apply to the vast majority of current parallel environments. These include everything from tightly integrated MPPs to heterogeneous networks of workstations. • MPI must not take over operating system responsibilities. It should instead provide a clean interface between an application and system software.

  38. Warnings continued • MPI must continue to guarantee communication determinism, i.e., process management must not introduce unavoidable race conditions. • MPI must not contain features that compromise performance. • MPI-1 programs must work under MPI-2, i.e., the MPI-1 static process model must be a special case of the MPI-2 dynamic model.

  39. How Issues Addressed • MPI remains primarily a communication library. • MPI does not change the concept of communicator.

  40. One Sided Communication • Functions that establish communication between two sets of MPI processes that do not share a communicator. • When would one sided communication be useful?

  41. One Sided Communication • How are the two sets of processes going to communicate with each other? • Need some sort of rendezvous point.

  42. Collective Operations • Intercommunicator collective operations • All-To-All • All processes contribute to the result. All processes receive the result. • * MPI_Allgather, MPI_Allgatherv • * MPI_Alltoall, MPI_Alltoallv • * MPI_Allreduce, MPI_Reduce_scatter • All-To-One • All processes contribute to the result. One process receives the result. • * MPI_Gather, MPI_Gatherv • * MPI_Reduce

  43. Collective Operations • One-To-All • One process contributes to the result. All processes receive the result. • * MPI_Bcast • * MPI_Scatter, MPI_Scatterv • Other • Collective operations that do not fit into one of the above categories. • * MPI_Scan • * MPI_Barrier

  44. I/O • Optimizations required for efficiency can only be implemented if the parallel I/O system provides a high-level interface

  45. MPI Implementations • Many different implementations most widely used MPICH(1.1) and MPICH2(2.0) • Argonne National Laboratory

  46. Examples • To run the program ``ocean'' with arguments ``-gridfile'' and ``ocean1.grd'' in C: • char command[] = "ocean"; • char *argv[] = {"-gridfile", "ocean1.grd", NULL}; • MPI_Comm_spawn(command, argv, ...); • To run the program ``ocean'' with arguments ``-gridfile'' and ``ocean1.grd'' and the program ``atmos'' with argument ``atmos.grd'' in C: • char *array_of_commands[2] = {"ocean", "atmos"}; • char **array_of_argv[2]; • char *argv0[] = {"-gridfile", "ocean1.grd", (char *)0}; • char *argv1[] = {"atmos.grd", (char *)0}; • array_of_argv[0] = argv0; • array_of_argv[1] = argv1; • MPI_Comm_spawn_multiple(2, array_of_commands, array_of_argv, ...);

  47. More Examples • /* manager */ • #include "mpi.h" • int main(int argc, char *argv[]) • { • int world_size, universe_size, *universe_sizep, flag; • MPI_Comm everyone; /* intercommunicator */ • char worker_program[100]; • MPI_Init(&argc, &argv); • MPI_Comm_size(MPI_COMM_WORLD, &world_size); • if (world_size != 1) error("Top heavy with management"); • MPI_Attr_get(MPI_COMM_WORLD, MPI_UNIVERSE_SIZE, • &universe_sizep, &flag);

  48. Example Continued • if (!flag) { • printf("This MPI does not support UNIVERSE_SIZE. How many\n\ • processes total?"); • scanf("%d", &universe_size); • } else universe_size = *universe_sizep; • if (universe_size == 1) error("No room to start workers"); • /* • * Now spawn the workers. Note that there is a run-time determination • * of what type of worker to spawn, and presumably this calculation must • * be done at run time and cannot be calculated before starting • * the program. If everything is known when the application is • * first started, it is generally better to start them all at once • * in a single MPI_COMM_WORLD. • */ • choose_worker_program(worker_program); • MPI_Comm_spawn(worker_program, MPI_ARGV_NULL, universe_size-1, • MPI_INFO_NULL, 0, MPI_COMM_SELF, &everyone, • MPI_ERRCODES_IGNORE); • /* • * Parallel code here. The communicator "everyone" can be used • * to communicate with the spawned processes, which have ranks 0,.. • * MPI_UNIVERSE_SIZE-1 in the remote group of the intercommunicator • * "everyone". • */ • MPI_Finalize(); • return 0; • }

  49. Yet More Example • /* worker */ • #include "mpi.h" • int main(int argc, char *argv[]) • { • int size; • MPI_Comm parent; • MPI_Init(&argc, &argv); • MPI_Comm_get_parent(&parent); • if (parent == MPI_COMM_NULL) error("No parent!"); • MPI_Comm_remote_size(parent, &size); • if (size != 1) error("Something's wrong with the parent"); • /* • * Parallel code here. • * The manager is represented as the process with rank 0 in (the remote • * group of) MPI_COMM_PARENT. If the workers need to communicate among • * themselves, they can use MPI_COMM_WORLD. • */ • MPI_Finalize(); • return 0; • }

  50. References • MPI Standards (http://www-unix.mcs.anl.gov/mpi/mpi-standard/mpi-report-2.0/mpi2-report.htm)

More Related