1 / 14

cc Compiler Parallelization Options

cc Compiler Parallelization Options. CSE 260 Mini-project Fall 2001 John Kerwin. Background. The Sun Workshop 6.1 cc compiler does not support OpenMP, but it does contain Multi Processing options similar to those in OpenMP. FOR MORE INFO. Chapter 4 of the C User's Guide at

pia
Download Presentation

cc Compiler Parallelization Options

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. cc Compiler Parallelization Options CSE 260 Mini-project Fall 2001 John Kerwin

  2. Background • The Sun Workshop 6.1 cc compiler does not support OpenMP, but it does contain Multi Processing options similar to those in OpenMP. FOR MORE INFO... Chapter 4 of the C User's Guide at http://docs.sun.com/htmlcoll/coll.33.7/iso-8859-1/CUG/parallel.html discusses how the compiler can Parallelize Sun ANSI/ISO C Code. Slides from "Application Tuning on Sun Systems" by Ruud van der Pas at http://www.uni-koeln.de/RRZK/server/sunfire/Koln_Talk_Jun2001_Summary_HO.pdf contain a lot of useful information about compiler options.

  3. Three Ways to Enable Compiler Parallelization • -xautopar Automatic parallelization • Just compile and run on a multiprocessor. • Use command like "setenv PARALLEL 8" to set the number of processors at runtime. • -xexplicitpar Explicit parallelization only • Use pragmas similar to those used in OpenMP to guide the compiler.   • -xparallel Automatic and explicit parallelization

  4. -xautopar • Requires -xO3 or higher optimization • Includes -xdepend • -xdepend analyzes loops for inter-iteration data dependencies and restructures them if possible to allow different iterations of the loop to be executed in parallel. • -xautopar analyzes every loop in the program and generates parallel code for parallelizable loops.

  5. How Automatic Parallelization Works • At the beginning of the program, the master thread spawns slave threads to execute the parallel code. • The slave threads wait idly until the master thread encounters a parallelizable loop that is profitable to execute in parallel. • If it encounters one, different iterations of the loop are assigned to slave threads, and all the threads synchronize at a barrier at the end of the loop.

  6. How Automatic Parallelization Works (continued) • The master thread uses an estimate of the granularity of each loop (number of iterations, versus the overhead of distributing work to threads and synchronizing) to determine whether or not it is profitable to execute the loop in parallel.   • If it cannot determine the granularity of the loop at compile time, it will generate both serial and parallel versions of the loop, and only call the parallel version at runtime if the number of iterations justify the overhead. 

  7. How Effective is –xautopar? • Success or failure with -xautopar depends on • Type of application • Coding style • Quality of the compiler • The compiler may not be able to automatically parallelize the loops in the most efficient manner. • This can happen if: • The data dependency analysis is unable to determine whether or not it is safe to parallelize a loop. • The granularity is not high enough because the compiler lacks information to parallelize the loop at the highest possible level. • Can check parallelization messages to see which loops were parallelized by using the -xloopinfo option.

  8. –xexplicitpar • This is when explicit parallelization through pragmas comes into the picture. • -xexplicitpar allows the programmer to insert pragmas into the code to guide the compiler on how to parallelize certain loops.  • The programmer is responsible for ensuring pragmas are used correctly, otherwise results are undefined. • Use -xvpara to print compiler warnings about potentially misused pragmas.

  9. Examples of some Pragmas Similar to OpenMP Pragmas Static Scheduling • All the iterations of the loop are uniformly distributed among all the participating processors.  #pragma MP taskloop schedtype(static) for(i=1; i < N-1; i++) { ... } similar to #pragma omp for schedule(static)

  10. Examples of some Pragmas Similar to OpenMP Pragmas Dynamic Scheduling with a Specified chunk_size #pragma MP taskloop schedtype(self(120)) similar to #pragma omp for schedule(dynamic, 120) Guided Dynamic Scheduling with a Minimum chunk_size #pragma MP taskloop schedtype(gss(10)) similar to #pragma omp for schedule(guided, 10)

  11. Speedup Using Static, Dynamic, and Guided MP Pragmas with 8 Processors

  12. Speedup from MPI, Pthreads, and Sun MP Programs with 8 Processors

  13. Time Spent Converting Serial Code to Parallel Code

  14. Coming Soon: OpenMP • OpenMP is supported in the Workshop 6.2 C compiler • #include <omp.h>

More Related