Profiling s3d on cray xt3 using tau
Download
1 / 26

Profiling S3D on Cray XT3 using TAU - PowerPoint PPT Presentation


  • 99 Views
  • Uploaded on

Profiling S3D on Cray XT3 using TAU. Sameer Shende [email protected] Acknowledgements. Alan Morris [UO] Kevin Huck [UO] Allen D. Malony [UO] Kenneth Roche [ORNL] Bronis R. de Supinski [LLNL]. TAU Parallel Performance System. http://www.cs.uoregon.edu/research/tau/

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Profiling S3D on Cray XT3 using TAU' - syshe


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Acknowledgements
Acknowledgements

  • Alan Morris [UO]

  • Kevin Huck [UO]

  • Allen D. Malony [UO]

  • Kenneth Roche [ORNL]

  • Bronis R. de Supinski [LLNL]


Tau parallel performance system
TAU Parallel Performance System

  • http://www.cs.uoregon.edu/research/tau/

  • Multi-level performance instrumentation

    • Multi-language automatic source instrumentation

  • Flexible and configurable performance measurement

  • Widely-ported parallel performance profiling system

    • Computer system architectures and operating systems

    • Different programming languages and compilers

  • Support for multiple parallel programming paradigms

    • Multi-threading, message passing, mixed-mode, hybrid




Program database toolkit pdt
Program Database Toolkit (PDT)

Application

/ Library

C / C++

parser

Fortran parser

F77/90/95

Program

documentation

PDBhtml

Application

component glue

IL

IL

SILOON

C / C++

IL analyzer

Fortran

IL analyzer

C++ / F90/95

interoperability

CHASM

Program

Database

Files

Automatic source

instrumentation

TAU_instr

DUCTAPE


PAPI

  • Performance Application Programming Interface

    • The purpose of the PAPI project is to design, standardize and implement a portable and efficient API to access the hardware performance monitor counters found on most modern microprocessors.

  • Parallel Tools Consortium project

  • Developed by University of Tennessee, Knoxville

  • http://icl.cs.utk.edu/papi/


S3d building with tau
S3D - Building with TAU

  • Change name of compiler in build/make.XT3

    • ftn=> tau_f90.sh

    • cc => tau_cc.sh

  • Set compile time environment variables

    • setenv TAU_MAKEFILE /spin/proj/perc/TOOLS/tau_latest/xt3/lib/ Makefile.tau-callpath-multiplecounters-mpi-papi-pdt-pgi

      • Choose callpath, PAPI counters, MPI profiling, PDT for source instrumentation

    • setenv TAU_OPTIONS ‘-optTauSelectFile=select.tau -optPreProcess’

      • Selective instrumentation file eliminates instrumentation in lightweight routines

      • Pre-process Fortran source code using cpp before compiling

  • Set runtime environment variables for instrumentation control and event PAPI counter selection in job submission script:

    • export TAU_THROTTLE=1

    • export COUNTER1 GET_TIME_OF_DAY

    • export COUNTER2 PAPI_FP_INS

    • export COUNTER3 PAPI_L1_DCM

    • export COUNTER4 PAPI_RES_STL

    • export COUNTER5 PAPI_L2_DCM


Selective instrumentation in tau
Selective Instrumentation in TAU

% cat select.tau

BEGIN_EXCLUDE_LIST

MCADIF

GETRATES

TRANSPORT_M::MCAVIS_NEW

MCEDIF

MCACON

CKYTCP

THERMCHEM_M::MIXCP

THERMCHEM_M::MIXENTH

THERMCHEM_M::GIBBSENRG_ALL_DIMT

CKRHOY

MCEVAL4

THERMCHEM_M::HIS

THERMCHEM_M::CPS

THERMCHEM_M::ENTROPY

END_EXCLUDE_LIST

BEGIN_INSTRUMENT_SECTION

loops routine="#"

END_INSTRUMENT_SECTION


Tau s paraprof profile browser manager
TAU’s ParaProf Profile Browser - Manager

Derived Metrics Flops = PAPI_FP_INS/wallclock time


Main window 8 cpus mpi ranks 0 7
Main Window - 8 cpus (MPI Ranks 0-7)

Some routines execute on different sets of processors











Callpath thread relations window
Callpath Thread Relations Window

parent

routine

children



Tau s paraprof profile browser manager1
TAU’s ParaProf Profile Browser - Manager

Different sections of code

within the same routine

execute on odd and even

processors!




Getting access to tau on jaguar
Getting Access to TAU on Jaguar

  • set path=(/spin/proj/perc/TOOLS/tau_latest/x86_64/bin $path)

  • Choose Stub Makefiles (TAU_MAKEFILE env. var.) from /spin/proj/perc/TOOLS/tau_latest/xt3/lib/Makefile.*

    • Makefile.tau-mpi-pdt-pgi (flat profile)

    • Makefile.tau-mpi-pdt-pgi-trace (event trace, for use with Vampir)

    • Makefile.tau-callpath-mpi-pdt-pgi (single metric, callpath profile)

  • Binaries of S3D can be found in:

    • ~sameer/scratch/S3D-BINARIES

      • withtau

        • papi, multiplecounters, mpi, pdt, pgi options

      • without_tau


ad