1 / 6

TAU’s MPI Wrapper Interposition Library

TAU’s MPI Wrapper Interposition Library. Uses standard MPI Profiling Interface Provides name shifted interface MPI_Send = PMPI_Send Weak bindings Interpose TAU’s MPI wrapper library between MPI and TAU -lmpi replaced by –lTauMpi –lpmpi –lmpi No change to the source code!

osman
Download Presentation

TAU’s MPI Wrapper Interposition Library

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TAU’s MPI Wrapper Interposition Library • Uses standard MPI Profiling Interface • Provides name shifted interface • MPI_Send = PMPI_Send • Weak bindings • Interpose TAU’s MPI wrapper library between MPI and TAU • -lmpi replaced by –lTauMpi –lpmpi –lmpi • No change to the source code! • Just re-link the application to generate performance data • setenv TAU_MAKEFILE <dir>/<arch>/lib/Makefile.tau-mpi -[options] • Use tau_cxx.sh, tau_f90.sh and tau_cc.sh as compilers

  2. Runtime MPI Shared Library Instrumentation • We can now interpose the MPI wrapper library for applications that have already been compiled • No re-compilation or re-linking necessary! • Uses LD_PRELOAD for Linux • On AIX, TAU uses MPI_EUILIB / MPI_EUILIBPATH • Simply compile TAU with MPI support and prefix your MPI program with tauex % mpirun -np 4 tauex a.out • Requires shared library MPI - does not work on XT3 • Approach will work with other shared libraries

  3. TAU’s MPI Wrapper Interposition Library • Uses standard MPI Profiling Interface • Provides name shifted interface • MPI_Send = PMPI_Send • Weak bindings • Interpose TAU’s MPI wrapper library between MPI and TAU • -lmpi replaced by –lTauMpi –lpmpi –lmpi • No change to the source code! Just re-link the application to generate performance data • setenv TAU_MAKEFILE <dir>/<arch>/lib/Makefile.tau-mpi-[options] • Use tau_cxx.sh, tau_f90.sh and tau_cc.sh as compilers

  4. Automatic Instrumentation • We now provide compiler wrapper scripts • Simply replace mpxlf90 with tau_f90.sh • Automatically instruments Fortran source code, links with TAU MPI Wrapper libraries. • Use tau_cc.sh and tau_cxx.sh for C/C++ Before CXX = mpCC F90 = mpxlf90_r CFLAGS = LIBS =-lm OBJS = f1.o f2.o f3.o … fn.o app: $(OBJS) $(CXX) $(LDFLAGS) $(OBJS) -o $@ $(LIBS) .cpp.o: $(CC) $(CFLAGS) -c $< After CXX = tau_cxx.sh F90 = tau_f90.sh CFLAGS = LIBS =-lm OBJS = f1.o f2.o f3.o … fn.o app: $(OBJS) $(CXX) $(LDFLAGS) $(OBJS) -o $@ $(LIBS) .cpp.o: $(CC) $(CFLAGS) -c $<

  5. I/O notes • Application file I/O performance often highly variable • depends on load on shared filesystem/network resources • and application/system configuration at time of measurement • tuning requires very careful extensive benchmarking • worst care performance very different from typical case • current tools don't deal well with this • Optimal I/O is no I/O! • preferable to eliminate non-essential I/O during measurement • configure tools to avoid intermediate measurement I/O (e.g., trace buffer flushes) where appropriate • configure measurement or analysis to exclude I/O phases • typically part of one-off application initialization/finalization cost which would be amortized in long production execution

More Related