1 / 24

Parallel Debugging with TotalView on Blue Horizon

Parallel Debugging with TotalView on Blue Horizon. Yifeng Cui and Laura C. Carrington yfcui@sdsc.edu San Diego Supercomputing Center. Overview. Parallel Debugging What is Totalview Availability of Totalview How to compile for TotalView How to run TotalView on Blue Horizon

yuval
Download Presentation

Parallel Debugging with TotalView on Blue Horizon

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parallel Debugging with TotalView on Blue Horizon Yifeng Cui and Laura C. Carrington yfcui@sdsc.edu San Diego Supercomputing Center

  2. Overview • Parallel Debugging • What is Totalview • Availability of Totalview • How to compile for TotalView • How to run TotalView on Blue Horizon • Navigating TotalView • Available Documentation • Lab Session

  3. Parallel Debugging: Why Debugger? • To know where the program crashed • To gain a better understanding of the program what is going on • To know what is the value of a distributed array • Use it as last resort

  4. Parallel Debugging • All problems of serial programming • Plus • Increased difficulty to verify correctness of program • Increased difficulty to debug N parallel processes • New parallel problems: • Deadlock • Race conditions • Irreproducibility

  5. Parallel Debugging: Parallel Debuggers • Most vendor debuggers have some support • Conventional debuggers such as unix dbx/gdb/adb may be little value • Debugging parallel programs is hard but possible • A parallel debugger is expected to be: • Portable across different platforms • See information without entering commands • Debug multiprocess/multithreaded programs • Automatically detects and attaches to running processes

  6. What is Totalview? • Parallel debugger • Source level debugging for C, C++, F77, F90, HPF • MPI, openMP, Pthreads, PVM • SMPs, MPPs, PVPs,Clusters • Available in all major Unix platforms and most supercomputers • GUI (independent of platforms except Cray T3E) • TotalView 4x on Tcl/tk • TotalView 5x on Motif

  7. Availability of Totalview • Compaq Digital Alpha • HP-UX • IBM RS6000 and SP Power • SGI MIPS • Sun Sparc Sun OS 5 • Linux Intel IA32 (Redhat) • Linux Alpha (Redhat) • Cray T3E by Cray • Hitachi SR2201 by sofTek, SR8000 • NEC SX-4 by sofTek, SX-5 beta

  8. How To Compile Just add “-g” flag to compiler: mpxlf90 –g stf_01.f mpcc –g stc_01.c

  9. How to Run TotalView • On a single process: % totalview myprog –a [args] • To debug a IBM POE program: % totalview poe –a myprog [args] • The other way to start Totalview % totalview It brings up the root window, select the file menu then New Program • To start Totalview Command Lines Interface % totalviewcli

  10. How to Run TotalView • Create “runme” script with LoadLeveler information, similar to the following: #! /usr/bin/csh -f setenv MP_RMPOOL 1 setenv MP_TASKS_PER_NODE 2 setenv MP_NODES 1 setenv MP_EUILIB ip setenv MP_EUIDEVICE en0 setenv MP_CPU_USAGE unique setenv MP_SHARED_MEMORY yes setenv MP_NODE_USAGE not_shared totalview poe -a a.out This last line starts Totalview • Launch TotalView by running “runme” script. From the interactive command options: poe a.out –nodes 1–tasks_per_node 2 –rmpool 1 \ -euilib ip–euidevice en0 LoadLeveler Keywords

  11. Navigating TotalView Unattached ProcessesWindow RootWindow ProcessWindow Data Windows

  12. Navigating TotalView: Process Window Process & thread motion buttons Stack Trace pane Local variables for the selected frame Source pane Thread pane Action Points pane

  13. Navigating TotalView: Root Window Process name Process ID Expand list Number of threads Thread listtid/systid Process/thread status: B: BreakpointR: RunningT: StoppedE: Error

  14. Navigating TotalView: Mouse Buttons • Left button is Select: • Chooses an item of interest, or • Starts editing a item • Middle button is Menu: • Raises a menu of actions you can perform • All menus have a Help (^?) entry • Right button is Dive: • Gets more information about an item • Shift+Dive forces open a new window View a menu Select anobject Dive

  15. Navigating TotalView:Center Button Source Panel Use center mouse to pop up menus in all the Windows. Select “Go Group” to start running.

  16. Navigating TotalView: Left Button Gridded box is a possible site for a breakpoint Select to set one Current function and source file Current point of execution Breakpoint

  17. Breakpoints Stops execution of process and threads that reach it Barrier Breakpoints Holds each thread and process that reach it until all threads and processes from the group reach it Evaluation Points Causes code fragment to execute when it is reached Navigating TotalView: Action Points

  18. Navigating TotalView: Right Button “Dive” or view source for a function or subroutine by right clicking on routine name Source panel now Displays code for Routine “do_jacobi” Left click to return to main routine source

  19. Navigating TotalView: Right Button “Dive” or view data by right clicking on array name New Data Window New Data Window of array values. You can edit “Slice” of values viewed. Arrays have a slice field that you can edit to specify the dimensions to display

  20. Navigating TotalView: Right Button New Data Window Use center mouse in Data Window to select “Visualize” menu option that pops up graph of data. Click image with center mouse to rotate image

  21. Navigating TotalView: Customize Totalview Add lines in your .Xdefaults file such as totalview*searchPath: /my/src/dir1,/my/src/dir2 totalview*parallelAttach: {yes | no | ask} totalview*sourcePaneTabWidth:n totalview*font:fontname Visualize*graph.height:height … To Load X resource file: Xrdb –load $HOME/.Xdefaults

  22. Documentation NPACI Blue Horizon documentation http://www.npaci.edu/BlueHorizon NPACI Blue Horizon Tools Page http://www.npaci.edu/BlueHorizon/guide_linked/bh_tools_txt.html Etnus Web Page :Getting Started with TotalView http://www.etnus.com/Products/TotalView/started/getting_started.html Etnus Web Page :TotalView User’s Guide http://www.etnus.com/pub/totalview/tv4.1.0/doc/User_Guide.pdf http://www.etnus.com/Support/docs/online_doc/user_guide/index.html

  23. Lab Session for TotalViewEnvironment Setup Setup for running X-windows applications on PCs: 1. Login to b80login.sdsc.edu using CRT (located in Applications common). 2. Launch Exceed (located in either Applications common or as a shortcut on your desktop called "Humming Bird". 3. set your environment, for csh: setenv DISPLAY t-wolf.sdsc.edu:0.0 ****where "t-wolf" is the name of the PC you are using 4. copy files from Tools_examples directory into your own working space. * create a directory to work with TotalView and Xprofiler: mkdir Tools * change directories into new directory: cd Tools * copy files into new directory: cp /work/Training/Tools_examples/* . NOTE: On a 2-button mouse the center mouse button is done by clicking on both the right and left button together.

  24. Lab Session for TotalViewRunning TotalView 1. Compile either Fortran or C example (st_01) with the following: mpxlf90 -g stf_01.f mpcc -g -lm stc_01.c 2. Launch TotalView using the TotalView_runme script: tf004i% TotalView_runme 3. After script launches, in main frame of the Process window (largest frame of largest window), use center mouse button to select "Go Group" menu item. 4. You will be prompted with the following: (NOTE: this may take a while 2-5 minutes while LoadLever searches for available cpus) "Process poe has started the parallel tasks. Do you want to stop the parallel task before they enter MAIN?" select "Yes" 5. After a few minutes the Process window should show the code. Place a break point, by using left mouse button, after the do_jacobi call in the main loop. 6. Use center mouse button in Process window to select "Go Group" menu item. This will cause the code to run to the break point. Use right mouse button to dive into the "do_jacobi" routine and also the "psi" array. 7. Continue to explore TotalView...when you are done exit TotalView by using center mouse button in the "Root Window" to select "Quit Debugger"

More Related