Debugging with the TotalView Source Code Debugger MIT March 6, 2008 Ed Hinkel Sales Engineer TotalView Technologies
Agenda • TotalView Technologies Intro • Source Code Debugging • - Setup • - Navigation • - Data View and Analysis • Memory Debugging • Parallel Debugging • Debugging Large Apps • Questions / Comments
TotalView Technologies Corporate Overview • The Most Experienced Technologists in Parallel Debugging • Technology originally developed at BBN in late 80’s • Developed from scratch specifically for debugging parallel applications • TotalView is recognized worldwide as the gold standard for debugging in multi-core, data intensive, high-performance, distributed, and clustered computing environments • The debugging leader in the HPC, EDU, and Commercial sectors • Founded as Etnus, Inc. in 1999, Renamed TotalView Technologies in 2007 • 50 employees (heavily engineering influenced) • Over 1,400 customers in 55 countries • Over 10K developers with over 2 million cores under license • Award winning product line (Supercomputing Online's Product of the Year)
What is TotalView? • A comprehensive debugging solution for demanding multi-core applications • C, C++, Fortran 77 & 90, UPC • Wide compiler & platform support • Multi-threaded Debugging • Parallel Debugging • MPI, PVM, Others • Remote Debugging • Memory Debugging Capabilities • Integrated into the Debugger • Powerful and Easy GUI • Visualization • CLI for Scripting
Supported Compilers and Architectures • Platform Support • Linux x86, x86-64, ia64, Power • Mac Power and Intel • Solaris Sparc and AMD64 • AIX, Tru64, IRIX, HP-UX ia64 • Cray X1, XT3, XT4, IBM BGL, BGP, SiCortex • Languages / Compilers • C/C++, Fortran, UPC, Assembly • Many Commercial & Open Source Compilers • Parallel Environments • MPI (MPICH1 & 2, LAM, Open MPI, poe, MPT, Quadrics, MVAPICH, & many others ) • UPC
Architecture for Cluster Debugging • Single Front End (TotalView) • GUI • debug engine • Debugger Agents (tvdsvr) • Low overhead, 1 per node • Traces multiple rank processes • TotalView communicates directly with tvdsvrs • Not using MPI • Protocol optimization Compute Nodes Provides Robust, Scalable and efficient operation with Minimal Program Impact
Starting TotalView Normal totalview [ tv_args ] prog_name [–a prog_args ] Attach to running program totalview [ tv_args ] prog_name –pid PID# [–a prog_args ] Attach to remote process totalview [ tv_args ] prog_name –remote name [–a prog_args ] Attach to a core file totalview [ tv_args ] prog_name corefile_name [ –a prog_args ] Command Line GUI
Interface Concepts • Root Window • State of all processes being debugged • Process Window • Detailed state of a single process • Thread within a process • Point of control • Control the process and possibly other related processes
TotalView Root Window Host name Hierarchical/ Linear Toggle Rank # (if MPI program) TotalView Thread ID # Expand - Collapse Toggle Action Point ID number Process Status
Process Window Overview Toolbar Stack Trace Pane Stack Frame Pane Source Pane Tabbed Area
Stack Trace and Stack Frame Panes Language Name Function Pointer
Viewing Source Code • TV always tries to display source code • If it cannot you will see assembly • -g puts ‘symbol table’ and ‘source code + line number’ info into your application • These are references, usually by relative path from the object file to source file • TV takes the basename and the path • TotalView will first try to use this info to find the source file • Then it will search a TV search path for the basename • Paths can be set via $tree function • CLI variables provides for setting source search paths - see documentation for details
Debugging Assembly Code Display/Debug Source, Assembly or Both
Stepping Commands Based onPC location
Basic Process Control Automatic Grouping • Control Group • All the processes created or attached together • Share Group • All the processes that share the same image • Workers Group • All the processes & threads that are not recognized as manager or service processes or threads • Lockstep Group • All threads at the same PC
Finding Functions, Variables, and Source Files Menu: View > Lookup ---------- Accelerator Keys: f, v ---------- “Closest Match” Search Results
Action Points Breakpoints ---------- Barrier Points ---------- Conditional Breakpoints ---------- Evaluation Points ---------- Watchpoints
Setting Breakpoints • Setting action points • Single left-click outlined source code line numbers • Action Points Tab • Lists all action points • Dive on an action point to focus it in source pane • Action point properties • Context menu when right-clicking the action point • Deleting action points • Left-click in Source Pane • Context menu in Source Pane / Action Points Tab • Disabling action points • Context menu • Left-click in Action Points Tab • Saving all action points • Action Point > Save All
Evaluation Points • Generalization of Conditional Breakpoints • C/C++ or Fortran • Call functions • Set variables • Test conditions • Test small source code patches • Help set up program circumstances
Watchpoints Use Tools > Watchpoint from a Variable Window. Watchpoints are set on a fixed memory region. When the contents of watched memory change, the watch- point is triggered and TotalView stops the program. Watchpoints are not set on a variable. You you need to be aware of the variable scope. Watchpoints can be conditional or unconditional Use intrinsic variables $newval and $oldval in the conditional expression
Help System • Context sensitive buttons on many dialog windows • Help menu in the main windows • Launches an html browser • Navigate or search the full content • Also available in pdf and hard copy • Check out the tip of the week archive
Diving on Variables • You can use Diving to: • … get more information • … to open a variable in a Variable Window. • … to chase pointers in complex data structures. • You can Dive on: • … variable names to open a variable window • … function names to open the source in the Process Window. • … processes and threads in the Root Window. • How do I Dive? • Double-click the left mouse button on selection • Single-click the middle mouse button on selection. • Select Dive from context menu opened with the right mouse button
Undiving In a Process Window: retrace the path that has been explored with multiple dives. In a Variable Window: replace contents with the previous contents. You can also remove changes in the variable window with Edit > Reset Default.
Dive in All Dive in All displays an element in an array of structures as if it were a simple array.
The Variable Window • Window contents are updated automatically • Changed values are highlighted • “Last Value” column is available Editing Variables • Click once on the value • Cursor switches into edit more • Esc key cancels editing • Enter key commits a change • Editing values changes the memory of the program
Expression List Window Add to the expression list using contextual menu with right-click on a variable, or by typing an expression directly in the window
Expression List Window • Reorder, delete, add • Sort the expressions • Edit expressions in place • Dive to get more info • Updated automatically • Expression-based • Simple values/expressions • View just the values you want to monitor
Four Ways to Look at Variables • Glance • Stack frame • Hover • Source pane • Dive to data window • Source, Stack or Variable Window • Arrays, structures, explore • Monitor via expression list • Source, Stack or Variable window • Keep an eye on scalars and expressions
Slicing Arrays Slice notation is [start:end:stride]
Visualizing Arrays • Visualize array data using Tools > Visualize from the Variable Window • Large arrays can be sliced down to a reasonable size first • Visualize is a standalone program • Data can be piped out to other visualization tools • Visualize allows to spin, zoom, etc. • Data is not updated with Variable Window; You must re-visualize • $visualize() is a directive in the expression system, and can be used in evaluation point expressions.
Typecasting Variables • Edit the type of a variable • Changes the way TotalView interprets the data in your program • Does not change the data in your program • Often used with pointers • Type cast to a void or code type to snoop around in memory Give TotalView a starting memory address and TotalView will interpret and display your memory from that location.
Type Casts Read from Right to Left Examples: • int* Pointer to an array of 10 int • int* Array of 10 pointers to int • int** Pointer to an array of 10 pointers to int • int* Array of 10 pointers to arrays of 5 int
Typecasting Examples • Cast float * to float * to see a dynamic array’s values • Cast to built-in types like $string to view a variable as a null-terminated string (automatic cast for char *) • Cast to $void for no type interpretation or for displaying regions of memory • Cast to $code to see 100 instructions of disassembly • Cast to your own structs, objects, Fortran user defined types, common block definitions, etc.
STLView STLView transforms templates into readable and understandable information • STLView supportsstd::vector, std::list, std::map, std::string • See doc for which STL implementations are supported
C++ Templates TotalView understands your C++ templates and gives you a choice ... Boxes with solid lines around line numbers indicate locations with replicated code
Managing SignalsFile > Signals Error Stop the process and flag as error StopStop the process ResendPass the signal to the target and do nothing: use with signal handlers IgnoreDiscard the signal