TDB - PowerPoint PPT Presentation

Slide1 l.jpg
Download
1 / 28

TDB. TD B: THE INTERACTIVE DISTRIBUTED DEBUGGING TOOL FOR PARALLEL MPI PROGRAMS. Authors:. RCMS PSI RAS , Pereslavl-Zalessky , Russia. A. Adamovich M. Kovalenko. History of the Development. T-system RCMS PSI RAS , since the early 90 s

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

TDB

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Slide1 l.jpg

TDB

TDB:

THE INTERACTIVE DISTRIBUTED DEBUGGING TOOL FOR PARALLEL MPI PROGRAMS


Authors l.jpg

Authors:

RCMS PSI RAS,

Pereslavl-Zalessky,

Russia

  • A.Adamovich

  • M.Kovalenko


History of the development l.jpg

History of the Development

  • T-system

    RCMS PSI RAS, since the early 90s

  • The SKIF project of the Russia-Belarus Union State 2000-2004

    T-system and itsenvironment:

    • T-system (industrial version);

    • the TGCC compiler;

    • the TDB interactive debugging system;

    • and others.


Objectives of the development l.jpg

Objectives of the Development

  • Support of software design and development using computing systems of the SKIF family

    • the element of the integrated toolkit;

    • directed towards T-system support.

  • Cost-effectiveness

    • reduced expenses for purchasing and maintaining the SKIF computing system

  • Information independence


Predecessors and analogues l.jpg

Predecessors and Analogues

  • P2D2 (Portable Debugger for Parallel and Distributed Programs, NASA, 1994, Doreen Cheng, Robert Hood)

  • TotalView (Etnus)

  • DDT (Distributed Debugging Tool, Streamline Computing)


Basic architecture principles l.jpg

Basic Architecture Principles

The TDB architecture:

  • distributed and multi-component

  • open and portable

  • flexible

  • multi-user


The tdb architecture distributed and multi component l.jpg

The TDB Architecture:Distributed and Multi-component

1) The primary daemon

2) The secondary daemon

3) The central server

4) The client component

5) The debugging server


The tdb architecture 2 2 l.jpg

The TDB Architecture (2/2)

Flexible

  • uses free software:

    • АСЕ, libxml++, libpcre, libgtk2.x, scintilla, gnome-debug-tdb (based on gnome-debug)

  • the possibility of using commercial products, system debuggers, for example


Tbd features l.jpg

TBD Features

  • Debug C and C++, Fortran programs

  • Linux for 32-bit or 64-bit processors

  • Debug parallel MPI programs.

  • Supported MPI implementations: LAM, MPICH, SCAMPI, MP-MPICH, DMPI.

  • Advanced job launch methods

  • Monitoring of states of target nodes

  • Multi-user support


Tbd features10 l.jpg

TBD Features

  • One-touch breakpoint setting/manipulating

  • Step into, over or out of functions

  • Watchpoints

  • One-touch symbolic display

  • Controls processes individually or collectively

  • Color-coded processes/nodes states

  • Log files


Tbd features11 l.jpg

TBD Features

  • Groups

    • Group processes using flexible definition language

    • Two types of groups supported:

      • static groups and

      • dynamic groups

    • Control grouped processes as lone processes (step, next, stop...) with real-time visual feedback

    • Special group commands:

      • group breakpoint,

      • group display


Tbd features12 l.jpg

TBD Features

  • Two process control modes:

    • active process control mode

    • group control mode

  • Two GTDB operational modes:

    • active process / active group debugging mode

    • per process debugging mode


Tbd features13 l.jpg

TBD Features

  • Special support for parallelizing systems:

    • T-system support:

      • Special commands t-break, t-print…


Gtdb tdb gui client windows and components features l.jpg

GTDB (TDB GUI client) windows and components features

  • Main window:

    • Active Process window

    • Source Code display with breakpoints

    • Command buttons

    • Command component

    • Active process / Active group selection component


Gtdb windows and components features l.jpg

GTDB windows and components features

  • GUI component for per process debugging:

    • With GUI features for easy processes and MPI-nodes status read

    • With ability to pick and choose one of processes

    • Full featured subcomponent for processes debugging similar to main subcomponent for debugging active process

  • MPI-nodes/processes states window, also used for selecting processes to inspect


Gtdb windows and components features16 l.jpg

GTDB windows and components features

  • Breakpoints manipulation component window

  • Configuration / Properties component window

  • Various pop-up menus used for:

    • selected expression data inspection and manipulation, print, display, watchpoints, value set...

    • execution control (breakpoints set, disable, delete...)


Gtdb tdb client component l.jpg

GTDB – TDB Client Component

  • intuitive interface and ergonomic design

  • the presentation of information is handy and convenient


Gtdb node selection component l.jpg

GTDB Node Selection Component

User can select the exact set of computational nodes that are available for debugging MPI tasks.

The list of all nodes available for MPI task debugging can be obtained through the request to TDB daemons.

The primary TDB daemon is running on front-end and Secondary TDB daemons are running on computational nodes of cluster. TDB daemons represent monitor processes.

Secondary daemons collect and the primary daemon accumulates useful info about computational nodes status.


Gtdb properties component l.jpg

GTDB Properties Component

Is used to configure various TDB, GTDB,

and MPI implementations settings


Gtdb nodes status component l.jpg

GTDB Nodes Status Component

  • Describes statuses of MPI-nodes processes.

  • Green color marks running processes

  • Yellow color marks stopped processes

  • Red color marks processes that have been stopped or terminated by a signal

Upper bar : common MPI-node status

Green - all processes of the node are running

Yellow – at least one of the processes is stopped

Red - at least one process caught a signal

Common status bar is used in purpose to give the user the opportunity to read information about the situation with debugging processes in a more simple and clear way.All status subcomponents are implemented as button widgets:

if clicked, open appropriate process (processes) for individual exploration in the PROCS GTDB mode.


Gtdb breakpoints component l.jpg

GTDB Breakpoints Component

The component is used to work with various types of breakpoints supported in TDB:

  • Source line breakpoints,

  • function breakpoints and

  • watchpoints;

    all of them may have conditions.

As well a special type of breakpoints is  implemented in TDB, so called “group breakpoints”. The group breakpoint allows user to set a number of uniform breakpoints in a group of parallel processes. The  user can set, delete, disable or enable group breakpoint in one command or click.


The main gtdb window sample debug session l.jpg

The Main GTDB Window. Sample Debug Session

GTDB in the MAIN -> PROC mode. Process 2:0 is an active (selected, exploring) process...


Example debug session of debugging simple mpi program l.jpg

Example Debug Session of Debugging Simple MPI Program

Example of dynamic groups definition using the "dgroup" command


Example debug session of debugging simple mpi program24 l.jpg

Example Debug Session of Debugging Simple MPI Program

We continue the execution of processes from the masters dynamic group and then stop on previously set breakpoints in the loop.


Example debug session of debugging simple mpi program25 l.jpg

Example Debug Session of Debugging Simple MPI Program

As we can see the ‘i’ variable equals to zero on all processes in the masters group (the "print" command on group masters was used). To get out from the loop we set the ‘i’ variable on all masters to 1.


Slide26 l.jpg

We continue execution of masters group processes, but – after the loop – execution is stopped by the SIGSEGV signal.


Per procs gtdb debugging mode l.jpg

Per Procs GTDB Debugging Mode

In the Main mode the user can work with one selected (active) process or group

In the Procs mode he/she can examine any process individually.

The component was implemented as two “notebooks” inserted one into the other.

The first (outer, placed vertically) notebook is the MPI-nodes notebook. Its bookmarks contain info about appropriate processes and common MPI-node statuses, colored as nodes status component.

The second (inner, placed horizontally) notebook is a notebook of processes...


Contacts l.jpg

Contacts

  • Max Kovalenkomadmax@botik.ru

  • Alexei Adamovichlexa@botik.ru

  • Sergei Abramovabram@botik.ru


  • Login