Presentation is loading. Please wait.

Presentation is loading. Please wait.

Program Systems Institute RASTDB TDB: THE INTERACTIVE DISTRIBUTED DEBUGGING TOOL FOR PARALLEL MPI PROGRAMS.

Similar presentations


Presentation on theme: "Program Systems Institute RASTDB TDB: THE INTERACTIVE DISTRIBUTED DEBUGGING TOOL FOR PARALLEL MPI PROGRAMS."— Presentation transcript:

1 Program Systems Institute RASTDB TDB: THE INTERACTIVE DISTRIBUTED DEBUGGING TOOL FOR PARALLEL MPI PROGRAMS

2 Program Systems Institute RASAuthors: RCMS PSI RAS, Pereslavl-Zalessky, Russia A. Adamovich A. Adamovich M. Kovalenko M. Kovalenko

3 Program Systems Institute RAS History of the Development T-system T-system RCMS PSI RAS, since the early 90s RCMS PSI RAS, since the early 90s The SKIF project of the Russia-Belarus Union State 2000-2004 The SKIF project of the Russia-Belarus Union State 2000-2004 T-system and its environment: T-system and its environment: T-system (industrial version); T-system (industrial version); the TGCC compiler; the TGCC compiler; the TDB interactive debugging system; the TDB interactive debugging system; and others. and others.

4 Program Systems Institute RAS Objectives of the Development Support of software design and development using computing systems of the SKIF family Support of software design and development using computing systems of the SKIF family the element of the integrated toolkit; the element of the integrated toolkit; directed towards T-system support. directed towards T-system support. Cost-effectiveness Cost-effectiveness reduced expenses for purchasing and maintaining the SKIF computing system reduced expenses for purchasing and maintaining the SKIF computing system Information independence Information independence

5 Program Systems Institute RAS Predecessors and Analogues P2D2 (Portable Debugger for Parallel and Distributed Programs, NASA, 1994, Doreen Cheng, Robert Hood) P2D2 (Portable Debugger for Parallel and Distributed Programs, NASA, 1994, Doreen Cheng, Robert Hood) TotalView (Etnus) TotalView (Etnus) DDT (Distributed Debugging Tool, Streamline Computing) DDT (Distributed Debugging Tool, Streamline Computing)

6 Program Systems Institute RAS Basic Architecture Principles The TDB architecture : distributed and multi-component distributed and multi-component open and portable open and portable flexible flexible multi-user multi-user

7 Program Systems Institute RAS The TDB Architecture: Distributed and Multi-component 1) The primary daemon 2) The secondary daemon 3) The central server 4) The client component 5) The debugging server

8 Program Systems Institute RAS The TDB Architecture (2/2) Flexible  uses free software: АСЕ, libxml++, libpcre, libgtk2.x, scintilla, gnome-debug-tdb (based on gnome-debug) АСЕ, libxml++, libpcre, libgtk2.x, scintilla, gnome-debug-tdb (based on gnome-debug)  the possibility of using commercial products, system debuggers, for example

9 Program Systems Institute RAS TBD Features Debug C and C++, Fortran programs Debug C and C++, Fortran programs Linux for 32-bit or 64-bit processors Linux for 32-bit or 64-bit processors Debug parallel MPI programs. Debug parallel MPI programs. Supported MPI implementations: LAM, MPICH, SCAMPI, MP-MPICH, DMPI. Supported MPI implementations: LAM, MPICH, SCAMPI, MP-MPICH, DMPI. Advanced job launch methods Advanced job launch methods Monitoring of states of target nodes Monitoring of states of target nodes Multi-user support Multi-user support

10 Program Systems Institute RAS TBD Features One-touch breakpoint setting/manipulating One-touch breakpoint setting/manipulating Step into, over or out of functions Step into, over or out of functions Watchpoints Watchpoints One-touch symbolic display One-touch symbolic display Controls processes individually or collectively Controls processes individually or collectively Color-coded processes/nodes states Color-coded processes/nodes states Log files Log files

11 Program Systems Institute RAS TBD Features Groups Groups Group processes using flexible definition language Group processes using flexible definition language Two types of groups supported: Two types of groups supported: static groups and static groups and dynamic groups dynamic groups Control grouped processes as lone processes (step, next, stop...) with real-time visual feedback Control grouped processes as lone processes (step, next, stop...) with real-time visual feedback Special group commands: Special group commands: group breakpoint, group breakpoint, group display group display

12 Program Systems Institute RAS TBD Features Two process control modes: Two process control modes: active process control mode active process control mode group control mode group control mode Two GTDB operational modes: Two GTDB operational modes: active process / active group debugging mode active process / active group debugging mode per process debugging mode per process debugging mode

13 Program Systems Institute RAS TBD Features Special support for parallelizing systems: Special support for parallelizing systems: T-system support: T-system support: Special commands t-break, t-print… Special commands t-break, t-print…

14 Program Systems Institute RAS GTDB (TDB GUI client) windows and components features Main window: Main window: Active Process window Active Process window Source Code display with breakpoints Source Code display with breakpoints Command buttons Command buttons Command component Command component Active process / Active group selection component Active process / Active group selection component

15 Program Systems Institute RAS GTDB windows and components features GUI component for per process debugging: GUI component for per process debugging: With GUI features for easy processes and MPI-nodes status read With GUI features for easy processes and MPI-nodes status read With ability to pick and choose one of processes With ability to pick and choose one of processes Full featured subcomponent for processes debugging similar to main subcomponent for debugging active process Full featured subcomponent for processes debugging similar to main subcomponent for debugging active process MPI-nodes/processes states window, also used for selecting processes to inspect MPI-nodes/processes states window, also used for selecting processes to inspect

16 Program Systems Institute RAS GTDB windows and components features Breakpoints manipulation component window Breakpoints manipulation component window Configuration / Properties component window Configuration / Properties component window Various pop-up menus used for: Various pop-up menus used for: selected expression data inspection and manipulation, print, display, watchpoints, value set... selected expression data inspection and manipulation, print, display, watchpoints, value set... execution control (breakpoints set, disable, delete...) execution control (breakpoints set, disable, delete...)

17 Program Systems Institute RAS GTDB – TDB Client Component  intuitive interface and ergonomic design  the presentation of information is handy and convenient

18 Program Systems Institute RAS GTDB Node Selection Component User can select the exact set of computational nodes that are available for debugging MPI tasks. The list of all nodes available for MPI task debugging can be obtained through the request to TDB daemons. The primary TDB daemon is running on front-end and Secondary TDB daemons are running on computational nodes of cluster. TDB daemons represent monitor processes. Secondary daemons collect and the primary daemon accumulates useful info about computational nodes status.

19 Program Systems Institute RAS GTDB Properties Component Is used to configure various TDB, GTDB, and MPI implementations settings

20 Program Systems Institute RAS GTDB Nodes Status Component Describes statuses of MPI-nodes processes. Green color marks running processes Yellow color marks stopped processes Red color marks processes that have been stopped or terminated by a signal Upper bar : common MPI-node status Green - all processes of the node are running Yellow – at least one of the processes is stopped Red - at least one process caught a signal Common status bar is used in purpose to give the user the opportunity to read information about the situation with debugging processes in a more simple and clear way. All status subcomponents are implemented as button widgets: if clicked, open appropriate process (processes) for individual exploration in the PROCS GTDB mode.

21 Program Systems Institute RAS GTDB Breakpoints Component The component is used to work with various types of breakpoints supported in TDB: Source line breakpoints, Source line breakpoints, function breakpoints and function breakpoints and watchpoints; watchpoints; all of them may have conditions. As well a special type of breakpoints is implemented in TDB, so called “group breakpoints”. The group breakpoint allows user to set a number of uniform breakpoints in a group of parallel processes. The user can set, delete, disable or enable group breakpoint in one command or click.

22 Program Systems Institute RAS The Main GTDB Window. Sample Debug Session GTDB in the MAIN -> PROC mode. Process 2:0 is an active (selected, exploring) process...

23 Program Systems Institute RAS Example Debug Session of Debugging Simple MPI Program Example of dynamic groups definition using the "dgroup" command

24 Program Systems Institute RAS Example Debug Session of Debugging Simple MPI Program We continue the execution of processes from the masters dynamic group and then stop on previously set breakpoints in the loop.

25 Program Systems Institute RAS Example Debug Session of Debugging Simple MPI Program As we can see the ‘i’ variable equals to zero on all processes in the masters group (the "print" command on group masters was used). To get out from the loop we set the ‘i’ variable on all masters to 1.

26 Program Systems Institute RAS We continue execution of masters group processes, but – after the loop – execution is stopped by the SIGSEGV signal.

27 Program Systems Institute RAS Per Procs GTDB Debugging Mode In the Main mode the user can work with one selected (active) process or group In the Procs mode he/she can examine any process individually. The component was implemented as two “notebooks” inserted one into the other. The first (outer, placed vertically) notebook is the MPI-nodes notebook. Its bookmarks contain info about appropriate processes and common MPI-node statuses, colored as nodes status component. The second (inner, placed horizontally) notebook is a notebook of processes...

28 Program Systems Institute RASContacts Max Kovalenko madmax@botik.ru Max Kovalenko madmax@botik.rumadmax@botik.ru Alexei Adamovich lexa@botik.ru Alexei Adamovich lexa@botik.rulexa@botik.ru Sergei Abramov abram@botik.ru Sergei Abramov abram@botik.ruabram@botik.ru


Download ppt "Program Systems Institute RASTDB TDB: THE INTERACTIVE DISTRIBUTED DEBUGGING TOOL FOR PARALLEL MPI PROGRAMS."

Similar presentations


Ads by Google