Application Debugging
Debugging methodical process of finding and reducing the number of bugs, or defects, in a computer program or a piece of electronic hardware thus making it behave as expected.
Basic Steps Recognize that a bug exists Isolate the source of the bug Identify the cause of the bug Determine a fix for the bug Apply the fix and test it
Common Debug Method print statements
Preferable Debug Method Use a debugger monitor an application program in situ catch memory errors can't attach print statements to a running program Graphical debuggers can provide visual aids
Available Debuggers dbx gdb PGI pgdbg Intel idb ladebug TotalView DDT Plus many others available
Step 1- Identify there is a bug If an error is severe enough to cause the program to terminate abnormally, the existence of a bug becomes obvious! if the error is minor and only causes the wrong results, it becomes much more difficult to detect that a bug exists this is especially true if it is difficult or impossible to verify the results of the program Goal identify the symptoms of the bug under what conditions the problem is detected will greatly help the remaining steps to debugging the problem.
Example debug]$./dgemm_ex1 Segmentation fault
Steps to follow recompile to enable debug support often this option is '-g' check compiler documentation to be sure! all modules need to be compiled with this option re-run application
Steps to follow want failure to generate a core dump by default, core dumps are disabled on HPC machines re-enable with the command: ulimit -c unlimited
Example debug]$./dgemm_ex1 Segmentation fault (core dumped)
Core File contains the memory image of a particular process along with other information such as the values of processor registers very useful debugging tool name format is: core.PID
Using the Core File examine its contents with a debugging tool such as gdb command format is: gdb exe_file core.PID if application compiled with '-g' then odds are good you will be taken directly to the offending source line
Example debug]$ gdb./dgemm_ex1 core GNU gdb Red Hat Linux ( EL4rh) Copyright 2004 Free Software Foundation, Inc.. Core was generated by `./dgemm_ex1'. Program terminated with signal 11, Segmentation fault. Reading symbols from BLAH BLAH BLAH. Loaded symbols for /lib64/ld-linux-x86-64.so.2 #0 0x d5534d03f in _IO_vfscanf_internal () from /lib64/tls/libc.so.6 (gdb) where #0 0x d5534d03f in _IO_vfscanf_internal () from /lib64/tls/libc.so.6 #1 0x d in fscanf () from /lib64/tls/libc.so.6 #2 0x c25 in main (argc=1, argv=0x7fbffff648) at /home1/nucci/proj/debug/dgemm_ex1.c:41
Live Example and Hands-On DDT Tutorial