Skip to content
Snippets Groups Projects
Commit eeb56cc1 authored by Martin Schroschk's avatar Martin Schroschk
Browse files

Review debugger pages

parent dda83bd7
No related branches found
No related tags found
3 merge requests!322Merge preview into main,!319Merge preview into main,!276Review debugger pages
# Debugging Tools
Debugging is an essential but also rather time consuming step during application development. Tools
dramatically reduce the amount of time spent to detect errors. Besides the "classical" serial
programming errors, which may usually be easily detected with a regular debugger, there exist
programming errors that result from the usage of OpenMP, Pthreads, or MPI. These errors may also be
detected with debuggers (preferably debuggers with support for parallel applications), however,
specialized tools like MPI checking tools (e.g. Marmot) or thread checking tools (e.g. Intel Thread
Checker) can simplify this task. The following sections provide detailed information about the
different types of debugging tools:
- [Debuggers] **todo** Debuggers -- debuggers (with and without support for parallel applications)
- [MPI Usage Error Detection] **todo** MPI Usage Error Detection -- tools to detect MPI usage errors
- [Thread Checking] **todo** Thread Checking -- tools to detect OpenMP/Pthread usage errors
# Debuggers
# Debugging
This section describes how to start the debuggers on the ZIH systems.
Debugging is an essential but also rather time consuming step during application development. Tools
dramatically reduce the amount of time spent to detect errors. Besides the "classical" serial
programming errors, which may usually be easily detected with a regular debugger, there exist
programming errors that result from the usage of OpenMP, Pthreads, or MPI. These errors may also be
detected with debuggers (preferably debuggers with support for parallel applications), however,
specialized tools like MPI checking tools (e.g. Marmot) or thread checking tools (e.g. Intel Thread
Checker) can simplify this task.
Detailed information about how to use the debuggers can be found on the
website of the debuggers (see below).
This page provides detailed information on classic debugging at ZIH systems. The more specific
topic [MPI Usage Error Detection](mpi_usage_error_detection.md) covers tools to detect MPI usage
errors.
## Overview of available Debuggers at ZIH
......@@ -17,30 +24,30 @@ website of the debuggers (see below).
## General Advices
- You need to compile your code with the flag `-g` to enable
debugging. This tells the compiler to include information about
variable and function names, source code lines etc. into the
executable.
- It is also recommendable to reduce or even disable optimizations
(`-O0` or gcc's `-Og`). At least inlining should be disabled (usually
`-fno-inline`).
- For parallel applications: try to reproduce the problem with less
processes or threads before using a parallel debugger.
- Use the compiler's check capabilites to find typical problems at
compile time or run time, read the manual (`man gcc`, `man ifort`, etc.)
- Intel C++ example: `icpc -g -std=c++14 -w3 -check=stack,uninit -check-pointers=rw -fp-trap=all`
- Intel Fortran example: `ifort -g -std03 -warn all -check all -fpe-all=0 -traceback`
- The flag `-traceback` of the Intel Fortran compiler causes to print
stack trace and source code location when the program terminates
abnormally.
- If your program crashes and you get an address of the failing
instruction, you can get the source code line with the command
`addr2line -e <executable> <address>` (if compiled with `-g`).
- Use [Memory Debuggers](#memory-debugging) to
verify the proper usage of memory.
- Core dumps are useful when your program crashes after a long
runtime.
- Slides from user training: [Introduction to Parallel Debugging](misc/debugging_intro.pdf)
- You need to compile your code with the flag `-g` to enable
debugging. This tells the compiler to include information about
variable and function names, source code lines etc. into the
executable.
- It is also recommendable to reduce or even disable optimizations
(`-O0` or gcc's `-Og`). At least inlining should be disabled (usually
`-fno-inline`).
- For parallel applications: try to reproduce the problem with less
processes or threads before using a parallel debugger.
- Use the compiler's check capabilities to find typical problems at
compile time or run time, read the manual (`man gcc`, `man ifort`, etc.)
- Intel C++ example: `icpc -g -std=c++14 -w3 -check=stack,uninit -check-pointers=rw -fp-trap=all`
- Intel Fortran example: `ifort -g -std03 -warn all -check all -fpe-all=0 -traceback`
- The flag `-traceback` of the Intel Fortran compiler causes to print
stack trace and source code location when the program terminates
abnormally.
- If your program crashes and you get an address of the failing
instruction, you can get the source code line with the command
`addr2line -e <executable> <address>` (if compiled with `-g`).
- Use [Memory Debuggers](#memory-debugging) to
verify the proper usage of memory.
- Core dumps are useful when your program crashes after a long
runtime.
- Slides from user training: [Introduction to Parallel Debugging](misc/debugging_intro.pdf)
## GNU Debugger (GDB)
......@@ -55,34 +62,28 @@ several ways:
| Attach running program to GDB | `gdb --pid <process ID>` |
| Open a core dump | `gdb <executable> <core file>` |
This [GDB Reference
Sheet](http://users.ece.utexas.edu/~adnan/gdb-refcard.pdf) makes life
easier when you often use GDB.
This [GDB Reference Sheet](http://users.ece.utexas.edu/~adnan/gdb-refcard.pdf) makes life easier
when you often use GDB.
Fortran 90 programmers may issue an
`module load ddt` before their debug session. This makes the GDB
modified by DDT available, which has better support for Fortran 90 (e.g.
derived types).
Fortran 90 programmers may issue an `module load ddt` before their debug session. This makes the GDB
modified by DDT available, which has better support for Fortran 90 (e.g. derived types).
## Arm DDT
![DDT Main Window](misc/ddt-main-window.png)
- Intuitive graphical user interface and great support for parallel applications
- We have 1024 licences, so many user can use this tool for parallel
debugging
- Don't expect that debugging an MPI program with 100ths of process
will always work without problems
- The more processes and nodes involved, the higher is the
probability for timeouts or other problems
- Debug with as few processes as required to reproduce the bug you
want to find
- Module to load before using: `module load ddt`
- Start: `ddt <executable>`
- If the GUI runs too slow over your remote connection:
Use [WebVNC](../access/graphical_applications_with_webvnc.md) to start a remote desktop
session in a web browser.
- Slides from user training: [Parallel Debugging with DDT](misc/debugging_ddt.pdf)
- Intuitive graphical user interface and great support for parallel applications
- We have 1024 licences, so many user can use this tool for parallel debugging
- Don't expect that debugging an MPI program with 100ths of process will always work without
problems
- The more processes and nodes involved, the higher is the probability for timeouts or other
problems
- Debug with as few processes as required to reproduce the bug you want to find
- Module to load before using: `module load ddt` Start: `ddt <executable>` If the GUI runs too slow
- over your remote connection:
Use [WebVNC](../access/graphical_applications_with_webvnc.md) to start a remote desktop session in
a web browser.
- Slides from user training: [Parallel Debugging with DDT](misc/debugging_ddt.pdf)
### Serial Program Example
......@@ -95,9 +96,9 @@ srun: job 123456 has been allocated resources
marie@compute$ ddt ./myprog
```
- Run dialog window of DDT opens.
- Optionally: configure options like program arguments.
- Hit *Run*.
- Run dialog window of DDT opens.
- Optionally: configure options like program arguments.
- Hit *Run*.
### Multi-threaded Program Example
......@@ -110,10 +111,10 @@ srun: job 123457 has been allocated resources
marie@compute$ ddt ./myprog
```
- Run dialog window of DDT opens.
- Optionally: configure options like program arguments.
- If OpenMP: set number of threads.
- Hit *Run*.
- Run dialog window of DDT opens.
- Optionally: configure options like program arguments.
- If OpenMP: set number of threads.
- Hit *Run*.
### MPI-Parallel Program Example
......@@ -128,27 +129,27 @@ salloc: Granted job allocation 123458
marie@login$ ddt srun ./myprog
```
- Run dialog window of DDT opens.
- If MPI-OpenMP-hybrid: set number of threads.
- Hit *Run*
- Run dialog window of DDT opens.
- If MPI-OpenMP-hybrid: set number of threads.
- Hit *Run*
## Memory Debugging
- Memory debuggers find memory management bugs, e.g.
- Use of non-initialized memory
- Access memory out of allocated bounds
- DDT has memory debugging included (needs to be enabled in the run dialog)
- Memory debuggers find memory management bugs, e.g.
- Use of non-initialized memory
- Access memory out of allocated bounds
- DDT has memory debugging included (needs to be enabled in the run dialog)
### Valgrind (Memcheck)
- Simulation of the program run in a virtual machine which accurately observes memory operations.
- Extreme run time slow-down: use small program runs!
- Finds more memory errors than other debuggers.
- Further information:
- [Valgrind Website](http://www.valgrind.org)
- [Memcheck Manual](https://www.valgrind.org/docs/manual/mc-manual.html)
(explanation of output, command-line options)
- For serial or multi-threaded programs:
- Simulation of the program run in a virtual machine which accurately observes memory operations.
- Extreme run time slow-down: use small program runs!
- Finds more memory errors than other debuggers.
- Further information:
- [Valgrind Website](http://www.valgrind.org)
- [Memcheck Manual](https://www.valgrind.org/docs/manual/mc-manual.html)
(explanation of output, command-line options)
- For serial or multi-threaded programs:
```console
marie@login$ module load Valgrind
......@@ -156,12 +157,12 @@ Module Valgrind/3.14.0-foss-2018b and 12 dependencies loaded.
marie@login$ srun -n 1 valgrind ./myprog
```
- Not recommended for MPI parallel programs, since usually the MPI library will throw
a lot of errors. But you may use valgrind the following way such that every rank
writes its own valgrind logfile:
- Not recommended for MPI parallel programs, since usually the MPI library will throw
a lot of errors. But you may use Valgrind the following way such that every rank
writes its own Valgrind logfile:
```console
marie@login$ module load Valgrind
Module Valgrind/3.14.0-foss-2018b and 12 dependencies loaded.
marie@login$ srun -n <number of processes> valgrind --log-file=valgrind-%p.out ./myprog
marie@login$ srun -n <number of processes> valgrind --log-file=valgrind-%p.out ./myprog
```
......@@ -62,9 +62,10 @@ nav:
- Building Software: software/building_software.md
- GPU Programming: software/gpu_programming.md
- Compilers: software/compilers.md
- Debuggers: software/debuggers.md
- Debugging:
- Overview: software/debuggers.md
- MPI Error Detection: software/mpi_usage_error_detection.md
- Libraries: software/libraries.md
- MPI Error Detection: software/mpi_usage_error_detection.md
- Score-P: software/scorep.md
- Perf Tools: software/perf_tools.md
- PIKA: software/pika.md
......@@ -113,7 +114,6 @@ nav:
- Overview: archive/overview.md
- Bio Informatics: archive/bioinformatics.md
- CXFS End of Support: archive/cxfs_end_of_support.md
- Debugging Tools: archive/debugging_tools.md
- KNL Nodes: archive/knl_nodes.md
- Load Leveler: archive/load_leveler.md
- Migrate to Atlas: archive/migrate_to_atlas.md
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment