diff --git a/doc.zih.tu-dresden.de/docs/archive/debugging_tools.md b/doc.zih.tu-dresden.de/docs/archive/debugging_tools.md deleted file mode 100644 index 0d902d2cfeb23f9ca1763df909d6746b16be81da..0000000000000000000000000000000000000000 --- a/doc.zih.tu-dresden.de/docs/archive/debugging_tools.md +++ /dev/null @@ -1,14 +0,0 @@ -# Debugging Tools - -Debugging is an essential but also rather time consuming step during application development. Tools -dramatically reduce the amount of time spent to detect errors. Besides the "classical" serial -programming errors, which may usually be easily detected with a regular debugger, there exist -programming errors that result from the usage of OpenMP, Pthreads, or MPI. These errors may also be -detected with debuggers (preferably debuggers with support for parallel applications), however, -specialized tools like MPI checking tools (e.g. Marmot) or thread checking tools (e.g. Intel Thread -Checker) can simplify this task. The following sections provide detailed information about the -different types of debugging tools: - -- [Debuggers] **todo** Debuggers -- debuggers (with and without support for parallel applications) -- [MPI Usage Error Detection] **todo** MPI Usage Error Detection -- tools to detect MPI usage errors -- [Thread Checking] **todo** Thread Checking -- tools to detect OpenMP/Pthread usage errors diff --git a/doc.zih.tu-dresden.de/docs/software/debuggers.md b/doc.zih.tu-dresden.de/docs/software/debuggers.md index fafb8c705f30a9e4b026d549b656aa7a0516540a..d88ca5f068f0145e8acc46407feca93a14968522 100644 --- a/doc.zih.tu-dresden.de/docs/software/debuggers.md +++ b/doc.zih.tu-dresden.de/docs/software/debuggers.md @@ -1,9 +1,16 @@ -# Debuggers +# Debugging -This section describes how to start the debuggers on the ZIH systems. +Debugging is an essential but also rather time consuming step during application development. Tools +dramatically reduce the amount of time spent to detect errors. Besides the "classical" serial +programming errors, which may usually be easily detected with a regular debugger, there exist +programming errors that result from the usage of OpenMP, Pthreads, or MPI. These errors may also be +detected with debuggers (preferably debuggers with support for parallel applications), however, +specialized tools like MPI checking tools (e.g. Marmot) or thread checking tools (e.g. Intel Thread +Checker) can simplify this task. -Detailed information about how to use the debuggers can be found on the -website of the debuggers (see below). +This page provides detailed information on classic debugging at ZIH systems. The more specific +topic [MPI Usage Error Detection](mpi_usage_error_detection.md) covers tools to detect MPI usage +errors. ## Overview of available Debuggers at ZIH @@ -17,30 +24,30 @@ website of the debuggers (see below). ## General Advices -- You need to compile your code with the flag `-g` to enable - debugging. This tells the compiler to include information about - variable and function names, source code lines etc. into the - executable. -- It is also recommendable to reduce or even disable optimizations - (`-O0` or gcc's `-Og`). At least inlining should be disabled (usually - `-fno-inline`). -- For parallel applications: try to reproduce the problem with less - processes or threads before using a parallel debugger. -- Use the compiler's check capabilites to find typical problems at - compile time or run time, read the manual (`man gcc`, `man ifort`, etc.) - - Intel C++ example: `icpc -g -std=c++14 -w3 -check=stack,uninit -check-pointers=rw -fp-trap=all` - - Intel Fortran example: `ifort -g -std03 -warn all -check all -fpe-all=0 -traceback` - - The flag `-traceback` of the Intel Fortran compiler causes to print - stack trace and source code location when the program terminates - abnormally. -- If your program crashes and you get an address of the failing - instruction, you can get the source code line with the command - `addr2line -e <executable> <address>` (if compiled with `-g`). -- Use [Memory Debuggers](#memory-debugging) to - verify the proper usage of memory. -- Core dumps are useful when your program crashes after a long - runtime. -- Slides from user training: [Introduction to Parallel Debugging](misc/debugging_intro.pdf) +- You need to compile your code with the flag `-g` to enable + debugging. This tells the compiler to include information about + variable and function names, source code lines etc. into the + executable. +- It is also recommendable to reduce or even disable optimizations + (`-O0` or gcc's `-Og`). At least inlining should be disabled (usually + `-fno-inline`). +- For parallel applications: try to reproduce the problem with less + processes or threads before using a parallel debugger. +- Use the compiler's check capabilities to find typical problems at + compile time or run time, read the manual (`man gcc`, `man ifort`, etc.) + - Intel C++ example: `icpc -g -std=c++14 -w3 -check=stack,uninit -check-pointers=rw -fp-trap=all` + - Intel Fortran example: `ifort -g -std03 -warn all -check all -fpe-all=0 -traceback` + - The flag `-traceback` of the Intel Fortran compiler causes to print + stack trace and source code location when the program terminates + abnormally. +- If your program crashes and you get an address of the failing + instruction, you can get the source code line with the command + `addr2line -e <executable> <address>` (if compiled with `-g`). +- Use [Memory Debuggers](#memory-debugging) to + verify the proper usage of memory. +- Core dumps are useful when your program crashes after a long + runtime. +- Slides from user training: [Introduction to Parallel Debugging](misc/debugging_intro.pdf) ## GNU Debugger (GDB) @@ -55,34 +62,28 @@ several ways: | Attach running program to GDB | `gdb --pid <process ID>` | | Open a core dump | `gdb <executable> <core file>` | -This [GDB Reference -Sheet](http://users.ece.utexas.edu/~adnan/gdb-refcard.pdf) makes life -easier when you often use GDB. +This [GDB Reference Sheet](http://users.ece.utexas.edu/~adnan/gdb-refcard.pdf) makes life easier +when you often use GDB. -Fortran 90 programmers may issue an -`module load ddt` before their debug session. This makes the GDB -modified by DDT available, which has better support for Fortran 90 (e.g. -derived types). +Fortran 90 programmers may issue an `module load ddt` before their debug session. This makes the GDB +modified by DDT available, which has better support for Fortran 90 (e.g. derived types). ## Arm DDT  -- Intuitive graphical user interface and great support for parallel applications -- We have 1024 licences, so many user can use this tool for parallel - debugging -- Don't expect that debugging an MPI program with 100ths of process - will always work without problems - - The more processes and nodes involved, the higher is the - probability for timeouts or other problems - - Debug with as few processes as required to reproduce the bug you - want to find -- Module to load before using: `module load ddt` -- Start: `ddt <executable>` -- If the GUI runs too slow over your remote connection: - Use [WebVNC](../access/graphical_applications_with_webvnc.md) to start a remote desktop - session in a web browser. -- Slides from user training: [Parallel Debugging with DDT](misc/debugging_ddt.pdf) +- Intuitive graphical user interface and great support for parallel applications +- We have 1024 licences, so many user can use this tool for parallel debugging +- Don't expect that debugging an MPI program with 100ths of process will always work without + problems + - The more processes and nodes involved, the higher is the probability for timeouts or other + problems + - Debug with as few processes as required to reproduce the bug you want to find +- Module to load before using: `module load ddt` Start: `ddt <executable>` If the GUI runs too slow +- over your remote connection: + Use [WebVNC](../access/graphical_applications_with_webvnc.md) to start a remote desktop session in + a web browser. +- Slides from user training: [Parallel Debugging with DDT](misc/debugging_ddt.pdf) ### Serial Program Example @@ -95,9 +96,9 @@ srun: job 123456 has been allocated resources marie@compute$ ddt ./myprog ``` -- Run dialog window of DDT opens. -- Optionally: configure options like program arguments. -- Hit *Run*. +- Run dialog window of DDT opens. +- Optionally: configure options like program arguments. +- Hit *Run*. ### Multi-threaded Program Example @@ -110,10 +111,10 @@ srun: job 123457 has been allocated resources marie@compute$ ddt ./myprog ``` -- Run dialog window of DDT opens. -- Optionally: configure options like program arguments. -- If OpenMP: set number of threads. -- Hit *Run*. +- Run dialog window of DDT opens. +- Optionally: configure options like program arguments. +- If OpenMP: set number of threads. +- Hit *Run*. ### MPI-Parallel Program Example @@ -128,27 +129,27 @@ salloc: Granted job allocation 123458 marie@login$ ddt srun ./myprog ``` -- Run dialog window of DDT opens. -- If MPI-OpenMP-hybrid: set number of threads. -- Hit *Run* +- Run dialog window of DDT opens. +- If MPI-OpenMP-hybrid: set number of threads. +- Hit *Run* ## Memory Debugging -- Memory debuggers find memory management bugs, e.g. - - Use of non-initialized memory - - Access memory out of allocated bounds -- DDT has memory debugging included (needs to be enabled in the run dialog) +- Memory debuggers find memory management bugs, e.g. + - Use of non-initialized memory + - Access memory out of allocated bounds +- DDT has memory debugging included (needs to be enabled in the run dialog) ### Valgrind (Memcheck) -- Simulation of the program run in a virtual machine which accurately observes memory operations. -- Extreme run time slow-down: use small program runs! -- Finds more memory errors than other debuggers. -- Further information: - - [Valgrind Website](http://www.valgrind.org) - - [Memcheck Manual](https://www.valgrind.org/docs/manual/mc-manual.html) - (explanation of output, command-line options) -- For serial or multi-threaded programs: +- Simulation of the program run in a virtual machine which accurately observes memory operations. +- Extreme run time slow-down: use small program runs! +- Finds more memory errors than other debuggers. +- Further information: + - [Valgrind Website](http://www.valgrind.org) + - [Memcheck Manual](https://www.valgrind.org/docs/manual/mc-manual.html) + (explanation of output, command-line options) +- For serial or multi-threaded programs: ```console marie@login$ module load Valgrind @@ -156,12 +157,12 @@ Module Valgrind/3.14.0-foss-2018b and 12 dependencies loaded. marie@login$ srun -n 1 valgrind ./myprog ``` -- Not recommended for MPI parallel programs, since usually the MPI library will throw - a lot of errors. But you may use valgrind the following way such that every rank - writes its own valgrind logfile: +- Not recommended for MPI parallel programs, since usually the MPI library will throw + a lot of errors. But you may use Valgrind the following way such that every rank + writes its own Valgrind logfile: ```console marie@login$ module load Valgrind Module Valgrind/3.14.0-foss-2018b and 12 dependencies loaded. -marie@login$ srun -n <number of processes> valgrind --log-file=valgrind-%p.out ./myprog +marie@login$ srun -n <number of processes> valgrind --log-file=valgrind-%p.out ./myprog ``` diff --git a/doc.zih.tu-dresden.de/docs/software/software_development_overview.md b/doc.zih.tu-dresden.de/docs/software/software_development_overview.md index 966647b4f6d7ee11f92255f3c5ceb619b2d1d647..d2dd73ed3a56bc49d31123cec65bc8694e7f0f10 100644 --- a/doc.zih.tu-dresden.de/docs/software/software_development_overview.md +++ b/doc.zih.tu-dresden.de/docs/software/software_development_overview.md @@ -37,9 +37,7 @@ Some questions you should ask yourself: Subsections: - [Compilers](compilers.md) -- [Debugging Tools](../archive/debugging_tools.md) - - [Debuggers](debuggers.md) (GDB, Allinea DDT, Totalview) - - [Tools to detect MPI usage errors](mpi_usage_error_detection.md) (MUST) +- [Debugging](debuggers.md) - PerformanceTools.md: [Score-P](scorep.md), [Vampir](vampir.md) - [Libraries](libraries.md) diff --git a/doc.zih.tu-dresden.de/mkdocs.yml b/doc.zih.tu-dresden.de/mkdocs.yml index 2f54a5dd57382093d6f9aff2b2cf583bc76b3806..4f6fcfc3d2eb5fb288a30837b10813e2ba9a16d3 100644 --- a/doc.zih.tu-dresden.de/mkdocs.yml +++ b/doc.zih.tu-dresden.de/mkdocs.yml @@ -62,9 +62,10 @@ nav: - Building Software: software/building_software.md - GPU Programming: software/gpu_programming.md - Compilers: software/compilers.md - - Debuggers: software/debuggers.md + - Debugging: + - Overview: software/debuggers.md + - MPI Error Detection: software/mpi_usage_error_detection.md - Libraries: software/libraries.md - - MPI Error Detection: software/mpi_usage_error_detection.md - Score-P: software/scorep.md - Perf Tools: software/perf_tools.md - PIKA: software/pika.md @@ -113,7 +114,6 @@ nav: - Overview: archive/overview.md - Bio Informatics: archive/bioinformatics.md - CXFS End of Support: archive/cxfs_end_of_support.md - - Debugging Tools: archive/debugging_tools.md - KNL Nodes: archive/knl_nodes.md - Load Leveler: archive/load_leveler.md - Migrate to Atlas: archive/migrate_to_atlas.md