Skip to content
Snippets Groups Projects
Commit a023f7ee authored by Martin Schroschk's avatar Martin Schroschk
Browse files

Merge branch 'SoftwareDevelopment' into 'preview'

Transfer content to new wiki and fix checks

See merge request zih/hpc-compendium/hpc-compendium!161
parents d026e1af 1b6671e9
No related branches found
No related tags found
3 merge requests!322Merge preview into main,!319Merge preview into main,!161Transfer content to new wiki and fix checks
......@@ -11,90 +11,88 @@ The following libraries are available on our platforms:
## The Boost Library
Boost provides free peer-reviewed portable C++ source libraries, ranging
from multithread and MPI support to regular expression and numeric
funtions. See at <http://www.boost.org> for detailed documentation.
Boost provides free peer-reviewed portable C++ source libraries, ranging from multithread and MPI
support to regular expression and numeric funtions. See at http://www.boost.org for detailed
documentation.
## BLAS/LAPACK
### Example
program ExampleProgram
```Fortran
program ExampleProgram
external dgesv
integer:: n, m, c, d, e, Z(2) !parameter definition
double precision:: A(2,2), B(2)
external dgesv
integer:: n, m, c, d, e, Z(2) !parameter definition
double precision:: A(2,2), B(2)
n=2; m=1; c=2; d=2;
n=2; m=1; c=2; d=2;
A(1,1) = 1.0; A(1,2) = 2.0; !parameter setting
A(2,1) = 3.0; A(2,2) = 4.0;
A(1,1) = 1.0; A(1,2) = 2.0; !parameter setting
A(2,1) = 3.0; A(2,2) = 4.0;
B(1) = 14.0; B(2) = 32.0;
B(1) = 14.0; B(2) = 32.0;
Call dgesv(n,m,A,c,Z,B,d,e); !call the subroutine
Call dgesv(n,m,A,c,Z,B,d,e); !call the subroutine
write(*,*) "Solution ", B(1), " ", B(2) !display on desktop
write(*,*) "Solution ", B(1), " ", B(2) !display on desktop
end program ExampleProgram
end program ExampleProgram
```
### Math Kernel Library (MKL)
The Intel Math Kernel Library is a collection of basic linear algebra
subroutines (BLAS) and fast fourier transformations (FFT). It contains
routines for:
The Intel Math Kernel Library is a collection of basic linear algebra subroutines (BLAS) and fast
fourier transformations (FFT). It contains routines for:
- Solvers such as linear algebra package (LAPACK) and BLAS
- Eigenvector/eigenvalue solvers (BLAS, LAPACK)
- PDEs, signal processing, seismic, solid-state physics (FFTs)
- General scientific, financial - vector transcendental functions,
vector markup language (XML)
- Solvers such as linear algebra package (LAPACK) and BLAS
- Eigenvector/eigenvalue solvers (BLAS, LAPACK)
- PDEs, signal processing, seismic, solid-state physics (FFTs)
- General scientific, financial - vector transcendental functions,
vector markup language (XML)
More specifically it contains the following components:
- BLAS:
- Level 1 BLAS: vector-vector operations, 48 functions
- Level 2 BLAS: matrix-vector operations, 66 functions
- Level 3 BLAS: matrix-matrix operations, 30 functions
- LAPACK (linear algebra package), solvers and eigensolvers, hundreds
of routines, more than 1000 user callable routines
- FFTs (fast Fourier transform): one and two dimensional, with and
without frequency ordering (bit reversal). There are wrapper
functions to provide an interface to use MKL instead of FFTW.
- VML (vector math library), set of vectorized transcendental
functions
- Parallel Sparse Direct Linear Solver (Pardiso)
Please note: MKL comes in an OpenMP-parallel version. If you want to use
it, make sure you know how to place your jobs. {{ In \[c't 18, 2010\],
Andreas Stiller proposes the usage of `GOMP_CPU_AFFINITY` to allow the
mapping of AMD cores. KMP_AFFINITY works only for Intel processors. }}
- BLAS:
- Level 1 BLAS: vector-vector operations, 48 functions
- Level 2 BLAS: matrix-vector operations, 66 functions
- Level 3 BLAS: matrix-matrix operations, 30 functions
- LAPACK (linear algebra package), solvers and eigensolvers, hundreds
of routines, more than 1000 user callable routines
- FFTs (fast Fourier transform): one and two dimensional, with and
without frequency ordering (bit reversal). There are wrapper
functions to provide an interface to use MKL instead of FFTW.
- VML (vector math library), set of vectorized transcendental
functions
- Parallel Sparse Direct Linear Solver (Pardiso)
Please note: MKL comes in an OpenMP-parallel version. If you want to use it, make sure you know how
to place your jobs. {{ In \[c't 18, 2010\], Andreas Stiller proposes the usage of
`GOMP_CPU_AFFINITY` to allow the mapping of AMD cores. KMP_AFFINITY works only for Intel processors.
}}
#### Linking with the MKL
For linker flag combinations, Intel provides the MKL Link Line Advisor
at
<http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/>
http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/
(please make sure that JavaScript is enabled for this page).
Can be compiled with MKL 11 like this
ifort -I$MKL_INC -L$MKL_LIB -lmkl_core -lm -lmkl_gf_ilp64 -lmkl_lapack example.f90
```Bash
ifort -I$MKL_INC -L$MKL_LIB -lmkl_core -lm -lmkl_gf_ilp64 -lmkl_lapack example.f90
```
#### Linking with the MKL at VENUS
Please follow the infomation at website \<br />
<http://hpcsoftware.ncsa.illinois.edu/Software/user/show_all.php?deploy_id=951&view=NCSA>
icc -O1 -I/sw/global/compilers/intel/2013/mkl//include -lmpi -mkl -lmkl_scalapack_lp64 -lmkl_blacs_sgimpt_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core example.c
####
```Bash
icc -O1 -I/sw/global/compilers/intel/2013/mkl//include -lmpi -mkl -lmkl_scalapack_lp64 -lmkl_blacs_sgimpt_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core example.c
```
## FFTW
FFTW is a C subroutine library for computing the discrete Fourier
transform (DFT) in one or more dimensions, of arbitrary input size, and
of both real and complex data (as well as of even/odd data, i.e. the
discrete cosine/sine transforms or DCT/DST). Before using this library,
please check out the functions of vendor specific libraries ACML and/or
MKL.
FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more
dimensions, of arbitrary input size, and of both real and complex data (as well as of even/odd data,
i.e. the discrete cosine/sine transforms or DCT/DST). Before using this library, please check out
the functions of vendor specific libraries ACML and/or MKL.
# Software Development at HPC Systems
This section provides you with the basic knowledge and tools to get you out of trouble. It will tell
you:
- How to compile your code
- Using mathematical libraries
- Find caveats and hidden errors in application codes
- Handle debuggers
- Follow system calls and interrupts
- Understand the relationship between correct code and performance
Some hints that are helpful:
- Stick to standards wherever possible, e.g. use the **`-std`** flag
for GNU and Intel C/C++ compilers. Computers are short living
creatures, migrating between platforms can be painful. In addition,
running your code on different platforms greatly increases the
reliably. You will find many bugs on one platform that never will be
revealed on another.
- Before and during performance tuning: Make sure that your code
delivers the correct results.
Some questions you should ask yourself:
- Given that a code is parallel, are the results independent from the
numbers of threads or processes?
- Have you ever run your Fortran code with array bound and subroutine
argument checking (the **`-check all`** and **`-traceback`** flags
for the Intel compilers)?
- Have you checked that your code is not causing floating point
exceptions?
- Does your code work with a different link order of objects?
- Have you made any assumptions regarding storage of data objects in
memory?
Subsections:
- [Compilers](Compilers.md)
- [Debugging Tools](Debugging Tools.md)
- [Debuggers](Debuggers.md) (GDB, Allinea DDT, Totalview)
- [Tools to detect MPI usage errors](MPIUsageErrorDetection.md) (MUST)
- PerformanceTools.md: [Score-P](ScoreP.md), [Vampir](Vampir.md), [Papi Library](PapiLibrary.md)
- [Libraries](Libraries.md)
Intel Tools Seminar \[Oct. 2013\]
- [TU-Dresden_Intel_Multithreading_Methodologies.pdf]**todo** %ATTACHURL%/TU-Dresden_Intel_Multithreading_Methodologies.pdf:
Intel Multithreading Methodologies
- [TU-Dresden_Advisor_XE.pdf] **todo** %ATTACHURL%/TU-Dresden_Advisor_XE.pdf):
Intel Advisor XE - Threading prototyping tool for software
architects
- [TU-Dresden_Inspector_XE.pdf] **todo** %ATTACHURL%/TU-Dresden_Inspector_XE.pdf):
Inspector XE - Memory-, Thread-, Pointer-Checker, Debugger
- [TU-Dresden_Intel_Composer_XE.pdf] **todo** %ATTACHURL%/TU-Dresden_Intel_Composer_XE.pdf):
Intel Composer - Compilers, Libraries
# Vampir
Contents:
1 [Introduction](#VampirIntro) 1 [Starting Vampir](#VampirUsage) 1
[Using VampirServer](#VampirServerUsage) 1 [Advanced
usage](#VampirAdvanced) 1 [Manual server
startup](#VampirManualServerStartup) 1 [Port
forwarding](#VampirPortForwarding) 1 [Nightly builds
(unstable)](#VampirServerUnstable)
#VampirIntro
## Introduction
Vampir is a graphical analysis framework that provides a large set of
different chart representations of event based performance data
generated through program instrumentation. These graphical displays,
including state diagrams, statistics, and timelines, can be used by
developers to obtain a better understanding of their parallel program
inner working and to subsequently optimize it. Vampir allows to focus on
appropriate levels of detail, which allows the detection and explanation
of various performance bottlenecks such as load imbalances and
communication deficiencies. [Follow this link for further
Vampir is a graphical analysis framework that provides a large set of different chart
representations of event based performance data generated through program instrumentation. These
graphical displays, including state diagrams, statistics, and timelines, can be used by developers
to obtain a better understanding of their parallel program inner working and to subsequently
optimize it. Vampir allows to focus on appropriate levels of detail, which allows the detection and
explanation of various performance bottlenecks such as load imbalances and communication
deficiencies. [Follow this link for further
information](http://tu-dresden.de/die_tu_dresden/zentrale_einrichtungen/zih/forschung/projekte/vampir).
A growing number of performance monitoring environments like
[VampirTrace](Compendium.VampirTrace), Score-P, TAU or KOJAK can produce
trace files that are readable by Vampir. The tool supports trace files
in Open Trace Format (OTF, OTF2) that is developed by ZIH and its
partners and is especially designed for massively parallel programs.
A growing number of performance monitoring environments like [VampirTrace](../archive/VampirTrace.md),
Score-P, TAU or KOJAK can produce trace files that are readable by Vampir. The tool supports trace
files in Open Trace Format (OTF, OTF2) that is developed by ZIH and its partners and is especially
designed for massively parallel programs.
\<img alt="" src="%ATTACHURLPATH%/vampir-framework.png" title="Vampir
Framework" />
#VampirUsage
\<img alt="" src="%ATTACHURLPATH%/vampir-framework.png" title="Vampir Framework" />
## Starting Vampir
Prior to using Vampir you need to set up the correct environment on one
the HPC systems with:
module load vampir
```Bash
module load vampir
```
For members of TU Dresden the Vampir tool is also available as
[download](http://tu-dresden.de/die_tu_dresden/zentrale_einrichtungen/zih/forschung/projekte/vampir/vampir_download_tu)
for installation on your personal computer.
Make sure, that compressed display forwarding (e.g.
`ssh -XC taurus.hrsk.tu-dresden.de`) is enabled. Start the GUI by typing
Make sure, that compressed display forwarding (e.g. `ssh -XC taurus.hrsk.tu-dresden.de`) is
enabled. Start the GUI by typing
vampir
```Bash
vampir
```
on your command line or by double-clicking the Vampir icon on your
personal computer.
on your command line or by double-clicking the Vampir icon on your personal computer.
Please consult the [Vampir user
manual](http://tu-dresden.de/die_tu_dresden/zentrale_einrichtungen/zih/forschung/projekte/vampir/dateien/Vampir-User-Manual.pdf)
Please consult the
[Vampir user manual](http://tu-dresden.de/die_tu_dresden/zentrale_einrichtungen/zih/forschung/projekte/vampir/dateien/Vampir-User-Manual.pdf)
for a tutorial on using the tool.
#VampirServerUsage
## Using VampirServer
VampirServer provides additional scalable analysis capabilities to the
Vampir GUI mentioned above. To use VampirServer on the HPC resources of
TU Dresden proceed as follows: start the Vampir GUI as described above
and use the *Open Remote* dialog with the parameters indicated in the
following figure to start and connect a VampirServer instance running on
taurus.hrsk.tu-dresden.de. Make sure to fill in your personal ZIH login
name.
VampirServer provides additional scalable analysis capabilities to the Vampir GUI mentioned above.
To use VampirServer on the HPC resources of TU Dresden proceed as follows: start the Vampir GUI as
described above and use the *Open Remote* dialog with the parameters indicated in the following
figure to start and connect a VampirServer instance running on taurus.hrsk.tu-dresden.de. Make sure
to fill in your personal ZIH login name.
\<img alt="" src="%ATTACHURLPATH%/vampir_open_remote_dialog.png"
title="Vampir Open Remote Dialog" />
Click on the Connect button and wait until the connection is
established. Enter your password when requested. Depending on the
available resources on the target system, this setup can take some time.
Click on the Connect button and wait until the connection is established. Enter your password when
requested. Depending on the available resources on the target system, this setup can take some time.
Please be patient and take a look at available resources beforehand.
#VampirAdvanced
## Advanced Usage
#VampirManualServerStartup
### Manual Server Startup
VampirServer is a parallel MPI program, which can also be started
manually by typing:
VampirServer is a parallel MPI program, which can also be started manually by typing:
vampirserver start
```Bash
vampirserver start
```
Above automatically allocates its resources via the respective batch
system. Use
Above automatically allocates its resources via the respective batch system. Use
vampirserver start mpi
```Bash
vampirserver start mpi
```
or
vampirserver start srun
```Bash
vampirserver start srun
```
if you want to start vampirserver without batch allocation or inside an
interactive allocation. The latter is needed whenever you manually take
care of the resource allocation by yourself.
if you want to start vampirserver without batch allocation or inside an interactive allocation. The
latter is needed whenever you manually take care of the resource allocation by yourself.
After scheduling this job the server prints out the port number it is
serving on, like `Listen port: 30088`.
After scheduling this job the server prints out the port number it is serving on, like `Listen port:
30088`.
Connecting to the most recently started server can be achieved by
entering `auto-detect` as *Setup name* in the *Open Remote* dialog of
Vampir.
Connecting to the most recently started server can be achieved by entering `auto-detect` as *Setup
name* in the *Open Remote* dialog of Vampir.
\<img alt=""
src="%ATTACHURLPATH%/vampir_open_remote_dialog_auto_start.png"
......@@ -118,23 +97,27 @@ title="Vampir Open Remote Dialog" />
Please make sure you stop VampirServer after finishing your work with
the front-end or with
vampirserver stop
```Bash
vampirserver stop
```
Type
vampirserver help
```Bash
vampirserver help
```
for further information. The [user manual of
VampirServer](http://tu-dresden.de/die_tu_dresden/zentrale_einrichtungen/zih/forschung/projekte/vampir/dateien/VampirServer-User-Manual.pdf)
can be found at *installation directory* /doc/vampirserver-manual.pdf.
Type
which vampirserver
```Bash
which vampirserver
```
to find the revision dependent *installation directory*.
#VampirPortForwarding
### Port Forwarding
VampirServer listens to a given socket port. It is possible to forward
......@@ -145,41 +128,41 @@ procedure works on Venus.
Start VampirServer on Taurus and wait for its scheduling:
vampirserver start
```Bash
vampirserver start
```
and wait for scheduling
Launching VampirServer...
Submitting slurm 30 minutes job (this might take a while)...
salloc: Granted job allocation 2753510
VampirServer 8.1.0 (r8451)
Licensed to ZIH, TU Dresden
Running 4 analysis processes... (abort with vampirserver stop 594)
VampirServer listens on: taurusi1253:30055
```Bash
Launching VampirServer...
Submitting slurm 30 minutes job (this might take a while)...
salloc: Granted job allocation 2753510
VampirServer 8.1.0 (r8451)
Licensed to ZIH, TU Dresden
Running 4 analysis processes... (abort with vampirserver stop 594)
VampirServer listens on: taurusi1253:30055
```
Open a second console on your local desktop and create an ssh tunnel to
the compute node with:
Open a second console on your local desktop and create an ssh tunnel to the compute node with:
ssh -L 30000:taurusi1253:30055 taurus.hrsk.tu-dresden.de
```Bash
ssh -L 30000:taurusi1253:30055 taurus.hrsk.tu-dresden.de
```
Now, the port 30000 on your desktop is connected to the VampirServer
port 30055 at the compute node taurusi1253 of Taurus. Finally, start
your local Vampir client and establish a remote connection to
Now, the port 30000 on your desktop is connected to the VampirServer port 30055 at the compute node
taurusi1253 of Taurus. Finally, start your local Vampir client and establish a remote connection to
`localhost`, port 30000 as described in the manual.
Remark: Please substitute the ports given in this example with
appropriate numbers and available ports.
#VampirServerUnstable
Remark: Please substitute the ports given in this example with appropriate numbers and available
ports.
### Nightly builds (unstable)
Expert users who subscribed to the development program can test new,
unstable tool features. The corresponding Vampir and VampirServer
software releases are provided as nightly builds. Unstable versions of
VampirServer are also installed on the HPC systems. The most recent
version can be launched/connected by entering `unstable` as *Setup name*
in the *Open Remote* dialog of Vampir.
Expert users who subscribed to the development program can test new, unstable tool features. The
corresponding Vampir and VampirServer software releases are provided as nightly builds. Unstable
versions of VampirServer are also installed on the HPC systems. The most recent version can be
launched/connected by entering `unstable` as *Setup name* in the *Open Remote* dialog of Vampir.
\<img alt=""
src="%ATTACHURLPATH%/vampir_open_remote_dialog_unstable.png"
......
......@@ -50,14 +50,17 @@ nav:
- VM tools: software/VMTools.md
- Virtual Desktops: software/VirtualDesktops.md
- Software Development and Tools:
- Overview: software/SoftwareDevelopment.md
- GPU Programming: software/GPUProgramming.md
- Compilers: software/Compilers.md
- Debuggers: software/Debuggers.md
- Libraries: software/Libraries.md
- MPI Error Detection: software/MPIUsageErrorDetection.md
- Score-P: software/ScoreP.md
- PAPI Library: software/PapiLibrary.md
- Perf Tools: software/PerfTools.md
- PIKA: software/pika.md
- Vampir: software/Vampir.md
- Data Management:
- Overview: data_management/DataManagement.md
- Announcement of Quotas: data_management/AnnouncementOfQuotas.md
......
# Software Development at HPC systems
This section should provide you with the basic knowledge and tools to
get you out of trouble. It will tell you:
- How to compile your code
- Using mathematical libraries
- Find caveats and hidden errors in application codes
- Handle debuggers
- Follow system calls and interrupts
- Understand the relationship between correct code and performance
Some hints that are helpful:
- Stick to standards wherever possible, e.g. use the **`-std`** flag
for GNU and Intel C/C++ compilers. Computers are short living
creatures, migrating between platforms can be painful. In addition,
running your code on different platforms greatly increases the
reliably. You will find many bugs on one platform that never will be
revealed on another.
- Before and during performance tuning: Make sure that your code
delivers the correct results.
Some questions you should ask yourself:
- Given that a code is parallel, are the results independent from the
numbers of threads or processes?
- Have you ever run your Fortran code with array bound and subroutine
argument checking (the **`-check all`** and **`-traceback`** flags
for the Intel compilers)?
- Have you checked that your code is not causing floating point
exceptions?
- Does your code work with a different link order of objects?
- Have you made any assumptions regarding storage of data objects in
memory?
Subsections:
- [Compilers](Compilers)
- [Debugging Tools](Debugging Tools)
- [Debuggers](Debuggers) (GDB, Allinea DDT, Totalview)
- [Tools to detect MPI usage errors](MPIUsageErrorDetection)
(MUST)
- [Performance Tools](Performance Tools) (Score-P, Vampir, performance
counters, etc.)
- [Libraries](Libraries)
- [Miscellaneous](Miscellaneous)
Intel Tools Seminar \[Oct. 2013\]
- [TU-Dresden_Intel_Multithreading_Methodologies.pdf](%ATTACHURL%/TU-Dresden_Intel_Multithreading_Methodologies.pdf):
Intel Multithreading Methodologies
- [TU-Dresden_Advisor_XE.pdf](%ATTACHURL%/TU-Dresden_Advisor_XE.pdf):
Intel Advisor XE - Threading prototyping tool for software
architects
- [TU-Dresden_Inspector_XE.pdf](%ATTACHURL%/TU-Dresden_Inspector_XE.pdf):
Inspector XE - Memory-, Thread-, Pointer-Checker, Debugger
- [TU-Dresden_Intel_Composer_XE.pdf](%ATTACHURL%/TU-Dresden_Intel_Composer_XE.pdf):
Intel Composer - Compilers, Libraries
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment