diff --git a/doc.zih.tu-dresden.de/docs/software/libraries.md b/doc.zih.tu-dresden.de/docs/software/libraries.md deleted file mode 100644 index f08e93626543260d2c0cf4dbbbc3edc34c804e18..0000000000000000000000000000000000000000 --- a/doc.zih.tu-dresden.de/docs/software/libraries.md +++ /dev/null @@ -1,91 +0,0 @@ -# Libraries - -The following libraries are available on our system: - -| | **Taurus** | **module** | -|-----------|-----------------------|------------| -| **Boost** | 1.49, 1.5\[4-9\], 160 | boost | -| **MKL** | 2013, 2015 | mkl | -| **FFTW** | 3.3.4 | fftw | - -## The Boost Library - -Boost provides free peer-reviewed portable C++ source libraries, ranging from multithread and MPI -support to regular expression and numeric functions. See at -[http://www.boost.org](http://www.boost.org) for detailed -documentation. - -## BLAS/LAPACK - -???+ example - ```Fortran - program ExampleProgram - - external dgesv - integer:: n, m, c, d, e, Z(2) !parameter definition - double precision:: A(2,2), B(2) - - n=2; m=1; c=2; d=2; - - A(1,1) = 1.0; A(1,2) = 2.0; !parameter setting - A(2,1) = 3.0; A(2,2) = 4.0; - - B(1) = 14.0; B(2) = 32.0; - - Call dgesv(n,m,A,c,Z,B,d,e); !call the subroutine - - write(*,*) "Solution ", B(1), " ", B(2) !display on desktop - - end program ExampleProgram - ``` - -### Math Kernel Library (MKL) - -The Intel Math Kernel Library is a collection of basic linear algebra subroutines (BLAS) and fast -fourier transformations (FFT). It contains routines for: - -- Solvers such as linear algebra package (LAPACK) and BLAS -- Eigenvector/eigenvalue solvers (BLAS, LAPACK) -- PDEs, signal processing, seismic, solid-state physics (FFTs) -- General scientific, financial - vector transcendental functions, - vector markup language (XML) - -More specific it contains the following components: - -- BLAS: - - Level 1 BLAS: vector-vector operations, 48 functions - - Level 2 BLAS: matrix-vector operations, 66 functions - - Level 3 BLAS: matrix-matrix operations, 30 functions -- LAPACK (linear algebra package), solvers and eigensolvers, hundreds - of routines, more than 1000 user callable routines -- FFTs (fast Fourier transform): one and two dimensional, with and - without frequency ordering (bit reversal). There are wrapper - functions to provide an interface to use MKL instead of FFTW. -- VML (vector math library), set of vectorized transcendental - functions -- Parallel Sparse Direct Linear Solver (Pardiso) - -!!! note - MKL comes in an OpenMP-parallel version. If you want to use it, make sure you know how - to place your jobs. [^1] - [^1]: In \[c't 18, 2010\], Andreas Stiller proposes the usage of - `GOMP_CPU_AFFINITY` to allow the mapping of AMD cores. KMP_AFFINITY works only for Intel processors. - -#### Linking with the MKL - -For linker flag combinations, Intel provides the -[MKL Link Line Advisor](http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/) -(please make sure that JavaScript is enabled for this page). - -Can be compiled with MKL 11 like this - -```Bash -ifort -I$MKL_INC -L$MKL_LIB -lmkl_core -lm -lmkl_gf_ilp64 -lmkl_lapack example.f90 -``` - -## FFTW - -FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more -dimensions, of arbitrary input size, and of both real and complex data (as well as of even/odd data, -i.e. the discrete cosine/sine transforms or DCT/DST). Before using this library, please check out -the functions of vendor-specific libraries ACML and/or MKL. diff --git a/doc.zih.tu-dresden.de/docs/software/math_libraries.md b/doc.zih.tu-dresden.de/docs/software/math_libraries.md new file mode 100644 index 0000000000000000000000000000000000000000..47a2ccdab432820cf59307634396b9acaee6f918 --- /dev/null +++ b/doc.zih.tu-dresden.de/docs/software/math_libraries.md @@ -0,0 +1,127 @@ +# Math Libraries + +Many software heavily relies on math libraries, e.g., for linear algebra or FFTW calculations. +Writing portable and fast math functions is a really challenging task. You can try it for fun, but you +really should avoid writing you own matrix-matrix multiplication. Thankfully, there are several +high quality math libraries available at ZIH systems. + +In the following, a few often-used interafaces/specificationa and libraries are described. All +libraries are available as [modules](modules.md). + +## BLAS, LAPACK and ScaLAPACK + +Over the last decades, the three de-facto standard specifications BLAS, LAPACK and ScaLAPACK for +basic linear algebra routines have been emerged. + +The [BLAS](https://www.netlib.org/blas/) (Basic Linear Algebra Subprograms) specification contains routines +for common linear algebra operations such as vector addition, matrix-vector multiplication, and dot +product. BLAS routines can be understood as basic building blocks for advanced numerical algorithms. + +The [Linear Algebra PACKage](https://www.netlib.org/lapack/) (LAPACK) provides more +sophisticated numerical algorithms, such as solving linear systems of equations, matrix +factorization, and eigenvalue problems. + +<!--With [libFlame](#amd-optimizing-cpu-libraries-aocl) and [MKL](#math-kernel-library-mkl) there are--> +<!--two highly optimised LAPACK implementations aiming for AMD and Intel architecture, respectively.--> + +The [Scalable Linear Algebra PACKage](https://www.netlib.org/scalapack) (ScaLAPACK) takes the +idea of high-performance linear algebra routines to parallel distributed memory machines. It offers +functionality to solves dense and banded linear systems, least squares problems, eigenvalue +problems, and singular value problems. + +<!--There is also an [optimized implementation](https://developer.amd.com/amd-aocl/scalapack/) addressing--> +<!--AMD architectures.--> + +Many concrete implementations, often tuned and optimized for specific hardware architectures, have +been developed over the last decades. The two hardware vendors Intel and AMD each offer a own math +library - [Intel MKL](#math-kernel-library-mkl) and [AOCL](#amd-optimizing-cpu-libraries-aocl)). +Both libraries are worth to consider from a users point of view, since they provide extensive math +functionality ranging from BLAS and LAPACK to random number generators and Fast Fourier +Transformation with consistent interfaces and the "promises" to be highly tuned and optimized and +continuously developed further. + +- [BLAS reference implementation](https://www.netlib.org/blas/) in Fortran +- [LAPACK reference implementation](https://www.netlib.org/lapack/) +- [ScaLAPACK reference implementation](https://www.netlib.org/scalapack/) +- [OpenBlas](www.openblas.net) +- For GPU implementations, refer to the [GPU section](#libraries-for-gpus) + +## AMD Optimizing CPU Libraries (AOCL) + +AMD Optimizing CPU Libraries (AOCL) (https://developer.amd.com/amd-aocl/) is a set of numerical +libraries tuned specifically for AMD EPYC processor family. AOCL offers linear algebra libraries +([BLIS](https://developer.amd.com/amd-cpu-libraries/blas-library/), + [libFLAME](https://developer.amd.com/amd-cpu-libraries/blas-library/#libflame), + [ScaLAPACK](https://developer.amd.com/amd-aocl/scalapack/), + [AOCL-Sparse](https://developer.amd.com/amd-aocl/aocl-sparse/), + [FFTW routines](https://developer.amd.com/amd-aocl/fftw/), + [AMD Math Library (LibM)](https://developer.amd.com/amd-cpu-libraries/amd-math-library-libm/), + as well as + [AMD Random Number Generator Library](https://developer.amd.com/amd-cpu-libraries/rng-library/) + and + [AMD Secure RNG Library(https://developer.amd.com/amd-cpu-libraries/rng-library/#securerng). + +## Math Kernel Library (MKL) + +The +[Intel Math Kernel Library](https://software.intel.com/content/www/us/en/develop/documentation/get-started-with-mkl-for-dpcpp/top.html) +(Intel MKL) provides extensively threaded math routines which are highly optimized for Intel CPUs. +It contains routines for linear algebra, direct and iterative sparse solvers, random number +generators and Fast Fourier Transformation (FFT). + +!!! note + + MKL comes in an OpenMP-parallel version. If you want to use it, make sure you know how + to place your jobs. [^1] + [^1]: In \[c't 18, 2010\], Andreas Stiller proposes the usage of + `GOMP_CPU_AFFINITY` to allow the mapping of AMD cores. KMP_AFFINITY works only for Intel processors. + +The available MKL modules can be queried as follows + +```console +marie@login$ module avail imkl +``` + +### Linking + +For linker flag combinations, we highly recommand the +[MKL Link Line Advisor](http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/) +(please make sure that JavaScript is enabled for this page). + +## Libraries for GPUs + +GPU implementations of math functions and routines are often much faster compared to CPU +implementations. This also holds for basic routines from BLAS and LAPACK. You should consider using +this GPU implementations first in order to obtain better performance. + +There are several math libraries for Nvidia GPUs, e.g. + +- [cuBLAS](https://docs.nvidia.com/cuda/cublas/index.html) +- [cuSOLVER](https://developer.nvidia.com/cusolver) (reduced set of LAPACK routines) +- [cuSPARSE](https://developer.nvidia.com/cusparse) (sparse matrix library) +- [cuFFT](https://developer.nvidia.com/cufft) + +[This webpage](https://developer.nvidia.com/gpu-accelerated-libraries#linear-algebra) provides a +comprehensive overview and starting point. + +### MAGMA + +The project [Matrix Algebra on GPU and Multicore Architectures](http://icl.cs.utk.edu/magma/) (MAGMA) +aims to develop a dense linear algebra library similar to LAPACK but for heterogeneous/hybrid +architectures, starting with current "Multicore+GPU" systems. `MAGMA` is available at ZIH systems in +different versions. You can list the available modules using + +```console +marie@login$ module spider magma +[...] + magma/2.5.4-fosscuda-2019b + magma/2.5.4 +``` + +## FFTW + +FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more +dimensions, of arbitrary input size, and of both real and complex data (as well as of even/odd data, +i.e. the discrete cosine/sine transforms or DCT/DST). Before using this library, please check out +the functions of vendor-specific libraries suchs as [AOCL](#amd-optimizing-cpu-libraries-aocl) +or [Intel MKL](#math-kernel-library-mkl) diff --git a/doc.zih.tu-dresden.de/docs/software/software_development_overview.md b/doc.zih.tu-dresden.de/docs/software/software_development_overview.md index c87d4c93b5fe27ba82ca261aad359df48a7e741c..ee6866ea8880832fc4cd7b034fe33a83e65a9c0d 100644 --- a/doc.zih.tu-dresden.de/docs/software/software_development_overview.md +++ b/doc.zih.tu-dresden.de/docs/software/software_development_overview.md @@ -41,7 +41,7 @@ Subsections: - [Debuggers](debuggers.md) (GDB, Allinea DDT, Totalview) - [Tools to detect MPI usage errors](mpi_usage_error_detection.md) (MUST) - PerformanceTools.md: [Score-P](scorep.md), [Vampir](vampir.md) -- [Libraries](libraries.md) +- [Math Libraries](math_libraries.md) Intel Tools Seminar \[Oct. 2013\] diff --git a/doc.zih.tu-dresden.de/mkdocs.yml b/doc.zih.tu-dresden.de/mkdocs.yml index 2472effc14bc9839a4a0f01ad576a94728831886..670be727a319ac172176e68cb66ef4fc3c3c4399 100644 --- a/doc.zih.tu-dresden.de/mkdocs.yml +++ b/doc.zih.tu-dresden.de/mkdocs.yml @@ -63,7 +63,7 @@ nav: - GPU Programming: software/gpu_programming.md - Compilers: software/compilers.md - Debuggers: software/debuggers.md - - Libraries: software/libraries.md + - Math Libraries: software/math_libraries.md - MPI Error Detection: software/mpi_usage_error_detection.md - Score-P: software/scorep.md - PAPI Library: software/papi_library.md