diff --git a/doc.zih.tu-dresden.de/docs/software/compilers.md b/doc.zih.tu-dresden.de/docs/software/compilers.md index 65438aaa15b35cbf7821408853845626781a4554..65bd2cc2139265b4c08f78ca7e2943faf0dd9c0b 100644 --- a/doc.zih.tu-dresden.de/docs/software/compilers.md +++ b/doc.zih.tu-dresden.de/docs/software/compilers.md @@ -67,19 +67,29 @@ Different architectures of CPUs feature different vector extensions (like SSE4.2 to accelerate computations. The following matrix shows proper compiler flags for the architectures at the ZIH: -| HPC System | Architecture | GCC | Intel | PGI | +| HPC System | Architecture | GCC | Intel | Nvidia HPC | |------------|--------------------|----------------------|----------------------|-----| -| [`Alpha Centauri`](../jobs_and_resources/alpha_centauri.md) | AMD Rome | `-march=znver2` | `-march=core-avx2` | `-tp=zen` | -| [`Barnard`](../jobs_and_resources/barnard.md) | AMD Sapphire Rapids | `-march=znver2` | `-march=core-avx2` | `-tp=zen` | -| [`Julia`](../jobs_and_resources/julia.md) | Intel Cascade Lake | `-march=cascadelake` | `-march=cascadelake` | `-tp=skylake` | -| [`Power9`](../jobs_and_resources/power9.md) | IBM Power9 | `-march=znver2` | `-march=core-avx2` | `-tp=zen` | -| [`Romeo`](../jobs_and_resources/romeo.md) | AMD Rome | `-march=znver2` | `-march=core-avx2` | `-tp=zen` | -| All | Host's architecture | `-march=native` | `-xHost` | | - -To build an executable for different node types (e.g. Cascade Lake with AVX512 and -Haswell without AVX512) the option `-march=haswell -axcascadelake` (for Intel compilers) -uses vector extension up to AVX2 as default path and runs along a different execution -path if AVX512 is available. +| [`Alpha Centauri`](../jobs_and_resources/alpha_centauri.md) | AMD Rome | `-march=znver2` | `-march=core-avx2` | `-tp=zen2` | +| [`Barnard`](../jobs_and_resources/barnard.md) | AMD Sapphire Rapids | `-march=sapphirerapids` | `-march=core-sapphirerapids` | | +| [`Julia`](../jobs_and_resources/julia.md) | Intel Cascade Lake | `-march=cascadelake` | `-march=cascadelake` | `-tp=cascadelake` | +| [`Romeo`](../jobs_and_resources/romeo.md) | AMD Rome | `-march=znver2` | `-march=core-avx2` | `-tp=zen2` | +| All x86 | Host's architecture | `-march=native` | `-xHost` or `-march=native` | `-tp=host` | +| [`Power9`](../jobs_and_resources/power9.md) | IBM Power9 | `-mcpu=power9` or `-mcpu=native` | | `-tp=pwr9` or `-tp=host` | + + +To build an executable for different node types with the Intel compiler, use +`-axcode`, where `code` is to be replaced with one or more target architectures. +For Cascade Lake and Sapphire Rapids. the option `-axcascadelake,sapphirerapids` +(for Intel compilers) instructs the compiler to optimized code paths for the +specified architecture(s), if possible. +If the application is executed on one of these architectures, the optimized code +path will be chosen. +A baseline code path will also be generated. +This path is used on other architectures than the specified ones and is used +in code sections that were not optimized by the compiler for a specific architecture. +Other optimization flags can be used as well for, e.g. `-O3`. +However, the `-march` option cannot be used here, as this will overwrite the +`-axcode` option. This increases the size of the program code (might result in poorer L1 instruction cache hits) but enables to run the same program on -different hardware types. +different hardware types with compiler optimizations.