diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/alpha_centauri.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/alpha_centauri.md index 5a0e15613665ececfaa2d0915fd7022c742a9288..b5b09281dd9ab9fdde89c7ae4ffe9ad4ec48c089 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/alpha_centauri.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/alpha_centauri.md @@ -12,6 +12,14 @@ The hardware specification is documented on the page The NVIDIA A100 GPUs may only be used with **CUDA 11** or later. Earlier versions do not recognize the new hardware properly. Make sure the software you are using is built with CUDA11. +There is a total of 48 physical cores in each node. SMT is also active, so in total, 96 logical +cores are available per node. + +!!! note + + Multithreading is disabled per default in a job. + See the [Slurm page](slurm.md) on how to enable it. + ### Modules The easiest way is using the [module system](../software/modules.md). diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/hardware_overview.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/hardware_overview.md index c4bd1c7909fda4fa27703c00c68e284be07a4cb0..538296b4ea52aee6c99f132811af8112803adcf9 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/hardware_overview.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/hardware_overview.md @@ -26,7 +26,7 @@ users and the ZIH. - 34 nodes, each with - 8 x NVIDIA A100-SXM4 Tensor Core-GPUs - - 2 x AMD EPYC CPU 7352 (24 cores) @ 2.3 GHz, Multithreading disabled + - 2 x AMD EPYC CPU 7352 (24 cores) @ 2.3 GHz, Multithreading available - 1 TB RAM - 3.5 TB local memory on NVMe device at `/tmp` - Hostnames: `taurusi[8001-8034]` @@ -36,7 +36,7 @@ users and the ZIH. ## Island 7 - AMD Rome CPUs - 192 nodes, each with - - 2 x AMD EPYC CPU 7702 (64 cores) @ 2.0 GHz, Multithreading enabled, + - 2 x AMD EPYC CPU 7702 (64 cores) @ 2.0 GHz, Multithreading available - 512 GB RAM - 200 GB local memory on SSD at `/tmp` - Hostnames: `taurusi[7001-7192]` diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm.md index adaf75cdf9a356307f023a85620fbc9f482dc019..82968b9c1ff37d1f74c81bb3a07fd2f496e5e205 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm.md @@ -316,6 +316,32 @@ provide a comprehensive collection of job examples. * Submisson: `marie@login$ sbatch batch_script.sh` * Run with fewer MPI tasks: `marie@login$ sbatch --ntasks=14 batch_script.sh` +## Using Simultaneous Multithreading (SMT) + +Most modern architectures offer simultaneous multithreading (SMT), where physical cores of a CPU are +split into virtual cores (aka. threads). This technique allows to run two instruction streams per +physical core in parallel. + +At ZIH systems, SMT is available at the partitions `rome` and `alpha`. It is deactivated by +default, because the environment variable `SLURM_HINT` is set to `nomultithread`. +If you wish to make use of the SMT cores, you need to explicitly activate it. +In principle, there are two different ways: + +1. Change the value of the environment variable via `export SLURM_HINT=multithread` in your current + shell and submit your job file, or invoke your `srun` or `salloc` command line. + +1. Clear the environment variable via `unset SLURM_HINT` and provide the option `--hint=multithread` + to `sbatch`, `srun` or `salloc` command line. + +??? warning + + If you like to activate SMT via the directive + ``` + #SBATCH --hint=multithread + ``` + within your job file, you also have to clear the environment variable `SLURM_HINT` before + submitting the job file. Otherwise, the environment varibale `SLURM_HINT` takes precedence. + ## Heterogeneous Jobs A heterogeneous job consists of several job components, all of which can have individual job