Skip to content
Snippets Groups Projects
Commit ea67e57b authored by Martin Schroschk's avatar Martin Schroschk
Browse files

Merge branch '476-fix-remeo-smt' into 'preview'

Resolve "Romeo SMT enabled by default?"

Closes #476

See merge request !821
parents 1050b7a4 165b164c
No related branches found
No related tags found
2 merge requests!835Automated merge from preview to main,!821Resolve "Romeo SMT enabled by default?"
......@@ -12,6 +12,14 @@ The hardware specification is documented on the page
The NVIDIA A100 GPUs may only be used with **CUDA 11** or later. Earlier versions do not
recognize the new hardware properly. Make sure the software you are using is built with CUDA11.
There is a total of 48 physical cores in each node. SMT is also active, so in total, 96 logical
cores are available per node.
!!! note
Multithreading is disabled per default in a job.
See the [Slurm page](slurm.md) on how to enable it.
### Modules
The easiest way is using the [module system](../software/modules.md).
......
......@@ -26,7 +26,7 @@ users and the ZIH.
- 34 nodes, each with
- 8 x NVIDIA A100-SXM4 Tensor Core-GPUs
- 2 x AMD EPYC CPU 7352 (24 cores) @ 2.3 GHz, Multithreading disabled
- 2 x AMD EPYC CPU 7352 (24 cores) @ 2.3 GHz, Multithreading available
- 1 TB RAM
- 3.5 TB local memory on NVMe device at `/tmp`
- Hostnames: `taurusi[8001-8034]`
......@@ -36,7 +36,7 @@ users and the ZIH.
## Island 7 - AMD Rome CPUs
- 192 nodes, each with
- 2 x AMD EPYC CPU 7702 (64 cores) @ 2.0 GHz, Multithreading enabled,
- 2 x AMD EPYC CPU 7702 (64 cores) @ 2.0 GHz, Multithreading available
- 512 GB RAM
- 200 GB local memory on SSD at `/tmp`
- Hostnames: `taurusi[7001-7192]`
......
......@@ -316,6 +316,32 @@ provide a comprehensive collection of job examples.
* Submisson: `marie@login$ sbatch batch_script.sh`
* Run with fewer MPI tasks: `marie@login$ sbatch --ntasks=14 batch_script.sh`
## Using Simultaneous Multithreading (SMT)
Most modern architectures offer simultaneous multithreading (SMT), where physical cores of a CPU are
split into virtual cores (aka. threads). This technique allows to run two instruction streams per
physical core in parallel.
At ZIH systems, SMT is available at the partitions `rome` and `alpha`. It is deactivated by
default, because the environment variable `SLURM_HINT` is set to `nomultithread`.
If you wish to make use of the SMT cores, you need to explicitly activate it.
In principle, there are two different ways:
1. Change the value of the environment variable via `export SLURM_HINT=multithread` in your current
shell and submit your job file, or invoke your `srun` or `salloc` command line.
1. Clear the environment variable via `unset SLURM_HINT` and provide the option `--hint=multithread`
to `sbatch`, `srun` or `salloc` command line.
??? warning
If you like to activate SMT via the directive
```
#SBATCH --hint=multithread
```
within your job file, you also have to clear the environment variable `SLURM_HINT` before
submitting the job file. Otherwise, the environment varibale `SLURM_HINT` takes precedence.
## Heterogeneous Jobs
A heterogeneous job consists of several job components, all of which can have individual job
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment