Skip to content
Snippets Groups Projects

Spring clean: Remove partition; use cluster

Merged Martin Schroschk requested to merge issue-583 into preview
All threads resolved!
@@ -8,8 +8,8 @@ depend on the type of parallelization and architecture.
@@ -8,8 +8,8 @@ depend on the type of parallelization and architecture.
### OpenMP Jobs
### OpenMP Jobs
An SMP-parallel job can only run within a node, so it is necessary to include the options `--node=1`
An SMP-parallel job can only run within a node, so it is necessary to include the options `--node=1`
and `--ntasks=1`. The maximum number of processors for an SMP-parallel program is 896 on
and `--ntasks=1`. The maximum number of processors for an SMP-parallel program is 896 on the cluster
partition `taurussmp8`, as described in the
[`Julia`](julia.md) as described in the
[section on memory limits](slurm_limits.md#slurm-resource-limits-table). Using the option
[section on memory limits](slurm_limits.md#slurm-resource-limits-table). Using the option
`--cpus-per-task=<N>` Slurm will start one task and you will have `N` CPUs available for your job.
`--cpus-per-task=<N>` Slurm will start one task and you will have `N` CPUs available for your job.
An example job file would look like:
An example job file would look like:
@@ -22,8 +22,7 @@ An example job file would look like:
@@ -22,8 +22,7 @@ An example job file would look like:
#SBATCH --tasks-per-node=1
#SBATCH --tasks-per-node=1
#SBATCH --cpus-per-task=8
#SBATCH --cpus-per-task=8
#SBATCH --time=08:00:00
#SBATCH --time=08:00:00
#SBATCH --job-name=Science1
#SBATCH --mail-type=start,end
#SBATCH --mail-type=end
#SBATCH --mail-user=<your.email>@tu-dresden.de
#SBATCH --mail-user=<your.email>@tu-dresden.de
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
@@ -131,10 +130,6 @@ where `NUM_PER_NODE` is the number of GPUs **per node** that will be used for th
@@ -131,10 +130,6 @@ where `NUM_PER_NODE` is the number of GPUs **per node** that will be used for th
srun ./your/cuda/application # start you application (probably requires MPI to use both nodes)
srun ./your/cuda/application # start you application (probably requires MPI to use both nodes)
```
```
With the transition to the sub-clusters it is no longer required to specify the partition with `-p, --partition`.
It can still be used and will lead to a failure when submitting the job on the wrong cluster.
This is useful to document the cluster used or avoid accidentally using the wrong SBATCH script.
!!! note
!!! note
Due to an unresolved issue concerning the Slurm job scheduling behavior, it is currently not
Due to an unresolved issue concerning the Slurm job scheduling behavior, it is currently not
@@ -209,10 +204,10 @@ three things:
@@ -209,10 +204,10 @@ three things:
1. Allocate enough resources to accommodate multiple instances of our program. This can be achieved
1. Allocate enough resources to accommodate multiple instances of our program. This can be achieved
with an appropriate batch script header (see below).
with an appropriate batch script header (see below).
1. Start job steps with srun as background processes. This is achieved by adding an ampersand at the
1. Start job steps with `srun` as background processes. This is achieved by adding an ampersand at
end of the `srun` command
the end of the `srun` command.
1. Make sure that each background process gets its private resources. We need to set the resource
1. Make sure that each background process gets its private resources. We need to set the resource
fraction needed for a single run in the corresponding srun command. The total aggregated
fraction needed for a single run in the corresponding `srun` command. The total aggregated
resources of all job steps must fit in the allocation specified in the batch script header.
resources of all job steps must fit in the allocation specified in the batch script header.
Additionally, the option `--exclusive` is needed to make sure that each job step is provided with
Additionally, the option `--exclusive` is needed to make sure that each job step is provided with
its private set of CPU and GPU resources. The following example shows how four independent
its private set of CPU and GPU resources. The following example shows how four independent
@@ -254,40 +249,40 @@ enough resources in total were specified in the header of the batch script.
@@ -254,40 +249,40 @@ enough resources in total were specified in the header of the batch script.
## Exclusive Jobs for Benchmarking
## Exclusive Jobs for Benchmarking
Jobs ZIH systems run, by default, in shared-mode, meaning that multiple jobs (from different users)
Jobs on ZIH systems run, by default, in shared-mode, meaning that multiple jobs (from different
can run at the same time on the same compute node. Sometimes, this behavior is not desired (e.g.
users) can run at the same time on the same compute node. Sometimes, this behavior is not desired
for benchmarking purposes). Thus, the Slurm parameter `--exclusive` request for exclusive usage of
(e.g. for benchmarking purposes). You can request for exclusive usage of resources using the Slurm
resources.
parameter `--exclusive`.
Setting `--exclusive` **only** makes sure that there will be **no other jobs running on your nodes**.
!!! note "Exclusive does not allocate all available resources"
It does not, however, mean that you automatically get access to all the resources which the node
might provide without explicitly requesting them, e.g. you still have to request a GPU via the
Setting `--exclusive` **only** makes sure that there will be **no other jobs running on your
generic resources parameter (`gres`) to run on the partitions with GPU, or you still have to
nodes**. It does not, however, mean that you automatically get access to all the resources
request all cores of a node if you need them. CPU cores can either to be used for a task
which the node might provide without explicitly requesting them.
(`--ntasks`) or for multi-threading within the same task (`--cpus-per-task`). Since those two
options are semantically different (e.g., the former will influence how many MPI processes will be
E.g. you still have to request for a GPU via the generic resources parameter (`gres`) on the GPU
spawned by `srun` whereas the latter does not), Slurm cannot determine automatically which of the
cluster. On the other hand, you also have to request all cores of a node if you need them.
two you might want to use. Since we use cgroups for separation of jobs, your job is not allowed to
use more resources than requested.*
CPU cores can either to be used for a task (`--ntasks`) or for multi-threading within the same task
(`--cpus-per-task`). Since those two options are semantically different (e.g., the former will
If you just want to use all available cores in a node, you have to specify how Slurm should organize
influence how many MPI processes will be spawned by `srun` whereas the latter does not), Slurm
them, like with `--partition=haswell --cpus-per-tasks=24` or `--partition=haswell --ntasks-per-node=24`.
cannot determine automatically which of the two you might want to use. Since we use cgroups for
 
separation of jobs, your job is not allowed to use more resources than requested.
Here is a short example to ensure that a benchmark is not spoiled by other jobs, even if it doesn't
Here is a short example to ensure that a benchmark is not spoiled by other jobs, even if it doesn't
use up all resources in the nodes:
use up all resources of the nodes:
!!! example "Exclusive resources"
!!! example "Job file with exclusive resources"
```Bash
```Bash
#!/bin/bash
#!/bin/bash
#SBATCH --partition=haswell
#SBATCH --nodes=2
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=2
#SBATCH --ntasks-per-node=2
#SBATCH --cpus-per-task=8
#SBATCH --cpus-per-task=8
#SBATCH --exclusive # ensure that nobody spoils my measurement on 2 x 2 x 8 cores
#SBATCH --exclusive # ensure that nobody spoils my measurement on 2 x 2 x 8 cores
#SBATCH --time=00:10:00
#SBATCH --time=00:10:00
#SBATCH --job-name=Benchmark
#SBATCH --job-name=benchmark
#SBATCH --mail-type=end
#SBATCH --mail-type=start,end
#SBATCH --mail-user=<your.email>@tu-dresden.de
#SBATCH --mail-user=<your.email>@tu-dresden.de
srun ./my_benchmark
srun ./my_benchmark
Loading