slurm_examples.md

```Bash
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --tasks-per-node=1
#SBATCH --cpus-per-task=8
#SBATCH --time=08:00:00
#SBATCH --job-name=Science1
#SBATCH --mail-type=end
#SBATCH --mail-user=<your.email>@tu-dresden.de

export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
./path/to/binary
```
There are different MPI libraries on ZIH systems for the different micro archtitectures. Thus,
you have to compile the binaries specifically for the target architecture and partition. Please
refer to the sections [building software](../software/building_software.md) and
[module environments](../software/modules.md#module-environments) for detailed
information.
```Bash
#!/bin/bash
#SBATCH --ntasks=864
#SBATCH --time=08:00:00
#SBATCH --job-name=Science1
#SBATCH --mail-type=end
#SBATCH --mail-user=<your.email>@tu-dresden.de

srun ./path/to/binary
```
```Bash
#!/bin/bash
#SBATCH --ntasks=4
#SBATCH --cpus-per-task=1
#SBATCH --time=01:00:00
#SBATCH --job-name=PseudoParallelJobs
#SBATCH --mail-type=end
#SBATCH --mail-user=<your.email>@tu-dresden.de

# The following sleep command was reported to fix warnings/errors with srun by users (feel free to uncomment).
#sleep 5
srun --exclusive --ntasks=1 ./path/to/binary &

#sleep 5
srun --exclusive --ntasks=1 ./path/to/binary &

#sleep 5
srun --exclusive --ntasks=1 ./path/to/binary &

#sleep 5
srun --exclusive --ntasks=1 ./path/to/binary &

echo "Waiting for parallel job steps to complete..."
wait
echo "All parallel job steps completed!"
```
```console
marie@login$ srun --ntasks=1 --cpus-per-task=16 --mem=16G --time=01:00:00 --pty bash --login
[...]
marie@compute$ # prepare the source code for building using configure, cmake or so
marie@compute$ make -j 16
```
```Bash
#!/bin/bash
#SBATCH --nodes=2              # request 2 nodes
#SBATCH --mincpus=1            # allocate one task per node...
#SBATCH --ntasks=2             # ...which means 2 tasks in total (see note below)
#SBATCH --cpus-per-task=6      # use 6 threads per task
#SBATCH --gres=gpu:1           # use 1 GPU per node (i.e. use one GPU per task)
#SBATCH --time=01:00:00        # run for 1 hour
#SBATCH --account=p_number_crunch      # account CPU time to project p_number_crunch

srun ./your/cuda/application   # start you application (probably requires MPI to use both nodes)
```
Due to an unresolved issue concerning the Slurm job scheduling behavior, it is currently not
practical to use `--ntasks-per-node` together with GPU jobs. If you want to use multiple nodes,
please use the parameters `--ntasks` and `--mincpus` instead. The values of `mincpus`*`nodes`
has to equal `ntasks` in this case.
Batch job submission failed: Requested node configuration is not available
srun: error: QOSMinGRES
srun: error: Unable to allocate resources: Job violates accounting/QOS policy (job submit limit, user's size and/or time limits)
```bash
#!/bin/bash
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --gres=gpu:1
#SBATCH --gpus-per-task=1
#SBATCH --time=01:00:00
#SBATCH --mem-per-cpu=1443
#SBATCH --partition=ml

srun some-gpu-application
```
srun --ntasks=1 --cpus-per-task=4 [...] --partition=ml some-gpu-application
```Bash
#!/bin/bash
#SBATCH --ntasks=4
#SBATCH --cpus-per-task=4
#SBATCH --gres=gpu:4
#SBATCH --gpus-per-task=1
#SBATCH --time=01:00:00
#SBATCH --mem-per-cpu=1443
#SBATCH --partition=ml

srun --exclusive --gres=gpu:1 --ntasks=1 --cpus-per-task=4 --gpus-per-task=1 --mem-per-cpu=1443 some-gpu-application &
srun --exclusive --gres=gpu:1 --ntasks=1 --cpus-per-task=4 --gpus-per-task=1 --mem-per-cpu=1443 some-gpu-application &
srun --exclusive --gres=gpu:1 --ntasks=1 --cpus-per-task=4 --gpus-per-task=1 --mem-per-cpu=1443 some-gpu-application &
srun --exclusive --gres=gpu:1 --ntasks=1 --cpus-per-task=4 --gpus-per-task=1 --mem-per-cpu=1443 some-gpu-application &

echo "Waiting for all job steps to complete..."
wait
echo "All jobs completed!"
```
srun --exclusive --gres=gpu:1 --ntasks=1 some-gpu-application &
```Bash
#!/bin/bash
#SBATCH --partition=haswell
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=2
#SBATCH --cpus-per-task=8
#SBATCH --exclusive    # ensure that nobody spoils my measurement on 2 x 2 x 8 cores
#SBATCH --time=00:10:00
#SBATCH --job-name=Benchmark
#SBATCH --mail-type=end
#SBATCH --mail-user=<your.email>@tu-dresden.de

srun ./my_benchmark
```
```Bash
#!/bin/bash
#SBATCH --array=0-9
#SBATCH --output=arraytest-%A_%a.out
#SBATCH --error=arraytest-%A_%a.err
#SBATCH --ntasks=864
#SBATCH --time=08:00:00
#SBATCH --job-name=Science1
#SBATCH --mail-type=end
#SBATCH --mail-user=<your.email>@tu-dresden.de

echo "Hi, I am step $SLURM_ARRAY_TASK_ID in this array job $SLURM_ARRAY_JOB_ID"
```
If you submit a large number of jobs doing heavy I/O in the Lustre filesystems you should limit
the number of your simultaneously running job with a second parameter like:

```Bash
#SBATCH --array=1-100000%100
```
This scripts submits the very same job file `myjob.sh` four times, which will be executed one
after each other. The number of tasks is increased from job to job making this submit script a
good starting point for (strong) scaling experiments.

```Bash title="submit_scaling.sh"
#!/bin/bash

task_numbers="1 2 4 8"
dependency=""
job_file="myjob.sh"

for tasks in ${task_numbers} ; do
    job_cmd="sbatch --ntasks=${tasks}"
    if [ -n "${dependency}" ] ; then
        job_cmd="${job_cmd} --dependency=afterany:${dependency}"
    fi
    job_cmd="${job_cmd} ${job_file}"
    echo -n "Running command: ${job_cmd}  "
    out="$(${job_cmd})"
    echo "Result: ${out}"
    dependency=$(echo "${out}" | awk '{print $4}')
done
```

The output looks like:
```console
marie@login$ sh submit_scaling.sh
Running command: sbatch --ntasks=1 myjob.sh  Result: Submitted batch job 2963822
Running command: sbatch --ntasks=2 --dependency afterany:32963822 myjob.sh  Result: Submitted batch job 2963823
Running command: sbatch --ntasks=4 --dependency afterany:32963823 myjob.sh  Result: Submitted batch job 2963824
Running command: sbatch --ntasks=8 --dependency afterany:32963824 myjob.sh  Result: Submitted batch job 2963825
```
This script submits three different job files, which will be executed one after each other. Of
course, the dependency reasons can be adopted.

```bash title="submit_job_chain.sh"
#!/bin/bash

declare -a job_names=("jobfile_a.sh" "jobfile_b.sh" "jobfile_c.sh")
dependency=""
arraylength=${#job_names[@]}

for (( i=0; i<arraylength; i++ )) ; do
  job_nr=$((i + 1))
  echo "Job ${job_nr}/${arraylength}: ${job_names[$i]}"
  if [ -n "${dependency}" ] ; then
      echo "Dependency: after job ${dependency}"
      dependency="--dependency=afterany:${dependency}"
  fi
  job="sbatch ${dependency} ${job_names[$i]}"
  out=$(${job})
  dependency=$(echo "${out}" | awk '{print $4}')
done
```

The output looks like:
```console
marie@login$ sh submit_job_chains.sh
Job 1/3: jobfile_a.sh
Job 2/3: jobfile_b.sh
Dependency: after job 2963708
Job 3/3: jobfile_c.sh
Dependency: after job 2963709
```
marie@login$ export DATAMOVER_JOB=$(dtcp /scratch/ws/1/marie-source/input.txt /beegfs/ws/1/marie-target/. | awk '{print $4}')
marie@login$ srun --dependency afterok:${DATAMOVER_JOB} ls /beegfs/ws/1/marie-target
srun: job 23872871 queued and waiting for resources
srun: job 23872871 has been allocated resources
input.txt