Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
hpc-compendium
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Deploy
Releases
Package Registry
Container Registry
Model registry
Operate
Terraform modules
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
ZIH
hpcsupport
hpc-compendium
Commits
24c7ab6a
Commit
24c7ab6a
authored
2 years ago
by
Martin Schroschk
Browse files
Options
Downloads
Patches
Plain Diff
Resolve
#456
parent
35549e69
No related branches found
Branches containing commit
No related tags found
2 merge requests
!808
Automated merge from preview to main
,
!784
Resolve #456
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm_examples.md
+68
-20
68 additions, 20 deletions
...h.tu-dresden.de/docs/jobs_and_resources/slurm_examples.md
with
68 additions
and
20 deletions
doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm_examples.md
+
68
−
20
View file @
24c7ab6a
...
@@ -338,36 +338,84 @@ Please read the Slurm documentation at https://slurm.schedmd.com/sbatch.html for
...
@@ -338,36 +338,84 @@ Please read the Slurm documentation at https://slurm.schedmd.com/sbatch.html for
## Chain Jobs
## Chain Jobs
You can use chain jobs to create dependencies between jobs. This is often
the case
if a job
relies
You can use chain jobs to
**
create dependencies between jobs
**
. This is often
useful
if a job
on the result of one or more preceding jobs. Chain jobs can also be used
if the runtime limit of the
relies
on the result of one or more preceding jobs. Chain jobs can also be used
to split a long
batch queues is not sufficient for your job
. Slurm has an option
runnning job exceeding the batch queues limits into parts and chain these parts
. Slurm has an option
`-d, --dependency=<dependency_list>`
that allows to specify that a job is only allowed to start if
`-d, --dependency=<dependency_list>`
that allows to specify that a job is only allowed to start if
another job finished.
another job finished.
Here is an example of how a chain job can look like, the example submits 4 jobs (described in a job
In the following we provide two examples for scripts that submit chain jobs.
file) that will be executed one after each other with different CPU numbers:
!!!
example "Sc
ript to submit jobs with dependencie
s"
???
example "Sc
aling experiment using chain job
s"
```Bash
This scripts submits the very same job file `myjob.sh` four times, which will be executed one
after each other. The number of tasks is increased from job to job making this submit script a
good starting point for (strong) scaling experiments.
```Bash title="submit_scaling.sh"
#!/bin/bash
#!/bin/bash
TASK_NUMBERS="1 2 4 8"
DEPENDENCY=""
task_numbers="1 2 4 8"
JOB_FILE="myjob.slurm"
dependency=""
job_file="myjob.sh"
for TASKS in $TASK_NUMBERS ; do
JOB_CMD="sbatch --ntasks=$TASKS"
for tasks in ${task_numbers} ; do
if [ -n "$DEPENDENCY" ] ; then
job_cmd="sbatch --ntasks=${tasks}"
JOB_CMD="$JOB_CMD --dependency afterany:$DEPENDENCY"
if [ -n "${dependency}" ] ; then
job_cmd="${job_cmd} --dependency=afterany:${dependency}"
fi
fi
JOB_CMD="$JOB_CMD $JOB_FILE
"
job_cmd="${job_cmd} ${job_file}
"
echo -n "Running command: $
JOB_CMD
"
echo -n "Running command: $
{job_cmd}
"
OUT=`$JOB_CMD
`
out=`${job_cmd}
`
echo "Result: $
OUT
"
echo "Result: $
{out}
"
DEPENDENCY
=`echo $
OUT
| awk '{print $4}'`
dependency
=`echo $
{out}
| awk '{print $4}'`
done
done
```
```
The output looks like:
```console
marie@login$ sh submit_scaling.sh
Running command: sbatch --ntasks=1 myjob.sh Result: Submitted batch job 2963822
Running command: sbatch --ntasks=2 --dependency afterany:32963822 myjob.sh Result: Submitted batch job 2963823
Running command: sbatch --ntasks=4 --dependency afterany:32963823 myjob.sh Result: Submitted batch job 2963824
Running command: sbatch --ntasks=8 --dependency afterany:32963824 myjob.sh Result: Submitted batch job 2963825
```
??? example "Example to submit job chain via script"
This script submits three different job files, which will be executed one after each other. Of
course, the dependency reasons can be adopted.
```bash title="submit_job_chain.sh"
#!/bin/bash
declare -a job_names=("jobfile_a.sh" "jobfile_b.sh" "jobfile_c.sh")
dependency=""
arraylength=${#job_names[@]}
for (( i=0; i<${arraylength}; i++ )) ; do
job_nr=`expr $i + 1`
echo "Job ${job_nr}/${arraylength}: ${job_names[$i]}"
if [ -n "${dependency}" ] ; then
echo "Dependency: after job ${dependency}"
dependency="--dependency=afterany:${dependency}"
fi
job="sbatch ${dependency} ${job_names[$i]}"
out=`${job}`
dependency=`echo ${out} | awk '{print $4}'`
done
```
The output looks like:
```console
marie@login$ sh submit_job_chains.sh
Job 1/3: jobfile_a.sh
Job 2/3: jobfile_b.sh
Dependency: after job 2963708
Job 3/3: jobfile_c.sh
Dependency: after job 2963709
```
## Array-Job with Afterok-Dependency and Datamover Usage
## Array-Job with Afterok-Dependency and Datamover Usage
In this example scenario, imagine you need to move data, before starting the main job.
In this example scenario, imagine you need to move data, before starting the main job.
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment