Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • zih/hpcsupport/hpc-compendium
  • okhrin--tu-dresden.de/hpc-compendium
  • lama722b--tu-dresden.de/hpc-compendium
  • siat527e--tu-dresden.de/hpc-compendium
  • jupe958c--tu-dresden.de/hpc-compendium
  • best913c--tu-dresden.de/hpc-compendium
  • yge-at-tu-dresden.de/hpc-compendium
7 results
Show changes
Commits on Source (24)
...@@ -14,7 +14,7 @@ RUN pip install mkdocs>=1.1.2 mkdocs-material>=7.1.0 mkdocs-htmlproofer-plugin== ...@@ -14,7 +14,7 @@ RUN pip install mkdocs>=1.1.2 mkdocs-material>=7.1.0 mkdocs-htmlproofer-plugin==
RUN apt-get update && apt-get install -y nodejs npm aspell git git-lfs RUN apt-get update && apt-get install -y nodejs npm aspell git git-lfs
RUN npm install -g markdownlint-cli markdown-link-check RUN npm install -g markdownlint-cli@0.32.2 markdown-link-check
########################################### ###########################################
# prepare git for automatic merging in CI # # prepare git for automatic merging in CI #
...@@ -38,6 +38,9 @@ RUN echo 'test \! -e /docs/tud_theme/javascripts/mermaid.min.js && test -x /docs ...@@ -38,6 +38,9 @@ RUN echo 'test \! -e /docs/tud_theme/javascripts/mermaid.min.js && test -x /docs
RUN echo 'exec "$@"' >> /entrypoint.sh RUN echo 'exec "$@"' >> /entrypoint.sh
RUN chmod u+x /entrypoint.sh RUN chmod u+x /entrypoint.sh
# Workaround https://gitlab.com/gitlab-org/gitlab-runner/-/issues/29022
RUN git config --global --add safe.directory /docs
WORKDIR /docs WORKDIR /docs
CMD ["mkdocs", "build", "--verbose", "--strict"] CMD ["mkdocs", "build", "--verbose", "--strict"]
......
...@@ -328,8 +328,8 @@ specifications for each component of the heterogeneous job should be separated w ...@@ -328,8 +328,8 @@ specifications for each component of the heterogeneous job should be separated w
Running a job step on a specific component is supported by the option `--het-group`. Running a job step on a specific component is supported by the option `--het-group`.
```console ```console
marie@login$ salloc --ntasks 1 --cpus-per-task 4 --partition <partition> --mem=200G : \ marie@login$ salloc --ntasks=1 --cpus-per-task=4 --partition <partition> --mem=200G : \
--ntasks 8 --cpus-per-task 1 --gres=gpu:8 --mem=80G --partition <partition> --ntasks=8 --cpus-per-task=1 --gres=gpu:8 --mem=80G --partition <partition>
[...] [...]
marie@login$ srun ./my_application <args for master tasks> : ./my_application <args for worker tasks> marie@login$ srun ./my_application <args for master tasks> : ./my_application <args for worker tasks>
``` ```
...@@ -340,16 +340,16 @@ components by a line containing the directive `#SBATCH hetjob`. ...@@ -340,16 +340,16 @@ components by a line containing the directive `#SBATCH hetjob`.
```bash ```bash
#!/bin/bash #!/bin/bash
#SBATCH --ntasks 1 #SBATCH --ntasks=1
#SBATCH --cpus-per-task 4 #SBATCH --cpus-per-task=4
#SBATCH --partition <partition> #SBATCH --partition=<partition>
#SBATCH --mem=200G #SBATCH --mem=200G
#SBATCH hetjob # required to separate groups #SBATCH hetjob # required to separate groups
#SBATCH --ntasks 8 #SBATCH --ntasks=8
#SBATCH --cpus-per-task 1 #SBATCH --cpus-per-task=1
#SBATCH --gres=gpu:8 #SBATCH --gres=gpu:8
#SBATCH --mem=80G #SBATCH --mem=80G
#SBATCH --partition <partition> #SBATCH --partition=<partition>
srun ./my_application <args for master tasks> : ./my_application <args for worker tasks> srun ./my_application <args for master tasks> : ./my_application <args for worker tasks>
...@@ -474,7 +474,7 @@ at no extra cost. ...@@ -474,7 +474,7 @@ at no extra cost.
??? example "Show all jobs since the beginning of year 2021" ??? example "Show all jobs since the beginning of year 2021"
```console ```console
marie@login$ sacct -S 2021-01-01 [-E now] marie@login$ sacct --starttime 2021-01-01 [--endtime now]
``` ```
## Jobs at Reservations ## Jobs at Reservations
......
...@@ -186,7 +186,7 @@ When `srun` is used within a submission script, it inherits parameters from `sba ...@@ -186,7 +186,7 @@ When `srun` is used within a submission script, it inherits parameters from `sba
`--ntasks=1`, `--cpus-per-task=4`, etc. So we actually implicitly run the following `--ntasks=1`, `--cpus-per-task=4`, etc. So we actually implicitly run the following
```bash ```bash
srun --ntasks=1 --cpus-per-task=4 ... --partition=ml some-gpu-application srun --ntasks=1 --cpus-per-task=4 [...] --partition=ml <some-gpu-application>
``` ```
Now, our goal is to run four instances of this program concurrently in a single batch script. Of Now, our goal is to run four instances of this program concurrently in a single batch script. Of
...@@ -237,7 +237,7 @@ inherited from the surrounding `sbatch` context. The following line would be suf ...@@ -237,7 +237,7 @@ inherited from the surrounding `sbatch` context. The following line would be suf
job in this example: job in this example:
```bash ```bash
srun --exclusive --gres=gpu:1 --ntasks=1 some-gpu-application & srun --exclusive --gres=gpu:1 --ntasks=1 <some-gpu-application> &
``` ```
Yet, it adds some extra safety to leave them in, enabling the Slurm batch system to complain if not Yet, it adds some extra safety to leave them in, enabling the Slurm batch system to complain if not
...@@ -278,7 +278,8 @@ use up all resources in the nodes: ...@@ -278,7 +278,8 @@ use up all resources in the nodes:
#SBATCH --exclusive # ensure that nobody spoils my measurement on 2 x 2 x 8 cores #SBATCH --exclusive # ensure that nobody spoils my measurement on 2 x 2 x 8 cores
#SBATCH --time=00:10:00 #SBATCH --time=00:10:00
#SBATCH --job-name=Benchmark #SBATCH --job-name=Benchmark
#SBATCH --mail-user=your.name@tu-dresden.de #SBATCH --mail-type=end
#SBATCH --mail-user=<your.email>@tu-dresden.de
srun ./my_benchmark srun ./my_benchmark
``` ```
...@@ -313,14 +314,14 @@ name specific to the job: ...@@ -313,14 +314,14 @@ name specific to the job:
```Bash ```Bash
#!/bin/bash #!/bin/bash
#SBATCH --array 0-9 #SBATCH --array=0-9
#SBATCH --output=arraytest-%A_%a.out #SBATCH --output=arraytest-%A_%a.out
#SBATCH --error=arraytest-%A_%a.err #SBATCH --error=arraytest-%A_%a.err
#SBATCH --ntasks=864 #SBATCH --ntasks=864
#SBATCH --time=08:00:00 #SBATCH --time=08:00:00
#SBATCH --job-name=Science1 #SBATCH --job-name=Science1
#SBATCH --mail-type=end #SBATCH --mail-type=end
#SBATCH --mail-user=your.name@tu-dresden.de #SBATCH --mail-user=<your.email>@tu-dresden.de
echo "Hi, I am step $SLURM_ARRAY_TASK_ID in this array job $SLURM_ARRAY_JOB_ID" echo "Hi, I am step $SLURM_ARRAY_TASK_ID in this array job $SLURM_ARRAY_JOB_ID"
``` ```
......
...@@ -66,7 +66,7 @@ marie@login$ srun --ntasks 1 --partition <partition> mpicc -g -o fancy-program f ...@@ -66,7 +66,7 @@ marie@login$ srun --ntasks 1 --partition <partition> mpicc -g -o fancy-program f
# Allocate interactive session with 1 extra process for MUST # Allocate interactive session with 1 extra process for MUST
marie@login$ salloc --ntasks 5 --partition <partition> marie@login$ salloc --ntasks 5 --partition <partition>
marie@login$ mustrun --must:mpiexec srun --must:np --ntasks --ntasks 4 --must:stacktrace backward ./fancy-program marie@login$ mustrun --must:mpiexec srun --must:np --ntasks --must:stacktrace backward --ntasks 4 ./fancy-program
[MUST] MUST configuration ... centralized checks with fall-back application crash handling (very slow) [MUST] MUST configuration ... centralized checks with fall-back application crash handling (very slow)
[MUST] Weaver ... success [MUST] Weaver ... success
[MUST] Code generation ... success [MUST] Code generation ... success
......
...@@ -96,7 +96,7 @@ the notebook by pre-loading a specific TensorFlow module: ...@@ -96,7 +96,7 @@ the notebook by pre-loading a specific TensorFlow module:
You can also define your own Jupyter kernel for more specific tasks. Please read about Jupyter You can also define your own Jupyter kernel for more specific tasks. Please read about Jupyter
kernels and virtual environments in our kernels and virtual environments in our
[JupyterHub](../access/jupyterhub.md#creating-and-using-a-custom-environment) documentation. [JupyterHub](../access/jupyterhub_custom_environments.md) documentation.
## TensorFlow in Containers ## TensorFlow in Containers
......
...@@ -73,7 +73,17 @@ Launching VampirServer... ...@@ -73,7 +73,17 @@ Launching VampirServer...
Submitting slurm 30 minutes job (this might take a while)... Submitting slurm 30 minutes job (this might take a while)...
``` ```
This way, a job with a timelimit of 30 minutes and default resources is submitted. This might fit
your needs. If not, please feel free to request a customized job running VampirServer, e.g.
```console
marie@login$ vampirserver start --ntasks=8 --time=01:00:00 --mem-per-cpu=3000M --partition=romeo
Launching VampirServer...
Submitting slurm 01:00:00 minutes job (this might take a while)...
```
Above automatically allocates its resources via the respective batch system. If you want to start Above automatically allocates its resources via the respective batch system. If you want to start
VampirServer without a batch allocation or from inside an interactive allocation, use VampirServer without a batch allocation or from inside an interactive allocation, use
```console ```console
......
...@@ -6,4 +6,4 @@ scriptpath=${BASH_SOURCE[0]} ...@@ -6,4 +6,4 @@ scriptpath=${BASH_SOURCE[0]}
basedir=`dirname "$scriptpath"` basedir=`dirname "$scriptpath"`
basedir=`dirname "$basedir"` basedir=`dirname "$basedir"`
cd $basedir/tud_theme/javascripts cd $basedir/tud_theme/javascripts
wget https://unpkg.com/mermaid/dist/mermaid.min.js wget https://unpkg.com/mermaid@9.4.0/dist/mermaid.min.js