Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • zih/hpcsupport/hpc-compendium
  • okhrin--tu-dresden.de/hpc-compendium
  • lama722b--tu-dresden.de/hpc-compendium
  • siat527e--tu-dresden.de/hpc-compendium
  • jupe958c--tu-dresden.de/hpc-compendium
  • best913c--tu-dresden.de/hpc-compendium
  • yge-at-tu-dresden.de/hpc-compendium
7 results
Show changes
Commits on Source (102)
Showing
with 86 additions and 68 deletions
FROM python:3.8-bullseye
SHELL ["/bin/bash", "-c"]
########
# Base #
########
......@@ -14,6 +16,19 @@ RUN apt update && apt install -y nodejs npm aspell git
RUN npm install -g markdownlint-cli markdown-link-check
###########################################
# prepare git for automatic merging in CI #
###########################################
RUN git config --global user.name 'Gitlab Bot'
RUN git config --global user.email 'hpcsupport@zih.tu-dresden.de'
RUN mkdir -p ~/.ssh
#see output of `ssh-keyscan gitlab.hrz.tu-chemnitz.de`
RUN echo $'# gitlab.hrz.tu-chemnitz.de:22 SSH-2.0-OpenSSH_7.4\n\
gitlab.hrz.tu-chemnitz.de ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDNixJ1syD506jOtiLPxGhAXsNnVfweFfzseh9/WrNxbTgIhi09fLb5aZI2CfOOWIi4fQz07S+qGugChBs4lJenLYAu4b0IAnEv/n/Xnf7wITf/Wlba2VSKiXdDqbSmNbOQtbdBLNu1NSt+inFgrreaUxnIqvWX4pBDEEGBAgG9e2cteXjT/dHp4+vPExKEjM6Nsxw516Cqv5H1ZU7XUTHFUYQr0DoulykDoXU1i3odJqZFZQzcJQv/RrEzya/2bwaatzKfbgoZLlb18T2LjkP74b71DeFIQWV2e6e3vsNwl1NsvlInEcsSZB1TZP+mKke7JWiI6HW2IrlSaGqM8n4h\n\
gitlab.hrz.tu-chemnitz.de ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIJ/cSNsKRPrfXCMjl+HsKrnrI3HgbCyKWiRa715S99BR\n' > ~/.ssh/known_hosts
WORKDIR /docs
CMD ["mkdocs", "build", "--verbose", "--strict"]
......@@ -41,8 +41,6 @@ Now, create a local clone of your fork
#### Install Dependencies
See [Installation with Docker](#preview-using-mkdocs-with-dockerfile).
**TODO:** virtual environment
**TODO:** What we need for markdownlinter and checks?
<!--- All branches are protected, i.e., only ZIH staff can create branches and push to them --->
......
......@@ -11,7 +11,7 @@ if you want to know whether your browser is supported by DCV.
**Check out our new documentation about** [Virtual Desktops](../software/virtual_desktops.md).
To start a JupyterHub session on the dcv partition (taurusi210\[4-8\]) with one GPU, six CPU cores
To start a JupyterHub session on the partition `dcv` (`taurusi210[4-8]`) with one GPU, six CPU cores
and 2583 MB memory per core, click on:
[https://taurus.hrsk.tu-dresden.de/jupyter/hub/spawn#/~(partition~'dcv~cpuspertask~'6~gres~'gpu*3a1~mempercpu~'2583~environment~'production)](https://taurus.hrsk.tu-dresden.de/jupyter/hub/spawn#/~(partition~'dcv~cpuspertask~'6~gres~'gpu*3a1~mempercpu~'2583~environment~'production))
Optionally, you can modify many different Slurm parameters. For this
......
......@@ -38,7 +38,7 @@ marie@login$ srun --pty --partition=interactive --mem-per-cpu=2500 --cpus-per-ta
[...]
```
Of course, you can adjust the batch job parameters to your liking. Note that the default timelimit
Of course, you can adjust the batch job parameters to your liking. Note that the default time limit
in partition `interactive` is only 30 minutes, so you should specify a longer one with `--time` (or `-t`).
The script will automatically generate a self-signed SSL certificate and place it in your home
......
# JupyterHub
With our JupyterHub service we offer you a quick and easy way to work with Jupyter notebooks on ZIH
systems. This page covers starting and stopping JuperterHub sessions, error handling and customizing
systems. This page covers starting and stopping JupyterHub sessions, error handling and customizing
the environment.
We also provide a comprehensive documentation on how to use
......@@ -21,7 +21,8 @@ cannot give extensive support in every case.
!!! note
This service is only available for users with an active HPC project.
See [here](../access/overview.md) how to apply for an HPC project.
See [Application for Login and Resources](../application/overview.md), if you need to apply for
an HPC project.
JupyterHub is available at
[https://taurus.hrsk.tu-dresden.de/jupyter](https://taurus.hrsk.tu-dresden.de/jupyter).
......@@ -100,7 +101,7 @@ running the code. We currently offer one for Python, C++, MATLAB and R.
## Stop a Session
It is good practise to stop your session once your work is done. This releases resources for other
It is good practice to stop your session once your work is done. This releases resources for other
users and your quota is less charged. If you just log out or close the window, your server continues
running and **will not stop** until the Slurm job runtime hits the limit (usually 8 hours).
......@@ -147,8 +148,8 @@ Useful pages for valid batch system parameters:
If the connection to your notebook server unexpectedly breaks, you will get this error message.
Sometimes your notebook server might hit a batch system or hardware limit and gets killed. Then
usually the logfile of the corresponding batch job might contain useful information. These logfiles
are located in your `home` directory and have the name `jupyter-session-<jobid>.log`.
usually the log file of the corresponding batch job might contain useful information. These log
files are located in your `home` directory and have the name `jupyter-session-<jobid>.log`.
## Advanced Tips
......@@ -309,4 +310,4 @@ You can switch kernels of existing notebooks in the kernel menu:
You have now the option to preload modules from the [module system](../software/modules.md).
Select multiple modules that will be preloaded before your notebook server starts. The list of
available modules depends on the module environment you want to start the session in (`scs5` or
`ml`). The right module environment will be chosen by your selected partition.
`ml`). The right module environment will be chosen by your selected partition.
# JupyterHub for Teaching
On this page we want to introduce to you some useful features if you
want to use JupyterHub for teaching.
On this page, we want to introduce to you some useful features if you want to use JupyterHub for
teaching.
!!! note
......@@ -9,23 +9,21 @@ want to use JupyterHub for teaching.
Please be aware of the following notes:
- ZIH systems operate at a lower availability level than your usual Enterprise Cloud VM. There
can always be downtimes, e.g. of the filesystems or the batch system.
- ZIH systems operate at a lower availability level than your usual Enterprise Cloud VM. There can
always be downtimes, e.g. of the filesystems or the batch system.
- Scheduled downtimes are announced by email. Please plan your courses accordingly.
- Access to HPC resources is handled through projects. See your course as a project. Projects need
to be registered beforehand (more info on the page [Access](../application/overview.md)).
- Don't forget to [add your users](../application/project_management.md#manage-project-members-dis-enable)
(eg. students or tutors) to your project.
(e.g. students or tutors) to your project.
- It might be a good idea to [request a reservation](../jobs_and_resources/overview.md#exclusive-reservation-of-hardware)
of part of the compute resources for your project/course to
avoid unnecessary waiting times in the batch system queue.
of part of the compute resources for your project/course to avoid unnecessary waiting times in
the batch system queue.
## Clone a Repository With a Link
This feature bases on
[nbgitpuller](https://github.com/jupyterhub/nbgitpuller).
Documentation can be found at
[this page](https://jupyterhub.github.io/nbgitpuller/).
This feature bases on [nbgitpuller](https://github.com/jupyterhub/nbgitpuller). Further information
can be found in the [external documentation about nbgitpuller](https://jupyterhub.github.io/nbgitpuller/).
This extension for Jupyter notebooks can clone every public git repository into the users work
directory. It's offering a quick way to distribute notebooks and other material to your students.
......@@ -50,14 +48,14 @@ The following parameters are available:
|---|---|
|`repo` | path to git repository|
|`branch` | branch in the repository to pull from default: `master`|
|`urlpath` | URL to redirect the user to a certain file [more info](https://jupyterhub.github.io/nbgitpuller/topic/url-options.html#urlpath)|
|`urlpath` | URL to redirect the user to a certain file, [more info about parameter urlpath](https://jupyterhub.github.io/nbgitpuller/topic/url-options.html#urlpath)|
|`depth` | clone only a certain amount of latest commits not recommended|
This [link
generator](https://jupyterhub.github.io/nbgitpuller/link?hub=https://taurus.hrsk.tu-dresden.de/jupyter/)
might help creating those links
## Spawner Options Passthrough with URL Parameters
## Spawn Options Pass-through with URL Parameters
The spawn form now offers a quick start mode by passing URL parameters.
......
......@@ -9,7 +9,7 @@ connection to enter the campus network. While active, it allows the user to conn
HPC login nodes.
For more information on our VPN and how to set it up, please visit the corresponding
[ZIH service catalogue page](https://tu-dresden.de/zih/dienste/service-katalog/arbeitsumgebung/zugang_datennetz/vpn).
[ZIH service catalog page](https://tu-dresden.de/zih/dienste/service-katalog/arbeitsumgebung/zugang_datennetz/vpn).
## Connecting from Linux
......
......@@ -36,15 +36,16 @@ Any project have:
## Third step: Hardware
![picture 4: Hardware >](misc/request_step3_machines.png "Hardware"){loading=lazy width=300 style="float:right"}
This step inquire the required hardware. You can find the specifications
[here](../jobs_and_resources/hardware_overview.md).
This step inquire the required hardware. The
[hardware specifications](../jobs_and_resources/hardware_overview.md) might help you to estimate,
e. g. the compute time.
Please fill in the total computing time you expect in the project runtime. The compute time is
Please fill in the total computing time you expect in the project runtime. The compute time is
given in cores per hour (CPU/h), this refers to the 'virtual' cores for nodes with hyperthreading.
If they require GPUs, then this is given as GPU units per hour (GPU/h). Please add 6 CPU hours per
If they require GPUs, then this is given as GPU units per hour (GPU/h). Please add 6 CPU hours per
GPU hour in your application.
The project home is a shared storage in your project. Here you exchange data or install software
The project home is a shared storage in your project. Here you exchange data or install software
for your project group in userspace. The directory is not intended for active calculations, for this
the scratch is available.
......
# Intermediate Archive
With the "Intermediate Archive", ZIH is closing the gap between a normal disk-based filesystem and
[Longterm Archive](preservation_research_data.md). The Intermediate Archive is a hierarchical
[Long-term Archive](preservation_research_data.md). The Intermediate Archive is a hierarchical
filesystem with disks for buffering and tapes for storing research data.
Its intended use is the storage of research data for a maximal duration of 3 years. For storing the
data after exceeding this time, the user has to supply essential metadata and migrate the files to
the [Longterm Archive](preservation_research_data.md). Until then, she/he has to keep track of her/his
the [Long-term Archive](preservation_research_data.md). Until then, she/he has to keep track of her/his
files.
Some more information:
......
# Longterm Preservation for Research Data
# Long-term Preservation for Research Data
## Why should research data be preserved?
......@@ -55,7 +55,7 @@ Below are some examples:
- ISBN
- possible meta-data for an electronically saved image would be:
- resolution of the image
- information about the colour depth of the picture
- information about the color depth of the picture
- file format (jpg or tiff or ...)
- file size how was this image created (digital camera, scanner, ...)
- description of what the image shows
......@@ -79,6 +79,6 @@ information about managing research data.
## I want to store my research data at ZIH. How can I do that?
Longterm preservation of research data is under construction at ZIH and in a testing phase.
Long-term preservation of research data is under construction at ZIH and in a testing phase.
Nevertheless you can already use the archiving service. If you would like to become a test
user, please write an E-Mail to [Dr. Klaus Köhler](mailto:klaus.koehler@tu-dresden.de).
......@@ -2,7 +2,7 @@
Storage systems differ in terms of capacity, streaming bandwidth, IOPS rate, etc. Price and
efficiency don't allow to have it all in one. That is why fast parallel filesystems at ZIH have
restrictions with regards to **age of files** and [quota](quotas.md). The mechanism of workspaces
restrictions with regards to **age of files** and [quota](permanent.md#quotas). The mechanism of workspaces
enables users to better manage their HPC data.
The concept of workspaces is common and used at a large number of HPC centers.
......
......@@ -26,4 +26,4 @@ Contributions from user-side are highly welcome. Please find out more in our [gu
**2021-10-05** Offline-maintenance (black building test)
**2021-09-29** Introduction to HPC at ZIH ([slides](misc/HPC-Introduction.pdf))
**2021-09-29** Introduction to HPC at ZIH ([HPC introduction slides](misc/HPC-Introduction.pdf))
......@@ -64,7 +64,8 @@ True
### Python Virtual Environments
Virtual environments allow users to install additional python packages and create an isolated
[Virtual environments](../software/python_virtual_environments.md) allow users to install
additional python packages and create an isolated
runtime environment. We recommend using `virtualenv` for this purpose.
```console
......
......@@ -32,22 +32,24 @@ and `--distribution` for different job types.
## OpenMP Strategies
The illustration below shows the default binding of a pure OpenMP-job on a single node with 16 CPUs
The illustration below shows the default binding of a pure OpenMP job on a single node with 16 CPUs
on which 16 threads are allocated.
```Bash
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --tasks-per-node=1
#SBATCH --cpus-per-task=16
![OpenMP](misc/openmp.png)
{: align=center}
export OMP_NUM_THREADS=16
!!! example "Default binding and default distribution"
srun --ntasks 1 --cpus-per-task $OMP_NUM_THREADS ./application
```
```bash
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --tasks-per-node=1
#SBATCH --cpus-per-task=16
![OpenMP](misc/openmp.png)
{: align=center}
export OMP_NUM_THREADS=16
srun --ntasks 1 --cpus-per-task $OMP_NUM_THREADS ./application
```
## MPI Strategies
......@@ -74,8 +76,10 @@ node and odd on each second socket of each node.
### Core Bound
Note: With this command the tasks will be bound to a core for the entire runtime of your
application.
!!! note
With this command the tasks will be bound to a core for the entire runtime of your
application.
#### Distribution: block:block
......
......@@ -58,10 +58,10 @@ For MPI-parallel jobs one typically allocates one core per task that has to be s
### Multiple Programs Running Simultaneously in a Job
In this short example, our goal is to run four instances of a program concurrently in a **single**
batch script. Of course we could also start a batch script four times with `sbatch` but this is not
what we want to do here. Please have a look at
[this subsection](#multiple-programs-running-simultaneously-in-a-job)
in case you intend to run GPU programs simultaneously in a **single** job.
batch script. Of course, we could also start a batch script four times with `sbatch` but this is not
what we want to do here. However, you can also find an example about
[how to run GPU programs simultaneously in a single job](#running-multiple-gpu-applications-simultaneously-in-a-batch-job)
below.
!!! example " "
......@@ -355,4 +355,4 @@ file) that will be executed one after each other with different CPU numbers:
## Array-Job with Afterok-Dependency and Datamover Usage
This is a *todo*
This part is under construction.
# Big Data Frameworks
# Big Data Analytics
[Apache Spark](https://spark.apache.org/), [Apache Flink](https://flink.apache.org/)
and [Apache Hadoop](https://hadoop.apache.org/) are frameworks for processing and integrating
......
# Building Software
While it is possible to do short compilations on the login nodes, it is generally considered good
practice to use a job for that, especially when using many parallel make processes. Note that
starting on December 6th 2016, the `/projects` filesystem will be mounted read-only on all compute
practice to use a job for that, especially when using many parallel make processes. Since 2016,
the `/projects` filesystem is mounted read-only on all compute
nodes in order to prevent users from doing large I/O there (which is what the `/scratch` is for).
In consequence, you cannot compile in `/projects` within a job anymore. If you wish to install
In consequence, you cannot compile in `/projects` within a job. If you wish to install
software for your project group anyway, you can use a build directory in the `/scratch` filesystem
instead:
instead.
Every sane build system should allow you to keep your source code tree and your build directory
separate, some even demand them to be different directories. Plus, you can set your installation
......@@ -17,16 +17,16 @@ For instance, when using CMake and keeping your source in `/projects`, you could
```console
# save path to your source directory:
marie@login$ export SRCDIR=/projects/p_myproject/mysource
marie@login$ export SRCDIR=/projects/p_marie/mysource
# create a build directory in /scratch:
marie@login$ mkdir /scratch/p_myproject/mysoftware_build
marie@login$ mkdir /scratch/p_marie/mysoftware_build
# change to build directory within /scratch:
marie@login$ cd /scratch/p_myproject/mysoftware_build
marie@login$ cd /scratch/p_marie/mysoftware_build
# create Makefiles:
marie@login$ cmake -DCMAKE_INSTALL_PREFIX=/projects/p_myproject/mysoftware $SRCDIR
marie@login$ cmake -DCMAKE_INSTALL_PREFIX=/projects/p_marie/mysoftware $SRCDIR
# build in a job:
marie@login$ srun --mem-per-cpu=1500 --cpus-per-task=12 --pty make -j 12
......
......@@ -16,7 +16,7 @@ The OpenFOAM (Open Field Operation and Manipulation) CFD Toolbox can simulate an
fluid flows involving chemical reactions, turbulence and heat transfer, to solid dynamics,
electromagnetics and the pricing of financial options. OpenFOAM is developed primarily by
[OpenCFD Ltd](https://www.openfoam.com) and is freely available and open-source,
licensed under the GNU General Public Licence.
licensed under the GNU General Public License.
The command `module spider OpenFOAM` provides the list of installed OpenFOAM versions. In order to
use OpenFOAM, it is mandatory to set the environment by sourcing the `bashrc` (for users running
......
......@@ -12,10 +12,10 @@ Singularity. Information about the use of Singularity on ZIH systems can be foun
In some cases using Singularity requires a Linux machine with root privileges (e.g. using the
partition `ml`), the same architecture and a compatible kernel. For many reasons, users on ZIH
systems cannot be granted root permissions. A solution is a Virtual Machine (VM) on the partition
`ml` which allows users to gain root permissions in an isolated environment. There are two main
`ml` which allows users to gain root permissions in an isolated environment. There are two main
options on how to work with Virtual Machines on ZIH systems:
1. [VM tools](virtual_machines_tools.md): Automative algorithms for using virtual machines;
1. [VM tools](virtual_machines_tools.md): Automated algorithms for using virtual machines;
1. [Manual method](virtual_machines.md): It requires more operations but gives you more flexibility
and reliability.
......@@ -35,7 +35,7 @@ execution. Follow the instructions for [locally installing Singularity](#local-i
[container creation](#container-creation). Moreover, existing Docker container can easily be
converted, see [Import a docker container](#importing-a-docker-container).
If you are already familar with Singularity, you might be more intressted in our [singularity
If you are already familiar with Singularity, you might be more interested in our [singularity
recipes and hints](singularity_recipe_hints.md).
### Local Installation
......