Skip to content
Snippets Groups Projects
Commit 32e86ffb authored by Martin Schroschk's avatar Martin Schroschk
Browse files

Merge, fix links and linter issues

parent 7861b3ca
No related branches found
No related tags found
4 merge requests!392Merge preview into contrib guide for browser users,!333Draft: update NGC containers,!327Merge preview into main,!317Jobs and resources
......@@ -248,7 +248,6 @@ provide a comprehensive collection of job examples.
* Submisson: `marie@login$ sbatch batch_script.sh`
* Run with fewer MPI tasks: `marie@login$ sbatch --ntasks 14 batch_script.sh`
## Manage and Control Jobs
### Job and Slurm Monitoring
......@@ -321,6 +320,7 @@ We'd like to point your attention to the following options gain insight in your
```console
marie@login$ sacct -j <JOBID>
```
??? example "Show all fields for a specific job"
```console
......@@ -332,8 +332,9 @@ We'd like to point your attention to the following options gain insight in your
```console
marie@login$ sacct -j <JOBID> -o JobName,MaxRSS,MaxVMSize,CPUTime,ConsumedEnergy
```
The manual page (`man sacct`) and the [online reference](https://slurm.schedmd.com/sacct.html) provide a
comprehensive documentation regarding available fields and formats.
The manual page (`man sacct`) and the [online reference](https://slurm.schedmd.com/sacct.html)
provide a comprehensive documentation regarding available fields and formats.
!!! hint "Time span"
......@@ -427,6 +428,7 @@ srun --ntasks 8 --cpus-per-task $OMP_NUM_THREADS ./application
![Hybrid MPI and OpenMP](misc/hybrid.png)
{: align=center}
## Node Features for Selective Job Submission
The nodes in our HPC system are becoming more diverse in multiple aspects: hardware, mounted
......
......@@ -18,7 +18,7 @@ to find out, which PyTorch modules are available on your partition.
We recommend using **Alpha** and/or **ML** partitions when working with machine learning workflows
and the PyTorch library.
You can find detailed hardware specification in our
[hardware documentation](../jobs_and_resources/hardware_taurus.md).
[hardware documentation](../jobs_and_resources/hardware_overview.md).
## PyTorch Console
......@@ -44,7 +44,7 @@ Module GCC/10.2.0, CUDA/11.1.1, OpenMPI/4.0.5, PyTorch/1.9.0 and 54 dependencies
marie@alpha$ pip install torchvision --no-deps
```
Using the **--no-deps** option for "pip install" is necessary here as otherwise the PyTorch
Using the **--no-deps** option for "pip install" is necessary here as otherwise the PyTorch
version might be replaced and you will run into trouble with the cuda drivers.
On the **ML** partition:
......
......@@ -81,4 +81,4 @@ marie@local$ ssh -N -f -L 6006:taurusi8034.taurus.hrsk.tu-dresden.de:6006 <zih-l
Now, you can see the TensorBoard in your browser at `http://localhost:6006/`.
Note that you can also use TensorBoard in an [sbatch file](../jobs_and_resources/batch_systems.md).
Note that you can also use TensorBoard in an [sbatch file](../jobs_and_resources/slurm.md).
......@@ -19,7 +19,7 @@ TensorFlow 2 and TensorFlow 1, see the corresponding [section below](#compatibil
We recommend using partitions **Alpha** and/or **ML** when working with machine learning workflows
and the TensorFlow library. You can find detailed hardware specification in our
[Hardware](../jobs_and_resources/hardware_taurus.md) documentation.
[Hardware](../jobs_and_resources/hardware_overview.md) documentation.
## TensorFlow Console
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment