Skip to content
Snippets Groups Projects
Commit 98218d6c authored by Christoph Lehmann's avatar Christoph Lehmann
Browse files

/software/pytorch.md: fixing for markdownlint

parent 237bd4ee
No related branches found
No related tags found
5 merge requests!333Draft: update NGC containers,!322Merge preview into main,!319Merge preview into main,!279Draft: Machine Learning restructuring,!258Data Analytics restructuring
...@@ -14,16 +14,16 @@ marie@login$ module spider pytorch ...@@ -14,16 +14,16 @@ marie@login$ module spider pytorch
to find out, which Pytorch modules are available on your partition. to find out, which Pytorch modules are available on your partition.
We recommend using **Alpha** and/or **ML** partitions when working with machine learning workflows We recommend using **Alpha** and/or **ML** partitions when working with machine learning workflows
and the Pytorch library. You can find detailed hardware specification and the Pytorch library.
[here](../jobs_and_resources/hardware_taurus.md). You can find detailed hardware specification [here](../jobs_and_resources/hardware_taurus.md).
## Pytorch Console ## Pytorch Console
On the **Alpha** partition load the module environment: On the **Alpha** partition load the module environment:
```console ```console
marie@login$ srun -p alpha --gres=gpu:1 -n 1 -c 7 --pty --mem-per-cpu=800 bash #Job submission on alpha nodes with 1 gpu on 1 node with 800 Mb per CPU marie@login$ srun -p alpha --gres=gpu:1 -n 1 -c 7 --pty --mem-per-cpu=800 bash #Job submission on alpha nodes with 1 gpu on 1 node with 800 Mb per CPU
marie@alpha$ module load modenv/hiera GCC/10.2.0 CUDA/11.1.1 OpenMPI/4.0.5 PyTorch/1.9.0 marie@alpha$ module load modenv/hiera GCC/10.2.0 CUDA/11.1.1 OpenMPI/4.0.5 PyTorch/1.9.0
Die folgenden Module wurden in einer anderen Version erneut geladen: Die folgenden Module wurden in einer anderen Version erneut geladen:
1) modenv/scs5 => modenv/hiera 1) modenv/scs5 => modenv/hiera
...@@ -31,17 +31,16 @@ Module GCC/10.2.0, CUDA/11.1.1, OpenMPI/4.0.5, PyTorch/1.9.0 and 54 dependencies ...@@ -31,17 +31,16 @@ Module GCC/10.2.0, CUDA/11.1.1, OpenMPI/4.0.5, PyTorch/1.9.0 and 54 dependencies
``` ```
??? hint "Torchvision on alpha partition" ??? hint "Torchvision on alpha partition"
On the alpha partition the module torchvision is not yet available within the module system. (19.08.2021) On the alpha partition the module torchvision is not yet available within the module system. (19.08.2021)
Torchvision can be made available by using a virtual environment: Torchvision can be made available by using a virtual environment:
```console ```console
marie@alpha$ virtualenv --system-site-packages python-environments/torchvision_env marie@alpha$ virtualenv --system-site-packages python-environments/torchvision_env
marie@alpha$ source python-environments/torchvision_env/bin/activate marie@alpha$ source python-environments/torchvision_env/bin/activate
marie@alpha$ pip install torchvision --no-deps marie@alpha$ pip install torchvision --no-deps
``` ```
Using the **--no-deps** option for "pip install" is necessary here as otherwise the Pytorch version might be replaced and you will run into trouble with the cuda drivers.
Using the **--no-deps** option for "pip install" is necessary here as otherwise the Pytorch version might be replaced and you will run into trouble with the cuda drivers.
On the **ML** partition: On the **ML** partition:
...@@ -49,7 +48,7 @@ On the **ML** partition: ...@@ -49,7 +48,7 @@ On the **ML** partition:
marie@login$ srun -p ml --gres=gpu:1 -n 1 -c 7 --pty --mem-per-cpu=800 bash #Job submission in ml nodes with 1 gpu on 1 node with 800 Mb per CPU marie@login$ srun -p ml --gres=gpu:1 -n 1 -c 7 --pty --mem-per-cpu=800 bash #Job submission in ml nodes with 1 gpu on 1 node with 800 Mb per CPU
``` ```
after calling after calling
```console ```console
marie@login$ module spider pytorch marie@login$ module spider pytorch
...@@ -65,13 +64,14 @@ Module torchvision/0.7.0-fosscuda-2019b-Python-3.7.4-PyTorch-1.6.0 and 55 depend ...@@ -65,13 +64,14 @@ Module torchvision/0.7.0-fosscuda-2019b-Python-3.7.4-PyTorch-1.6.0 and 55 depend
Now we check that we can access Pytorch: Now we check that we can access Pytorch:
```console ```console
marie@{ml,alpha}$ python -c "import torch; print(torch.__version__)" marie@{ml,alpha}$ python -c "import torch; print(torch.__version__)"
``` ```
The following example shows how to create a python virtual environment and import Pytorch. The following example shows how to create a python virtual environment and
import Pytorch.
```console ```console
marie@ml$ mkdir python-environments #create folder marie@ml$ mkdir python-environments #create folder
marie@ml$ which python #check which python are you using marie@ml$ which python #check which python are you using
/sw/installed/Python/3.7.4-GCCcore-8.3.0/bin/python /sw/installed/Python/3.7.4-GCCcore-8.3.0/bin/python
marie@ml$ virtualenv --system-site-packages python-environments/env #create virtual environment "env" which inheriting with global site packages marie@ml$ virtualenv --system-site-packages python-environments/env #create virtual environment "env" which inheriting with global site packages
...@@ -82,7 +82,7 @@ marie@ml$ python -c "import torch; print(torch.__version__)" ...@@ -82,7 +82,7 @@ marie@ml$ python -c "import torch; print(torch.__version__)"
## Pytorch in JupyterHub ## Pytorch in JupyterHub
In addition to using interactive and batch jobs, it is possible to work with Pytorch using JupyterHub. In addition to using interactive and batch jobs, it is possible to work with Pytorch using JupyterHub.
The production and test environments of JupyterHub contain Python kernels, that come with a Pytorch support. The production and test environments of JupyterHub contain Python kernels, that come with a Pytorch support.
![Pytorch module in JupyterHub](misc/Pytorch_jupyter_module.png) ![Pytorch module in JupyterHub](misc/Pytorch_jupyter_module.png)
...@@ -90,6 +90,4 @@ The production and test environments of JupyterHub contain Python kernels, that ...@@ -90,6 +90,4 @@ The production and test environments of JupyterHub contain Python kernels, that
## Distributed Pytorch ## Distributed Pytorch
For details on how to run Pytorch with multiple GPUs and/or multiple nodes, see For details on how to run Pytorch with multiple GPUs and/or multiple nodes, see [distributed training](distributed_training.md).
[distributed training](distributed_training.md).
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment