From 31f5bdb1833b7c63c9a2cc359ceda1df7162f154 Mon Sep 17 00:00:00 2001 From: Natalie Breidenbach <natalie.breidenbach@tu-dresden.de> Date: Tue, 28 Nov 2023 13:58:04 +0100 Subject: [PATCH] Update pytorch.md --- .../docs/software/pytorch.md | 37 +++++++++++-------- 1 file changed, 21 insertions(+), 16 deletions(-) diff --git a/doc.zih.tu-dresden.de/docs/software/pytorch.md b/doc.zih.tu-dresden.de/docs/software/pytorch.md index 4d03aec66..249aadb02 100644 --- a/doc.zih.tu-dresden.de/docs/software/pytorch.md +++ b/doc.zih.tu-dresden.de/docs/software/pytorch.md @@ -15,18 +15,23 @@ marie@login$ module spider pytorch to find out, which PyTorch modules are available. -We recommend using partitions `alpha` and/or `ml` when working with machine learning workflows +We recommend using the cluster `alpha` and/or `power` when working with machine learning workflows and the PyTorch library. You can find detailed hardware specification in our [hardware documentation](../jobs_and_resources/hardware_overview.md). + +_The module environments /hiera, /scs5, /classic and /ml originated from the taurus system are momentarily under construction. The script will be updated after completion of the redesign accordingly_ + ## PyTorch Console -On the partition `alpha`, load the module environment: +On the cluster `alpha`, load the module environment: + ```console # Job submission on alpha nodes with 1 gpu on 1 node with 800 Mb per CPU -marie@login$ srun -p alpha --gres=gpu:1 -n 1 -c 7 --pty --mem-per-cpu=800 bash + +marie@login.alpha$ srun --gres=gpu:1 -n 1 -c 7 --pty --mem-per-cpu=800 bash marie@alpha$ module load modenv/hiera GCC/10.2.0 CUDA/11.1.1 OpenMPI/4.0.5 PyTorch/1.9.0 Die folgenden Module wurden in einer anderen Version erneut geladen: 1) modenv/scs5 => modenv/hiera @@ -34,9 +39,9 @@ Die folgenden Module wurden in einer anderen Version erneut geladen: Module GCC/10.2.0, CUDA/11.1.1, OpenMPI/4.0.5, PyTorch/1.9.0 and 54 dependencies loaded. ``` -??? hint "Torchvision on partition `alpha`" +??? hint "Torchvision on the cluster `alpha`" - On the partition `alpha`, the module torchvision is not yet available within the module + On the cluster `alpha`, the module torchvision is not yet available within the module system. (19.08.2021) Torchvision can be made available by using a virtual environment: @@ -49,46 +54,46 @@ Module GCC/10.2.0, CUDA/11.1.1, OpenMPI/4.0.5, PyTorch/1.9.0 and 54 dependencies Using the **--no-deps** option for "pip install" is necessary here as otherwise the PyTorch version might be replaced and you will run into trouble with the CUDA drivers. -On the partition `ml`: +On the cluster `power`: ```console -# Job submission in ml nodes with 1 gpu on 1 node with 800 Mb per CPU -marie@login$ srun -p ml --gres=gpu:1 -n 1 -c 7 --pty --mem-per-cpu=800 bash +# Job submission in power nodes with 1 gpu on 1 node with 800 Mb per CPU +marie@login.power$ srun --gres=gpu:1 -n 1 -c 7 --pty --mem-per-cpu=800 bash ``` After calling ```console -marie@login$ module spider pytorch +marie@login.power$ module spider pytorch ``` we know that we can load PyTorch (including torchvision) with ```console -marie@ml$ module load modenv/ml torchvision/0.7.0-fossCUDA-2019b-Python-3.7.4-PyTorch-1.6.0 +marie@power$ module load modenv/ml torchvision/0.7.0-fossCUDA-2019b-Python-3.7.4-PyTorch-1.6.0 Module torchvision/0.7.0-fossCUDA-2019b-Python-3.7.4-PyTorch-1.6.0 and 55 dependencies loaded. ``` Now, we check that we can access PyTorch: ```console -marie@{ml,alpha}$ python -c "import torch; print(torch.__version__)" +marie@{power,alpha}$ python -c "import torch; print(torch.__version__)" ``` The following example shows how to create a python virtual environment and import PyTorch. ```console # Create folder -marie@ml$ mkdir python-environments +marie@power$ mkdir python-environments # Check which python are you using -marie@ml$ which python +marie@power$ which python /sw/installed/Python/3.7.4-GCCcore-8.3.0/bin/python # Create virtual environment "env" which inheriting with global site packages -marie@ml$ virtualenv --system-site-packages python-environments/env +marie@power$ virtualenv --system-site-packages python-environments/env [...] # Activate virtual environment "env". Example output: (env) bash-4.2$ -marie@ml$ source python-environments/env/bin/activate -marie@ml$ python -c "import torch; print(torch.__version__)" +marie@power$ source python-environments/env/bin/activate +marie@power$ python -c "import torch; print(torch.__version__)" ``` ## PyTorch in JupyterHub -- GitLab