Skip to content
Snippets Groups Projects
Commit 2678aa71 authored by Martin Schroschk's avatar Martin Schroschk
Browse files

Merge branch 'Python_fixes' into 'preview'

Replace Python.md

See merge request zih/hpc-compendium/hpc-compendium!140
parents 8ee7b4f0 ffa3cfac
No related branches found
No related tags found
3 merge requests!322Merge preview into main,!319Merge preview into main,!140Replace Python.md
# Python for Data Analytics # Python for Data Analytics
Python is a high-level interpreted language widely used in research and Python is a high-level interpreted language widely used in research and
science. Using HPC allows you to work with python quicker and more science. Using HPC allows you to work with python quicker and more
effective. Taurus allows working with a lot of available packages and effective. Taurus allows working with a lot of available packages and
libraries which give more useful functionalities and allow use all libraries which give more useful functionalities and allow use all
features of Python and to avoid minuses. features of Python and to avoid minuses.
**Prerequisites:** To work with Python, you obviously need \<a **Prerequisites:** To work with PyTorch you obviously need [access](../access/Login.md) for the
href="Login" target="\_blank">access\</a> for the Taurus system and Taurus system and basic knowledge about Python, Numpy and SLURM system.
basic knowledge about Python, SLURM system.
**Aim** of this page is to introduce users on how to start working with **Aim** of this page is to introduce users on how to start working with Python on the
Python on the \<a href="HPCDA" target="\_self">HPC-DA\</a> system - part [HPC-DA](../use_of_hardware/Power9.md) system - part of the TU Dresden HPC system.
of the TU Dresden HPC system.
\<span style="font-size: 1em;">There are three main options on how to There are three main options on how to
work with Keras and Tensorflow on the HPC-DA: 1. Modules; 2. \</span> work with Keras and Tensorflow on the HPC-DA: 1. Modules; 2. [JupyterNotebook](JupyterHub.md);
[JupyterNotebook](JupyterHub)\<span style="font-size: 1em;">; 3. 3.[Containers](containers.md). The main way is using the [Modules
\</span> [Containers](TensorFlowContainerOnHPCDA)\<span system](modules.md) and Python virtual environment.
style="font-size: 1em;">. The main way is using the \</span> [Modules
system](RuntimeEnvironment#Module_Environments)\<span style="font-size:
1em;"> and Python virtual environment.\</span>
You could work with simple examples in your home directory but according Note: You could work with simple examples in your home directory but according to
to \<a href="HPCStorageConcept2019" target="\_blank">the storage [HPCStorageConcept2019](../data_management/HPCStorageConcept2019.md) please use **workspaces**
concept\</a>** please use \<a href="WorkSpaces" for your study and work projects.
target="\_blank">workspaces\</a> for your study and work projects**.
## Virtual environment ## Virtual environment
There are two methods of how to work with virtual environments on There are two methods of how to work with virtual environments on
Taurus: Taurus:
1.** Vitualenv** is a standard Python tool to create isolated Python 1. **Vitualenv** is a standard Python tool to create isolated Python environments.
environments. It is the preferred interface for managing installations It is the preferred interface for
and virtual environments on Taurus and part of the Python modules. managing installations and virtual environments on Taurus and part of the Python modules.
2\. **Conda** is an alternative method for managing installations and 2. **Conda** is an alternative method for managing installations and
virtual environments on Taurus. Conda is an open-source package virtual environments on Taurus. Conda is an open-source package
management system and environment management system from Anaconda. The management system and environment management system from Anaconda. The
conda manager is included in all versions of Anaconda and Miniconda. conda manager is included in all versions of Anaconda and Miniconda.
...@@ -47,25 +39,28 @@ conda manager is included in all versions of Anaconda and Miniconda. ...@@ -47,25 +39,28 @@ conda manager is included in all versions of Anaconda and Miniconda.
with the virtual environments previously created with conda tool and with the virtual environments previously created with conda tool and
vice versa! Prefer virtualenv whenever possible. vice versa! Prefer virtualenv whenever possible.
\<span style="font-size: 1em;">This example shows how to start working This example shows how to start working
with \</span> **Virtualenv** \<span style="font-size: 1em;"> and Python with **Virtualenv** and Python virtual environment (using the module system)
virtual environment (using the module system) \</span>
srun -p ml -N 1 -n 1 -c 7 --mem-per-cpu=5772 --gres=gpu:1 --time=04:00:00 --pty bash #Job submission in ml nodes with 1 gpu on 1 node. srun -p ml -N 1 -n 1 -c 7 --mem-per-cpu=5772 --gres=gpu:1 --time=04:00:00 --pty bash #Job submission in ml nodes with 1 gpu on 1 node.
mkdir python-environments # Optional: Create folder. Please use Workspaces!<br /><br />module load modenv/ml #Changing the environment. Example output: The following have been reloaded with a version change: 1) modenv/scs5 =&gt; modenv/ml<br />ml av Python #Check the available modules with Python mkdir python-environments # Optional: Create folder. Please use Workspaces!
module load Python #Load default Python. Example output: Module Python/3.7.4-GCCcore-8.3.0 with 7 dependencies loaded
module load modenv/ml # Changing the environment. Example output: The following have been reloaded with a version change: 1 modenv/scs5 => modenv/ml
ml av Python #Check the available modules with Python
module load Python #Load default Python. Example output: Module Python/3.7 4-GCCcore-8.3.0 with 7 dependencies loaded
which python #Check which python are you using which python #Check which python are you using
virtualenv --system-site-packages python-environments/envtest #Create virtual environment virtualenv --system-site-packages python-environments/envtest #Create virtual environment
source python-environments/envtest/bin/activate #Activate virtual environment. Example output: (envtest) bash-4.2$ source python-environments/envtest/bin/activate #Activate virtual environment. Example output: (envtest) bash-4.2$
python #Start python python #Start python
from time import gmtime, strftime from time import gmtime, strftime
print(strftime("%Y-%m-%d %H:%M:%S", gmtime())) #Example output: 2019-11-18 13:54:16<br /><br />deactivate # Leave the virtual environment print(strftime("%Y-%m-%d %H:%M:%S", gmtime())) #Example output: 2019-11-18 13:54:16
deactivate #Leave the virtual environment
The \<a href="<https://virtualenv.pypa.io/en/latest/>" title="Creation The [virtualenv](https://virtualenv.pypa.io/en/latest/) Python module (Python 3) provides support
of virtual environments.">virtualenv\</a> Python module (Python 3) for creating virtual environments with their own sitedirectories,
provides support for creating virtual environments with their own site optionally isolated from system site directories. Each
directories, optionally isolated from system site directories. Each
virtual environment has its own Python binary (which matches the version virtual environment has its own Python binary (which matches the version
of the binary that was used to create this environment) and can have its of the binary that was used to create this environment) and can have its
own independent set of installed Python packages in its site own independent set of installed Python packages in its site
...@@ -76,11 +71,10 @@ installation. When you switch projects, you can simply create a new ...@@ -76,11 +71,10 @@ installation. When you switch projects, you can simply create a new
virtual environment and not have to worry about breaking the packages virtual environment and not have to worry about breaking the packages
installed in other environments. installed in other environments.
In your virtual environment, you can use packages from the [Complete In your virtual environment, you can use packages from the (Complete
List of Modules](SoftwareModulesList) or if you didn't find what you List of Modules)(SoftwareModulesList) or if you didn't find what you
need you can install required packages with the command: \<span>pip need you can install required packages with the command: `pip install`. With the command
install\</span>. With the command \<span>pip freeze\</span>, you can see `pip freeze`, you can see a list of all installed packages and their versions.
a list of all installed packages and their versions.
This example shows how to start working with **Conda** and virtual This example shows how to start working with **Conda** and virtual
environment (with using module system) environment (with using module system)
...@@ -99,23 +93,21 @@ environment (with using module system) ...@@ -99,23 +93,21 @@ environment (with using module system)
conda deactivate #Leave the virtual environment conda deactivate #Leave the virtual environment
\<span style="font-size: 1em;">You can control where a conda environment You can control where a conda environment
lives by providing a path to a target directory when creating the lives by providing a path to a target directory when creating the
environment. For example, the following command will create a new environment. For example, the following command will create a new
environment in a workspace located in '\<span>scratch\</span>'\</span> environment in a workspace located in `scratch`
conda create --prefix /scratch/ws/<name_of_your_workspace>/conda-virtual-environment/<name_of_your_environment> conda create --prefix /scratch/ws/<name_of_your_workspace>/conda-virtual-environment/<name_of_your_environment>
%RED%Please pay attention<span class="twiki-macro ENDCOLOR"></span>, Please pay attention,
using srun directly on the shell will lead to blocking and launch an using srun directly on the shell will lead to blocking and launch an
interactive job. Apart from short test runs, it is **recommended to interactive job. Apart from short test runs, it is **recommended to
launch your jobs into the background by using \<a href="Slurm" launch your jobs into the background by using Slurm**. For that, you can conveniently put
target="\_blank">batch jobs\</a>**. For that, you can conveniently put
the parameters directly into the job file which you can submit using the parameters directly into the job file which you can submit using
`sbatch [options] <job file>.` `sbatch [options] <job file>.`
\<span style="color: #222222; font-size: 1.385em;">Jupyter ## Jupyter Notebooks
Notebooks\</span>
Jupyter notebooks are a great way for interactive computing in your web Jupyter notebooks are a great way for interactive computing in your web
browser. Jupyter allows working with data cleaning and transformation, browser. Jupyter allows working with data cleaning and transformation,
...@@ -125,20 +117,17 @@ course with machine learning. ...@@ -125,20 +117,17 @@ course with machine learning.
There are two general options on how to work Jupyter notebooks using There are two general options on how to work Jupyter notebooks using
HPC. HPC.
\<span style="font-size: 1em;">On Taurus, there is \</span>\<a On Taurus, there is [JupyterHub](JupyterHub.md) where you can simply run your Jupyter notebook
href="JupyterHub" target="\_self">jupyterhub\</a>\<span
style="font-size: 1em;">, where you can simply run your Jupyter notebook
on HPC nodes. Also, you can run a remote jupyter server within a sbatch on HPC nodes. Also, you can run a remote jupyter server within a sbatch
GPU job and with the modules and packages you need. The manual server GPU job and with the modules and packages you need. The manual server
setup you can find \</span>\<a href="DeepLearning" setup you can find [here](DeepLearning.md).
target="\_blank">here.\</a>
\<span style="font-size: 1em;">With Jupyterhub you can work with general With Jupyterhub you can work with general
data analytics tools. This is the recommended way to start working with data analytics tools. This is the recommended way to start working with
the Taurus. However, some special instruments could not be available on the Taurus. However, some special instruments could not be available on
the Jupyterhub. \</span>\<span style="font-size: 1em;">Keep in mind that the Jupyterhub.
the remote Jupyter server can offer more freedom with settings and
approaches.\</span> **Keep in mind that the remote Jupyter server can offer more freedom with settings and approaches.**
## MPI for Python ## MPI for Python
...@@ -149,12 +138,11 @@ a library specification that allows HPC to pass information between its ...@@ -149,12 +138,11 @@ a library specification that allows HPC to pass information between its
various nodes and clusters. MPI designed to provide access to advanced various nodes and clusters. MPI designed to provide access to advanced
parallel hardware for end-users, library writers and tool developers. parallel hardware for end-users, library writers and tool developers.
#### Why use MPI? ### Why use MPI?
MPI provides a powerful, efficient and portable way to express parallel MPI provides a powerful, efficient and portable way to express parallel
programs. \<span style="font-size: 1em;">Among many parallel programs.
computational models, message-passing has proven to be an effective Among many parallel computational models, message-passing has proven to be an effective one.
one.\</span>
### Parallel Python with mpi4py ### Parallel Python with mpi4py
...@@ -173,8 +161,9 @@ optimized communication of NumPy arrays. ...@@ -173,8 +161,9 @@ optimized communication of NumPy arrays.
Mpi4py is included as an extension of the SciPy-bundle modules on Mpi4py is included as an extension of the SciPy-bundle modules on
taurus. taurus.
Please check the \<a Please check the SoftwareModulesList for the modules availability. The availability of the mpi4py
href`"SoftwareModulesList" target="_blank">software module list</a> for the new modules. The availability of the mpi4py in the module you can check by the <b><span>module whatis <name_of_the module> </span></b>command. The =module whatis` in the module you can check by
the `module whatis <name_of_the module>` command. The `module whatis`
command displays a short information and included extensions of the command displays a short information and included extensions of the
module. module.
...@@ -198,8 +187,7 @@ environment: ...@@ -198,8 +187,7 @@ environment:
### Horovod ### Horovod
\<a href="<https://github.com/horovod/horovod>" [Horovod](https://github.com/horovod/horovod) is the open source distributed training
target="\_blank">Horovod\</a> is the open source distributed training
framework for TensorFlow, Keras, PyTorch. It is supposed to make it easy framework for TensorFlow, Keras, PyTorch. It is supposed to make it easy
to develop distributed deep learning projects and speed them up with to develop distributed deep learning projects and speed them up with
TensorFlow. TensorFlow.
...@@ -207,8 +195,7 @@ TensorFlow. ...@@ -207,8 +195,7 @@ TensorFlow.
#### Why use Horovod? #### Why use Horovod?
Horovod allows you to easily take a single-GPU TensorFlow and Pytorch Horovod allows you to easily take a single-GPU TensorFlow and Pytorch
program and successfully train it on many GPUs \<a program and successfully train it on many GPUs! In
href="<https://eng.uber.com/horovod/>" target="\_blank">faster\</a>! In
some cases, the MPI model is much more straightforward and requires far some cases, the MPI model is much more straightforward and requires far
less code changes than the distributed code from TensorFlow for less code changes than the distributed code from TensorFlow for
instance, with parameter servers. Horovod uses MPI and NCCL which gives instance, with parameter servers. Horovod uses MPI and NCCL which gives
...@@ -218,7 +205,7 @@ in some cases better results than pure TensorFlow and PyTorch. ...@@ -218,7 +205,7 @@ in some cases better results than pure TensorFlow and PyTorch.
Horovod is available as a module with **TensorFlow** or **PyTorch**for Horovod is available as a module with **TensorFlow** or **PyTorch**for
**all** module environments. Please check the [software module **all** module environments. Please check the [software module
list](SoftwareModulesList) for the current version of the software. list](modules.md) for the current version of the software.
Horovod can be loaded like other software on the Taurus: Horovod can be loaded like other software on the Taurus:
ml av Horovod #Check available modules with Python ml av Horovod #Check available modules with Python
...@@ -232,23 +219,41 @@ install Horovod you need to create a virtual environment and load the ...@@ -232,23 +219,41 @@ install Horovod you need to create a virtual environment and load the
dependencies (e.g. MPI). Installing PyTorch can take a few hours and is dependencies (e.g. MPI). Installing PyTorch can take a few hours and is
not recommended not recommended
%RED%Note:<span class="twiki-macro ENDCOLOR"></span> You could work with **Note:** You could work with simple examples in your home directory but **please use workspaces
simple examples in your home directory but **please use \<a for your study and work projects** (see the Storage concept).
href="WorkSpaces" target="\_blank">workspaces\</a> for your study and
work projects**(see the \<a href="HPCStorageConcept2019"
target="\_blank">Storage concept\</a>).
Setup: Setup:
srun -N 1 --ntasks-per-node=6 -p ml --time=08:00:00 --pty bash #allocate a Slurm job allocation, which is a set of resources (nodes)<br /><br />module load modenv/ml #Load dependencies by using modules <br />module load OpenMPI/3.1.4-gcccuda-2018b<br />module load Python/3.6.6-fosscuda-2018b<br />module load cuDNN/7.1.4.18-fosscuda-2018b<br />module load CMake/3.11.4-GCCcore-7.3.0<br /><br />virtualenv --system-site-packages &lt;location_for_your_environment&gt; #create virtual environment<br /><br />source &lt;location_for_your_environment&gt;/bin/activate #activate virtual environment srun -N 1 --ntasks-per-node=6 -p ml --time=08:00:00 --pty bash #allocate a Slurm job allocation, which is a set of resources (nodes)
module load modenv/ml #Load dependencies by using modules
Or when you need to use conda: \<br /> module load OpenMPI/3.1.4-gcccuda-2018b
module load Python/3.6.6-fosscuda-2018b
srun -N 1 --ntasks-per-node=6 -p ml --time=08:00:00 --pty bash #allocate a Slurm job allocation, which is a set of resources (nodes)<br /><br />module load modenv/ml #Load dependencies by using modules <br />module load OpenMPI/3.1.4-gcccuda-2018b<br />module load PythonAnaconda/3.6<br />module load cuDNN/7.1.4.18-fosscuda-2018b<br />module load CMake/3.11.4-GCCcore-7.3.0<br /><br />conda create --prefix=&lt;location_for_your_environment&gt; python=3.6 anaconda #create virtual environment<br /><br />conda activate &lt;location_for_your_environment&gt; #activate virtual environment module load cuDNN/7.1.4.18-fosscuda-2018b
module load CMake/3.11.4-GCCcore-7.3.0
Install Pytorch (not recommended)\<br /> virtualenv --system-site-packages <location_for_your_environment> #create virtual environment
source <location_for_your_environment>/bin/activate #activate virtual environment
cd /tmp<br />git clone https://github.com/pytorch/pytorch #clone Pytorch from the source<br />cd pytorch #go to folder<br />git checkout v1.7.1 #Checkout version (example: 1.7.1)<br />git submodule update --init #Update dependencies<br />python setup.py install #install it with python<br /><br />cd -
Or when you need to use conda:
srun -N 1 --ntasks-per-node=6 -p ml --time=08:00:00 --pty bash #allocate a Slurm job allocation, which is a set of resources (nodes)
module load modenv/ml #Load dependencies by using modules
module load OpenMPI/3.1.4-gcccuda-2018b
module load PythonAnaconda/3.6
module load cuDNN/7.1.4.18-fosscuda-2018b
module load CMake/3.11.4-GCCcore-7.3.0
conda create --prefix=<location_for_your_environment> python=3.6 anaconda #create virtual environment
conda activate <location_for_your_environment> #activate virtual environment
Install Pytorch (not recommended)
cd /tmp
git clone https://github.com/pytorch/pytorch #clone Pytorch from the source
cd pytorch #go to folder
git checkout v1.7.1 #Checkout version (example: 1.7.1)
git submodule update --init #Update dependencies
python setup.py install #install it with python
##### Install Horovod for Pytorch with python and pip ##### Install Horovod for Pytorch with python and pip
...@@ -260,10 +265,10 @@ details. ...@@ -260,10 +265,10 @@ details.
##### Verify that Horovod works ##### Verify that Horovod works
python #start python python #start python
import torch #import pytorch import torch #import pytorch
import horovod.torch as hvd #import horovod import horovod.torch as hvd #import horovod
hvd.init() #initialize horovod hvd.init() #initialize horovod
hvd.size() hvd.size()
hvd.rank() hvd.rank()
print('Hello from:', hvd.rank()) print('Hello from:', hvd.rank())
...@@ -273,7 +278,5 @@ details. ...@@ -273,7 +278,5 @@ details.
If you want to use NCCL instead of MPI you can specify that in the If you want to use NCCL instead of MPI you can specify that in the
install command after loading the NCCL module: install command after loading the NCCL module:
module load NCCL/2.3.7-fosscuda-2018b<br />HOROVOD_GPU_ALLREDUCE=NCCL HOROVOD_GPU_BROADCAST=NCCL HOROVOD_WITHOUT_TENSORFLOW=1 HOROVOD_WITH_PYTORCH=1 HOROVOD_WITHOUT_MXNET=1 pip install --no-cache-dir horovod module load NCCL/2.3.7-fosscuda-2018b
HOROVOD_GPU_ALLREDUCE=NCCL HOROVOD_GPU_BROADCAST=NCCL HOROVOD_WITHOUT_TENSORFLOW=1 HOROVOD_WITH_PYTORCH=1 HOROVOD_WITHOUT_MXNET=1 pip install --no-cache-dir horovod
\<div id="gtx-trans" style="position: absolute; left: 386px; top:
2567.99px;"> \</div>
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment