From 0b688b5f21dd220ae5e02d2e8ca67e4729b5b563 Mon Sep 17 00:00:00 2001
From: Elias Werner <eliwerner3@googlemail.com>
Date: Thu, 26 Aug 2021 13:39:41 +0200
Subject: [PATCH] reviewed machine learning overview

---
 .../docs/software/machine_learning.md         | 64 ++++++-------------
 1 file changed, 20 insertions(+), 44 deletions(-)

diff --git a/doc.zih.tu-dresden.de/docs/software/machine_learning.md b/doc.zih.tu-dresden.de/docs/software/machine_learning.md
index e3ca23e17..3af102cb9 100644
--- a/doc.zih.tu-dresden.de/docs/software/machine_learning.md
+++ b/doc.zih.tu-dresden.de/docs/software/machine_learning.md
@@ -6,15 +6,15 @@ For machine learning purposes, we recommend to use the **Alpha** and/or **ML** p
 ## ML partition
 
 The compute nodes of the ML partition are built on the base of [Power9](https://www.ibm.com/it-infrastructure/power/power9)
-architecture from IBM. The system was created for AI challenges, analytics and working with,
-Machine learning, data-intensive workloads, deep-learning frameworks and accelerated databases.
+architecture from IBM. The system was created for AI challenges, analytics and working with
+data-intensive workloads and accelerated databases.
 
 The main feature of the nodes is the ability to work with the
 [NVIDIA Tesla V100](https://www.nvidia.com/en-gb/data-center/tesla-v100/) GPU with **NV-Link**
 support that allows a total bandwidth with up to 300 gigabytes per second (GB/sec). Each node on the
 ml partition has 6x Tesla V-100 GPUs. You can find a detailed specification of the partition [here](../jobs_and_resources/power9.md).
 
-**Note:** The ML partition is based on the PowerPC Architecture, which means that the software built
+**Note:** The ML partition is based on the Power9 architecture, which means that the software built
 for x86_64 will not work on this partition. Also, users need to use the modules which are
 specially made for the ml partition (from modenv/ml).
 
@@ -29,7 +29,9 @@ marie@ml$ module load modenv/ml    #example output: The following have been relo
 
 ## Alpha partition
 
-- describe alpha partition
+Another partition for machine learning tasks is Alpha. It is mainly dedicated to [ScaDS.AI](https://scads.ai/)
+topics. Each node on Alpha has 2x AMD EPYC CPUs, 8x NVIDIA A100-SXM4 GPUs, 1TB RAM and 3.5TB local
+space (/tmp) on an NVMe device. You can find more details of the partition [here](../jobs_and_resources/alpha_centauri.md).
 
 ### Modules
 
@@ -40,51 +42,22 @@ marie@login$ srun -p alpha --gres=gpu:1 -n 1 -c 7 --pty --mem-per-cpu=8000 bash
 marie@romeo$ module load modenv/scs5
 ```
 
-## Machine Learning Console and Virtual Environment
+## Machine Learning via Console
 
-A virtual environment is a cooperatively isolated runtime environment that allows Python users and
-applications to install and update Python distribution packages without interfering with the
-behaviour of other Python applications running on the same system. At its core, the main purpose of
-Python virtual environments is to create an isolated environment for Python projects.
+### Python and Virtual Environments
 
-### Conda virtual environment
+Python users should use a virtual environment when conducting machine learning tasks via console.
+In case of using [sbatch files](../jobs_and_resources/batch_systems.md) to send your job you usually
+don't need a virtual environment.
 
-[Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html)
-is an open-source package management system and environment management system from the Anaconda.
+For more details on machine learning or data science with Python see [here](data_analytics_with_python.md).
 
-```console
-marie@login$ srun -p ml -N 1 -n 1 -c 2 --gres=gpu:1 --time=01:00:00 --pty --mem-per-cpu=8000 bash   #job submission in ml nodes with allocating: 1 node, 1 task per node, 2 CPUs per task, 1 gpu per node, with 8000 mb on 1 hour.
-marie@ml$ module load modenv/ml                    #example output: The following have been reloaded with a version change:  1) modenv/scs5 =&gt; modenv/ml
-marie@ml$ mkdir python-virtual-environments        #create folder for your environments
-marie@ml$ cd python-virtual-environments           #go to folder
-marie@ml$ which python                             #check which python are you using
-marie@ml$ python3 -m venv --system-site-packages env                         #create virtual environment "env" which inheriting with global site packages
-marie@ml$ source env/bin/activate                                            #activate virtual environment "env". Example output: (env) bash-4.2$
-```
-
-The inscription (env) at the beginning of each line represents that now you are in the virtual
-environment.
+### R
 
-### Python virtual environment
+R also supports machine learning via console. It does not require a virtual environment due to a
+different package managment.
 
-**Virtualenv (venv)** is a standard Python tool to create isolated Python environments.
-It has been integrated into the standard library under the [venv module](https://docs.python.org/3/library/venv.html).
-
-```console
-marie@login$ srun -p ml -N 1 -n 1 -c 2 --gres=gpu:1 --time=01:00:00 --pty --mem-per-cpu=8000 bash   #job submission in ml nodes with allocating: 1 node, 1 task per node, 2 CPUs per task, 1 gpu per node, with 8000 mb on 1 hour.
-marie@ml$ module load modenv/ml                    #example output: The following have been reloaded with a version change:  1) modenv/scs5 =&gt; modenv/ml
-marie@ml$ mkdir python-virtual-environments        #create folder for your environments
-marie@ml$ cd python-virtual-environments           #go to folder
-marie@ml$ which python                             #check which python are you using
-marie@ml$ python3 -m venv --system-site-packages env                         #create virtual environment "env" which inheriting with global site packages
-marie@ml$ source env/bin/activate                                            #activate virtual environment "env". Example output: (env) bash-4.2$
-```
-
-The inscription (env) at the beginning of each line represents that now you are in the virtual
-environment.
-
-Note: However in case of using [sbatch files](link) to send your job you usually don't need a
-virtual environment.
+For more details on machine learning or data science with R see [here](../data_analytics_with_r/#r-console).
 
 ## Machine Learning with Jupyter
 
@@ -96,6 +69,9 @@ your Jupyter notebooks on HPC nodes.
 After accessing JupyterHub, you can start a new session and configure it. For machine learning
 purposes, select either **Alpha** or **ML** partition and the resources, your application requires.
 
+In your session you can use [Python](../data_analytics_with_python/#jupyter-notebooks), [R](../data_analytics_with_r/#r-in-jupyterhub)
+or [R studio](data_analytics_with_rstudio) for your machine learning and data science topics.
+
 ## Machine Learning with Containers
 
 Some machine learning tasks require using containers. In the HPC domain, the [Singularity](https://singularity.hpcng.org/)
@@ -140,7 +116,7 @@ different values but 4 should be a pretty good starting point.
 marie@compute$ export NCCL_MIN_NRINGS=4
 ```
 
-### HPC
+### HPC related Software
 
 The following HPC related software is installed on all nodes:
 
-- 
GitLab