Skip to content
Snippets Groups Projects
Commit caefd310 authored by Elias Werner's avatar Elias Werner
Browse files

Merge branch 'ML_neustrukturierung' into DA_neustrukturierung

parents 902ff715 b57d1d2e
No related branches found
No related tags found
5 merge requests!333Draft: update NGC containers,!322Merge preview into main,!319Merge preview into main,!279Draft: Machine Learning restructuring,!258Data Analytics restructuring
...@@ -6,15 +6,15 @@ For machine learning purposes, we recommend to use the **Alpha** and/or **ML** p ...@@ -6,15 +6,15 @@ For machine learning purposes, we recommend to use the **Alpha** and/or **ML** p
## ML partition ## ML partition
The compute nodes of the ML partition are built on the base of [Power9](https://www.ibm.com/it-infrastructure/power/power9) The compute nodes of the ML partition are built on the base of [Power9](https://www.ibm.com/it-infrastructure/power/power9)
architecture from IBM. The system was created for AI challenges, analytics and working with, architecture from IBM. The system was created for AI challenges, analytics and working with
Machine learning, data-intensive workloads, deep-learning frameworks and accelerated databases. data-intensive workloads and accelerated databases.
The main feature of the nodes is the ability to work with the The main feature of the nodes is the ability to work with the
[NVIDIA Tesla V100](https://www.nvidia.com/en-gb/data-center/tesla-v100/) GPU with **NV-Link** [NVIDIA Tesla V100](https://www.nvidia.com/en-gb/data-center/tesla-v100/) GPU with **NV-Link**
support that allows a total bandwidth with up to 300 gigabytes per second (GB/sec). Each node on the support that allows a total bandwidth with up to 300 gigabytes per second (GB/sec). Each node on the
ml partition has 6x Tesla V-100 GPUs. You can find a detailed specification of the partition [here](../jobs_and_resources/power9.md). ml partition has 6x Tesla V-100 GPUs. You can find a detailed specification of the partition [here](../jobs_and_resources/power9.md).
**Note:** The ML partition is based on the PowerPC Architecture, which means that the software built **Note:** The ML partition is based on the Power9 architecture, which means that the software built
for x86_64 will not work on this partition. Also, users need to use the modules which are for x86_64 will not work on this partition. Also, users need to use the modules which are
specially made for the ml partition (from `modenv/ml`). specially made for the ml partition (from `modenv/ml`).
...@@ -29,7 +29,9 @@ marie@ml$ module load modenv/ml #example output: The following have been relo ...@@ -29,7 +29,9 @@ marie@ml$ module load modenv/ml #example output: The following have been relo
## Alpha partition ## Alpha partition
- describe alpha partition Another partition for machine learning tasks is Alpha. It is mainly dedicated to [ScaDS.AI](https://scads.ai/)
topics. Each node on Alpha has 2x AMD EPYC CPUs, 8x NVIDIA A100-SXM4 GPUs, 1TB RAM and 3.5TB local
space (/tmp) on an NVMe device. You can find more details of the partition [here](../jobs_and_resources/alpha_centauri.md).
### Modules ### Modules
...@@ -40,51 +42,24 @@ marie@login$ srun -p alpha --gres=gpu:1 -n 1 -c 7 --pty --mem-per-cpu=8000 bash ...@@ -40,51 +42,24 @@ marie@login$ srun -p alpha --gres=gpu:1 -n 1 -c 7 --pty --mem-per-cpu=8000 bash
marie@romeo$ module load modenv/scs5 marie@romeo$ module load modenv/scs5
``` ```
## Machine Learning Console and Virtual Environment ## Machine Learning via Console
A virtual environment is a cooperatively isolated run-time environment that allows Python users and
applications to install and update Python distribution packages without interfering with the
behavior of other Python applications running on the same system. At its core, the main purpose of
Python virtual environments is to create an isolated environment for Python projects.
### Conda virtual environment ### Python and Virtual Environments
[Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html) Python users should use a [virtual environment](python_virtual_environments.md) when conducting machine learning tasks via console.
is an open-source package management system and environment management system from the Anaconda. In case of using [sbatch files](../jobs_and_resources/batch_systems.md) to send your job you usually
don't need a virtual environment.
```console For more details on machine learning or data science with Python see [here](data_analytics_with_python.md).
marie@login$ srun -p ml -N 1 -n 1 -c 2 --gres=gpu:1 --time=01:00:00 --pty --mem-per-cpu=8000 bash #job submission in ml nodes with allocating: 1 node, 1 task per node, 2 CPUs per task, 1 gpu per node, with 8000 mb on 1 hour.
marie@ml$ module load modenv/ml #example output: The following have been reloaded with a version change: 1) modenv/scs5 => modenv/ml
marie@ml$ mkdir python-virtual-environments #create folder for your environments
marie@ml$ cd python-virtual-environments #go to folder
marie@ml$ which python #check which python are you using
marie@ml$ python3 -m venv --system-site-packages env #create virtual environment "env" which inheriting with global site packages
marie@ml$ source env/bin/activate #activate virtual environment "env". Example output: (env) bash-4.2$
```
The inscription (env) at the beginning of each line represents that now you are in the virtual ### R
environment.
### Python virtual environment R also supports machine learning via console. It does not require a virtual environment due to a
different package managment.
**virtualenv (venv)** is a standard Python tool to create isolated Python environments. For more details on machine learning or data science with R see [here](../data_analytics_with_r/#r-console).
It has been integrated into the standard library under the [venv module](https://docs.python.org/3/library/venv.html).
```console
marie@login$ srun -p ml -N 1 -n 1 -c 2 --gres=gpu:1 --time=01:00:00 --pty --mem-per-cpu=8000 bash #job submission in ml nodes with allocating: 1 node, 1 task per node, 2 CPUs per task, 1 gpu per node, with 8000 mb on 1 hour.
marie@ml$ module load modenv/ml #example output: The following have been reloaded with a version change: 1) modenv/scs5 => modenv/ml
marie@ml$ mkdir python-virtual-environments #create folder for your environments
marie@ml$ cd python-virtual-environments #go to folder
marie@ml$ which python #check which python are you using
marie@ml$ python3 -m venv --system-site-packages env #create virtual environment "env" which inheriting with global site packages
marie@ml$ source env/bin/activate #activate virtual environment "env". Example output: (env) bash-4.2$
```
The inscription (env) at the beginning of each line represents that now you are in the virtual
environment.
Note: However in case of using [sbatch files](link) to send your job you usually don't need a
virtual environment.
## Machine Learning with Jupyter ## Machine Learning with Jupyter
...@@ -96,6 +71,9 @@ your Jupyter notebooks on HPC nodes. ...@@ -96,6 +71,9 @@ your Jupyter notebooks on HPC nodes.
After accessing JupyterHub, you can start a new session and configure it. For machine learning After accessing JupyterHub, you can start a new session and configure it. For machine learning
purposes, select either **Alpha** or **ML** partition and the resources, your application requires. purposes, select either **Alpha** or **ML** partition and the resources, your application requires.
In your session you can use [Python](../data_analytics_with_python/#jupyter-notebooks), [R](../data_analytics_with_r/#r-in-jupyterhub)
or [R studio](data_analytics_with_rstudio) for your machine learning and data science topics.
## Machine Learning with Containers ## Machine Learning with Containers
Some machine learning tasks require using containers. In the HPC domain, the [Singularity](https://singularity.hpcng.org/) Some machine learning tasks require using containers. In the HPC domain, the [Singularity](https://singularity.hpcng.org/)
...@@ -139,7 +117,7 @@ different values but 4 should be a pretty good starting point. ...@@ -139,7 +117,7 @@ different values but 4 should be a pretty good starting point.
marie@compute$ export NCCL_MIN_NRINGS=4 marie@compute$ export NCCL_MIN_NRINGS=4
``` ```
### HPC ### HPC related Software
The following HPC related software is installed on all nodes: The following HPC related software is installed on all nodes:
......
# TensorBoard
TensorBoard is a visualization toolkit for TensorFlow and offers a variety of functionalities such
as presentation of loss and accuracy, visualization of the model graph or profiling of the
application.
On ZIH systems, TensorBoard is only available as an extension of the TensorFlow module. To check
whether a specific TensorFlow module provides TensorBoard, use the following command:
```console
marie@compute$ module spider TensorFlow/2.3.1
```
If TensorBoard occurs in the `Included extensions` section of the output, TensorBoard is available.
## Using TensorBoard
To use TensorBoard, you have to connect via ssh to taurus as usual, schedule an interactive job and
load a TensorFlow module:
```console
marie@login$ srun -p alpha -n 1 -c 1 --pty --mem-per-cpu=8000 bash #Job submission on alpha node
marie@alpha$ module load TensorFlow/2.3.1
marie@alpha$ tensorboard --logdir /scratch/gpfs/<YourNetID>/myproj/log --bind_all
```
Then create a workspace for the event data, that should be visualized in TensorBoard. If you already
have an event data directory, you can skip that step.
```console
marie@alpha$ ws_allocate -F scratch tensorboard_logdata 1
```
Now you can run your TensorFlow application. Note that you might have to adapt your code to make it
accessible for TensorBoard. Please find further information on the official [TensorBoard website](https://www.tensorflow.org/tensorboard/get_started)
Then you can start TensorBoard and pass the directory of the event data:
```console
marie@alpha$ tensorboard --logdir /scratch/ws/1/marie-tensorboard_logdata --bind_all
```
TensorBoard will then return a server address on taurus, e.g. `taurusi8034.taurus.hrsk.tu-dresden.de:6006`
For accessing TensorBoard now, you have to set up some port forwarding via ssh to your local
machine:
```console
marie@local$ ssh -N -f -L 6006:taurusi8034.taurus.hrsk.tu-dresden.de:6006 <zih-login>@taurus.hrsk.tu-dresden.de
```
Now you can see the tensorboard in your browser at `http://localhost:6006/`.
Note that you can also use tensorboard in an [sbatch file](../jobs_and_resources/batch_systems.md).
...@@ -8,7 +8,7 @@ resources. ...@@ -8,7 +8,7 @@ resources.
Please check the software modules list via Please check the software modules list via
```console ```console
marie@login$ module spider TensorFlow marie@compute$ module spider TensorFlow
``` ```
to find out, which TensorFlow modules are available on your partition. to find out, which TensorFlow modules are available on your partition.
...@@ -26,7 +26,7 @@ On the **Alpha** partition load the module environment: ...@@ -26,7 +26,7 @@ On the **Alpha** partition load the module environment:
```console ```console
marie@login$ srun -p alpha --gres=gpu:1 -n 1 -c 7 --pty --mem-per-cpu=8000 bash #Job submission on alpha nodes with 1 gpu on 1 node with 8000 Mb per CPU marie@login$ srun -p alpha --gres=gpu:1 -n 1 -c 7 --pty --mem-per-cpu=8000 bash #Job submission on alpha nodes with 1 gpu on 1 node with 8000 Mb per CPU
marie@romeo$ module load modenv/scs5 marie@alpha$ module load modenv/scs5
``` ```
On the **ML** partition load the module environment: On the **ML** partition load the module environment:
...@@ -50,27 +50,34 @@ marie@ml$ tensorflow-test ...@@ -50,27 +50,34 @@ marie@ml$ tensorflow-test
Basic test of tensorflow - A Hello World!!!... Basic test of tensorflow - A Hello World!!!...
``` ```
Following example shows how to create python virtual environment and import TensorFlow. ??? example
Following example shows how to create python virtual environment and import TensorFlow.
```console ```console
marie@ml$ mkdir python-environments #create folder marie@ml$ mkdir python-environments #create folder
marie@ml$ which python #check which python are you using marie@ml$ which python #check which python are you using
/sw/installed/Python/3.7.4-GCCcore-8.3.0/bin/python /sw/installed/Python/3.7.4-GCCcore-8.3.0/bin/python
marie@ml$ virtualenv --system-site-packages python-environments/env #create virtual environment "env" which inheriting with global site packages marie@ml$ virtualenv --system-site-packages python-environments/env #create virtual environment "env" which inheriting with global site packages
[...] [...]
marie@ml$ source python-environments/env/bin/activate #activate virtual environment "env". Example output: (env) bash-4.2$ marie@ml$ source python-environments/env/bin/activate #activate virtual environment "env". Example output: (env) bash-4.2$
marie@ml$ python -c "import tensorflow as tf; print(tf.__version__)" marie@ml$ python -c "import tensorflow as tf; print(tf.__version__)"
``` ```
## TensorFlow in JupyterHub ## TensorFlow in JupyterHub
In addition to using interactive and batch jobs, it is possible to work with TensorFlow using In addition to using interactive and batch jobs, it is possible to work with TensorFlow using
JupyterHub. The production and test environments of JupyterHub contain Python and R kernels, that JupyterHub. The production and test environments of JupyterHub contain Python and R kernels, that
both come with a TensorFlow support. both come with a TensorFlow support. However, you can specify the TensorFlow version when spawning
the notebook by pre-loading a specific TensorFlow module:
![TensorFlow module in JupyterHub](misc/tensorflow_jupyter_module.png) ![TensorFlow module in JupyterHub](misc/tensorflow_jupyter_module.png)
{: align="center"} {: align="center"}
??? hint
You can also define your own Jupyter kernel for more specific tasks. Please read there
documentation about JupyterHub, Jupyter kernels and virtual environments
[here](../../access/jupyterhub/#creating-and-using-your-own-environment).
## TensorFlow in Containers ## TensorFlow in Containers
Another option to use TensorFlow are containers. In the HPC domain, the Another option to use TensorFlow are containers. In the HPC domain, the
......
# Container on HPC-DA (TensorFlow, PyTorch)
<span class="twiki-macro RED"></span> **Note: This page is under
construction** <span class="twiki-macro ENDCOLOR"></span>
\<span style="font-size: 1em;">A container is a standard unit of
software that packages up code and all its dependencies so the
application runs quickly and reliably from one computing environment to
another.\</span>
**Prerequisites:** To work with Tensorflow, you need \<a href="Login"
target="\_blank">access\</a> for the Taurus system and basic knowledge
about containers, Linux systems.
**Aim** of this page is to introduce users on how to use Machine
Learning Frameworks such as TensorFlow or PyTorch on the \<a
href="HPCDA" target="\_self">HPC-DA\</a> system - part of the TU Dresden
HPC system.
Using a container is one of the options to use Machine learning
workflows on Taurus. Using containers gives you more flexibility working
with modules and software but at the same time required more effort.
\<span style="font-size: 1em;">On Taurus \</span>\<a
href="<https://sylabs.io/>" target="\_blank">Singularity\</a>\<span
style="font-size: 1em;"> used as a standard container solution.
Singularity enables users to have full control of their environment.
Singularity containers can be used to package entire scientific
workflows, software and libraries, and even data. This means that
\</span>**you dont have to ask an HPC support to install anything for
you - you can put it in a Singularity container and run!**\<span
style="font-size: 1em;">As opposed to Docker (the most famous container
solution), Singularity is much more suited to being used in an HPC
environment and more efficient in many cases. Docker containers also can
easily be used in Singularity.\</span>
Future information is relevant for the HPC-DA system (ML partition)
based on Power9 architecture.
In some cases using Singularity requires a Linux machine with root
privileges, the same architecture and a compatible kernel. For many
reasons, users on Taurus cannot be granted root permissions. A solution
is a Virtual Machine (VM) on the ml partition which allows users to gain
root permissions in an isolated environment. There are two main options
on how to work with VM on Taurus:
1\. [VM tools](vm_tools.md). Automative algorithms for using virtual
machines;
2\. [Manual method](virtual_machines.md). It required more operations but gives you
more flexibility and reliability.
Short algorithm to run the virtual machine manually:
srun -p ml -N 1 -c 4 --hint=nomultithread --cloud=kvm --pty /bin/bash<br />cat ~/.cloud_$SLURM_JOB_ID #Example output: ssh root@192.168.0.1<br />ssh root@192.168.0.1 #Copy and paste output from the previous command <br />./mount_host_data.sh
with VMtools:
VMtools contains two main programs:
**\<span>buildSingularityImage\</span>** and
**\<span>startInVM.\</span>**
Main options on how to create a container on ML nodes:
1\. Create a container from the definition
1.1 Create a Singularity definition from the Dockerfile.
\<span style="font-size: 1em;">2. Importing container from the \</span>
[DockerHub](https://hub.docker.com/search?q=ppc64le&type=image&page=1)\<span
style="font-size: 1em;"> or \</span>
[SingularityHub](https://singularity-hub.org/)
Two main sources for the Tensorflow containers for the Power9
architecture:
<https://hub.docker.com/r/ibmcom/tensorflow-ppc64le>
<https://hub.docker.com/r/ibmcom/powerai>
Pytorch:
<https://hub.docker.com/r/ibmcom/powerai>
-- Main.AndreiPolitov - 2020-01-03
# Tensorflow on Jupyter Notebook
%RED%Note: This page is under construction<span
class="twiki-macro ENDCOLOR"></span>
Disclaimer: This page dedicates a specific question. For more general
questions please check the JupyterHub webpage.
The Jupyter Notebook is an open-source web application that allows you
to create documents that contain live code, equations, visualizations,
and narrative text. \<span style="font-size: 1em;">Jupyter notebook
allows working with TensorFlow on Taurus with GUI (graphic user
interface) and the opportunity to see intermediate results step by step
of your work. This can be useful for users who dont have huge experience
with HPC or Linux. \</span>
**Prerequisites:** To work with Tensorflow and jupyter notebook you need
\<a href="Login" target="\_blank">access\</a> for the Taurus system and
basic knowledge about Python, SLURM system and the Jupyter notebook.
\<span style="font-size: 1em;"> **This page aims** to introduce users on
how to start working with TensorFlow on the [HPCDA](../jobs_and_resources/hpcda.md) system - part
of the TU Dresden HPC system with a graphical interface.\</span>
## Get started with Jupyter notebook
Jupyter notebooks are a great way for interactive computing in your web
browser. Jupyter allows working with data cleaning and transformation,
numerical simulation, statistical modelling, data visualization and of
course with machine learning.
\<span style="font-size: 1em;">There are two general options on how to
work Jupyter notebooks using HPC. \</span>
- \<span style="font-size: 1em;">There is \</span>**\<a
href="JupyterHub" target="\_self">jupyterhub\</a>** on Taurus, where
you can simply run your Jupyter notebook on HPC nodes. JupyterHub is
available [here](https://taurus.hrsk.tu-dresden.de/jupyter)
- For more specific cases you can run a manually created **remote
jupyter server.** \<span style="font-size: 1em;"> You can find the
manual server setup [here](deep_learning.md).
\<span style="font-size: 13px;">Keep in mind that with Jupyterhub you
can't work with some special instruments. However general data analytics
tools are available. Still and all, the simplest option for beginners is
using JupyterHub.\</span>
## Virtual environment
\<span style="font-size: 1em;">For working with TensorFlow and python
packages using virtual environments (kernels) is necessary.\</span>
Interactive code interpreters that are used by Jupyter Notebooks are
called kernels.\<br />Creating and using your kernel (environment) has
the benefit that you can install your preferred python packages and use
them in your notebooks.
A virtual environment is a cooperatively isolated runtime environment
that allows Python users and applications to install and upgrade Python
distribution packages without interfering with the behaviour of other
Python applications running on the same system. So the [Virtual
environment](https://docs.python.org/3/glossary.html#term-virtual-environment)
is a self-contained directory tree that contains a Python installation
for a particular version of Python, plus several additional packages. At
its core, the main purpose of Python virtual environments is to create
an isolated environment for Python projects. Python virtual environment is
the main method to work with Deep Learning software as TensorFlow on the
[HPCDA](../jobs_and_resources/hpcda.md) system.
### Conda and Virtualenv
There are two methods of how to work with virtual environments on
Taurus. **Vitualenv (venv)** is a
standard Python tool to create isolated Python environments. We
recommend using venv to work with Tensorflow and Pytorch on Taurus. It
has been integrated into the standard library under
the [venv](https://docs.python.org/3/library/venv.html).
However, if you have reasons (previously created environments etc) you
could easily use conda. The conda is the second way to use a virtual
environment on the Taurus.
[Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html)
is an open-source package management system and environment management system
from the Anaconda.
**Note:** Keep in mind that you **can not** use conda for working with
the virtual environments previously created with Vitualenv tool and vice
versa!
This example shows how to start working with environments and prepare
environment (kernel) for working with Jupyter server
srun -p ml --gres=gpu:1 -n 1 --pty --mem-per-cpu=8000 bash #Job submission in ml nodes with 1 gpu on 1 node with 8000 mb.
module load modenv/ml #example output: The following have been reloaded with a version change: 1) modenv/scs5 => modenv/ml
mkdir python-virtual-environments #create folder for your environments
cd python-virtual-environments #go to folder
module load TensorFlow #load TensorFlow module. Example output: Module TensorFlow/1.10.0-PythonAnaconda-3.6 and 1 dependency loaded.
which python #check which python are you using
python3 -m venv --system-site-packages env #create virtual environment "env" which inheriting with global site packages
source env/bin/activate #Activate virtual environment "env". Example output: (env) bash-4.2$
module load TensorFlow #load TensorFlow module in the virtual environment
The inscription (env) at the beginning of each line represents that now
you are in the virtual environment.
Now you can check the working capacity of the current environment.
python #start python
import tensorflow as tf
print(tf.VERSION) #example output: 1.14.0
### Install Ipykernel
Ipykernel is an interactive Python shell and a Jupyter kernel to work
with Python code in Jupyter notebooks. The IPython kernel is the Python
execution backend for Jupyter. The Jupyter Notebook
automatically ensures that the IPython kernel is available.
```
(env) bash-4.2$ pip install ipykernel #example output: Collecting ipykernel
...
#example output: Successfully installed ... ipykernel-5.1.0 ipython-7.5.0 ...
(env) bash-4.2$ python -m ipykernel install --user --name env --display-name="env"
#example output: Installed kernelspec my-kernel in .../.local/share/jupyter/kernels/env
[install now additional packages for your notebooks]
```
Deactivate the virtual environment
(env) bash-4.2$ deactivate
So now you have a virtual environment with included TensorFlow module.
You can use this workflow for your purposes particularly for the simple
running of your jupyter notebook with Tensorflow code.
## Examples and running the model
Below are brief explanations examples of Jupyter notebooks with
Tensorflow models which you can run on ml nodes of HPC-DA. Prepared
examples of TensorFlow models give you an understanding of how to work
with jupyterhub and tensorflow models. It can be useful and instructive
to start your acquaintance with Tensorflow and HPC-DA system from these
simple examples.
You can use a [remote Jupyter server](../access/jupyterhub.md). For simplicity, we
will recommend using Jupyterhub for our examples.
JupyterHub is available [here](https://taurus.hrsk.tu-dresden.de/jupyter)
Please check updates and details [JupyterHub](../access/jupyterhub.md). However,
the general pipeline can be briefly explained as follows.
After logging, you can start a new session and configure it. There are
simple and advanced forms to set up your session. On the simple form,
you have to choose the "IBM Power (ppc64le)" architecture. You can
select the required number of CPUs and GPUs. For the acquaintance with
the system through the examples below the recommended amount of CPUs and
1 GPU will be enough. With the advanced form, you can use the
configuration with 1 GPU and 7 CPUs. To access all your workspaces
use " / " in the workspace scope.
You need to download the file with a jupyter notebook that already
contains all you need for the start of the work. Please put the file
into your previously created virtual environment in your working
directory or use the kernel for your notebook.
Note: You could work with simple examples in your home directory but according to
[new storage concept](../data_lifecycle/hpc_storage_concept2019.md) please use
[workspaces](../data_lifecycle/workspaces.md) for your study and work projects**.
For this reason, you have to use advanced options and put "/" in "Workspace scope" field.
To download the first example (from the list below) into your previously
created virtual environment you could use the following command:
```
ws_list
cd <name_of_your_workspace> #go to workspace
wget https://doc.zih.tu-dresden.de/hpc-wiki/pub/Compendium/TensorFlowOnJupyterNotebook/Mnistmodel.zip
unzip Example_TensorFlow_Automobileset.zip
```
Also, you could use kernels for all notebooks, not only for them which placed
in your virtual environment. See the [jupyterhub](../access/jupyterhub.md) page.
### Examples:
1\. Simple MNIST model. The MNIST database is a large database of
handwritten digits that is commonly used for \<a
href="<https://en.wikipedia.org/wiki/Training_set>" title="Training
set">t\</a>raining various image processing systems. This model
illustrates using TF-Keras API. \<a
href="<https://www.tensorflow.org/guide/keras>"
target="\_top">Keras\</a> is TensorFlow's high-level API. Tensorflow and
Keras allow us to import and download the MNIST dataset directly from
their API. Recommended parameters for running this model is 1 GPU and 7
cores (28 thread)
[doc.zih.tu-dresden.de/hpc-wiki/pub/Compendium/TensorFlowOnJupyterNotebook/Mnistmodel.zip]**todo**(Mnistmodel.zip)
### Running the model
\<span style="font-size: 1em;">Documents are organized with tabs and a
very versatile split-screen feature. On the left side of the screen, you
can open your file. Use 'File-Open from Path' to go to your workspace
(e.g. /scratch/ws/\<username-name_of_your_ws>). You could run each cell
separately step by step and analyze the result of each step. Default
command for running one cell Shift+Enter'. Also, you could run all cells
with the command 'run all cells' how presented on the picture
below\</span>
**todo** \<img alt="Screenshot_from_2019-09-03_15-20-16.png" height="250"
src="Screenshot_from_2019-09-03_15-20-16.png"
title="Screenshot_from_2019-09-03_15-20-16.png" width="436" />
#### Additional advanced models
1\. A simple regression model uses [Automobile
dataset](https://archive.ics.uci.edu/ml/datasets/Automobile). In a
regression problem, we aim to predict the output of a continuous value,
in this case, we try to predict fuel efficiency. This is the simple
model created to present how to work with a jupyter notebook for the
TensorFlow models. Recommended parameters for running this model is 1
GPU and 7 cores (28 thread)
[doc.zih.tu-dresden.de/hpc-wiki/pub/Compendium/TensorFlowOnJupyterNotebook/Example_TensorFlow_Automobileset.zip]**todo**(Example_TensorFlow_Automobileset.zip)
2\. The regression model uses the
[dataset](https://archive.ics.uci.edu/ml/datasets/Beijing+PM2.5+Data)
with meteorological data from the Beijing airport and the US embassy.
The data set contains almost 50 thousand on instances and therefore
needs more computational effort. Recommended parameters for running this
model is 1 GPU and 7 cores (28 threads)
[doc.zih.tu-dresden.de/hpc-wiki/pub/Compendium/TensorFlowOnJupyterNotebook/Example_TensorFlow_Meteo_airport.zip]**todo**(Example_TensorFlow_Meteo_airport.zip)
**Note**: All examples created only for study purposes. The main aim is
to introduce users of the HPC-DA system of TU-Dresden with TensorFlow
and Jupyter notebook. Examples do not pretend to completeness or
science's significance. Feel free to improve the models and use them for
your study.
- [Mnistmodel.zip]**todo**(Mnistmodel.zip): Mnistmodel.zip
- [Example_TensorFlow_Automobileset.zip]**todo**(Example_TensorFlow_Automobileset.zip):
Example_TensorFlow_Automobileset.zip
- [Example_TensorFlow_Meteo_airport.zip]**todo**(Example_TensorFlow_Meteo_airport.zip):
Example_TensorFlow_Meteo_airport.zip
- [Example_TensorFlow_3D_road_network.zip]**todo**(Example_TensorFlow_3D_road_network.zip):
Example_TensorFlow_3D_road_network.zip
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment