Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
hpc-compendium
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Deploy
Releases
Package Registry
Container Registry
Model registry
Operate
Terraform modules
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
ZIH
hpcsupport
hpc-compendium
Commits
c5a34409
Commit
c5a34409
authored
3 years ago
by
Elias Werner
Browse files
Options
Downloads
Patches
Plain Diff
removed keras.md
parent
8cb2c031
No related branches found
No related tags found
5 merge requests
!333
Draft: update NGC containers
,
!322
Merge preview into main
,
!319
Merge preview into main
,
!279
Draft: Machine Learning restructuring
,
!258
Data Analytics restructuring
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
doc.zih.tu-dresden.de/docs/software/keras.md
+0
-237
0 additions, 237 deletions
doc.zih.tu-dresden.de/docs/software/keras.md
with
0 additions
and
237 deletions
doc.zih.tu-dresden.de/docs/software/keras.md
deleted
100644 → 0
+
0
−
237
View file @
8cb2c031
# Keras
This is an introduction on how to run a
Keras machine learning application on the new machine learning partition
of Taurus.
Keras is a high-level neural network API,
written in Python and capable of running on top of
[
TensorFlow
](
https://github.com/tensorflow/tensorflow
)
.
In this page,
[
Keras
](
https://www.tensorflow.org/guide/keras
)
will be
considered as a TensorFlow's high-level API for building and training
deep learning models. Keras includes support for TensorFlow-specific
functionality, such as
[
eager execution
](
https://www.tensorflow.org/guide/keras#eager_execution
)
,
[
tf.data
](
https://www.tensorflow.org/api_docs/python/tf/data
)
pipelines
and
[
estimators
](
https://www.tensorflow.org/guide/estimator
)
.
On the machine learning nodes (machine learning partition), you can use
the tools from
[
IBM Power AI
](
./power_ai.md
)
. PowerAI is an enterprise
software distribution that combines popular open-source deep learning
frameworks, efficient AI development tools (Tensorflow, Caffe, etc).
In machine learning partition (modenv/ml) Keras is available as part of
the Tensorflow library at Taurus and also as a separate module named
"Keras". For using Keras in machine learning partition you have two
options:
-
use Keras as part of the TensorFlow module;
-
use Keras separately and use Tensorflow as an interface between
Keras and GPUs.
**Prerequisites**
: To work with Keras you, first of all, need
[
access
](
../access/ssh_login.md
)
for the Taurus system, loaded
Tensorflow module on ml partition, activated Python virtual environment.
Basic knowledge about Python, SLURM system also required.
**Aim**
of this page is to introduce users on how to start working with
Keras and TensorFlow on the
[
HPC-DA
](
../jobs_and_resources/hpcda.md
)
system - part of the TU Dresden HPC system.
There are three main options on how to work with Keras and Tensorflow on
the HPC-DA: 1. Modules; 2. JupyterNotebook; 3. Containers. One of the
main ways is using the
**TODO LINK MISSING**
(Modules
system)(RuntimeEnvironment#Module_Environments) and Python virtual
environment. Please see the
[
Python page
](
./python.md
)
for the HPC-DA
system.
The information about the Jupyter notebook and the
**JupyterHub**
could
be found
[
here
](
../access/jupyterhub.md
)
. The use of
Containers is described
[
here
](
tensorflow_container_on_hpcda.md
)
.
Keras contains numerous implementations of commonly used neural-network
building blocks such as layers,
[
objectives
](
https://en.wikipedia.org/wiki/Objective_function
)
,
[
activation functions
](
https://en.wikipedia.org/wiki/Activation_function
)
[
optimizers
](
https://en.wikipedia.org/wiki/Mathematical_optimization
)
,
and a host of tools
to make working with image and text data easier. Keras, for example, has
a library for preprocessing the image data.
The core data structure of Keras is a
**model**
, a way to organize layers. The Keras functional API is the way
to go for defining as simple (sequential) as complex models, such as
multi-output models, directed acyclic graphs, or models with shared
layers.
## Getting started with Keras
This example shows how to install and start working with TensorFlow and
Keras (using the module system). To get started, import
[
tf.keras
](
https://www.tensorflow.org/api_docs/python/tf/keras
)
as part of your TensorFlow program setup.
tf.keras is TensorFlow's implementation of the
[
Keras API
specification
](
https://keras.io/
)
. This is a modified example that we
used for the
[
Tensorflow page
](
./tensorflow.md
)
.
```
bash
srun
-p
ml
--gres
=
gpu:1
-n
1
--pty
--mem-per-cpu
=
8000 bash
module load modenv/ml
#example output: The following have been reloaded with a version change: 1) modenv/scs5 => modenv/ml
mkdir
python-virtual-environments
cd
python-virtual-environments
module load TensorFlow
#example output: Module TensorFlow/1.10.0-PythonAnaconda-3.6 and 1 dependency loaded.
which python
python3
-m
venv
--system-site-packages
env
#create virtual environment "env" which inheriting with global site packages
source env
/bin/activate
#example output: (env) bash-4.2$
module load TensorFlow
python
import tensorflow as tf
from tensorflow.keras import layers
print
(
tf.VERSION
)
#example output: 1.10.0
print
(
tf.keras.__version__
)
#example output: 2.1.6-tf
```
As was said the core data structure of Keras is a
**model**
, a way to
organize layers. In Keras, you assemble
*layers*
to build
*models*
. A
model is (usually) a graph of layers. For our example we use the most
common type of model is a stack of layers. The below
[
example
](
https://www.tensorflow.org/guide/keras#model_subclassing
)
of using the advanced model with model
subclassing and custom layers illustrate using TF-Keras API.
```
python
import
tensorflow
as
tf
from
tensorflow.keras
import
layers
import
numpy
as
np
# Numpy arrays to train and evaluate a model
data
=
np
.
random
.
random
((
50000
,
32
))
labels
=
np
.
random
.
random
((
50000
,
10
))
# Create a custom layer by subclassing
class
MyLayer
(
layers
.
Layer
):
def
__init__
(
self
,
output_dim
,
**
kwargs
):
self
.
output_dim
=
output_dim
super
(
MyLayer
,
self
).
__init__
(
**
kwargs
)
# Create the weights of the layer
def
build
(
self
,
input_shape
):
shape
=
tf
.
TensorShape
((
input_shape
[
1
],
self
.
output_dim
))
# Create a trainable weight variable for this layer
self
.
kernel
=
self
.
add_weight
(
name
=
'
kernel
'
,
shape
=
shape
,
initializer
=
'
uniform
'
,
trainable
=
True
)
super
(
MyLayer
,
self
).
build
(
input_shape
)
# Define the forward pass
def
call
(
self
,
inputs
):
return
tf
.
matmul
(
inputs
,
self
.
kernel
)
# Specify how to compute the output shape of the layer given the input shape.
def
compute_output_shape
(
self
,
input_shape
):
shape
=
tf
.
TensorShape
(
input_shape
).
as_list
()
shape
[
-
1
]
=
self
.
output_dim
return
tf
.
TensorShape
(
shape
)
# Serializing the layer
def
get_config
(
self
):
base_config
=
super
(
MyLayer
,
self
).
get_config
()
base_config
[
'
output_dim
'
]
=
self
.
output_dim
return
base_config
@classmethod
def
from_config
(
cls
,
config
):
return
cls
(
**
config
)
# Create a model using your custom layer
model
=
tf
.
keras
.
Sequential
([
MyLayer
(
10
),
layers
.
Activation
(
'
softmax
'
)])
# The compile step specifies the training configuration
model
.
compile
(
optimizer
=
tf
.
compat
.
v1
.
train
.
RMSPropOptimizer
(
0.001
),
loss
=
'
categorical_crossentropy
'
,
metrics
=
[
'
accuracy
'
])
# Trains for 10 epochs(steps).
model
.
fit
(
data
,
labels
,
batch_size
=
32
,
epochs
=
10
)
```
## Running the sbatch script on ML modules (modenv/ml)
Generally, for machine learning purposes ml partition is used but for
some special issues, SCS5 partition can be useful. The following sbatch
script will automatically execute the above Python script on ml
partition. If you have a question about the sbatch script see the
article about
[
SLURM
](
./../jobs_and_resources/binding_and_distribution_of_tasks.md
)
.
Keep in mind that you need to put the executable file (Keras_example) with
python code to the same folder as bash script or specify the path.
```
bash
#!/bin/bash
#SBATCH --mem=4GB # specify the needed memory
#SBATCH -p ml # specify ml partition
#SBATCH --gres=gpu:1 # use 1 GPU per node (i.e. use one GPU per task)
#SBATCH --nodes=1 # request 1 node
#SBATCH --time=00:05:00 # runs for 5 minutes
#SBATCH -c 16 # how many cores per task allocated
#SBATCH -o HLR_Keras_example.out # save output message under HLR_${SLURMJOBID}.out
#SBATCH -e HLR_Keras_example.err # save error messages under HLR_${SLURMJOBID}.err
module load modenv/ml
module load TensorFlow
python Keras_example.py
## when finished writing, submit with: sbatch <script_name>
```
Output results and errors file you can see in the same folder in the
corresponding files after the end of the job. Part of the example
output:
```
......
Epoch 9/10
50000/50000 [==============================] - 2s 37us/sample - loss: 11.5159 - acc: 0.1000
Epoch 10/10
50000/50000 [==============================] - 2s 37us/sample - loss: 11.5159 - acc: 0.1020
```
## Tensorflow 2
[
TensorFlow 2.0
](
https://blog.tensorflow.org/2019/09/tensorflow-20-is-now-available.html
)
is a significant milestone for the
TensorFlow and the community. There are multiple important changes for
users.
Tere are a number of TensorFlow 2 modules for both ml and scs5
partitions in Taurus (2.0 (anaconda), 2.0 (python), 2.1 (python))
(11.04.20). Please check
**TODO MISSING DOC**
(the software modules list)(./SoftwareModulesList.md
for the information about available
modules.
<span
style=
"color:red"
>
**NOTE**
</span>
: Tensorflow 2 of the
current version is loading by default as a Tensorflow module.
TensorFlow 2.0 includes many API changes, such as reordering arguments,
renaming symbols, and changing default values for parameters. Thus in
some cases, it makes code written for the TensorFlow 1 not compatible
with TensorFlow 2. However, If you are using the high-level APIs
**(tf.keras)**
there may be little or no action you need to take to make
your code fully TensorFlow 2.0
[
compatible
](
https://www.tensorflow.org/guide/migrate
)
.
It is still possible to run 1.X code,
unmodified (
[
except for contrib
](
https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
)
), in TensorFlow 2.0:
```
python
import
tensorflow.compat.v1
as
tf
tf
.
disable_v2_behavior
()
#instead of "import tensorflow as tf"
```
To make the transition to TF 2.0 as seamless as possible, the TensorFlow
team has created the
[
tf_upgrade_v2
](
https://www.tensorflow.org/guide/upgrade
)
utility to help transition legacy code to the new API.
## F.A.Q:
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment