Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
hpc-compendium
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Deploy
Releases
Package registry
Container Registry
Model registry
Operate
Terraform modules
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
ZIH
hpcsupport
hpc-compendium
Commits
d779ca8f
Commit
d779ca8f
authored
3 years ago
by
Elias Werner
Browse files
Options
Downloads
Patches
Plain Diff
add first draft for tensorboard
fixes in tensorflow
parent
1fb4fc87
No related branches found
Branches containing commit
No related tags found
5 merge requests
!333
Draft: update NGC containers
,
!322
Merge preview into main
,
!319
Merge preview into main
,
!279
Draft: Machine Learning restructuring
,
!258
Data Analytics restructuring
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
doc.zih.tu-dresden.de/docs/software/tensorboard.md
+52
-0
52 additions, 0 deletions
doc.zih.tu-dresden.de/docs/software/tensorboard.md
doc.zih.tu-dresden.de/docs/software/tensorflow.md
+2
-2
2 additions, 2 deletions
doc.zih.tu-dresden.de/docs/software/tensorflow.md
with
54 additions
and
2 deletions
doc.zih.tu-dresden.de/docs/software/tensorboard.md
+
52
−
0
View file @
d779ca8f
# TensorBoard
TensorBoard is a visualization toolkit for TensorFlow and offers a variety of functionalities such
as presentation of loss and accuracy, visualization of the model graph or profiling of the
application.
On ZIH systems, TensorBoard is only available as an extension of the TensorFlow module. To check
whether a specific TensorFlow module provides TensorBoard, use the following command:
```
console
marie@compute$
module spider TensorFlow/2.3.1
```
If TensorBoard occurs in the
`Included extensions`
section of the output, TensorBoard is available.
## Using TensorBoard
To use TensorBoard, you have to connect via ssh to taurus as usual, schedule an interactive job and
load a TensorFlow module:
```
console
marie@login$
srun
-p
alpha
-n
1
-c
1
--pty
--mem-per-cpu
=
8000 bash
#Job submission on alpha node
marie@alpha$
module load TensorFlow/2.3.1
marie@alpha$
tensorboard
--logdir
/scratch/gpfs/<YourNetID>/myproj/log
--bind_all
```
Then create a workspace for the event data, that should be visualized in TensorBoard. If you already
have an event data directory, you can skip that step.
```
console
marie@alpha$
ws_allocate
-F
scratch tensorboard_logdata 1
```
Now you can run your TensorFlow application. Note that you might have to adapt your code to make it
accessible for TensorBoard. Please find further information on the official
[
TensorBoard website
](
https://www.tensorflow.org/tensorboard/get_started
)
Then you can start TensorBoard and pass the directory of the event data:
```
console
marie@alpha$
tensorboard
--logdir
/scratch/ws/1/marie-tensorboard_logdata
--bind_all
```
TensorBoard will then return a server address on taurus, e.g.
`taurusi8034.taurus.hrsk.tu-dresden.de:6006`
For accessing TensorBoard now, you have to set up some port forwarding via ssh to your local
machine:
```
console
marie@local$
ssh
-N
-f
-L
6006:taurusi8034.taurus.hrsk.tu-dresden.de:6006 <zih-login>@taurus.hrsk.tu-dresden.de
```
Now you can see the tensorboard in your browser at
`http://localhost:6006/`
.
Note that you can also use tensorboard in an
[
sbatch file
](
../jobs_and_resources/batch_systems.md
)
.
This diff is collapsed.
Click to expand it.
doc.zih.tu-dresden.de/docs/software/tensorflow.md
+
2
−
2
View file @
d779ca8f
...
@@ -8,7 +8,7 @@ resources.
...
@@ -8,7 +8,7 @@ resources.
Please check the software modules list via
Please check the software modules list via
```
console
```
console
marie@
login
$
module spider TensorFlow
marie@
compute
$
module spider TensorFlow
```
```
to find out, which TensorFlow modules are available on your partition.
to find out, which TensorFlow modules are available on your partition.
...
@@ -26,7 +26,7 @@ On the **Alpha** partition load the module environment:
...
@@ -26,7 +26,7 @@ On the **Alpha** partition load the module environment:
```
console
```
console
marie@login$
srun
-p
alpha
--gres
=
gpu:1
-n
1
-c
7
--pty
--mem-per-cpu
=
8000 bash
#Job submission on alpha nodes with 1 gpu on 1 node with 8000 Mb per CPU
marie@login$
srun
-p
alpha
--gres
=
gpu:1
-n
1
-c
7
--pty
--mem-per-cpu
=
8000 bash
#Job submission on alpha nodes with 1 gpu on 1 node with 8000 Mb per CPU
marie@
romeo
$
module load modenv/scs5
marie@
alpha
$
module load modenv/scs5
```
```
On the
**ML**
partition load the module environment:
On the
**ML**
partition load the module environment:
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment