diff --git a/doc.zih.tu-dresden.de/docs/software/tensorboard.md b/doc.zih.tu-dresden.de/docs/software/tensorboard.md index e69de29bb2d1d6434b8b29ae775ad8c2e48c5391..7272c7f2bbaa5da37f9a2e390812b26b51a17e34 100644 --- a/doc.zih.tu-dresden.de/docs/software/tensorboard.md +++ b/doc.zih.tu-dresden.de/docs/software/tensorboard.md @@ -0,0 +1,52 @@ +# TensorBoard + +TensorBoard is a visualization toolkit for TensorFlow and offers a variety of functionalities such +as presentation of loss and accuracy, visualization of the model graph or profiling of the +application. +On ZIH systems, TensorBoard is only available as an extension of the TensorFlow module. To check +whether a specific TensorFlow module provides TensorBoard, use the following command: + +```console +marie@compute$ module spider TensorFlow/2.3.1 +``` + +If TensorBoard occurs in the `Included extensions` section of the output, TensorBoard is available. + +## Using TensorBoard + +To use TensorBoard, you have to connect via ssh to taurus as usual, schedule an interactive job and +load a TensorFlow module: + +```console +marie@login$ srun -p alpha -n 1 -c 1 --pty --mem-per-cpu=8000 bash #Job submission on alpha node +marie@alpha$ module load TensorFlow/2.3.1 +marie@alpha$ tensorboard --logdir /scratch/gpfs/<YourNetID>/myproj/log --bind_all +``` + +Then create a workspace for the event data, that should be visualized in TensorBoard. If you already +have an event data directory, you can skip that step. + +```console +marie@alpha$ ws_allocate -F scratch tensorboard_logdata 1 +``` + +Now you can run your TensorFlow application. Note that you might have to adapt your code to make it +accessible for TensorBoard. Please find further information on the official [TensorBoard website](https://www.tensorflow.org/tensorboard/get_started) +Then you can start TensorBoard and pass the directory of the event data: + +```console +marie@alpha$ tensorboard --logdir /scratch/ws/1/marie-tensorboard_logdata --bind_all +``` + +TensorBoard will then return a server address on taurus, e.g. `taurusi8034.taurus.hrsk.tu-dresden.de:6006` + +For accessing TensorBoard now, you have to set up some port forwarding via ssh to your local +machine: + +```console +marie@local$ ssh -N -f -L 6006:taurusi8034.taurus.hrsk.tu-dresden.de:6006 <zih-login>@taurus.hrsk.tu-dresden.de +``` + +Now you can see the tensorboard in your browser at `http://localhost:6006/`. + +Note that you can also use tensorboard in an [sbatch file](../jobs_and_resources/batch_systems.md). diff --git a/doc.zih.tu-dresden.de/docs/software/tensorflow.md b/doc.zih.tu-dresden.de/docs/software/tensorflow.md index c4101a5693d1b3a6a631f3d35439502f055c280e..f8a815c8be3d8cb4aed02e4f6ea1bb75ceb3fd80 100644 --- a/doc.zih.tu-dresden.de/docs/software/tensorflow.md +++ b/doc.zih.tu-dresden.de/docs/software/tensorflow.md @@ -8,7 +8,7 @@ resources. Please check the software modules list via ```console -marie@login$ module spider TensorFlow +marie@compute$ module spider TensorFlow ``` to find out, which TensorFlow modules are available on your partition. @@ -26,7 +26,7 @@ On the **Alpha** partition load the module environment: ```console marie@login$ srun -p alpha --gres=gpu:1 -n 1 -c 7 --pty --mem-per-cpu=8000 bash #Job submission on alpha nodes with 1 gpu on 1 node with 8000 Mb per CPU -marie@romeo$ module load modenv/scs5 +marie@alpha$ module load modenv/scs5 ``` On the **ML** partition load the module environment: