diff --git a/doc.zih.tu-dresden.de/docs/software/hyperparameter_optimization.md b/doc.zih.tu-dresden.de/docs/software/hyperparameter_optimization.md index 588a129968bf63612da5e6c295ee918b489719f7..3e2ad38e1bfcb71e7420aeb7a1a2e56922ae3c7e 100644 --- a/doc.zih.tu-dresden.de/docs/software/hyperparameter_optimization.md +++ b/doc.zih.tu-dresden.de/docs/software/hyperparameter_optimization.md @@ -62,9 +62,9 @@ There are the following script preparation steps for OmniOpt: # coding: utf-8 # # Example for using OmniOpt - # + # # source code taken from: https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html - # parameters under consideration:# + # parameters under consideration:# # 1. batch size # 2. epochs # 3. size output layer 1 @@ -181,7 +181,7 @@ There are the following script preparation steps for OmniOpt: correct /= size print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n") - + #print statement esp. for OmniOpt (single line!!) print(f"RESULT: {test_loss:>8f} \n") @@ -195,23 +195,24 @@ There are the following script preparation steps for OmniOpt: 1. Testing script functionality and determine software requirements for the chosen [partition](../jobs_and_resources/system_taurus.md#partitions). In the following the alpha partition is used. Please note the parameters `--out-layer1`, `--batchsize`, `--epochs` when calling the python script. - Additionally, note the `RESULT` string with the output for OmniOpt. - - ??? hint "Hint for installing Python modules" - Note that for this example the module `torchvision` is not available on the alpha partition and it is installed by creating a [virtual environment](python_virtual_environments.md). - It is recommended to install such a virtual environment into a [workspace](../data_lifecycle/workspaces.md). - ``` console - marie@login$ module load modenv/hiera GCC/10.2.0 CUDA/11.1.1 OpenMPI/4.0.5 PyTorch/1.9.0 - marie@login$ mkdir </path/to/workspace/python-environments> #create folder - marie@login$ virtualenv --system-site-packages </path/to/workspace/python-environments/torchvision_env> - marie@login$ source </path/to/workspace/python-environments/torchvision_env>/bin/activate #activate virtual environment - marie@login$ pip install torchvision #install torchvision module - ``` - - ```console +Additionally, note the `RESULT` string with the output for OmniOpt. + +??? hint "Hint for installing Python modules" +Note that for this example the module `torchvision` is not available on the alpha partition and it is installed by creating a [virtual environment](python_virtual_environments.md). +It is recommended to install such a virtual environment into a [workspace](../data_lifecycle/workspaces.md). + + ``` console + marie@login$ module load modenv/hiera GCC/10.2.0 CUDA/11.1.1 OpenMPI/4.0.5 PyTorch/1.9.0 + marie@login$ mkdir </path/to/workspace/python-environments> #create folder + marie@login$ virtualenv --system-site-packages </path/to/workspace/python-environments/torchvision_env> + marie@login$ source </path/to/workspace/python-environments/torchvision_env>/bin/activate #activate virtual environment + marie@login$ pip install torchvision #install torchvision module + ``` + + ``` console marie@login$ srun -p alpha --gres=gpu:1 -n 1 -c 7 --pty --mem-per-cpu=800 bash #Job submission on alpha nodes with 1 gpu on 1 node with 800 Mb per CPU marie@alpha$ module load modenv/hiera GCC/10.2.0 CUDA/11.1.1 OpenMPI/4.0.5 PyTorch/1.9.0 - marie@alpha$ source </path/to/workspace/python-environments/torchvision_env>/bin/activate #activate virtual environment +marie@alpha$ source </path/to/workspace/python-environments/torchvision_env>/bin/activate #activate virtual environment Die folgenden Module wurden in einer anderen Version erneut geladen: 1) modenv/scs5 => modenv/hiera @@ -226,70 +227,122 @@ There are the following script preparation steps for OmniOpt: loss: 0.572221 [30000/60000] loss: 1.516888 [40000/60000] loss: 0.445737 [50000/60000] - Test Error: - Accuracy: 69.5%, Avg loss: 0.878329 + Test Error: + Accuracy: 69.5%, Avg loss: 0.878329 - RESULT: 0.878329 + RESULT: 0.878329 Done! ``` - Using the modified script within OmniOpt requires configuring and loading of the software environment. - The recommended way is to wrap the necessary calls in a shell script. +Using the modified script within OmniOpt requires configuring and loading of the software environment. +The recommended way is to wrap the necessary calls in a shell script. - ??? example "Example for wrapping with shell script" - ``` shell - #!/bin/bash -l - # ^ Shebang-Line, so that it is known that this is a bash file - # -l means 'load this as login shell', so that /etc/profile gets loaded and you can use 'module load' or 'ml' as usual +??? example "Example for wrapping with shell script" + ``` shell + #!/bin/bash -l + # ^ Shebang-Line, so that it is known that this is a bash file + # -l means 'load this as login shell', so that /etc/profile gets loaded and you can use 'module load' or 'ml' as usual - # If you use this script not via `./run.sh' or just `srun run.sh', but like `srun bash run.sh', please add the '-l' there too. - # Like this: - # srun bash -l run.sh + # If you use this script not via `./run.sh' or just `srun run.sh', but like `srun bash run.sh', please add the '-l' there too. + # Like this: + # srun bash -l run.sh - # Load modules your program needs, always specify versions! - module load modenv/hiera GCC/10.2.0 CUDA/11.1.1 OpenMPI/4.0.5 PyTorch/1.7.1 - source </path/to/workspace/python-environments/torchvision_env>/bin/activate #activate virtual environment + # Load modules your program needs, always specify versions! + module load modenv/hiera GCC/10.2.0 CUDA/11.1.1 OpenMPI/4.0.5 PyTorch/1.7.1 + source </path/to/workspace/python-environments/torchvision_env>/bin/activate #activate virtual environment - # Load your script. $@ is all the parameters that are given to this shell file. - python </path/to/your/script/mnistFashion.py> $@ - ``` + # Load your script. $@ is all the parameters that are given to this shell file. + python </path/to/your/script/mnistFashion.py> $@ + ``` - When the wrapped shell script is running properly the preparations are finished and the next step is configuring OmniOpt. +When the wrapped shell script is running properly the preparations are finished and the next step is configuring OmniOpt. ### Configure and Run OmniOpt Configuring OmniOpt is done via the GUI at [https://imageseg.scads.ai/omnioptgui/](https://imageseg.scads.ai/omnioptgui/){:target="_blank"}.. -This GUI guides through the configuration process and as result a configuration file is created automatically according to the GUI input. +This GUI guides through the configuration process and as result a configuration file is created automatically according to the GUI input. If you are more familiar with using OmniOpt later on, this configuration file can be modified directly without using the GUI. A screenshot of the GUI, including a properly configuration for the MNIST fashion example is shown below. -The GUI, in which the displayed values are already entered, can be reached [here](https://imageseg.scads.ai/omnioptgui/?maxevalserror=5&mem_per_worker=1000&number_of_parameters=3¶m_0_values=10%2C50%2C100¶m_1_values=8%2C16%2C32¶m_2_values=10%2C15%2C30&account=&partition=alpha&searchtype=tpe.suggest&objective_program=bash%20%3C%2Fpath%2Fto%2Fyour%2Fwrapper-script%2Frun-mnist-fashion.sh%3E%20--out-layer1%3D(%24x_0)%20--batchsize%3D(%24x_1)%20--epochs%3D(%24x_2)¶m_0_type=hp.choice¶m_1_type=hp.choice¶m_2_type=hp.choice¶m_0_name=out-layer1¶m_1_name=batchsize¶m_2_name=batchsize&projectname=mnist_fashion_optimization_set_1){:target="_blank"}. +The GUI, in which the below displayed values are already entered, can be reached [here](https://imageseg.scads.ai/omnioptgui/?maxevalserror=5&mem_per_worker=1000&number_of_parameters=3¶m_0_values=10%2C50%2C100¶m_1_values=8%2C16%2C32¶m_2_values=10%2C15%2C30¶m_0_name=out-layer1¶m_1_name=batchsize¶m_2_name=batchsize&account=&projectname=mnist_fashion_optimization_set_1&partition=alpha&searchtype=tpe.suggest¶m_0_type=hp.choice¶m_1_type=hp.choice¶m_2_type=hp.choice&max_evals=1000&objective_program=bash%20%3C%2Fpath%2Fto%2Fwrapper-script%2Frun-mnist-fashion.sh%3E%20--out-layer1%3D(%24x_0)%20--batchsize%3D(%24x_1)%20--epochs%3D(%24x_2)&workdir=%3C%2Fscratch%2Fws%2Fomniopt-workdir%2F%3E){:target="_blank"}. +Please modify the paths for "objective programm" and "workdir" according to your needs.  {: align="center"} -After all parameters are entered into the GUI, the call for OmniOpt is generated and displayed on the right. This command contains all necessary instructions (including requesting resources with Slurm). Thus, this command can be executed directly on a login node on the ZIH system. +Using OmniOpt for a first trial example, it is often sufficient to concentrate on the following configuration parameters: + +1. **Optimization run name:** +A name for a OmniOpt run given a belonging configuration. +1. **Partition:** +Choosing the partition on the ZIH system that fits the program needs. +1. **Enable GPU:** +Decide whether a program could benefit from GPU usage or not. +1. **Workdir:** +The directory where OmniOpt is saving its neccessary files and all results. +Derived from the optimization run name, their is created a single directory for every such configuration. +Make sure that this working directory is writeable from the compute nodes. +It is recommended to use a [workspace](../data_lifecycle/workspaces.md). +1. **Objective program:** +Provide all information for program execution. Typically, this will contain the command for executing a wrapper script. +1. **Parameters:** +The hyperparameters to be optimized with the names OmniOpt should use. +For the example here, the variable names are identical to the input parameters of the Python script. +However, these names can be chosen differently, since the connection to OmniOpt is realized via the variables ($x_0), ($x_1), etc. from the GUI section "Objective program". +Please note that it is not necessary to name the parameters explicitly in your script but only within the OmniOpt configuration. + +After all parameters are entered into the GUI, the call for OmniOpt is generated automatically and displayed on the right. This command contains all necessary instructions (including requesting resources with Slurm). +**Thus, this command can be executed directly on a login node on the ZIH system.**  {: align="center"} +After executing this command OmniOpt is doing all the magic in the background and there are no further actions necessary. + +??? hint "Hints on the working directory" + 1. Starting OmniOpt without providing a working directory will store OmniOpt into the present directory. + 1. Within the given working directory a new folder named "omniopt" as default, is created. + 1. Within one OmniOpt working directory there can be multiple optimization projects. + 1. It is possible to have as many working directories as you want (with multiple optimization runs). + 1. It is recommended to use a [workspace](../data_lifecycle/workspaces.md) as working directory, but not the home directory. + ### Check and Evaluate OmniOpt Results -TODO +For getting informed about the current status of OmniOpt or for looking into results, the evaluation tool of OmniOpt is used. +Switch to the OmniOpt folder and run ```evaluate-run.sh```. + + ``` console + marie@login$ bash </scratch/ws/omniopt-workdir/>evaluate-run.sh + ``` -## Details on OmniOpt +After initializing and checking for updates in the background Omniopt is asking to select the optimization run of interest. +After selecting the optimization run, there will be a menu with the items as shown below. +If OmniOpt has still running jobs there appear some menu items that refer to these running jobs (image shown below to the right). -TODO +evaluation options (all jobs finished) | evaluation options (still running jobs) +:--------------------------------------------------------------:|:-------------------------: + |  -### Configuration +For now we assume that OmniOpt has finshed already. +In order to look into the results, there are the following basic approaches. -TODO +1. **Graphical approach:** + There are basically two graphical approaches: two dimensional scatterplots and parallel plots. -### Monitoring + Below there is shown a parallel plot from the MNIST fashion example. + {: align="center"} -TODO + ??? hint "Hints on parallel plots" + Parallel plots are suitable especially for dealing with multiple dimensions. + The parallel plot created by OmniOpt is an interactive html file that is stored in the ominopt working directory under ```projects/<name_of_optimization_run>/parallel-plot```. + The interactivity of this plot is intended to make optimal combinations of the hyperparameters visible more easily. + Get more information about this interactivity by clicking the "Help" button at the top of the graphic (see red arrow on the image above). -### Evaluation of Results + After creating the plot OmniOpt suggests to open the html file directly in a web browser on the ZIH system. + Therefore, it is neccessary to login via ssh with the option -X (X11 forwarding), e.g. ```ssh -X taurus.hrsk.tu-dresden.de```. + Nevertheless, because of latency using x11 forwarding it is recommended to download the html file and explore the parallel plot on the local machine. -TODO +1. **Getting the raw data:** + As a second approach the raw data of the optimization process can be exported as a csv file. + The created output files are stored in the folder ```projects/<name_of_optimization_run>/csv```. diff --git a/doc.zih.tu-dresden.de/docs/software/misc/OmniOpt-evaluate-menu.png b/doc.zih.tu-dresden.de/docs/software/misc/OmniOpt-evaluate-menu.png new file mode 100644 index 0000000000000000000000000000000000000000..6d425818925017b52e455ddfb92b00904a0f302d Binary files /dev/null and b/doc.zih.tu-dresden.de/docs/software/misc/OmniOpt-evaluate-menu.png differ diff --git a/doc.zih.tu-dresden.de/docs/software/misc/OmniOpt-graph-result.png b/doc.zih.tu-dresden.de/docs/software/misc/OmniOpt-graph-result.png new file mode 100644 index 0000000000000000000000000000000000000000..8dbbec668465134bbd35a78d63052b7c7d253d0e Binary files /dev/null and b/doc.zih.tu-dresden.de/docs/software/misc/OmniOpt-graph-result.png differ diff --git a/doc.zih.tu-dresden.de/docs/software/misc/OmniOpt-parallel-plot.png b/doc.zih.tu-dresden.de/docs/software/misc/OmniOpt-parallel-plot.png new file mode 100644 index 0000000000000000000000000000000000000000..3702d69383fe4cb248456102f97e8a7fc8127ca0 Binary files /dev/null and b/doc.zih.tu-dresden.de/docs/software/misc/OmniOpt-parallel-plot.png differ diff --git a/doc.zih.tu-dresden.de/docs/software/misc/OmniOpt-still-running-jobs.png b/doc.zih.tu-dresden.de/docs/software/misc/OmniOpt-still-running-jobs.png new file mode 100644 index 0000000000000000000000000000000000000000..d4cd05138805d13e6eedd61b3ad8b0c5c9416afe Binary files /dev/null and b/doc.zih.tu-dresden.de/docs/software/misc/OmniOpt-still-running-jobs.png differ diff --git a/doc.zih.tu-dresden.de/docs/software/misc/hyperparameter_optimization-OmniOpt-GUI.png b/doc.zih.tu-dresden.de/docs/software/misc/hyperparameter_optimization-OmniOpt-GUI.png index f57b78d8e26ebf8babfe369f5334f1faa5fd0dc4..c292e7cefb46224585894acc8623e1bfa9878052 100644 Binary files a/doc.zih.tu-dresden.de/docs/software/misc/hyperparameter_optimization-OmniOpt-GUI.png and b/doc.zih.tu-dresden.de/docs/software/misc/hyperparameter_optimization-OmniOpt-GUI.png differ diff --git a/doc.zih.tu-dresden.de/docs/software/misc/hyperparameter_optimization-OmniOpt-final-command.png b/doc.zih.tu-dresden.de/docs/software/misc/hyperparameter_optimization-OmniOpt-final-command.png index d584bfb235e5aac46a8dd99860c4b75bd1e59dbb..b0b714462939f9acbd2e25e0d0eb39b431dba5de 100644 Binary files a/doc.zih.tu-dresden.de/docs/software/misc/hyperparameter_optimization-OmniOpt-final-command.png and b/doc.zih.tu-dresden.de/docs/software/misc/hyperparameter_optimization-OmniOpt-final-command.png differ