From a559b6bacb9111cbb1ad44ed1a89276281dc8ec5 Mon Sep 17 00:00:00 2001 From: Jan Frenzel <jan.frenzel@tu-dresden.de> Date: Tue, 23 Nov 2021 12:06:17 +0100 Subject: [PATCH] Cleaned custom_easy_build_environment.md. This resolves #97. --- .../software/custom_easy_build_environment.md | 211 ++++++++++-------- 1 file changed, 119 insertions(+), 92 deletions(-) diff --git a/doc.zih.tu-dresden.de/docs/software/custom_easy_build_environment.md b/doc.zih.tu-dresden.de/docs/software/custom_easy_build_environment.md index d482d89a4..4dbd33a72 100644 --- a/doc.zih.tu-dresden.de/docs/software/custom_easy_build_environment.md +++ b/doc.zih.tu-dresden.de/docs/software/custom_easy_build_environment.md @@ -1,133 +1,160 @@ # EasyBuild -Sometimes the \<a href="SoftwareModulesList" target="\_blank" -title="List of Modules">modules installed in the cluster\</a> are not -enough for your purposes and you need some other software or a different -version of a software. - -\<br />For most commonly used software, chances are high that there is -already a *recipe* that EasyBuild provides, which you can use. But what -is Easybuild? - -\<a href="<https://easybuilders.github.io/easybuild/>" -target="\_blank">EasyBuild\</a>\<span style="font-size: 1em;"> is the -software used to build and install software on, and create modules for, -Taurus.\</span> - -\<span style="font-size: 1em;">The aim of this page is to introduce -users to working with EasyBuild and to utilizing it to create -modules**.**\</span> - -**Prerequisites:** \<a href="Login" target="\_blank">access\</a> to the -Taurus system and basic knowledge about Linux, \<a href="SystemTaurus" -target="\_blank" title="SystemTaurus">Taurus\</a> and the \<a -href="RuntimeEnvironment" target="\_blank" -title="RuntimeEnvironment">modules system \</a>on Taurus. - -\<span style="font-size: 1em;">EasyBuild uses a configuration file -called recipe or "EasyConfig", which contains all the information about -how to obtain and build the software:\</span> +Sometimes the [modules installed in the cluster](modules.md) are not enough for your purposes and +you need some other software or a different version of a software. + +For most commonly used software, chances are high that there is already a *recipe* that EasyBuild +provides, which you can use. But what is Easybuild? + +[EasyBuild](https://easybuilders.github.io/easybuild/) is the software used to build and install +software on ZIH systems. + +The aim of this page is to introduce users to working with EasyBuild and to utilizing it to create +modules + +## Prerequisites + +1. [Shell access](../access/ssh_login.md) to ZIH systems +1. basic knowledge about: + - [the ZIH system](../jobs_and_resources/hardware_overview.md) + - [the module system](modules.md) on ZIH systems + +EasyBuild uses a configuration file called recipe or "EasyConfig", which contains all the +information about how to obtain and build the software: - Name - Version - Toolchain (think: Compiler + some more) - Download URL -- Buildsystem (e.g. configure && make or cmake && make) +- Buildsystem (e.g. `configure && make` or `cmake && make`) - Config parameters - Tests to ensure a successful build -The "Buildsystem" part is implemented in so-called "EasyBlocks" and -contains the common workflow. Sometimes those are specialized to -encapsulate behaviour specific to multiple/all versions of the software. -\<span style="font-size: 1em;">Everything is written in Python, which -gives authors a great deal of flexibility.\</span> +The build system part is implemented in so-called "EasyBlocks" and contains the common workflow. +Sometimes, those are specialized to encapsulate behaviour specific to multiple/all versions of the +software. Everything is written in Python, which gives authors a great deal of flexibility. ## Set up a custom module environment and build your own modules -Installation of the new software (or version) does not require any -specific credentials. +Installation of the new software (or version) does not require any specific credentials. + +### Prerequisites -\<br />Prerequisites: 1 An existing EasyConfig 1 a place to put your -modules. \<span style="font-size: 1em;">Step by step guide:\</span> +1. An existing EasyConfig +1. a place to put your modules. -1\. Create a \<a href="WorkSpaces" target="\_blank">workspace\</a> where -you'll install your modules. You need a place where your modules will be -placed. This needs to be done only once : +### Step by step guide - ws_allocate -F scratch EasyBuild 50 # +**Step 1:** Create a [workspace](../data_lifecycle/workspaces.md#allocate-a-workspace) where you +install your modules. You need a place where your modules are placed. This needs to be done only +once: -2\. Allocate nodes. You can do this with interactive jobs (see the -example below) and/or put commands in a batch file and source it. The -latter is recommended for non-interactive jobs, using the command sbatch -in place of srun. For the sake of illustration, we use an interactive -job as an example. The node parameters depend, to some extent, on the -architecture you want to use. ML nodes for the Power9 and others for the -x86. We will use Haswell nodes. +```console +marie@login$ ws_allocate -F scratch EasyBuild 50 +marie@login$ ws_list | grep 'directory.*EasyBuild' + workspace directory : /scratch/ws/1/marie-EasyBuild +``` - srun -p haswell -N 1 -c 4 --time=08:00:00 --pty /bin/bash +**Step 2:** Allocate nodes. You can do this with interactive jobs (see the example below) and/or +put commands in a batch file and source it. The latter is recommended for non-interactive jobs, +using the command `sbatch` instead of `srun`. For the sake of illustration, we use an +interactive job as an example. Depending on the partitions that you want the module to be usable on +later, you need to select nodes with the same archtitecture. Thus, use nodes from partition ml +later, if you want to use the module on nodes of that partition. In this example, we assume that we +want to use the module on nodes with x86 architecture und thus, use Haswell nodes. -\*Using EasyBuild on the login nodes is not allowed\* +```console +marie@login$ srun --partition=haswell --nodes=1 --cpus-per-task=4 --time=08:00:00 --pty /bin/bash -l +``` -3\. Load EasyBuild module. +!!! warning - module load EasyBuild + Using EasyBuild on the login nodes is not allowed. -\<br />4. Specify Workspace. The rest of the guide is based on it. -Please create an environment variable called \`WORKSPACE\` with the -location of your Workspace: +**Step 3:** Load EasyBuild module. - WORKSPACE=<location_of_your_workspace> # For example: WORKSPACE=/scratch/ws/anpo879a-EasyBuild +```console +module load EasyBuild +``` -5\. Load the correct modenv according to your current or target -architecture: \`ml modenv/scs5\` for x86 (default) or \`modenv/ml\` for -Power9 (ml partition). Load EasyBuild module +**Step 4:** Specify the workspace. The rest of the guide is based on it. Please create an +environment variable called `WORKSPACE` with the location of your workspace: - ml modenv/scs5 - module load EasyBuild +```console +marie@compute$ WORKSPACE=/scratch/ws/1/marie-EasyBuild #see output of ws_list above +``` -6\. Set up your environment: +**Step 5:** Load the correct modenv according to your current or target architecture: - export EASYBUILD_ALLOW_LOADED_MODULES=EasyBuild,modenv/scs5 - export EASYBUILD_DETECT_LOADED_MODULES=unload - export EASYBUILD_BUILDPATH="/tmp/${USER}-EasyBuild${SLURM_JOB_ID:-}" - export EASYBUILD_SOURCEPATH="${WORKSPACE}/sources" - export EASYBUILD_INSTALLPATH="${WORKSPACE}/easybuild-$(basename $(readlink -f /sw/installed))" - export EASYBUILD_INSTALLPATH_MODULES="${EASYBUILD_INSTALLPATH}/modules" - module use "${EASYBUILD_INSTALLPATH_MODULES}/all" - export LMOD_IGNORE_CACHE=1 +=== "x86 (default, e. g. partition haswell)" + ```console + marie@compute$ module load modenv/scs5 + ``` +=== "Power9 (partition ml)" + ```console + marie@compute$ module load modenv/ml + ``` -7\. \<span style="font-size: 13px;">Now search for an existing -EasyConfig: \</span> +**Step 6:** Load module `EasyBuild` - eb --search TensorFlow +```console +marie@compute$ module load EasyBuild +``` -\<span style="font-size: 13px;">8. Build the EasyConfig and its -dependencies\</span> +**Step 7:** Set up your environment: - eb TensorFlow-1.8.0-fosscuda-2018a-Python-3.6.4.eb -r +```console +marie@compute$ export EASYBUILD_ALLOW_LOADED_MODULES=EasyBuild,modenv/scs5 +marie@compute$ export EASYBUILD_DETECT_LOADED_MODULES=unload +marie@compute$ export EASYBUILD_BUILDPATH="/tmp/${USER}-EasyBuild${SLURM_JOB_ID:-}" +marie@compute$ export EASYBUILD_SOURCEPATH="${WORKSPACE}/sources" +marie@compute$ export EASYBUILD_INSTALLPATH="${WORKSPACE}/easybuild-$(basename $(readlink -f /sw/installed))" +marie@compute$ export EASYBUILD_INSTALLPATH_MODULES="${EASYBUILD_INSTALLPATH}/modules" +marie@compute$ module use "${EASYBUILD_INSTALLPATH_MODULES}/all" +marie@compute$ export LMOD_IGNORE_CACHE=1 +``` -\<span style="font-size: 13px;">After this is done (may take A LONG -time), you can load it just like any other module.\</span> +**Step 8:** Now search for an existing EasyConfig: -9\. To use your custom build modules you only need to rerun step 4, 5, 6 -and execute the usual: +```console +marie@compute$ eb --search TensorFlow +``` - module load <name_of_your_module> # For example module load TensorFlow-1.8.0-fosscuda-2018a-Python-3.6.4 +**Step 9:** Build the EasyConfig and its dependencies -The key is the \`module use\` command which brings your modules into -scope so \`module load\` can find them and the LMOD_IGNORE_CACHE line -which makes LMod pick up the custom modules instead of searching the +```console +marie@compute$ eb TensorFlow-1.8.0-fosscuda-2018a-Python-3.6.4.eb -r +``` + +This may take a long time. After this is done, you can load it just like any other module. + +**Step 10:** To use your custom build modules you only need to rerun steps 4, 5, 6, 7 and execute +the usual: + +```console +marie@compute$ module load TensorFlow-1.8.0-fosscuda-2018a-Python-3.6.4 #replace with name of your module +``` + +The key is the `module use` command which brings your modules into scope so `module load` can find +them. The `LMOD_IGNORE_CACHE` line makes `LMod` pick up the custom modules instead of searching the system cache. ## Troubleshooting -When building your EasyConfig fails, you can first check the log -mentioned and scroll to the bottom to see what went wrong. +When building your EasyConfig fails, you can first check the log mentioned and scroll to the bottom +to see what went wrong. + +It might also be helpful to inspect the build environment EasyBuild uses. For that you can run: + +```console +marie@compute$ eb myEC.eb --dump-env-script` +``` + +This command creates a sourceable .env file with `module load` and `export` commands that show what +EB does before running, e.g., the configure step. -It might also be helpful to inspect the build environment EB uses. For -that you can run \`eb myEC.eb --dump-env-script\` which creates a -sourceable .env file with \`module load\` and \`export\` commands that -show what EB does before running, e.g., the configure step. +It might also be helpful to use -It might also be helpful to use '\<span style="font-size: 1em;">export -LMOD_IGNORE_CACHE=0'\</span> +```console +marie@compute$ export LMOD_IGNORE_CACHE=0 +``` -- GitLab