diff --git a/Dockerfile b/Dockerfile index d8d77ca7c6dd279a5194a27fa854be94a9e5ff70..67ffffaa2e29c4effe35714e3dca8128872252f6 100644 --- a/Dockerfile +++ b/Dockerfile @@ -12,7 +12,7 @@ RUN pip install -r /src/doc.zih.tu-dresden.de/requirements.txt # Linter # ########## -RUN apt update && apt install -y nodejs npm +RUN apt update && apt install -y nodejs npm aspell RUN npm install -g markdownlint-cli markdown-link-check diff --git a/doc.zih.tu-dresden.de/README.md b/doc.zih.tu-dresden.de/README.md index b009dd5edc2b3f30bfd954e913176bf3a741a016..5f43c1e8b60c34dfb7d50dae341ec473b633759c 100644 --- a/doc.zih.tu-dresden.de/README.md +++ b/doc.zih.tu-dresden.de/README.md @@ -40,7 +40,7 @@ Now, create a local clone of your fork #### Install Dependencies -**TODO:** Describtion +**TODO:** Description ```Shell Session ~ cd hpc-compendium/doc.zih.tu-dresden.de @@ -61,7 +61,7 @@ editor are invoked: Do your changes, add a meaningful commit message and commit The more sophisticated integrated Web IDE is reached from the top level menu of the repository or by selecting any source file. -Other git services might have an aquivivalent web interface to interact with the repository. Please +Other git services might have an equivalent web interface to interact with the repository. Please refer to the corresponding documentation for further information. <!--This option of contributing is only available for users of--> @@ -157,6 +157,22 @@ To check a single file, e. g. `doc.zih.tu-dresden.de/docs/software/big_data_fram docker run --name=hpc-compendium --rm -it -w /docs --mount src="$(pwd)"/doc.zih.tu-dresden.de,target=/docs,type=bind hpc-compendium markdown-link-check docs/software/big_data_frameworks.md ``` +For spell-checking a single file, use: + +```Bash +docker run --name=hpc-compendium --rm -it -w /docs --mount src="$(pwd)"/doc.zih.tu-dresden.de,target=/docs,type=bind hpc-compendium ./util/check-spelling.sh <file> +``` + +For spell-checking all files, use: + +```Bash +docker run --name=hpc-compendium --rm -it -w /docs --mount src="$(pwd)"/doc.zih.tu-dresden.de,target=/docs,type=bind hpc-compendium ./util/check-spelling.sh +``` + +This outputs all words of all files that are unknown to the spell checker. +To let the spell checker "know" a word, append it to +`doc.zih.tu-dresden.de/wordlist.aspell`. + #### Build Static Documentation To build the documentation, invoke `mkdocs build`. This will create a new directory named `public` @@ -220,7 +236,7 @@ new branch (a so-called feature branch) basing on the `main` branch and commit y ``` The last command pushes the changes to your remote at branch `FEATUREBRANCH`. Now, it is time to -incoporate the changes and improvements into the HPC Compendium. For this, create a +incorporate the changes and improvements into the HPC Compendium. For this, create a [merge request](https://gitlab.hrz.tu-chemnitz.de/zih/hpc-compendium/hpc-compendium/-/merge_requests/new) to the `main` branch. diff --git a/doc.zih.tu-dresden.de/docs/application/access.md b/doc.zih.tu-dresden.de/docs/application/access.md index b396ad42a0946c22647ff7240ee156f20f2376ec..54aa7c531aaf8eb774a573a4323e4c526af0e331 100644 --- a/doc.zih.tu-dresden.de/docs/application/access.md +++ b/doc.zih.tu-dresden.de/docs/application/access.md @@ -13,7 +13,7 @@ project manager is called to inform the ZIH about any changes according the staf also trial accounts have to fill in the application form.)\<br />** It is invariably possible to apply for more/different resources. Whether additional resources are -granted or not depends on the current allocations and on the availablility of the installed systems. +granted or not depends on the current allocations and on the availability of the installed systems. The terms of use of the HPC systems are only [available in German](terms_of_use.md) - at the moment. @@ -39,13 +39,13 @@ For obtaining access to the machines, the following forms have to be filled in: ### Subsequent applications / view for project leader -Subsequent applications will be neccessary, +Subsequent applications will be necessary, - if the end of project is reached - if the applied resources won't be sufficient The project leader and one person instructed by him, the project administrator, should use -[this website](https://hpcprojekte.zih.tu-dresden.de/managers/) (ZIH-login neccessary). At this +[this website](https://hpcprojekte.zih.tu-dresden.de/managers/) (ZIH-login necessary). At this website you have an overview of your projects, the usage of resources, you can submit subsequent applications, and you are able to add staff members to your project. @@ -77,8 +77,8 @@ LaTeX-template( If you plan to publish a paper with results based on the used CPU hours of our machines, please insert in the acknowledgement an small part with thank for the support by the machines of the -ZIH/TUD. (see example below) Please send us a link/reference to the paper if it was puplished. It -will be very helpfull for the next acquirement of compute power. Thank you very much. +ZIH/TUD. (see example below) Please send us a link/reference to the paper if it was published. It +will be very helpful for the next acquirement of compute power. Thank you very much. Two examples: diff --git a/doc.zih.tu-dresden.de/docs/data_lifecycle/file_systems.md b/doc.zih.tu-dresden.de/docs/data_lifecycle/file_systems.md index 7d131092c93795efa1820a9d3fa41b34a9fb1c8d..5365ac4f3cfed5c4bc6a7051802bfbfe1eb7b17d 100644 --- a/doc.zih.tu-dresden.de/docs/data_lifecycle/file_systems.md +++ b/doc.zih.tu-dresden.de/docs/data_lifecycle/file_systems.md @@ -44,7 +44,7 @@ To work as efficient as possible, consider the following points - Store checkpoints and other temporary data in `/scratch/ws/...` - Compilation in `/dev/shm` or `/tmp` -Getting high I/O-bandwitdh +Getting high I/O-bandwidth - Use many clients - Use many processes (writing in the same file at the same time is possible) @@ -52,7 +52,7 @@ Getting high I/O-bandwitdh ## Cheat Sheet for Debugging File System Issues -Every Taurus-User should normaly be able to perform the following commands to get some intel about +Every Taurus-User should normally be able to perform the following commands to get some intel about their data. ### General @@ -63,7 +63,7 @@ For the first view, you can easily use the "df-command". df ``` -Alternativly you can use the "findmnt"-command, which is also able to perform an `df` by adding the +Alternatively, you can use the "findmnt"-command, which is also able to perform an `df` by adding the "-D"-parameter. ```Bash @@ -122,7 +122,7 @@ This will set the stripe pattern for `/beegfs/global0/path/to/mydir/` to a chunk distributed over 16 storage targets. Find files located on certain server or targets. The following command searches all files that are -stored on the storage targets with id 4 or 30 und my-workspace directory. +stored on the storage targets with id 4 or 30 and my-workspace directory. ```Bash beegfs-ctl --find /beegfs/global0/my-workspace/ --targetid=4 --targetid=30 --mount=/beegfs/global0 diff --git a/doc.zih.tu-dresden.de/docs/data_lifecycle/overview.md b/doc.zih.tu-dresden.de/docs/data_lifecycle/overview.md index ac4c81a15051ef0bb58cebd6a3f93dcd68fc7067..e1b5fca65e562a243590c8fb55f92242b2265b4a 100644 --- a/doc.zih.tu-dresden.de/docs/data_lifecycle/overview.md +++ b/doc.zih.tu-dresden.de/docs/data_lifecycle/overview.md @@ -44,7 +44,7 @@ range from days to a few--> <!--years.--> !!! hint "Recommendations to choose of storage system" - * For data that seldomly changes but consumes a lot of space, the + * For data that seldom changes but consumes a lot of space, the [warm_archive](file_systems.md#warm_archive) can be used. (Note that this is mounted **read-only** on the compute nodes). * For a series of calculations that works on the same data please use a `scratch` based [workspace](workspaces.md). diff --git a/doc.zih.tu-dresden.de/docs/data_transfer/data_mover.md b/doc.zih.tu-dresden.de/docs/data_transfer/data_mover.md index 56b3e3daa0457b1427312b4062de8aca8985f81a..856af9f3080969f29ac71c7bc8bf6b8c79c45a60 100644 --- a/doc.zih.tu-dresden.de/docs/data_transfer/data_mover.md +++ b/doc.zih.tu-dresden.de/docs/data_transfer/data_mover.md @@ -54,7 +54,7 @@ Options for dtrsync: -a, --archive archive mode; equals -rlptgoD (no -H,-A,-X) - -r, --recursive recurse into directorie + -r, --recursive recurse into directories -l, --links copy symlinks as symlinks -p, --perms preserve permissions -t, --times preserve modification times diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/overview.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/overview.md index 217a429dc40f67d0f0e7f5bb03d1b12e4f14debd..14266272720761b66b817d9805c28f1079397e73 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/overview.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/overview.md @@ -24,12 +24,12 @@ ZIH uses the batch system Slurm for resource management and job scheduling. * who gets an email on which occasion, The runtime environment (see [here](../software/overview.md)) as well as the executable and - certain command-line agruments have to be specified to run the computational work. + certain command-line arguments have to be specified to run the computational work. ??? note "Batch System" The batch system is the central organ of every HPC system users interact with its compute - resources. The batchsystem finds an adequate compute system (partition/island) for your compute + resources. The batch system finds an adequate compute system (partition/island) for your compute jobs. It organizes the queueing and messaging, if all resources are in use. If resources are available for your job, the batch system allocates and connects to these resources, transfers run-time environment, and starts the job. @@ -49,8 +49,8 @@ a single GPU's core can handle is small), GPUs are not as versatile as CPUs. ### Available Hardware ZIH provides a broad variety of compute resources ranging from normal server CPUs of different -manufactures, to large shared memory nodes, GPU-assisted nodes up to highly specialised resources for -[Machine Learing](../software/machine_learning.md) and AI. +manufactures, to large shared memory nodes, GPU-assisted nodes up to highly specialized resources for +[Machine Learning](../software/machine_learning.md) and AI. The page [Hardware Taurus](hardware_taurus.md) holds a comprehensive overview. The desired hardware can be specified by the partition `-p, --partition` flag in Slurm. @@ -81,7 +81,7 @@ with the `--x11` option. To use an interactive job you have to specify `-X` flag However, using `srun` directly on the Shell will lead to blocking and launch an interactive job. Apart from short test runs, it is recommended to encapsulate your experiments and computational -tasks into batchjobs and submit them to the batch system. For that, you can conveniently put the +tasks into batch jobs and submit them to the batch system. For that, you can conveniently put the parameters directly into the job file which you can submit using `sbatch [options] <job file>`. ## Processing of Data for Input and Output diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/sd_flex.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/sd_flex.md index 8f536f9cd58d5bbe3fae80565e3af515284a84ad..04624da4e55fe3a32e3d41842622b38b3e176315 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/sd_flex.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/sd_flex.md @@ -2,7 +2,7 @@ - Hostname: taurussmp8 - Access to all shared file systems -- SLURM partition `julia` +- Slurm partition `julia` - 32 x Intel(R) Xeon(R) Platinum 8276M CPU @ 2.20GHz (28 cores) - 48 TB RAM (usable: 47 TB - one TB is used for cache coherence protocols) @@ -14,7 +14,7 @@ There are 370 TB of NVMe devices installed. For immediate access for all project of fast NVMe storage is available at `/nvme/1/<projectname>`. For testing, we have set a quota of 100 GB per project on this NVMe storage.This is -With a more detailled proposal on how this unique system (large shared memory + NVMe storage) can +With a more detailed proposal on how this unique system (large shared memory + NVMe storage) can speed up their computations, a project's quota can be increased or dedicated volumes of up to the full capacity can be set up. @@ -26,7 +26,7 @@ full capacity can be set up. variables, so that OpenMPI uses shared memory instead of Infiniband for message transport. `export OMPI_MCA_pml=ob1;   export OMPI_MCA_mtl=^mxm` - Use `I_MPI_FABRICS=shm` so that Intel MPI doesn't even consider - using InfiniBand devices itself, but only shared-memory instead + using Infiniband devices itself, but only shared-memory instead ## Open for Testing diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm.md index 2241599fb1c739061a0b50cbc8b8a6e44aae107e..0c4d3d92a25de40aa7ec887feeb08086081a5af3 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm.md @@ -19,7 +19,7 @@ Some options of `srun/sbatch` are: | -n \<N> or --ntasks \<N> | set a number of tasks to N(default=1). This determines how many processes will be spawned by srun (for MPI jobs). | | -N \<N> or --nodes \<N> | set number of nodes that will be part of a job, on each node there will be --ntasks-per-node processes started, if the option --ntasks-per-node is not given, 1 process per node will be started | | --ntasks-per-node \<N> | how many tasks per allocated node to start, as stated in the line before | -| -c \<N> or --cpus-per-task \<N> | this option is needed for multithreaded (e.g. OpenMP) jobs, it tells SLURM to allocate N cores per task allocated; typically N should be equal to the number of threads you program spawns, e.g. it should be set to the same number as OMP_NUM_THREADS | +| -c \<N> or --cpus-per-task \<N> | this option is needed for multithreaded (e.g. OpenMP) jobs, it tells Slurm to allocate N cores per task allocated; typically N should be equal to the number of threads you program spawns, e.g. it should be set to the same number as OMP_NUM_THREADS | | -p \<name> or --partition \<name> | select the type of nodes where you want to execute your job, on Taurus we currently have haswell, `smp`, `sandy`, `west`, ml and `gpu` available | | --mem-per-cpu \<name> | specify the memory need per allocated CPU in MB | | --time \<HH:MM:SS> | specify the maximum runtime of your job, if you just put a single number in, it will be interpreted as minutes | @@ -27,7 +27,7 @@ Some options of `srun/sbatch` are: | --mail-type ALL | specify for what type of events you want to get a mail; valid options beside ALL are: BEGIN, END, FAIL, REQUEUE | | -J \<name> or --job-name \<name> | give your job a name which is shown in the queue, the name will also be included in job emails (but cut after 24 chars within emails) | | --no-requeue | At node failure, jobs are requeued automatically per default. Use this flag to disable requeueing. | -| --exclusive | tell SLURM that only your job is allowed on the nodes allocated to this job; please be aware that you will be charged for all CPUs/cores on the node | +| --exclusive | tell Slurm that only your job is allowed on the nodes allocated to this job; please be aware that you will be charged for all CPUs/cores on the node | | -A \<project> | Charge resources used by this job to the specified project, useful if a user belongs to multiple projects. | | -o \<filename> or --output \<filename> | \<p>specify a file name that will be used to store all normal output (stdout), you can use %j (job id) and %N (name of first node) to automatically adopt the file name to the job, per default stdout goes to "slurm-%j.out"\</p> \<p>%RED%NOTE:<span class="twiki-macro ENDCOLOR"></span> the target path of this parameter must be writeable on the compute nodes, i.e. it may not point to a read-only mounted file system like /projects.\</p> | | -e \<filename> or --error \<filename> | \<p>specify a file name that will be used to store all error output (stderr), you can use %j (job id) and %N (name of first node) to automatically adopt the file name to the job, per default stderr goes to "slurm-%j.out" as well\</p> \<p>%RED%NOTE:<span class="twiki-macro ENDCOLOR"></span> the target path of this parameter must be writeable on the compute nodes, i.e. it may not point to a read-only mounted file system like /projects.\</p> | @@ -51,7 +51,7 @@ echo Starting Program During runtime, the environment variable SLURM_JOB_ID will be set to the id of your job. You can also use our [Slurm Batch File Generator]**todo** Slurmgenerator, which could help you create -basic SLURM job scripts. +basic Slurm job scripts. Detailed information on [memory limits on Taurus]**todo** @@ -78,9 +78,9 @@ one job per user. Please check the availability of nodes there with `sinfo -p in ### Interactive X11/GUI Jobs -SLURM will forward your X11 credentials to the first (or even all) node +Slurm will forward your X11 credentials to the first (or even all) node for a job with the (undocumented) --x11 option. For example, an -interactive session for 1 hour with Matlab using eigth cores can be +interactive session for 1 hour with Matlab using eight cores can be started with: ```Shell Session @@ -100,7 +100,7 @@ by simply deleting the known_hosts file altogether if you don't have important o ### Requesting an Nvidia K20X / K80 / A100 -SLURM will allocate one or many GPUs for your job if requested. Please note that GPUs are only +Slurm will allocate one or many GPUs for your job if requested. Please note that GPUs are only available in certain partitions, like `gpu2`, `gpu3` or `gpu2-interactive`. The option for sbatch/srun in this case is `--gres=gpu:[NUM_PER_NODE]` (where `NUM_PER_NODE` can be `1`, 2 or 4, meaning that one, two or four of the GPUs per node will be used for the job). A sample job file @@ -118,10 +118,10 @@ srun ./your/cuda/application # start you application (probably requires MPI to Please be aware that the partitions `gpu`, `gpu1` and `gpu2` can only be used for non-interactive jobs which are submitted by `sbatch`. Interactive jobs (`salloc`, `srun`) will have to use the -partition `gpu-interactive`. SLURM will automatically select the right partition if the partition +partition `gpu-interactive`. Slurm will automatically select the right partition if the partition parameter (-p) is omitted. -**Note:** Due to an unresolved issue concering the SLURM job scheduling behavior, it is currently +**Note:** Due to an unresolved issue concerning the Slurm job scheduling behavior, it is currently not practical to use `--ntasks-per-node` together with GPU jobs. If you want to use multiple nodes, please use the parameters `--ntasks` and `--mincpus` instead. The values of mincpus \* nodes has to equal ntasks in this case. @@ -156,7 +156,7 @@ depend on the type of parallelization and architecture. An SMP-parallel job can only run within a node, so it is necessary to include the options `-N 1` and `-n 1`. The maximum number of processors for an SMP-parallel program is 488 on Venus and 56 on -taurus (smp island). Using --cpus-per-task N SLURM will start one task and you will have N CPUs +taurus (smp island). Using --cpus-per-task N Slurm will start one task and you will have N CPUs available for your job. An example job file would look like: ```Bash @@ -190,7 +190,7 @@ specifically for their target. srun ./path/to/binary ``` -#### Multiple Programms Running Simultaneously in a Job +#### Multiple Programs Running Simultaneously in a Job In this short example, our goal is to run four instances of a program concurrently in a **single** batch script. Of course we could also start a batch script four times with sbatch but this is not @@ -229,7 +229,7 @@ echo "All parallel job steps completed!" Jobs on taurus run, by default, in shared-mode, meaning that multiple jobs can run on the same compute nodes. Sometimes, this behaviour is not desired (e.g. for benchmarking purposes), in which -case it can be turned off by specifying the SLURM parameter: `--exclusive` . +case it can be turned off by specifying the Slurm parameter: `--exclusive` . Setting `--exclusive` **only** makes sure that there will be **no other jobs running on your nodes**. It does not, however, mean that you automatically get access to all the resources which the node @@ -238,7 +238,7 @@ generic resources parameter (gres) to run on the GPU partitions, or you still ha cores of a node if you need them. CPU cores can either to be used for a task (`--ntasks`) or for multi-threading within the same task (--cpus-per-task). Since those two options are semantically different (e.g., the former will influence how many MPI processes will be spawned by 'srun' whereas -the latter does not), SLURM cannot determine automatically which of the two you might want to use. +the latter does not), Slurm cannot determine automatically which of the two you might want to use. Since we use cgroups for separation of jobs, your job is not allowed to use more resources than requested.* @@ -251,8 +251,13 @@ other jobs, even if it doesn't use up all resources in the nodes: ```Bash #!/bin/bash -#SBATCH -J Benchmark<br />#SBATCH -p haswell<br />#SBATCH --nodes=2<br />#SBATCH --ntasks-per-node=2 -#SBATCH --cpus-per-task=8<br />#SBATCH --exclusive # ensure that nobody spoils my measurement on 2 x 2 x 8 cores<br />#SBATCH --mail-user=your.name@tu-dresden.de +#SBATCH -J Benchmark +#SBATCH -p haswell +#SBATCH --nodes=2 +#SBATCH --ntasks-per-node=2 +#SBATCH --cpus-per-task=8 +#SBATCH --exclusive # ensure that nobody spoils my measurement on 2 x 2 x 8 cores +#SBATCH --mail-user=your.name@tu-dresden.de #SBATCH --time=00:10:00 srun ./my_benchmark @@ -301,7 +306,7 @@ For further details please read the Slurm documentation at You can use chain jobs to create dependencies between jobs. This is often the case if a job relies on the result of one or more preceding jobs. Chain jobs can also be used if the runtime limit of the -batch queues is not sufficient for your job. SLURM has an option `-d` or "--dependency" that allows +batch queues is not sufficient for your job. Slurm has an option `-d` or "--dependency" that allows to specify that a job is only allowed to start if another job finished. Here is an example of how a chain job can look like, the example submits 4 jobs (described in a job @@ -328,7 +333,7 @@ done ### Binding and Distribution of Tasks -The SLURM provides several binding strategies to place and bind the tasks and/or threads of your job +The Slurm provides several binding strategies to place and bind the tasks and/or threads of your job to cores, sockets and nodes. Note: Keep in mind that the distribution method has a direct impact on the execution time of your application. The manipulation of the distribution can either speed up or slow down your application. More detailed information about the binding can be found @@ -400,7 +405,7 @@ src="data:;base64,iVBORw0KGgoAAAANSUhEUgAAAvoAAADyCAIAAACzsfbGAAAABmJLR0QA/wD/AP ### Node Features for Selective Job Submission The nodes in our HPC system are becoming more diverse in multiple aspects: hardware, mounted -storage, software. The system administrators can decribe the set of properties and it is up to the +storage, software. The system administrators can describe the set of properties and it is up to the user to specify her/his requirements. These features should be thought of as changing over time (e.g. a file system get stuck on a certain node). @@ -457,7 +462,7 @@ of the possible job status: | Resources | The job is waiting for resources to become available. | | NodeDown | A node required by the job is down. | | BadConstraints | The jobs constraints can not be satisfied. | -| SystemFailure | Failure of the SLURM system, a file system, the network, etc. | +| SystemFailure | Failure of the Slurm system, a file system, the network, etc. | | JobLaunchFailure | The job could not be launched. This may be due to a file system problem, invalid program name, etc. | | NonZeroExitCode | The job terminated with a non-zero exit code. | | TimeLimit | The job exhausted its time limit. | @@ -470,7 +475,7 @@ For detailed information on why your submitted job has not started yet, you can ## Accounting -The SLRUM command `sacct` provides job statistics like memory usage, CPU +The Slurm command `sacct` provides job statistics like memory usage, CPU time, energy usage etc. Examples: ```Shell Session @@ -513,7 +518,7 @@ nodes that will work for you. src="%ATTACHURL%/hdfview_memory.png" style="float: right; margin-left: 10px;" title="hdfview" width="324" /> \</a> -SLURM offers the option to gather profiling data from every task/node of the job. Following data can +Slurm offers the option to gather profiling data from every task/node of the job. Following data can be gathered: - Task data, such as CPU frequency, CPU utilization, memory @@ -548,7 +553,7 @@ module load HDFView hdfview.sh profile.h5 ``` -More information about profiling with SLURM: +More information about profiling with Slurm: - [Slurm Profiling](http://slurm.schedmd.com/hdf5_profile_user_guide.html) - [sh5util](http://slurm.schedmd.com/sh5util.html) diff --git a/doc.zih.tu-dresden.de/docs/software/compilers.md b/doc.zih.tu-dresden.de/docs/software/compilers.md index c9b3cc95eef88d0de303d8cb46830dfe16015d6d..19a70e4638aa126176c8d705d472176e4bbbb915 100644 --- a/doc.zih.tu-dresden.de/docs/software/compilers.md +++ b/doc.zih.tu-dresden.de/docs/software/compilers.md @@ -37,7 +37,7 @@ optimization, data alignment and so on. You can list all available compiler opti `-help`. Reading the man-pages is a good idea, too. The user benefits from the (nearly) same set of compiler flags for optimization for the C,C++, and -Fortran-compilers. In the following table, only a couple of important compiler-dependant options are +Fortran-compilers. In the following table, only a couple of important compiler-dependent options are listed. For more detailed information, the user should refer to the man pages or use the option -help to list all options of the compiler. @@ -55,7 +55,7 @@ Description\* \| | | `-ipa` | `-ipo` | `-Mipa` | `-ipa` | inter procedure optimization (across files) | | | | `-ip` | `-Mipa` | | inter procedure optimization (within files) | | | `-apo` | `-parallel` | `-Mconcur` | `-apo` | Auto-parallelizer | -| `-fprofile-generate` | | `-prof-gen` | `-Mpfi` | `-fb-create` | Create intrumented code to generate profile in file \<FN> | +| `-fprofile-generate` | | `-prof-gen` | `-Mpfi` | `-fb-create` | Create instrumented code to generate profile in file \<FN> | | `-fprofile-use` | | `-prof-use` | `-Mpfo` | `-fb-opt` | Use profile data for optimization. - Leave all other optimization options | *We can not generally give advice as to which option should be used - even -O0 sometimes leads to a @@ -96,17 +96,17 @@ parallelism in the code. Therefore it is sometimes necessary to provide the compiler with some hints. Some possible directives are (Fortran style): -| | | -|--------------------------|-----------------------------------| -| `CDEC$ ivdep` | ignore assumed vector dependences | -| `CDEC$ swp` | try to software-pipeline | -| `CDEC$ noswp` | disable softeware-pipeling | -| `CDEC$ loop count (n)` | hint for optimzation | -| `CDEC$ distribute point` | split this large loop | -| `CDEC$ unroll (n)` | unroll (n) times | -| `CDEC$ nounroll` | do not unroll | -| `CDEC$ prefetch a` | prefetch array a | -| `CDEC$ noprefetch a` | do not prefetch array a | +| | | +|--------------------------|------------------------------------| +| `CDEC$ ivdep` | ignore assumed vector dependencies | +| `CDEC$ swp` | try to software-pipeline | +| `CDEC$ noswp` | disable software-pipeline | +| `CDEC$ loop count (n)` | hint for optimization | +| `CDEC$ distribute point` | split this large loop | +| `CDEC$ unroll (n)` | unroll (n) times | +| `CDEC$ nounroll` | do not unroll | +| `CDEC$ prefetch a` | prefetch array a | +| `CDEC$ noprefetch a` | do not prefetch array a | The compiler directives are the same for `ifort` and `icc` . The syntax for C/C++ is like `#pragma ivdep`, `#pragma swp`, and so on. diff --git a/doc.zih.tu-dresden.de/docs/software/containers.md b/doc.zih.tu-dresden.de/docs/software/containers.md index 638b2c73bfd103d5ce8fe7cbb3cbe065874b932b..a67a4a986881ffe09a16582adfeda719e6f90ccd 100644 --- a/doc.zih.tu-dresden.de/docs/software/containers.md +++ b/doc.zih.tu-dresden.de/docs/software/containers.md @@ -182,7 +182,7 @@ Dockerfile in the current folder into a singularity definition file: `spython recipe Dockerfile myDefinition.def<br />` -Now please **verify** your generated defintion and adjust where +Now please **verify** your generated definition and adjust where required! There are some notable changes between singularity definitions and diff --git a/doc.zih.tu-dresden.de/docs/software/data_analytics_with_r.md b/doc.zih.tu-dresden.de/docs/software/data_analytics_with_r.md index 254bced046f1edff75bc0fb83ffca76f7724027e..9c1e092a72d6294a9c5b91f0cd3459bc8e215ebb 100644 --- a/doc.zih.tu-dresden.de/docs/software/data_analytics_with_r.md +++ b/doc.zih.tu-dresden.de/docs/software/data_analytics_with_r.md @@ -24,7 +24,7 @@ tauruslogin$ srun --partition=haswell --ntasks=1 --nodes=1 --cpus-per-task=4 --m # Ensure that you are using the scs5 environment module load modenv/scs5 -# Check all availble modules for R with version 3.6 +# Check all available modules for R with version 3.6 module available R/3.6 # Load default R module module load R diff --git a/doc.zih.tu-dresden.de/docs/software/deep_learning.md b/doc.zih.tu-dresden.de/docs/software/deep_learning.md index 8d6f62ab45e1dcaa167d2615c6eac21ef2141743..da8c9c461fddc3c870ef418bb7db2b1ed493abe8 100644 --- a/doc.zih.tu-dresden.de/docs/software/deep_learning.md +++ b/doc.zih.tu-dresden.de/docs/software/deep_learning.md @@ -1,7 +1,7 @@ # Deep learning **Prerequisites**: To work with Deep Learning tools you obviously need [Login](../access/ssh_login.md) -for the Taurus system and basic knowledge about Python, SLURM manager. +for the Taurus system and basic knowledge about Python, Slurm manager. **Aim** of this page is to introduce users on how to start working with Deep learning software on both the ml environment and the scs5 environment of the Taurus system. @@ -26,12 +26,12 @@ There are numerous different possibilities on how to work with [TensorFlow](tens Taurus. On this page, for all examples default, scs5 partition is used. Generally, the easiest way is using the [modules system](modules.md) and Python virtual environment (test case). However, in some cases, you may need directly installed -Tensorflow stable or night releases. For this purpose use the +TensorFlow stable or night releases. For this purpose use the [EasyBuild](custom_easy_build_environment.md), [Containers](tensorflow_container_on_hpcda.md) and see [the example](https://www.tensorflow.org/install/pip). For examples of using TensorFlow for ml partition with module system see [TensorFlow page for HPC-DA](tensorflow.md). -Note: If you are going used manually installed Tensorflow release we recommend use only stable +Note: If you are going used manually installed TensorFlow release we recommend use only stable versions. ## Keras @@ -44,7 +44,7 @@ name "Keras". On this page for all examples default scs5 partition used. There are numerous different possibilities on how to work with [TensorFlow](tensorflow.md) and Keras on Taurus. Generally, the easiest way is using the [module system](modules.md) and Python -virtual environment (test case) to see Tensorflow part above. +virtual environment (test case) to see TensorFlow part above. For examples of using Keras for ml partition with the module system see the [Keras page for HPC-DA](keras.md). @@ -71,7 +71,7 @@ Job-file (schedule job with sbatch, check the status with 'squeue -u \<Username> #!/bin/bash #SBATCH --gres=gpu:1 # 1 - using one gpu, 2 - for using 2 gpus #SBATCH --mem=8000 -#SBATCH -p gpu2 # select the type of nodes (opitions: haswell, smp, sandy, west,gpu, ml) K80 GPUs on Haswell node +#SBATCH -p gpu2 # select the type of nodes (options: haswell, smp, sandy, west, gpu, ml) K80 GPUs on Haswell node #SBATCH --time=00:30:00 #SBATCH -o HLR_<name_of_your_script>.out # save output under HLR_${SLURMJOBID}.out #SBATCH -e HLR_<name_of_your_script>.err # save error messages under HLR_${SLURMJOBID}.err @@ -128,7 +128,7 @@ The [ImageNet](http://www.image-net.org/) project is a large visual database des visual object recognition software research. In order to save space in the file system by avoiding to have multiple duplicates of this lying around, we have put a copy of the ImageNet database (ILSVRC2012 and ILSVR2017) under `/scratch/imagenet` which you can use without having to download it -again. For the future, the Imagenet dataset will be available in `/warm_archive`. ILSVR2017 also +again. For the future, the ImageNet dataset will be available in `/warm_archive`. ILSVR2017 also includes a dataset for recognition objects from a video. Please respect the corresponding [Terms of Use](https://image-net.org/download.php). @@ -138,21 +138,19 @@ Jupyter notebooks are a great way for interactive computing in your web browser. working with data cleaning and transformation, numerical simulation, statistical modelling, data visualization and of course with machine learning. -There are two general options on how to work Jupyter notebooks using HPC: remote jupyter server and -jupyterhub. +There are two general options on how to work Jupyter notebooks using HPC: remote Jupyter server and +JupyterHub. -These sections show how to run and set up a remote jupyter server within a sbatch GPU job and which +These sections show how to run and set up a remote Jupyter server within a sbatch GPU job and which modules and packages you need for that. **Note:** On Taurus, there is a [JupyterHub](../access/jupyterhub.md), where you do not need the manual server setup described below and can simply run your Jupyter notebook on HPC nodes. Keep in -mind that with Jupyterhub you can't work with some special instruments. However general data +mind, that, with JupyterHub, you can't work with some special instruments. However, general data analytics tools are available. The remote Jupyter server is able to offer more freedom with settings and approaches. -Note: Jupyterhub is could be under construction - ### Preparation phase (optional) On Taurus, start an interactive session for setting up the @@ -184,7 +182,7 @@ executable script and run the installation script: wget https://repo.continuum.io/archive/Anaconda3-2019.03-Linux-x86_64.sh chmod 744 Anaconda3-2019.03-Linux-x86_64.sh ./Anaconda3-2019.03-Linux-x86_64.sh -(during installation you have to confirm the licence agreement) +(during installation you have to confirm the license agreement) ``` Next step will install the anaconda environment into the home @@ -197,14 +195,14 @@ conda create --name jnb ### Set environmental variables on Taurus In shell activate previously created python environment (you can -deactivate it also manually) and Install jupyter packages for this python environment: +deactivate it also manually) and install Jupyter packages for this python environment: ```Bash source activate jnb conda install jupyter ``` -If you need to adjust the config, you should create the template. Generate config files for jupyter -notebook server: +If you need to adjust the configuration, you should create the template. Generate config files for +Jupyter notebook server: ```Bash jupyter notebook --generate-config @@ -220,7 +218,7 @@ in browser session: jupyter notebook password Enter password: Verify password: ``` -you will get a message like that: +You get a message like that: ```Bash [NotebookPasswordApp] Wrote *hashed password* to @@ -234,9 +232,9 @@ certificate: openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout mykey.key -out mycert.pem ``` -fill in the form with decent values. +Fill in the form with decent values. -Possible entries for your jupyter config (`.jupyter/jupyter_notebook*config.py*`). Uncomment below +Possible entries for your Jupyter config (`.jupyter/jupyter_notebook*config.py*`). Uncomment below lines: ```Bash @@ -253,11 +251,11 @@ hashed password here>' c.NotebookApp.port = 9999 c.NotebookApp.allow_remote_acce Note: `<path-to-cert>` - path to key and certificate files, for example: (`/home/\<username>/mycert.pem`) -### SLURM job file to run the jupyter server on Taurus with GPU (1x K80) (also works on K20) +### Slurm job file to run the Jupyter server on Taurus with GPU (1x K80) (also works on K20) ```Bash #!/bin/bash -l #SBATCH --gres=gpu:1 # request GPU #SBATCH --partition=gpu2 # use GPU partition -SBATCH --output=notebok_output.txt #SBATCH --nodes=1 #SBATCH --ntasks=1 #SBATCH --time=02:30:00 +SBATCH --output=notebook_output.txt #SBATCH --nodes=1 #SBATCH --ntasks=1 #SBATCH --time=02:30:00 SBATCH --mem=4000M #SBATCH -J "jupyter-notebook" # job-name #SBATCH -A <name_of_your_project> unset XDG_RUNTIME_DIR # might be required when interactive instead of sbatch to avoid @@ -287,7 +285,7 @@ There are two options on how to connect to the server: 1. You can create an ssh tunnel if you have problems with the solution above. Open the other terminal and configure ssh -tunnel: (look up connection values in the output file of slurm job, e.g.) (recommended): +tunnel: (look up connection values in the output file of Slurm job, e.g.) (recommended): ```Bash node=taurusi2092 #see the name of the node with squeue -u <your_login> @@ -310,11 +308,11 @@ IP to your browser or call on local terminal e.g. local$> firefox https://<IP>: important to use SSL cert ``` -To login into the jupyter notebook site, you have to enter the **token**. +To login into the Jupyter notebook site, you have to enter the **token**. (`https://localhost:8887`). Now you can create and execute notebooks on Taurus with GPU support. -If you would like to use [JupyterHub](../access/jupyterhub.md) after using a remote manually configurated -jupyter server (example above) you need to change the name of the configuration file +If you would like to use [JupyterHub](../access/jupyterhub.md) after using a remote manually configured +Jupyter server (example above) you need to change the name of the configuration file (`/home//.jupyter/jupyter_notebook_config.py`) to any other. ### F.A.Q @@ -322,7 +320,7 @@ jupyter server (example above) you need to change the name of the configuration **Q:** - I have an error to connect to the Jupyter server (e.g. "open failed: administratively prohibited: open failed") -**A:** - Check the settings of your jupyter config file. Is it all necessary lines uncommented, the +**A:** - Check the settings of your Jupyter config file. Is it all necessary lines uncommented, the right path to cert and key files, right hashed password from .json file? Check is the used local port [available](https://en.wikipedia.org/wiki/List_of_TCP_and_UDP_port_numbers) Check local settings e.g. (`/etc/ssh/sshd_config`, `/etc/hosts`). diff --git a/doc.zih.tu-dresden.de/docs/software/fem_software.md b/doc.zih.tu-dresden.de/docs/software/fem_software.md index 5c576c143288e6260d067eaa6ba58f9069cb2a73..bd65ea9832462bae475841f2e3ed2fa8193e3355 100644 --- a/doc.zih.tu-dresden.de/docs/software/fem_software.md +++ b/doc.zih.tu-dresden.de/docs/software/fem_software.md @@ -149,7 +149,7 @@ runwb2 The ANSYS workbench (runwb2) can also be used in a batch script to start calculations (the solver, not GUI) from a workbench project into the background. To do so, you have to specify the -B parameter (for batch -mode), -F for your project file, and can then either add differerent +mode), -F for your project file, and can then either add different commands via -E parameters directly, or specify a workbench script file containing commands via -R. diff --git a/doc.zih.tu-dresden.de/docs/software/get_started_with_hpcda.md b/doc.zih.tu-dresden.de/docs/software/get_started_with_hpcda.md index 8a15ce2c78a17cea2ae1084dcaba5cd9ff93594c..ac90455f91a13a74023d9e767aa9f7bce538cf69 100644 --- a/doc.zih.tu-dresden.de/docs/software/get_started_with_hpcda.md +++ b/doc.zih.tu-dresden.de/docs/software/get_started_with_hpcda.md @@ -107,17 +107,17 @@ command was used. #### Copy data from lm to hm ```Bash -scp <file> <zih-user>@taurusexport.hrsk.tu-dresden.de:<target-location> #Copy file from your local machine. For example: scp helloworld.txt mustermann@taurusexport.hrsk.tu-dresden.de:/scratch/ws/mastermann-Macine_learning_project/ +scp <file> <zih-user>@taurusexport.hrsk.tu-dresden.de:<target-location> #Copy file from your local machine. For example: scp helloworld.txt mustermann@taurusexport.hrsk.tu-dresden.de:/scratch/ws/mastermann-Macine_learning_project/ -scp -r <directory> <zih-user>@taurusexport.hrsk.tu-dresden.de:<target-location> #Copy directory from your local machine. +scp -r <directory> <zih-user>@taurusexport.hrsk.tu-dresden.de:<target-location> #Copy directory from your local machine. ``` #### Copy data from hm to lm ```Bash -scp <zih-user>@taurusexport.hrsk.tu-dresden.de:<file> <target-location> #Copy file. For example: scp mustermann@taurusexport.hrsk.tu-dresden.de:/scratch/ws/mastermann-Macine_learning_project/helloworld.txt /home/mustermann/Downloads +scp <zih-user>@taurusexport.hrsk.tu-dresden.de:<file> <target-location> #Copy file. For example: scp mustermann@taurusexport.hrsk.tu-dresden.de:/scratch/ws/mastermann-Macine_learning_project/helloworld.txt /home/mustermann/Downloads -scp -r <zih-user>@taurusexport.hrsk.tu-dresden.de:<directory> <target-location> #Copy directory +scp -r <zih-user>@taurusexport.hrsk.tu-dresden.de:<directory> <target-location> #Copy directory ``` #### Moving data inside the HPC machines. Datamover @@ -133,7 +133,8 @@ These commands submit a job to the data transfer machines that execute the selec for the `dt` prefix, their syntax is the same as the shell command without the `dt`. ```Bash -dtcp -r /scratch/ws/<name_of_your_workspace>/results /luste/ssd/ws/<name_of_your_workspace> #Copy from workspace in scratch to ssd.<br />dtwget https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz #Download archive CIFAR-100. +dtcp -r /scratch/ws/<name_of_your_workspace>/results /lustre/ssd/ws/<name_of_your_workspace>; #Copy from workspace in scratch to ssd. +dtwget https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz #Download archive CIFAR-100. ``` ## BatchSystems. SLURM @@ -178,7 +179,7 @@ module load TensorFlow python machine_learning_example.py -## when finished writing, submit with: sbatch <script_name> For example: sbatch machine_learning_script.slurm +## when finished writing, submit with: sbatch <script_name> For example: sbatch machine_learning_script.slurm ``` The `machine_learning_example.py` contains a simple ml application based on the mnist model to test @@ -224,7 +225,7 @@ modules) and to run the job exist two main options: ```Bash srun -p ml -N 1 -n 1 -c 2 --gres=gpu:1 --time=01:00:00 --pty --mem-per-cpu=8000 bash #job submission in ml nodes with allocating: 1 node, 1 task per node, 2 CPUs per task, 1 gpu per node, with 8000 mb on 1 hour. -module load modenv/ml #example output: The following have been reloaded with a version change: 1) modenv/scs5 => modenv/ml +module load modenv/ml #example output: The following have been reloaded with a version change: 1) modenv/scs5 => modenv/ml mkdir python-virtual-environments #create folder for your environments cd python-virtual-environments #go to folder @@ -310,7 +311,9 @@ SingularityHub container with TensorFlow. It does **not require root privileges* Taurus directly: ```Bash -srun -p ml -N 1 --gres=gpu:1 --time=02:00:00 --pty --mem-per-cpu=8000 bash #allocating resourses from ml nodes to start the job to create a container.<br />singularity build my-ML-container.sif docker://ibmcom/tensorflow-ppc64le #create a container from the DockerHub with the last TensorFlow version<br />singularity run --nv my-ML-container.sif #run my-ML-container.sif container with support of the Nvidia's GPU. You could also entertain with your container by commands: singularity shell, singularity exec +srun -p ml -N 1 --gres=gpu:1 --time=02:00:00 --pty --mem-per-cpu=8000 bash #allocating resourses from ml nodes to start the job to create a container. +singularity build my-ML-container.sif docker://ibmcom/tensorflow-ppc64le #create a container from the DockerHub with the last TensorFlow version +singularity run --nv my-ML-container.sif #run my-ML-container.sif container with support of the Nvidia's GPU. You could also entertain with your container by commands: singularity shell, singularity exec ``` There are two sources for containers for Power9 architecture with diff --git a/doc.zih.tu-dresden.de/docs/software/libraries.md b/doc.zih.tu-dresden.de/docs/software/libraries.md index 3da400e5dfe9eefbd95489ceb20601d75dcd5ca6..32fc99ccce0f11b9de54a45683b1abd7ad5cf5a3 100644 --- a/doc.zih.tu-dresden.de/docs/software/libraries.md +++ b/doc.zih.tu-dresden.de/docs/software/libraries.md @@ -12,7 +12,7 @@ The following libraries are available on our platforms: ## The Boost Library Boost provides free peer-reviewed portable C++ source libraries, ranging from multithread and MPI -support to regular expression and numeric funtions. See at http://www.boost.org for detailed +support to regular expression and numeric functions. See at http://www.boost.org for detailed documentation. ## BLAS/LAPACK @@ -51,7 +51,7 @@ fourier transformations (FFT). It contains routines for: - General scientific, financial - vector transcendental functions, vector markup language (XML) -More speciï¬cally it contains the following components: +More specifically it contains the following components: - BLAS: - Level 1 BLAS: vector-vector operations, 48 functions @@ -95,4 +95,4 @@ icc -O1 -I/sw/global/compilers/intel/2013/mkl//include -lmpi -mkl -lmkl_scalapac FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and of both real and complex data (as well as of even/odd data, i.e. the discrete cosine/sine transforms or DCT/DST). Before using this library, please check out -the functions of vendor speciï¬c libraries ACML and/or MKL. +the functions of vendor specific libraries ACML and/or MKL. diff --git a/doc.zih.tu-dresden.de/docs/software/mathematics.md b/doc.zih.tu-dresden.de/docs/software/mathematics.md index fc5d7e8942240c61790b1ff8671b9fa63f1eecab..9edba02881bb0cf154ed3828b6a7d84b77fdb257 100644 --- a/doc.zih.tu-dresden.de/docs/software/mathematics.md +++ b/doc.zih.tu-dresden.de/docs/software/mathematics.md @@ -16,7 +16,7 @@ interface capabilities within a document-like user interface paradigm. ### Fonts -To remotely use the graphical frontend you have to add the Mathematica fonts to the local +To remotely use the graphical frontend, you have to add the Mathematica fonts to the local fontmanager. #### Linux Workstation @@ -36,7 +36,7 @@ You have to add additional Mathematica fonts at your local PC If you use **Xming** as X-server at your PC (refer to [remote access from Windows](../access/ssh_mit_putty.md), follow these steps: -1. Create a new folder `Mathematica` in the diretory `fonts` of the installation directory of Xming +1. Create a new folder `Mathematica` in the directory `fonts` of the installation directory of Xming (mostly: `C:\\Programme\\Xming\\fonts\\`) 1. Extract the fonts archive into this new directory `Mathematica`. In result you should have the two directories `DBF` and `Type1`. @@ -56,7 +56,7 @@ C:\WINDOWS\Fonts ### Mathematica and Slurm -Please use the batchsystem Slurm for running calculations. This is a small example for a batch +Please use the batch system Slurm for running calculations. This is a small example for a batch script, that you should prepare and start with the command `sbatch <scriptname>`. The File `mathtest.m` is your input script that includes the calculation statements for Mathematica. The file `mathtest.output` will hold the results of your calculations. @@ -154,7 +154,7 @@ srun --pty matlab -nodisplay -r basename_of_your_matlab_script #NOTE: you must o many instances of your calculation as you'd like, since it does not need a license during runtime when compiled to a binary. -You can find detailled documentation on the Matlab compiler at +You can find detailed documentation on the Matlab compiler at [Mathworks' help pages](https://de.mathworks.com/help/compiler/). ### Using the MATLAB Compiler (mcc) diff --git a/doc.zih.tu-dresden.de/docs/software/modules.md b/doc.zih.tu-dresden.de/docs/software/modules.md index af769f96ec9513e563ada5b6d5003b2ded58cdea..8f5a0ae2c4792fd92c458dc89033b2058a1e22de 100644 --- a/doc.zih.tu-dresden.de/docs/software/modules.md +++ b/doc.zih.tu-dresden.de/docs/software/modules.md @@ -103,7 +103,7 @@ modules). Private module files allow you to load your own installed software packages into your environment and to handle different versions without getting into conflicts. Private modules can be setup for a single user as well as all users of project group. The workflow and settings for user private module -files is described in the follwing. The [settings for project private +files is described in the following. The [settings for project private modules](#project-private-modules) differ only in details. The command diff --git a/doc.zih.tu-dresden.de/docs/software/mpi_usage_error_detection.md b/doc.zih.tu-dresden.de/docs/software/mpi_usage_error_detection.md index 5fabe6f54bef480c504bce8bbc1ed55a22e0d4cb..8d1d7e17a02c3dd2ab572216899cd37f7a9aee3a 100644 --- a/doc.zih.tu-dresden.de/docs/software/mpi_usage_error_detection.md +++ b/doc.zih.tu-dresden.de/docs/software/mpi_usage_error_detection.md @@ -1,4 +1,4 @@ -# Correctness Checking and Usage Error Detection for MPI Parallel Applications +# Correctness Checking and Usage Error Detection for MPI Parallel Applications MPI as the de-facto standard for parallel applications of the message passing paradigm offers more than one hundred different API calls with complex restrictions. As a result, developing @@ -72,13 +72,13 @@ task**. Finally, MUST assumes that your application may crash at any time. To still gather correctness results under this assumption is extremely expensive in terms of performance overheads. Thus, if -your application does not crashs, you should add an "--must:nocrash" to the mustrun command to make +your application does not crash, you should add an "--must:nocrash" to the mustrun command to make MUST aware of this knowledge. Overhead is drastically reduced with this switch. ### Result Files After running your application with MUST you will have its output in the working directory of your -application. The output is named `MUST_Output.html`. Open this files in a browser to anlyze the +application. The output is named `MUST_Output.html`. Open this files in a browser to analyze the results. The HTML file is color coded: Entries in green represent notes and useful information. Entries in yellow represent warnings, and entries in red represent errors. diff --git a/doc.zih.tu-dresden.de/docs/software/power_ai.md b/doc.zih.tu-dresden.de/docs/software/power_ai.md index 7a03aa31e24615c86c8188e1f165247e1032ab6a..dc0fa59b3fc53e180bd620dde71df5597c33298f 100644 --- a/doc.zih.tu-dresden.de/docs/software/power_ai.md +++ b/doc.zih.tu-dresden.de/docs/software/power_ai.md @@ -9,7 +9,7 @@ are valid for PowerAI version 1.5.4 - \<a href="<https://www.ibm.com/support/knowledgecenter/en/SS5SF7_1.5.3/welcome/welcome.htm>" target="\_blank" title="Landing Page">Landing Page\</a> (note that - you can select differnet PowerAI versions with the drop down menu + you can select different PowerAI versions with the drop down menu "Change Product or version") - \<a href="<https://developer.ibm.com/linuxonpower/deep-learning-powerai/>" diff --git a/doc.zih.tu-dresden.de/docs/software/singularity_recipe_hints.md b/doc.zih.tu-dresden.de/docs/software/singularity_recipe_hints.md index e2831fc2ead270710f9e8d192d8fc51c31a33927..5e4388fcf95ed06370d7d633544ee685113df1a7 100644 --- a/doc.zih.tu-dresden.de/docs/software/singularity_recipe_hints.md +++ b/doc.zih.tu-dresden.de/docs/software/singularity_recipe_hints.md @@ -20,7 +20,7 @@ singularity exec xeyes.sif xeyes. ``` This works because all the magic is done by singularity already like setting $DISPLAY to the outside -display and mounting $HOME so $HOME/.Xauthority (X11 authentification cookie) is found. When you are +display and mounting $HOME so $HOME/.Xauthority (X11 authentication cookie) is found. When you are using \`--contain\` or \`--no-home\` you have to set that cookie yourself or mount/copy it inside the container. Similar for \`--cleanenv\` you have to set $DISPLAY e.g. via diff --git a/doc.zih.tu-dresden.de/util/check-spelling.sh b/doc.zih.tu-dresden.de/util/check-spelling.sh new file mode 100755 index 0000000000000000000000000000000000000000..327b29ec1a80d1a361b8be4bdde2e1a93bf0e981 --- /dev/null +++ b/doc.zih.tu-dresden.de/util/check-spelling.sh @@ -0,0 +1,42 @@ +#!/bin/bash + +scriptpath=${BASH_SOURCE[0]} +basedir=`dirname "$scriptpath"` +basedir=`dirname "$basedir"` +wordlistfile=$basedir/wordlist.aspell +acmd="aspell -p $wordlistfile --ignore 2 -l en_US list" + +function spell_check () { + file_to_check=$1 + ret=$(cat "$file_to_check" | $acmd) + if [ ! -z "$ret" ]; then + echo "-- File $file_to_check" + echo "$ret" | sort -u + fi +} + +function usage() { + cat <<-EOF +usage: $0 [file] +Outputs all words of the file (or, if no argument given, all files in the current directory, recursively), that the spell checker cannot recognize. +If you are sure a word is correct, you can put it in $wordlistfile. +EOF +} + +if [ $# -eq 1 ]; then + case $1 in + help | -help | --help) + usage + exit + ;; + *) + spell_check $1 + ;; + esac +elif [ $# -eq 0 ]; then + for i in `find -name \*.md`; do + spell_check $i + done +else + usage +fi diff --git a/doc.zih.tu-dresden.de/wordlist.aspell b/doc.zih.tu-dresden.de/wordlist.aspell new file mode 100644 index 0000000000000000000000000000000000000000..6d23d29110d57c85ecb248e0ac012652935c8022 --- /dev/null +++ b/doc.zih.tu-dresden.de/wordlist.aspell @@ -0,0 +1,42 @@ +personal_ws-1.1 en 1805 +analytics +benchmarking +citable +CPU +CUDA +EasyBuild +Flink +GPU +hadoop +Haswell +HDFS +Horovod +HPC +ImageNet +Infiniband +Jupyter +Keras +MPI +OPARI +OpenACC +OpenCL +OpenMP +PAPI +rome +romeo +salloc +sbatch +ScaDS +Scalasca +scancel +scontrol +scp +SHMEM +Slurm +squeue +srun +SSD +TensorFlow +Theano +Vampir +ZIH