diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/rome_nodes.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/rome_nodes.md index 717bd4ddc4909e67fe81ca99c94d4a113f770ae0..a6cdfba8bd47659bc3a14473cad74c10b73089d0 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/rome_nodes.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/rome_nodes.md @@ -1,44 +1,54 @@ -# AMD EPYC Nodes (Zen 2, Codename "Rome") +# Island 7 - AMD Rome Nodes -The nodes **taurusi\[7001-7192\]** are each equipped 2x AMD EPYC 7702 -64-Core processors, so there is a total of 128 physical cores in each +## Hardware + +- Slurm partiton: romeo +- Module architecture: rome +- 192 nodes taurusi[7001-7192], each: + - 2x AMD EPYC CPU 7702 (64 cores) @ 2.0GHz, MultiThreading + - 512 GB RAM + - 200 GB SSD disk mounted on /tmp + +## Usage + +There is a total of 128 physical cores in each node. SMT is also active, so in total, 256 logical cores are available per node. +!!! note + Multithreading is disabled per default in a job. To make use of it + include the Slurm parameter `--hint=multithread` in your job script + or command line, or set + the environment variable `SLURM_HINT=multithread` before job submission. + Each node brings 512 GB of main memory, so you can request roughly 1972MB per logical core (using --mem-per-cpu). Note that you will always get the memory for the logical core sibling too, even if you do not -intend to use SMT (SLURM_HINT=nomultithread which is the default). - -You can use them by specifying partition romeo: **-p romeo** +intend to use SMT. -**Note:** If you are running a job here with only ONE process (maybe -multiple cores), please explicitly set the option `-n 1` ! +!!! note + If you are running a job here with only ONE process (maybe + multiple cores), please explicitly set the option `-n 1` ! Be aware that software built with Intel compilers and `-x*` optimization flags will not run on those AMD processors! That's why most older modules built with intel toolchains are not available on **romeo**. -We provide the script: **ml_arch_avail** that you can use to check if a +We provide the script: `ml_arch_avail` that you can use to check if a certain module is available on rome architecture. ## Example, running CP2K on Rome First, check what CP2K modules are available in general: - -```bash - $ ml spider CP2K - #or: - $ ml avail CP2K/ -``` +`module load spider CP2K` or `module avail CP2K`. You will see that there are several different CP2K versions avail, built with different toolchains. Now let's assume you have to decided you want to run CP2K version 6 at least, so to check if those modules are built for rome, use: -```bash -$ ml_arch_avail CP2K/6 +```console +marie@login$ ml_arch_avail CP2K/6 CP2K/6.1-foss-2019a: haswell, rome CP2K/6.1-foss-2019a-spglib: haswell, rome CP2K/6.1-intel-2018a: sandy, haswell @@ -73,28 +83,29 @@ you should set the following environment variable to make sure that AVX2 is used: ```bash - export MKL_DEBUG_CPU_TYPE=5 +export MKL_DEBUG_CPU_TYPE=5 ``` Without it, the MKL does a CPUID check and disables AVX2/FMA on -non-Intel CPUs, leading to much worse performance. **NOTE:** in version -2020, Intel has removed this environment variable and added separate Zen -codepaths to the library. However, they are still incomplete and do not -cover every BLAS function. Also, the Intel AVX2 codepaths still seem to -provide somewhat better performance, so a new workaround would be to -overwrite the `mkl_serv_intel_cpu_true` symbol with a custom function: +non-Intel CPUs, leading to much worse performance. +!!! note + In version 2020, Intel has removed this environment variable and added separate Zen + codepaths to the library. However, they are still incomplete and do not + cover every BLAS function. Also, the Intel AVX2 codepaths still seem to + provide somewhat better performance, so a new workaround would be to + overwrite the `mkl_serv_intel_cpu_true` symbol with a custom function: ```c - int mkl_serv_intel_cpu_true() { - return 1; - } +int mkl_serv_intel_cpu_true() { + return 1; +} ``` and preloading this in a library: -```bash - gcc -shared -fPIC -o libfakeintel.so fakeintel.c - export LD_PRELOAD=libfakeintel.so +```console +marie@login$ gcc -shared -fPIC -o libfakeintel.so fakeintel.c +marie@login$ export LD_PRELOAD=libfakeintel.so ``` As for compiler optimization flags, `-xHOST` does not seem to produce diff --git a/doc.zih.tu-dresden.de/docs/software/misc/parallel_debugging_must.pdf b/doc.zih.tu-dresden.de/docs/software/misc/parallel_debugging_must.pdf new file mode 100644 index 0000000000000000000000000000000000000000..ee250cd658c90396f54a82648823800293093401 Binary files /dev/null and b/doc.zih.tu-dresden.de/docs/software/misc/parallel_debugging_must.pdf differ diff --git a/doc.zih.tu-dresden.de/docs/software/mpi_usage_error_detection.md b/doc.zih.tu-dresden.de/docs/software/mpi_usage_error_detection.md index 201f909b26ddce09ee537f1818bcf3aba1128368..8d1d7e17a02c3dd2ab572216899cd37f7a9aee3a 100644 --- a/doc.zih.tu-dresden.de/docs/software/mpi_usage_error_detection.md +++ b/doc.zih.tu-dresden.de/docs/software/mpi_usage_error_detection.md @@ -1,4 +1,4 @@ -# Introduction +# Correctness Checking and Usage Error Detection for MPI Parallel Applications MPI as the de-facto standard for parallel applications of the message passing paradigm offers more than one hundred different API calls with complex restrictions. As a result, developing @@ -15,7 +15,7 @@ MUST checks if your application conforms to the MPI standard and will issue warn errors or non-portable constructs. You can apply MUST without modifying your source code, though we suggest to add the debugging flag "-g" during compilation. -- [MUST introduction slides]**todo** %ATTACHURL%/parallel_debugging_must.pdf +See also [MUST Introduction Slides](misc/parallel_debugging_must.pdf). ### Setup and Modules @@ -24,29 +24,50 @@ combination of a compiler and an MPI library, make sure to use a combination tha Right now we only provide a single combination on each system, contact us if you need further combinations. You can query for the available modules with: -```Bash -module avail must +```console +marie@login$ module avail must + MUST/1.6.0-rc3-intel-2018a (L) ``` You can load a MUST module as follows: -```Bash -module load must +```console +marie@login$ module load MUST +Module MUST/1.6.0-rc3-intel-2018a and 16 dependencies loaded. ``` Besides loading a MUST module, no further changes are needed during compilation and linking. -### Running with MUST +### Running your Application with MUST -In order to run with MUST you need to replace the mpirun/mpiexec command with mustrun: +In order to run your application with MUST you need to replace the srun command with mustrun: -```Bash -mustrun -np <NPROC> ./a.out +```console +marie@login$ mustrun -np <number of MPI processes> ./<your binary> ``` -Besides replacing the mpiexec command you need to be aware that **MUST always allocates an extra -process**. I.e. if you issue a `mustrun -np 4 ./a.out` then MUST will start 5 processes instead. -This is usually not critical, however in batch jobs **make sure to allocate space for this extra +Suppose your application is called `fancy-program` and is normally run with 4 processes. +The invocation should then be + +```console +marie@login$ mustrun -np 4 ./fancy-program +[MUST] MUST configuration ... centralized checks with fall-back application crash handling (very slow) +[MUST] Weaver ... success +[MUST] Code generation ... success +[MUST] Build file generation ... success +[MUST] Configuring intermediate build ... success +[MUST] Building intermediate sources ... success +[MUST] Installing intermediate modules ... success +[MUST] Generating P^nMPI configuration ... success +[MUST] Search for linked P^nMPI ... not found ... using LD_PRELOAD to load P^nMPI ... success +[MUST] Executing application: +{...} +[MUST] Execution finished, inspect "/home/marie/MUST_Output.html"! +``` + +Besides replacing the srun command you need to be aware that **MUST always allocates an extra +process**, i.e. if you issue a `mustrun -np 4 ./a.out` then MUST will start 5 processes instead. +This is usually not critical, however in batch jobs **make sure to allocate an extra CPU for this task**. Finally, MUST assumes that your application may crash at any time. To still gather correctness @@ -61,7 +82,7 @@ application. The output is named `MUST_Output.html`. Open this files in a browse results. The HTML file is color coded: Entries in green represent notes and useful information. Entries in yellow represent warnings, and entries in red represent errors. -## Other MPI Correctness Tools +## Further MPI Correctness Tools Besides MUST, there exist further MPI correctness tools, these are: diff --git a/howto.md b/howto.md index bc48fed969a008ed6ae2e13c08a48a0461ab4ac3..51f73d786d327be1c19589bcc62a437fe305c60e 100644 --- a/howto.md +++ b/howto.md @@ -1,6 +1,6 @@ # How to work with the git -Pre-requisites: see Readme.md +Pre-requisites: see [Readme.md](doc.zih.tu-dresden.de/README.md) I want to change something in the RomeNodes.md documentation! @@ -16,15 +16,19 @@ cd doc.zih.tu-dresden.de ```Bash git checkout -b RomeNodes ``` + ## 2. Edit the file using your preferred editor ## 3. Run the linter: + ```Bash markdownlint ./docs/use_of_hardware/RomeNodes.md ``` + If there are still errors: go to step 2 ## 4. Run the link checker: + ```Bash markdown-link-check ./docs/use_of_hardware/RomeNodes.md ``` @@ -32,15 +36,10 @@ markdown-link-check ./docs/use_of_hardware/RomeNodes.md If there are still errors: go to step 2 ## 5. Commit and merge request + ```Bash git commit ./docs/use_of_hardware/RomeNodes.md -m "typo fixed" git push origin RomeNodes #the branch name ``` -You will get a link you have to follow to create the merge request. - - - - - - +You will get a link you have to follow to create the merge request.