diff --git a/doc.zih.tu-dresden.de/docs/index.md b/doc.zih.tu-dresden.de/docs/index.md index 7a619e8a91c41b62a1d4f2fe8e8a1b5e16481b5c..64802daa9ae83761fb147961185fec3322880f0b 100644 --- a/doc.zih.tu-dresden.de/docs/index.md +++ b/doc.zih.tu-dresden.de/docs/index.md @@ -31,6 +31,7 @@ Please also find out the other ways you could contribute in our ## News +* **2023-11-16** [OpenMPI 4.1.x - Workaround for MPI-IO Performance Loss](jobs_and_resources/mpi_issues/#openmpi-v41x-performance-loss-with-mpi-io-module-ompio) * **2023-10-04** [User tests on Barnard](jobs_and_resources/barnard_test.md) * **2023-06-01** [New hardware and complete re-design](jobs_and_resources/architecture_2023.md) * **2023-01-04** [New hardware: NVIDIA Arm HPC Developer Kit](jobs_and_resources/arm_hpc_devkit.md) diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/architecture_2023.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/architecture_2023.md index f58a290ca19eabef7221ac245f8ec0d35946668d..b0d23e2e789719ed0ff95a84f8f1056753cbb60c 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/architecture_2023.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/architecture_2023.md @@ -21,7 +21,7 @@ computations, please use interactive jobs. ## Storage Systems -### Permananent Filesystems +### Permanent Filesystems We now have `/home`, `/projects` and `/software` in a Lustre filesystem. Snapshots and tape backup are configured. For convenience, we will make the old home available diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/mpi_issues.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/mpi_issues.md index ccb34da378591594991ab746915fd90e9847920b..6fac180b08d19e24ba28e658539f9664e16c0c93 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/mpi_issues.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/mpi_issues.md @@ -2,6 +2,32 @@ This pages holds known issues observed with MPI and concrete MPI implementations. +## OpenMPI v4.1.x - Performance Loss with MPI-IO-Module OMPIO + +OpenMPI v4.1.x introduced a couple of major enhancements, e.g., the `OMPIO` module is now the +default module for MPI-IO on **all** filesystems incl. Lustre (cf. +[NEWS file in OpenMPI source code](https://raw.githubusercontent.com/open-mpi/ompi/v4.1.x/NEWS)). +Prior to this, `ROMIO` was the default MPI-IO module for Lustre. + +Colleagues of ZIH have found that some MPI-IO access patterns suffer a significant performance loss +using `OMPIO` as MPI-IO module with OpenMPI/4.1.x modules on ZIH systems. At the moment, the root +cause is unclear and needs further investigation. + +**A workaround** for this performance loss is to use "old", i.e., `ROMIO` MPI-IO-module. This +is achieved by setting the environment variable `OMPI_MCA_io` before executing the application as +follows + +```console +export OMPI_MCA_io=^ompio +srun ... +``` + +or setting the option as argument, in case you invoke `mpirun` directly + +```console +mpirun --mca io ^ompio ... +``` + ## Mpirun on partition `alpha`and `ml` Using `mpirun` on partitions `alpha` and `ml` leads to wrong resource distribution when more than