Document problems with Rmpi and UCX
Using the R parallel library on MPI clusters has shown problems when using more than a few nodes. The error messages indicate that there are buggy interactions of R/Rmpi/OpenMPI and UCX. Disabling UCX has solved these problems in our experiments. Documenting this workaround could be beneficial for other R users.
We invoked the R script successfully with the following command:
mpirun -mca btl_openib_allow_ib true --mca pml ^ucx --mca osc ^ucx -np 1 Rscript --vanilla the-script.R
where the arguments -mca btl_openib_allow_ib true --mca pml ^ucx --mca osc ^ucx
disable usage of UCX.
Edited by Rico Bergmann