Module TensorFlow/2.3.1-fosscuda-2019b-Python-3.7.4 and 15 dependencies loaded.
```
Please allocate the job with respect to
[hardware specification](../jobs_and_resources/hardware_taurus.md)! Note that the nodes on `ml`
partition have 4way-SMT, so for every physical core allocated, you will always get 4\*1443Mb=5772mb.
!!! warning
Be aware that for compatibility reasons it is important to choose modules with
the same toolchain version (in this case `fosscuda/2019b`). For reference see [here](modules.md)
In order to interact with Python-based frameworks (like TensorFlow) `reticulate` R library is used.
To configure it to point to the correct Python executable in your virtual environment, create
...
...
@@ -111,18 +93,34 @@ a file named `.Rprofile` in your project directory (e.g. R-TensorFlow) with the
contents:
```R
Sys.setenv(RETICULATE_PYTHON="/sw/installed/Anaconda3/2019.03/bin/python")#assign the output of the 'which python' from above to RETICULATE_PYTHON
Sys.setenv(RETICULATE_PYTHON="/sw/installed/Python/3.7.4-GCCcore-8.3.0/bin/python")#assign the output of the 'which python' from above to RETICULATE_PYTHON
```
Let's start R, install some libraries and evaluate the result:
```R
install.packages("reticulate")
library(reticulate)
reticulate::py_config()
install.packages("tensorflow")
library(tensorflow)
tf$constant("Hello TensorFlow")#In the output 'Tesla V100-SXM2-32GB' should be mentioned
```rconsole
> install.packages(c("reticulate", "tensorflow"))
Installing packages into ‘~/R/x86_64-pc-linux-gnu-library/3.6’
@@ -203,13 +201,7 @@ tf$constant("Hello TensorFlow") #In the output 'Tesla V100-SXM2-32GB' sh
## Parallel Computing with R
Generally, the R code is serial. However, many computations in R can be made faster by the use of
parallel computations. Taurus allows a vast number of options for parallel computations. Large
amounts of data and/or use of complex models are indications to use parallelization.
### General Information about the R Parallelism
There are various techniques and packages in R that allow parallelization. This section
concentrates on most general methods and examples. The Information here is Taurus-specific.
parallel computations. This section concentrates on most general methods and examples.
The [parallel](https://www.rdocumentation.org/packages/parallel/versions/3.6.2) library
will be used below.
...
...
@@ -297,7 +289,8 @@ This way of the R parallelism uses the
[MPI](https://en.wikipedia.org/wiki/Message_Passing_Interface)(Message Passing Interface) as a
"back-end" for its parallel operations. The MPI-based job in R is very similar to submitting an
[MPI Job](../jobs_and_resources/slurm.md#binding-and-distribution-of-tasks) since both are running
multicore jobs on multiple nodes. Below is an example of running R script with the Rmpi on Taurus:
multicore jobs on multiple nodes. Below is an example of running R script with the Rmpi on
ZIH system:
```Bash
#!/bin/bash
...
...
@@ -305,8 +298,8 @@ multicore jobs on multiple nodes. Below is an example of running R script with t
#SBATCH --ntasks=32 # this parameter determines how many processes will be spawned, please use >=8
#SBATCH --cpus-per-task=1
#SBATCH --time=01:00:00
#SBATCH -o test_Rmpi.out
#SBATCH -e test_Rmpi.err
#SBATCH --output=test_Rmpi.out
#SBATCH --error=test_Rmpi.err
module purge
module load modenv/scs5
...
...
@@ -323,10 +316,10 @@ However, in some specific cases, you can specify the number of nodes and the num
tasks per node explicitly:
```Bash
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --tasks-per-node=16
#SBATCH --cpus-per-task=1
module purge
module load modenv/scs5
module load R
...
...
@@ -395,7 +388,7 @@ Another example:
#snow::stopCluster(cl) # usually it hangs over here with OpenMPI > 2.0. In this case this command may be avoided, Slurm will clean up after the job finishes
```
To use Rmpi and MPI please use one of these partitions: **haswell**, **broadwell** or **rome**.
To use Rmpi and MPI please use one of these partitions: `haswell`, `broadwell` or `rome`.
Use `mpirun` command to start the R script. It is a wrapper that enables the communication
between processes running on different nodes. It is important to use `-np 1` (the number of spawned