Skip to content
Snippets Groups Projects
Commit faed07d7 authored by Veronika Scholz's avatar Veronika Scholz
Browse files

Fix spelling

parent f65f2791
No related branches found
No related tags found
6 merge requests!398Update data_analytics_with_python.md. Fixed spelling and wording. All issues...,!392Merge preview into contrib guide for browser users,!368Update experiments.md,!356Merge preview in main,!355Merge preview in main,!341Updated TensorFlow and Horovod in distributed_training.md
......@@ -13,7 +13,7 @@ each device has a replica of the model and computes over different parts of the
2. model parallelism:
models are distributed over multiple devices.
In the folowing we will stick to the concept of data parallelism because it is a widely-used
In the following we will stick to the concept of data parallelism because it is a widely-used
technique.
There are basically two strategies to train the scattered data throughout the devices:
......@@ -183,7 +183,7 @@ synchronize gradients and buffers.
The tutorial can be found [here](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html).
To use distributed data parallelism on ZIH systems please make sure the `--ntasks-per-node`
parameter is equal to the number of GPUs you useper node.
parameter is equal to the number of GPUs you use per node.
Also, it can be useful to increase `memory/cpu` parameters if you run larger models.
Memory can be set up to:
......@@ -277,13 +277,13 @@ In the example presented installation for TensorFlow.
Adapt as required and refer to the Horovod documentation for details.
```bash
HOROVOD_GPU_OPERATIONS=NCCL HOROVOD_WITH_TENSORFLOW=1 pip install --no-cache-dir horovod\[tensorflow\]
HOROVOD_GPU_OPERATIONS=NCCL HOROVOD_WITH_TENSORFLOW=1 pip install --no-cache-dir horovod\[tensorflow\]
horovodrun --check-build
```
If you want to use OpenMPI then specify `HOROVOD_GPU_ALLREDUCE=MPI`.
To have better performance it is recommended to use NCCL instead of OpenMPI.
If you want to use OpenMPI then specify `HOROVOD_GPU_ALLREDUCE=MPI`.
To have better performance it is recommended to use NCCL instead of OpenMPI.
##### Verify that Horovod works
......
personal_ws-1.1 en 203
ALLREDUCE
Altix
Amdahl's
analytics
anonymized
APIs
awk
BeeGFS
benchmarking
BLAS
......@@ -13,8 +15,13 @@ ccNUMA
centauri
citable
conda
config
CONFIG
cpu
CPU
cpus
CPUs
crossentropy
CSV
CUDA
cuDNN
......@@ -24,9 +31,13 @@ dataframes
DataFrames
datamover
DataParallel
dataset
ddl
DDP
DDR
DFG
dir
distr
DistributedDataParallel
DockerHub
EasyBuild
......@@ -47,22 +58,30 @@ GFLOPS
gfortran
GiB
gnuplot
gpu
GPU
GPUs
gres
hadoop
haswell
HDFS
hiera
horovod
Horovod
horovodrun
hostname
HPC
HPL
hvd
hyperparameter
hyperparameters
icc
icpc
ifort
ImageNet
img
Infiniband
init
inode
Itanium
jobqueue
......@@ -80,11 +99,13 @@ lsf
lustre
Mathematica
MEGWARE
mem
MiB
MIMD
Miniconda
MKL
MNIST
modenv
Montecito
mountpoint
mpi
......@@ -99,7 +120,11 @@ multithreaded
NCCL
Neptun
NFS
nodelist
NODELIST
NRINGS
ntasks
NUM
NUMA
NUMAlink
NumPy
......@@ -134,10 +159,15 @@ PowerAI
ppc
PSOCK
Pthreads
pty
PythonAnaconda
pytorch
PyTorch
queue
randint
reachability
README
resnet
Rmpi
rome
romeo
......@@ -175,6 +205,7 @@ SUSE
TBB
TCP
TensorBoard
tensorflow
TensorFlow
TFLOPS
Theano
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment