diff --git a/doc.zih.tu-dresden.de/docs/software/distributed_training.md b/doc.zih.tu-dresden.de/docs/software/distributed_training.md
index 15ef5b2b0002a7ed5d2d8096f90c75983ce54cce..b1ae9cc316774f8139379ed0639e4ada5cb443a1 100644
--- a/doc.zih.tu-dresden.de/docs/software/distributed_training.md
+++ b/doc.zih.tu-dresden.de/docs/software/distributed_training.md
@@ -158,7 +158,7 @@ achieve true parallelism due to the well known issue of Global Interpreter Lock
 Python. To work around this issue and gain performance benefits of parallelism, the use of
 `torch.nn.DistributedDataParallel` is recommended. This involves little more code changes to set up,
 but further increases the performance of model training. The starting step is to initialize the
-process group by calling the `torch.distributed.init_process_group()` using the appropriate backend
+process group by calling the `torch.distributed.init_process_group()` using the appropriate back end
 such as NCCL, MPI or Gloo. The use of NCCL as back end is recommended as it is currently the fastest
 back end when using GPUs.