Skip to content
Snippets Groups Projects
Commit 154c1383 authored by Alexander Grund's avatar Alexander Grund
Browse files

Check and correct uses of `partition` in changed places

parent fc8c2da6
No related branches found
No related tags found
2 merge requests!1008Automated merge from preview to main,!998Update references to old GPU clusters
......@@ -196,7 +196,7 @@ When `srun` is used within a submission script, it inherits parameters from `sba
`--ntasks=1`, `--cpus-per-task=4`, etc. So we actually implicitly run the following
```bash
srun --ntasks=1 --cpus-per-task=4 [...] --partition=power9 some-gpu-application
srun --ntasks=1 --cpus-per-task=4 [...] some-gpu-application
```
Now, our goal is to run four instances of this program concurrently in a single batch script. Of
......
......@@ -208,8 +208,8 @@ parameter `--ntasks-per-node=<N>` equals the number of GPUs you use per node.
Also, it can be useful to increase `memory/cpu` parameters if you run larger models.
Memory can be set up to:
- `--mem=250G` and `--cpus-per-task=7` for the partition `power9`.
- `--mem=900G` and `--cpus-per-task=6` for the partition `alpha`.
- `--mem=250G` and `--cpus-per-task=7` for the `Power9` cluster.
- `--mem=900G` and `--cpus-per-task=6` for the `Alpha` cluster.
Keep in mind that only one memory parameter (`--mem-per-cpu=<MB>` or `--mem=<MB>`) can be specified.
......
......@@ -338,10 +338,11 @@ So the concept if this hierarchical toolchains is already built into this module
## Per-Architecture Builds
Since we have a heterogeneous cluster, we do individual builds of the software for each
architecture present. This ensures that, no matter what partition the software runs on, a build
architecture present.
This ensures that, no matter what partition/cluster the software runs on, a build
optimized for the host architecture is used automatically.
However, not every module will be available for each node type or partition.
However, not every module will be available on all clusters.
Use `ml av` or `ml spider` to search for modules available on the sub-cluster you are on.
## Advanced Usage
......
......@@ -5,7 +5,7 @@ the PowerAI Framework for Machine Learning. In the following the links
are valid for PowerAI version 1.5.4.
!!! warning
The information provided here is available from IBM and can be used on partition `power9` only!
The information provided here is available from IBM and can be used on the `Power9` cluster only!
## General Overview
......@@ -47,7 +47,7 @@ are valid for PowerAI version 1.5.4.
(Open Neural Network Exchange) provides support for moving models
between those frameworks.
- [Distributed Deep Learning](https://www.ibm.com/support/knowledgecenter/SS5SF7_1.5.4/navigation/pai_getstarted_ddl.html?view=kc)
Distributed Deep Learning (DDL). Works on up to 4 nodes on partition `power9`.
Distributed Deep Learning (DDL). Works on up to 4 nodes on cluster `Power9`.
## PowerAI Container
......
......@@ -47,7 +47,7 @@ times till it succeeds.
bash-4.2$ cat /tmp/marie_2759627/activate
#!/bin/bash
if ! grep -q -- "Key for the VM on the partition power9" "/home/marie/.ssh/authorized_keys" > /dev/null; then
if ! grep -q -- "Key for the VM on the cluster power" "/home/marie/.ssh/authorized_keys" > /dev/null; then
cat "/tmp/marie_2759627/kvm.pub" >> "/home/marie/.ssh/authorized_keys"
else
sed -i "s|.*Key for the VM on the cluster power.*|ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC3siZfQ6vQ6PtXPG0RPZwtJXYYFY73TwGYgM6mhKoWHvg+ZzclbBWVU0OoU42B3Ddofld7TFE8sqkHM6M+9jh8u+pYH4rPZte0irw5/27yM73M93q1FyQLQ8Rbi2hurYl5gihCEqomda7NQVQUjdUNVc6fDAvF72giaoOxNYfvqAkw8lFyStpqTHSpcOIL7pm6f76Jx+DJg98sXAXkuf9QK8MurezYVj1qFMho570tY+83ukA04qQSMEY5QeZ+MJDhF0gh8NXjX/6+YQrdh8TklPgOCmcIOI8lwnPTUUieK109ndLsUFB5H0vKL27dA2LZ3ZK+XRCENdUbpdoG2Czz Key for the VM on the cluster power|" "/home/marie/.ssh/authorized_keys"
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment