Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
hpc-compendium
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Deploy
Releases
Package registry
Container Registry
Model registry
Operate
Terraform modules
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
ZIH
hpcsupport
hpc-compendium
Commits
1b49c43e
Commit
1b49c43e
authored
3 years ago
by
Martin Schroschk
Browse files
Options
Downloads
Patches
Plain Diff
Brief review
parent
c35e08b6
No related branches found
No related tags found
4 merge requests
!392
Merge preview into contrib guide for browser users
,
!333
Draft: update NGC containers
,
!327
Merge preview into main
,
!317
Jobs and resources
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
doc.zih.tu-dresden.de/docs/jobs_and_resources/alpha_centauri.md
+12
-12
12 additions, 12 deletions
...h.tu-dresden.de/docs/jobs_and_resources/alpha_centauri.md
doc.zih.tu-dresden.de/wordlist.aspell
+7
-0
7 additions, 0 deletions
doc.zih.tu-dresden.de/wordlist.aspell
with
19 additions
and
12 deletions
doc.zih.tu-dresden.de/docs/jobs_and_resources/alpha_centauri.md
+
12
−
12
View file @
1b49c43e
# Alpha Centauri - Multi-GPU
s
ub-
c
luster
# Alpha Centauri - Multi-GPU
S
ub-
C
luster
The sub-cluster "AlphaCentauri" had been installed for AI-related computations (ScaDS.AI).
The sub-cluster "Alpha
Centauri" had been installed for AI-related computations (ScaDS.AI).
It has 34 nodes, each with:
It has 34 nodes, each with:
-
8 x NVIDIA A100-SXM4 (40 GB RAM)
*
8 x NVIDIA A100-SXM4 (40 GB RAM)
-
2 x AMD EPYC CPU 7352 (24 cores) @ 2.3 GHz with multithreading enabled
*
2 x AMD EPYC CPU 7352 (24 cores) @ 2.3 GHz with multi
-
threading enabled
-
1 TB RAM 3.5 TB
`/tmp`
local NVMe device
*
1 TB RAM 3.5 TB
`/tmp`
local NVMe device
-
Hostnames:
`taurusi[8001-8034]`
*
Hostnames:
`taurusi[8001-8034]`
-
Slurm partition
`alpha`
for batch jobs and
`alpha-interactive`
for interactive jobs
*
Slurm partition
`alpha`
for batch jobs and
`alpha-interactive`
for interactive jobs
!!! note
!!! note
...
@@ -23,8 +23,8 @@ The software for the `alpha` partition is available in `modenv/hiera` module env
...
@@ -23,8 +23,8 @@ The software for the `alpha` partition is available in `modenv/hiera` module env
To check the available modules for
`modenv/hiera`
, use the command
To check the available modules for
`modenv/hiera`
, use the command
```
bash
```
console
module spider <module_name>
marie@alpha$
module spider <module_name>
```
```
For example, to check whether PyTorch is available in version 1.7.1:
For example, to check whether PyTorch is available in version 1.7.1:
...
@@ -95,11 +95,11 @@ Successfully installed torchvision-0.10.0
...
@@ -95,11 +95,11 @@ Successfully installed torchvision-0.10.0
### JupyterHub
### JupyterHub
[
JupyterHub
](
../access/jupyterhub.md
)
can be used to run Jupyter notebooks on AlphaCentauri
[
JupyterHub
](
../access/jupyterhub.md
)
can be used to run Jupyter notebooks on Alpha
Centauri
sub-cluster. As a starting configuration, a "GPU (NVIDIA Ampere A100)" preset can be used
sub-cluster. As a starting configuration, a "GPU (NVIDIA Ampere A100)" preset can be used
in the advanced form. In order to use latest software, it is recommended to choose
in the advanced form. In order to use latest software, it is recommended to choose
`fosscuda-2020b`
as a standard environment. Already installed modules from
`modenv/hiera`
`fosscuda-2020b`
as a standard environment. Already installed modules from
`modenv/hiera`
can be pre
-
loaded in "Preload modules (modules load):" field.
can be preloaded in "Preload modules (modules load):" field.
### Containers
### Containers
...
@@ -109,6 +109,6 @@ Detailed information about containers can be found [here](../software/containers
...
@@ -109,6 +109,6 @@ Detailed information about containers can be found [here](../software/containers
Nvidia
Nvidia
[
NGC
](
https://developer.nvidia.com/blog/how-to-run-ngc-deep-learning-containers-with-singularity/
)
[
NGC
](
https://developer.nvidia.com/blog/how-to-run-ngc-deep-learning-containers-with-singularity/
)
containers can be used as an effective solution for machine learning related tasks. (Downloading
containers can be used as an effective solution for machine learning related tasks. (Downloading
containers requires registration).
Nvidia-prepared containers with software solutions for specific
containers requires registration). Nvidia-prepared containers with software solutions for specific
scientific problems can simplify the deployment of deep learning workloads on HPC. NGC containers
scientific problems can simplify the deployment of deep learning workloads on HPC. NGC containers
have shown consistent performance compared to directly run code.
have shown consistent performance compared to directly run code.
This diff is collapsed.
Click to expand it.
doc.zih.tu-dresden.de/wordlist.aspell
+
7
−
0
View file @
1b49c43e
...
@@ -47,6 +47,7 @@ ecryptfs
...
@@ -47,6 +47,7 @@ ecryptfs
engl
engl
english
english
env
env
EPYC
Espresso
Espresso
ESSL
ESSL
fastfs
fastfs
...
@@ -78,6 +79,7 @@ HDFS
...
@@ -78,6 +79,7 @@ HDFS
HDFView
HDFView
Horovod
Horovod
hostname
hostname
Hostnames
HPC
HPC
HPL
HPL
html
html
...
@@ -133,11 +135,13 @@ natively
...
@@ -133,11 +135,13 @@ natively
NCCL
NCCL
Neptun
Neptun
NFS
NFS
NGC
NRINGS
NRINGS
NUMA
NUMA
NUMAlink
NUMAlink
NumPy
NumPy
Nutzungsbedingungen
Nutzungsbedingungen
Nvidia
NVMe
NVMe
NWChem
NWChem
OME
OME
...
@@ -169,6 +173,8 @@ PMI
...
@@ -169,6 +173,8 @@ PMI
png
png
PowerAI
PowerAI
ppc
ppc
Preload
preloaded
PSOCK
PSOCK
Pthreads
Pthreads
pymdownx
pymdownx
...
@@ -220,6 +226,7 @@ stdout
...
@@ -220,6 +226,7 @@ stdout
subdirectories
subdirectories
subdirectory
subdirectory
SUSE
SUSE
SXM
TBB
TBB
TCP
TCP
TensorBoard
TensorBoard
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment