diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/hardware_overview_2023.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/hardware_overview_2023.md index cc45e3809e4b249194a1e5c7464195a837721e75..50dcefc3c47102683ec4dbf6e3439d4cc8466934 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/hardware_overview_2023.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/hardware_overview_2023.md @@ -27,3 +27,67 @@ All clusters have access to these shared parallel file systems: | `Weka` | `/weka` | 232 TB | | `Home` | `/home` | 40 TB | +## Barnard - Intel Sapphire Rapids CPUs + +- 630 nodes, each with + - 2 x Intel(R) Xeon(R) CPU E5-2680 v3 (12 cores) @ 2.50 GHz, Multithreading disabled + - 128 GB local memory on SSD +- Varying amounts of main memory (selected automatically by the batch system for you according to + your job requirements) + * 594 nodes with 2.67 GB RAM per core (64 GB in total): `taurusi[6001-6540,6559-6612]` + - 18 nodes with 10.67 GB RAM per core (256 GB in total): `taurusi[6541-6558]` +- Hostnames: `taurusi[6001-6612]` +- Slurm Partition: `haswell` + +??? hint "Node topology" + +  + {: align=center} + + +## AMD Rome CPUs + NVIDIA A100 + +- 34 nodes, each with + - 8 x NVIDIA A100-SXM4 Tensor Core-GPUs + - 2 x AMD EPYC CPU 7352 (24 cores) @ 2.3 GHz, Multithreading available + - 1 TB RAM + - 3.5 TB local memory on NVMe device at `/tmp` +- Hostnames: `taurusi[8001-8034]` +- Slurm partition: `alpha` +- Further information on the usage is documented on the site [Alpha Centauri Nodes](alpha_centauri.md) + +## Island 7 - AMD Rome CPUs + +- 192 nodes, each with + - 2 x AMD EPYC CPU 7702 (64 cores) @ 2.0 GHz, Multithreading available + - 512 GB RAM + - 200 GB local memory on SSD at `/tmp` +- Hostnames: `taurusi[7001-7192]` +- Slurm partition: `romeo` +- Further information on the usage is documented on the site [AMD Rome Nodes](rome_nodes.md) + +## Large SMP System HPE Superdome Flex + +- 1 node, with + - 32 x Intel(R) Xeon(R) Platinum 8276M CPU @ 2.20 GHz (28 cores) + - 47 TB RAM +- Configured as one single node +- 48 TB RAM (usable: 47 TB - one TB is used for cache coherence protocols) +- 370 TB of fast NVME storage available at `/nvme/<projectname>` +- Hostname: `taurussmp8` +- Slurm partition: `julia` +- Further information on the usage is documented on the site [HPE Superdome Flex](sd_flex.md) + +## IBM Power9 Nodes for Machine Learning + +For machine learning, we have IBM AC922 nodes installed with this configuration: + +- 32 nodes, each with + - 2 x IBM Power9 CPU (2.80 GHz, 3.10 GHz boost, 22 cores) + - 256 GB RAM DDR4 2666 MHz + - 6 x NVIDIA VOLTA V100 with 32 GB HBM2 + - NVLINK bandwidth 150 GB/s between GPUs and host +- Hostnames: `taurusml[1-32]` +- Slurm partition: `ml` + +