From 730913f42d81cb809f49cc62b92c66abec2a4f3b Mon Sep 17 00:00:00 2001 From: Martin Schroschk <martin.schroschk@tu-dresden.de> Date: Fri, 8 Dec 2023 15:25:49 +0100 Subject: [PATCH] Update w.r.t. new Romeo system --- doc.zih.tu-dresden.de/docs/index.md | 1 + .../jobs_and_resources/hardware_overview.md | 29 +++---------------- .../docs/jobs_and_resources/romeo.md | 25 ++++++++++++++++ 3 files changed, 30 insertions(+), 25 deletions(-) diff --git a/doc.zih.tu-dresden.de/docs/index.md b/doc.zih.tu-dresden.de/docs/index.md index 270c1a3f2..addbc83cb 100644 --- a/doc.zih.tu-dresden.de/docs/index.md +++ b/doc.zih.tu-dresden.de/docs/index.md @@ -31,6 +31,7 @@ Please also find out the other ways you could contribute in our ## News +* **2023-12-07** [Maintenance finished: CPU cluster `Romeo` is now available](jobs_and_resources/romeo.md) * **2023-12-01** [Maintenance finished: GPU cluster `Alpha Centauri` is now available](jobs_and_resources/alpha_centauri.md) * **2023-11-25** [Data transfer available for Barnard via Dataport Nodes](data_transfer/dataport_nodes.md) * **2023-11-14** [End of life of `scratch` and `ssd` filesystems is January 3 2024](data_lifecycle/file_systems.md) diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/hardware_overview.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/hardware_overview.md index 5eb9fd13a..10fe69e2e 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/hardware_overview.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/hardware_overview.md @@ -9,7 +9,7 @@ analytics, and artificial intelligence methods with extensive capabilities for e and performance monitoring provides ideal conditions to achieve the ambitious research goals of the users and the ZIH. -!!! Warning "HPC Systems Migration Phase" +!!! danger "HPC Systems Migration Phase" **On December 11 2023 Taurus will be decommissioned for good**. @@ -28,7 +28,7 @@ perspective, there will be **five separate clusters**: | [`Alpha Centauri`](#alpha-centauri) | GPU cluster | 2021 | `i[8001-8037].alpha.hpc.tu-dresden.de` | | [`Julia`](#julia) | Single SMP system | 2021 | `smp8.julia.hpc.tu-dresden.de` | | [`Romeo`](#romeo) | CPU cluster | 2020 | `i[8001-8190].romeo.hpc.tu-dresden.de` | -| [`Power`](#power9) | IBM Power/GPU cluster | 2018 | `ml[1-29].power9.hpc.tu-dresden.de` | +| [`Power9`](#power9) | IBM Power/GPU cluster | 2018 | `ml[1-29].power9.hpc.tu-dresden.de` | All clusters will run with their own [Slurm batch system](slurm.md) and job submission is possible only from their respective login nodes. @@ -111,7 +111,7 @@ required): ## Login and Dataport Nodes -!!! Note " **On December 11 2023 Taurus will be decommissioned for good**." +!!! danger "**On December 11 2023 Taurus will be decommissioned for good**." Do not use Taurus for production anymore. @@ -168,30 +168,9 @@ The cluster `Romeo` is a general purpose cluster by NEC based on AMD Rome CPUs. - 200 GB local memory on SSD at `/tmp` - Login nodes: `login[1-2].romeo.hpc.tu-dresden.de` - Hostnames: `i[7001-7190].romeo.hpc.tu-dresden.de` +- Operating system: Rocky Linux 8.7 - Further information on the usage is documented on the site [CPU Cluster Romeo](romeo.md) -??? note "Maintenance from November 27 to December 12" - - The recabling will take place from November 27 to December 12. These works are planned: - - * update the software stack (OS, firmware, software), - * change the ethernet access (new VLANs), - * complete integration of Romeo and Julia into the Barnard Infiniband network to get full - bandwidth access to all Barnard filesystems, - * configure and deploy stand-alone Slurm batch systems. - - After the maintenance, the Rome nodes reappear as a stand-alone cluster that can be reached via - `login[1,2].romeo.hpc.tu-dresden.de`. - - **Changes w.r.t. filesystems:** - Your new `/home` directory (from Barnard) will become your `/home` on *Romeo*, Julia, Alpha - Centauri and the Power9 system. Thus, please [migrate your `/home` from Taurus to your **new** - `/home` on Barnard](barnard.md#data-management-and-data-transfer). - - The old work filesystems `/lustre/scratch` and `/lustre/ssd will` be turned off on January 1 - 2024 for good (no data access afterwards!). The new work filesystem available on Romeo will be - `/horse`. Please [migrate your working data to `/horse`](barnard.md#data-migration-to-new-filesystems). - ## Julia The cluster `Julia` is a large SMP (shared memory parallel) system by HPE based on Superdome Flex diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/romeo.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/romeo.md index fc675722d..ba63480e7 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/romeo.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/romeo.md @@ -1,5 +1,30 @@ # CPU Cluster Romeo +## Overview + +The HPC system `Romeo` is a general purpose cluster based on AMD Rome CPUs. From 2019 till the end +of 2023, it was available as partition `romeo` within `Taurus`. With the decommission of `Taurus`, +`Romeo` has been re-engineered and is now a homogeneous, standalone cluster with own +[Slurm batch system](slurm.md) and own login nodes. This maintenance also comprised: + + * change the ethernet access (new VLANs), + * complete integration of `Romeo` into the `Barnard` InfiniBand network to get full + bandwidth access to all new filesystems, + * configure and deploy stand-alone Slurm batch system, + * newly build software within separate software and module system. + +!!! note "Changes w.r.t. filesystems" + + Your new `/home` directory (from `Barnard`) is now your `/home` on *Romeo*, too. + Thus, please + [migrate your `/home` from Taurus to your **new** `/home` on Barnard](barnard.md#data-management-and-data-transfer). + + The old work filesystems `/lustre/scratch` and `/lustre/ssd will` be turned off on January 1 + 2024 for good (no data access afterwards!). The new work filesystem available on `Romeo` is + `horse`. Please [migrate your working data to `/horse`](barnard.md#data-migration-to-new-filesystems). + +## Hardware Resources + The hardware specification is documented on the page [HPC Resources](hardware_overview.md#romeo). -- GitLab