diff --git a/doc.zih.tu-dresden.de/docs/index.md b/doc.zih.tu-dresden.de/docs/index.md index f76dddd4ff8cb73162890b39f9f43e4301208bcb..da67d905d731c8b11a731174875e2dc65f0b3ad9 100644 --- a/doc.zih.tu-dresden.de/docs/index.md +++ b/doc.zih.tu-dresden.de/docs/index.md @@ -31,8 +31,8 @@ Please also find out the other ways you could contribute in our ## News +* **2023-11-06** [Substantial update on "How-To: Migration to Barnard](jobs_and_resources/migration_to_barnard.md) * **2023-10-16** [Open MPI 4.1.x - Workaround for MPI-IO Performance Loss](jobs_and_resources/mpi_issues/#performance-loss-with-mpi-io-module-ompio) -* **2023-10-04** [User tests on Barnard](jobs_and_resources/barnard_test.md) * **2023-06-01** [New hardware and complete re-design](jobs_and_resources/architecture_2023.md) * **2023-01-04** [New hardware: NVIDIA Arm HPC Developer Kit](jobs_and_resources/arm_hpc_devkit.md) diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/architecture_2023.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/architecture_2023.md index edcf705997db07e2358f5eff08145889104df519..8c2c5078ecc1a5bdf8ee3d8596436f635dd33f10 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/architecture_2023.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/architecture_2023.md @@ -1,8 +1,15 @@ # Architectural Re-Design 2023 -With the replacement of the Taurus system by the cluster [Barnard](hardware_overview_2023.md#barnard-intel-sapphire-rapids-cpus) -in 2023, the rest of the installed hardware had to be re-connected, both with -InfiniBand and with Ethernet. +Over the last decade we have been running our HPC system of high heterogeneity with a single +Slurm batch system. This made things very complicated, especially to inexperienced users. +With the replacement of the Taurus system by the cluster +[Barnard](hardware_overview_2023.md#barnard-intel-sapphire-rapids-cpus) +we **now create homogeneous clusters with their own Slurm instances and with cluster specific login +nodes** running on the same CPU. Job submission will be possible only from within the cluster +(compute or login node). + +All clusters will be integrated to the new InfiniBand fabric and have then the same access to +the shared filesystems. This recabling will require a brief downtime of a few days.  {: align=center} @@ -54,5 +61,11 @@ storages. ## Migration Phase For about one month, the new cluster Barnard, and the old cluster Taurus -will run side-by-side - both with their respective filesystems. You can find a comprehensive -[description of the migration phase here](migration_2023.md). +will run side-by-side - both with their respective filesystems. We provide a comprehensive +[description of the migration to Barnard](migration_to_barnard.md). + +The follwing figure provides a graphical overview of the overall process (red: user action +required): + + +{: align=center} diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/migration_2023.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/migration_2023.md deleted file mode 100644 index 54637c407476297d30de46a4bf233359a6053b2b..0000000000000000000000000000000000000000 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/migration_2023.md +++ /dev/null @@ -1,85 +0,0 @@ -# Migration 2023 - -## Brief Overview over Coming Changes - -All components of Taurus will be dismantled step by step. - -### New Hardware - -The new HPC system [Barnard](hardware_overview_2023.md#barnard-intel-sapphire-rapids-cpus) from Bull -comes with these main properties: - -* 630 compute nodes based on Intel Sapphire Rapids -* new Lustre-based storage systems -* HDR InfiniBand network large enough to integrate existing and near-future non-Bull hardware -* To help our users to find the best location for their data we now use the name of -animals (size, speed) as mnemonics. - -### New Architecture - -Over the last decade we have been running our HPC system of high heterogeneity with a single -Slurm batch system. This made things very complicated, especially to inexperienced users. -To lower this hurdle we **now create homogeneous clusters with their own Slurm instances and with -cluster specific login nodes** running on the same CPU. Job submission is possible only -from within the cluster (compute or login node). - -All clusters will be integrated to the new InfiniBand fabric and have then the same access to -the shared filesystems. This recabling requires a brief downtime of a few days. - -Please refer to the overview page [Architectural Re-Design 2023](architecture_2023.md) -for details on the new architecture. - -### New Software - -The new nodes run on Linux RHEL 8.7. For a seamless integration of other compute hardware, -all operating system will be updated to the same versions of operating system, Mellanox and Lustre -drivers. With this all application software was re-built consequently using Git and CI/CD pipelines -for handling the multitude of versions. - -We start with `release/23.10` which is based on software requests from user feedbacks of our -HPC users. Most major software versions exist on all hardware platforms. - -## Migration Path - -Please make sure to have read the details on the [Architectural Re-Design 2023](architecture_2023.md) -before further reading. - -!!! note - - The migration can only be successful as a joint effort of HPC team and users. - -Here is a description of the action items. - -|When?|TODO ZIH |TODO users |Remark | -|---|---|---|---| -| done (May 2023) |first sync `/scratch` to `/data/horse/old_scratch2`| |copied 4 PB in about 3 weeks| -| done (June 2023) |enable access to Barnard| |initialized LDAP tree with Taurus users| -| done (July 2023) | |install new software stack|tedious work | -| ASAP | |adapt scripts|new Slurm version, new resources, no partitions| -| August 2023 | |test new software stack on Barnard|new versions sometimes require different prerequisites| -| August 2023| |test new software stack on other clusters|a few nodes will be made available with the new software stack, but with the old filesystems| -| ASAP | |prepare data migration|The small filesystems `/beegfs` and `/lustre/ssd`, and `/home` are mounted on the old systems "until the end". They will *not* be migrated to the new system.| -| July 2023 | sync `/warm_archive` to new hardware| |using datamover nodes with Slurm jobs | -| September 2023 |prepare re-cabling of older hardware (Bull)| |integrate other clusters in the IB infrastructure | -| Autumn 2023 |finalize integration of other clusters (Bull)| |**~2 days downtime**, final rsync and migration of `/projects`, `/warm_archive`| -| Autumn 2023 ||transfer last data from old filesystems | `/beegfs`, `/lustre/scratch`, `/lustre/ssd` are no longer available on the new systems| - -### Data Migration - -Why do users need to copy their data? Why only some? How to do it best? - -* The sync of hundreds of terabytes can only be done planned and carefully. -(`/scratch`, `/warm_archive`, `/projects`). The HPC team will use multiple syncs -to not forget the last bytes. During the downtime, `/projects` will be migrated. -* User homes (`/home`) are relatively small and can be copied by the scientists. -Keeping in mind that maybe deleting and archiving is a better choice. -* For this, datamover nodes are available to run transfer jobs under Slurm. Please refer to the -section [Transfer Data to New Home Directory](../barnard_test#transfer-data-to-new-home-directory) -for more detailed instructions. - -### A Graphical Overview - -(red: user action required): - - -{: align=center} diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/migration_to_barnard.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/migration_to_barnard.md new file mode 100644 index 0000000000000000000000000000000000000000..a17d8104a46e2575fcd813ae490ce2bfbea564c0 --- /dev/null +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/migration_to_barnard.md @@ -0,0 +1,349 @@ +# How-To: Migration to Barnard + +All HPC users are cordially invited to migrate to our new HPC system **Barnard** and prepare your +software and workflows for production there. + +!!! note "Migration Phase" + + Please make sure to have read the details on the overall + [Architectural Re-Design 2023](architecture_2023.md#migration-phase) before further reading. + +The migration from Taurus to Barnard comprises the following steps: + +* [Prepare login to Barnard](#login-to-barnard) +* [Data management and data transfer to new filesystems](#data-management-and-data-transfer) +* [Update job scripts and workflow to new software](#software) +* [Update job scripts and workflow w.r.t. Slurm](#slurm) + +!!! note + + We highly recommand to first read the entire page carefully, and then execute the steps. + +The migration can only be successful as a joint effort of HPC team and users. +We value your feedback. Please provide it directly via our ticket system. For better processing, +please add "Barnard:" as a prefix to the subject of the [support ticket](../support/support.md). + +## Login to Barnard + +!!! hint + + All users and projects from Taurus now can work on Barnard. + +You use `login[1-4].barnard.hpc.tu-dresden.de` to access the system +from campus (or VPN). In order to verify the SSH fingerprints of the login nodes, please refer to +the page [Fingerprints](/access/key_fingerprints/#barnard). + +All users have **new empty HOME** file systems, this means you have first to ... + +??? "... install your public SSH key on Barnard" + + - Please create a new SSH keypair with ed25519 encryption, secured with + a passphrase. Please refer to this + [page for instructions](../../access/ssh_login#before-your-first-connection). + - After login, add the public key to your `.ssh/authorized_keys` file on Barnard. + +## Data Management and Data Transfer + +!!! warning + + All data in the `/home` directory or in workspaces on the BeeGFS or Lustre SSD file + systems will be deleted by the end of 2023, since these filesystems will be decommissioned. + +Existing Taurus users that would like to keep some of their data need to copy them to the new system +manually, using the [steps described below](#data-management-and-data-transfer). + +### Filesystems on Barnard + +Our new HPC system Barnard also comes with **two new Lustre filesystems**, namely `/data/horse` and +`/data/walrus`. Both have a capacity of 20 PB, but differ in performance and intended usage, see +below. In order to support the data life cycle management, the well-known +[workspace concept](#workspaces-on-barnard) is applied. + +* The `/project` filesystem is the same on Taurus and Barnard +(mounted read-only on the compute nodes). +* The new work filesystem is `/data/horse`. +* The slower `/data/walrus` can be considered as a substitute for the old + `/warm_archive`- mounted **read-only** on the compute nodes. + It can be used to store e.g. results. + +!!! Warning + + All old filesystems, i.e., `ssd`, `beegfs`, and `scratch`, will be shutdown by the end of 2023. + To work with your data from Taurus you might have to move/copy them to the new storages. + + Please, carefully read the following documentation and instructions. + +### Workspaces on Barnard + +The filesystems `/data/horse` and `/data/walrus` can only be accessed via workspaces. Please refer +to the [workspace page](../../data_lifecycle/workspaces/), if you are not familiar with the +workspace concept and the corresponding commands. The following table provides the settings for +workspaces on these two filesystems. + +| Filesystem (use with parameter `--filesystem=<filesystem>`) | Max. Duration in Days | Extensions | Keeptime in Days | +|:-------------------------------------|---------------:|-----------:|--------:| +| `/data/horse` (default) | 100 | 10 | 30 | +| `/data/walrus` | 365 | 2 | 60 | +{: summary="Settings for Workspace Filesystem `/data/horse` and `/data/walrus`."} + +### Data Migration to New Filesystems + +Since all old filesystems of Taurus will be shutdown by the end of 2023, your data needs to be +migrated to the new filesystems on Barnard. This migration comprises + +* your personal `/home` directory, +* your workspaces on `/ssd`, `/beegfs` and `/scratch`. + +!!! note "It's your turn" + + **You are responsible for the migration of your data**. With the shutdown of the old + filesystems, all data will be deleted. + +!!! note "Make a plan" + + We highly recommand to **take some minutes for planing the transfer process**. Do not act with + precipitation. + + Please **do not copy your entire data** from the old to the new filesystems, but consider this + opportunity for **cleaning up your data**. E.g., it might make sense to delete outdated scripts, + old log files, etc., and move other files, e.g., results, to the `/data/walrus` filesystem. + +!!! hint "Generic login" + + In the following we will use the generic login `marie` and workspace `numbercrunch` + ([cf. content rules on generic names](../contrib/content_rules.md#data-privacy-and-generic-names)). + **Please make sure to replace it with your personal login.** + +We have four new [datamover nodes](/data_transfer/datamover) that have mounted all storages +of the old Taurus and new Barnard system. Do not use the datamovers from Taurus, i.e., all data +transfer need to be invoked from Barnard! Thus, the very first step is to +[login to Barnard](#login-to-barnard). + +The command `dtinfo` will provide you the mountpoints of the old filesystems + +```console +marie@barnard$ dtinfo +[...] +directory on datamover mounting clusters directory on cluster + +/data/old/home Taurus /home +/data/old/lustre/scratch2 Taurus /scratch +/data/old/lustre/ssd Taurus /lustre/ssd +[...] +``` + +In the following, we will provide instructions with comprehensive examples for the data transfer of +your data to the new `/home` filesystem, as well as the working filesystems `/data/horse` and +`/data/walrus`. + +??? "Migration of Your Home Directory" + + Your personal (old) home directory at Taurus will not be automatically transferred to the new + Barnard system. Please do not copy your entire home, but clean up your data. E.g., it might + make sense to delete outdated scripts, old log files, etc., and move other files to an archive + filesystem. Thus, please transfer only selected directories and files that you need on the new + system. + + The steps are as follows: + + 1. Login to Barnard, i.e., + + ``` + ssh login[1-4].barnard.tu-dresden.de + ``` + + 1. The command `dtinfo` will provide you the mountpoint + + ```console + marie@barnard$ dtinfo + [...] + directory on datamover mounting clusters directory on cluster + + /data/old/home Taurus /home + [...] + ``` + + 1. Use the `dtls` command to list your files on the old home directory + + ``` + marie@barnard$ dtls /data/old/home/marie + [...] + ``` + + 1. Use the `dtcp` command to invoke a transfer job, e.g., + + ```console + marie@barnard$ dtcp --recursive /data/old/home/marie/<useful data> /home/marie/ + ``` + + **Note**, please adopt the source and target paths to your needs. All available options can be + queried via `dtinfo --help`. + + !!! warning + + Please be aware that there is **no synchronisation process** between your home directories + at Taurus and Barnard. Thus, after the very first transfer, they will become divergent. + +Please follow this instructions for transferring you data from `ssd`, `beegfs` and `scratch` to the +new filesystems. The instructions and examples are divided by the target not the source filesystem. + +This migration task requires a preliminary step: You need to allocate workspaces on the +target filesystems. + +??? Note "Preliminary Step: Allocate a workspace" + + Both `/data/horse/` and `/data/walrus` can only be used with + [workspaces](../data_lifecycle/workspaces.md). Before you invoke any data transer from the old + working filesystems to the new ones, you need to allocate a workspace first. + + The command `ws_list -l` lists the available and the default filesystem for workspaces. + + ``` + marie@barnard$ ws_list --list + available filesystems: + horse (default) + walrus + ``` + + As you can see, `/data/horse` is the default workspace filesystem at Barnard. I.e., if you + want to allocate, extend or release a workspace on `/data/walrus`, you need to pass the + option `--filesystem=walrus` explicitly to the corresponding workspace commands. Please + refer to our [workspace documentation](../data_lifecycle/workspaces.md), if you need refresh + your knowledge. + + The most simple command to allocate a workspace is as follows + + ``` + marie@barnard$ ws_allocate numbercrunch 90 + ``` + + Please refer to the table holding the settings + (cf. [subection workspaces on Barnard](#workspaces-on-barnard)) for the max. duration and + `ws_allocate --help` for all available options. + +??? "Migration to work filesystem `/data/horse`" + + === "Source: old `/scratch`" + + If you transfer data from the old `/scratch` to `/data/horse`, it is sufficient to use + `dtmv` instead of `dtcp` since this data has already been copied to a special directory on + the new `horse` filesystem. Thus, you just need to move it to the right place (the Lustre + metadata system will update the correspoding entries). + + ```console + marie@barnard$ dtmv /data/horse/lustre/scratch2/0/marie-numbercrunch /data/horse/ws/marie-numbercrunch + ``` + + === "Source: old `/ssd`" + + The old `ssd` filesystem is mounted at `/data/old/lustre/ssd` on the datamover nodes and the + workspaces are within the subdirectory `ws/`. A corresponding data transfer using `dtcopy` + looks like + + ```console + marie@barnard$ dtcp --recursive /data/old/lustre/ssd/ws/marie-numbercrunch /data/horse/ws/marie-numbercrunch + ``` + + === "Source: old `/beegfs`" + + The old `beegfs` filesystem is mounted at `/data/old/beegfs` on the datamover nodes and the + workspaces are within the subdirectories `ws/0` and `ws/1`, respectively. A corresponding + data transfer using `dtcp` looks like + + ```console + marie@barnard$ dtcp --recursive /data/old/beegfs/ws/0/marie-numbercrunch /data/horse/ws/marie-numbercrunch + ``` + +??? "Migration to `/data/walrus`" + + === "Source: old `/scratch`" + + The old `scratch` filesystem is mounted at `/data/old/lustre/scratch2` on the datamover + nodes and the workspaces are within the subdirectories `ws/0` and `ws/1`, respectively. A + corresponding data transfer using `dtcopy` looks like + + ```console + marie@barnard$ dtcp --recursive /data/old/lustre/scratch2/ws/0/marie-numbercrunch /data/walrus/ws/marie-numbercrunch + ``` + + === "Source: old `/ssd`" + + The old `ssd` filesystem is mounted at `/data/old/lustre/ssd` on the datamover nodes and the + workspaces are within the subdirectory `ws/`. A corresponding data transfer using `dtcopy` + looks like + + ```console + marie@barnard$ dtcp --recursive /data/old/lustre/ssd/ /data/walrus/ws/marie-numbercrunch + ``` + + === "Source: old `/beegfs`" + + The old `beegfs` filesystem is mounted at `/data/old/beegfs` on the datamover nodes and the + workspaces are within the subdirectories `ws/0` and `ws/1`, respectively. A corresponding + data transfer using `dtcopy` looks like + + ```console + marie@barnard$ dtcp --recursive /data/old/beegfs/ws/0/marie-numbercrunch /data/walrus/ws/marie-numbercrunch + ``` + +??? "Migration from `/lustre/ssd` or `/beegfs`" + + **You** are entirely responsible for the transfer of these data to the new location. + Start the dtrsync process as soon as possible. (And maybe repeat it at a later time.) + +??? "Migration from `/lustre/scratch2` aka `/scratch`" + + We are synchronizing this (**last: October 18**) to `/data/horse/lustre/scratch2/`. + + Please do **NOT** copy those data yourself. Instead check if it is already sychronized + to `/data/walrus/warm_archive/ws`. + + In case you need to update this (Gigabytes, not Terabytes!) please run `dtrsync` like in + `dtrsync -a /data/old/lustre/scratch2/ws/0/my-workspace/newest/ /data/horse/lustre/scratch2/ws/0/my-workspace/newest/` + +??? "Migration from `/warm_archive`" + + The process of syncing data from `/warm_archive` to `/data/walrus/warm_archive` is still ongoing. + + Please do **NOT** copy those data yourself. Instead check if it is already sychronized + to `/data/walrus/warm_archive/ws`. + + In case you need to update this (Gigabytes, not Terabytes!) please run `dtrsync` like in + `dtrsync -a /data/old/warm_archive/ws/my-workspace/newest/ /data/walrus/warm_archive/ws/my-workspace/newest/` + +When the last compute system will have been migrated the old file systems will be +set write-protected and we start a final synchronization (sratch+walrus). +The target directories for synchronization `/data/horse/lustre/scratch2/ws` and +`/data/walrus/warm_archive/ws/` will not be deleted automatically in the mean time. + +## Software + +Please use `module spider` to identify the software modules you need to load. + +The default release version is 23.10. + +The new nodes run on Linux RHEL 8.7. For a seamless integration of other compute hardware, +all operating system will be updated to the same versions of operating system, Mellanox and Lustre +drivers. With this all application software was re-built consequently using Git and CI/CD pipelines +for handling the multitude of versions. + +We start with `release/23.10` which is based on software requests from user feedbacks of our +HPC users. Most major software versions exist on all hardware platforms. + +## Slurm + +* We are running the most recent Slurm version. +* You must not use the old partition names. +* Not all things are tested. + +## Updates after your feedback (state: October 19) + +* A **second synchronization** from `/scratch` has started on **October, 18**, and is + now nearly done. +* A first, and incomplete synchronization from `/warm_archive` has been done (see above). + With support from NEC we are transferring the rest in the next weeks. +* The **data transfer tools** now work fine. +* After fixing too tight security restrictions, **all users can login** now. +* **ANSYS** now starts: please check if your specific use case works. +* **login1** is under construction, do not use it at the moment. Workspace creation does + not work there. diff --git a/doc.zih.tu-dresden.de/mkdocs.yml b/doc.zih.tu-dresden.de/mkdocs.yml index 2f46fed3da8ef98c096e2b64c9d8fa0611749912..6f106d108cfbdef1594d1c85b72e8715f7fb15cf 100644 --- a/doc.zih.tu-dresden.de/mkdocs.yml +++ b/doc.zih.tu-dresden.de/mkdocs.yml @@ -103,9 +103,8 @@ nav: - Overview: jobs_and_resources/hardware_overview.md - New Systems 2023: - Architectural Re-Design 2023: jobs_and_resources/architecture_2023.md - - Overview 2023: jobs_and_resources/hardware_overview_2023.md - - Migration 2023: jobs_and_resources/migration_2023.md - - "How-To: Migration to Barnard": jobs_and_resources/barnard_test.md + - HPC Resources Overview 2023: jobs_and_resources/hardware_overview_2023.md + - "How-To: Migration to Barnard": jobs_and_resources/migration_to_barnard.md - AMD Rome Nodes: jobs_and_resources/rome_nodes.md - NVMe Storage: jobs_and_resources/nvme_storage.md - Alpha Centauri: jobs_and_resources/alpha_centauri.md