diff --git a/doc.zih.tu-dresden.de/docs/archive/filesystems.md b/doc.zih.tu-dresden.de/docs/archive/filesystems.md new file mode 100644 index 0000000000000000000000000000000000000000..6555aecfc897486bb4677d9f0a0ec7b1e5e549d6 --- /dev/null +++ b/doc.zih.tu-dresden.de/docs/archive/filesystems.md @@ -0,0 +1,60 @@ +--- +search: + boost: 0.00001 +--- + +# Switched-Off Filesystems (Outdated) + +!!! warning + + **This page is deprecated! All documented filesystems on this page are decommissioned.** + +## Workspaces + +### Workspace Lifetimes + +The filesystems `warm_archive`, `ssd` and `scratch` will be switched off end of 2023. Do not use +them anymore! + +| Filesystem (use with parameter `--filesystem <filesystem>`) | Duration, days | Extensions | [Filesystem Feature](#filesystem-features) | Remarks | +|:-------------------------------------|---------------:|-----------:|:-------------------------------------------------------------------------|:--------| +| `scratch` (default) | 100 | 10 | `fs_lustre_scratch2` | Scratch filesystem (`/lustre/scratch2`, symbolic link: `/scratch`) with high streaming bandwidth, based on spinning disks | +| `ssd` | 30 | 2 | `fs_lustre_ssd` | High-IOPS filesystem (`/lustre/ssd`, symbolic link: `/ssd`) on SSDs. | +| `warm_archive` | 365 | 2 | 30 | `fs_warm_archive_ws` | Capacity filesystem based on spinning disks | + +## Node Features for Selective Job Submission + +The nodes in our HPC system are becoming more diverse in multiple aspects, e.g, hardware, mounted +storage, software. The system administrators can describe the set of properties and it is up to you +as user to specify the requirements. These features should be thought of as changing over time +(e.g., a filesystem get stuck on a certain node). + +A feature can be used with the Slurm option `-C, --constraint=<ARG>` like +`srun --constraint="fs_lustre_scratch2" [...]` with `srun` or `sbatch`. + +Multiple features can also be combined using AND, OR, matching OR, resource count etc. +E.g., `--constraint="fs_beegfs|fs_lustre_ssd"` requests for nodes with at least one of the +features `fs_beegfs` and `fs_lustre_ssd`. For a detailed description of the possible +constraints, please refer to the [Slurm documentation](https://slurm.schedmd.com/srun.html#OPT_constraint). + +!!! hint + + A feature is checked only for scheduling. Running jobs are not affected by changing features. + +## Filesystem Features + +A feature `fs_*` is active if a certain (global) filesystem is mounted and available on a node. +Access to these filesystems is tested every few minutes on each node and the Slurm features are +set accordingly. + +| Feature | Description | [Workspace Name](../data_lifecycle/workspaces.md#extension-of-a-workspace) | +|:---------------------|:-------------------------------------------------------------------|:---------------------------------------------------------------------------| +| `fs_lustre_scratch2` | `/scratch` mounted read-write (mount point is `/lustre/scratch2`) | `scratch` | +| `fs_lustre_ssd` | `/ssd` mounted read-write (mount point is `/lustre/ssd`) | `ssd` | +| `fs_warm_archive_ws` | `/warm_archive/ws` mounted read-only | `warm_archive` | +| `fs_beegfs_global0` | `/beegfs/global0` mounted read-write | `beegfs_global0` | +| `fs_beegfs` | `/beegfs` mounted read-write | `beegfs` | +!!! hint + +For certain projects, specific filesystems are provided. For those, +additional features are available, like `fs_beegfs_<projectname>`. diff --git a/doc.zih.tu-dresden.de/docs/data_lifecycle/workspaces.md b/doc.zih.tu-dresden.de/docs/data_lifecycle/workspaces.md index e810605f17da8371f78b62dfd41679f9e877b5eb..fab57270a92385e3b32a0dc10dd7ed7edd275c38 100644 --- a/doc.zih.tu-dresden.de/docs/data_lifecycle/workspaces.md +++ b/doc.zih.tu-dresden.de/docs/data_lifecycle/workspaces.md @@ -31,12 +31,11 @@ Since the workspace filesystems are intended for different use cases and thus di performance, their granted timespans differ accordingly. The maximum lifetime and number of renewals are provided in the following table. -| Filesystem (use with parameter `--filesystem <filesystem>`) | Max. Duration in Days | Extensions | Keeptime | [Filesystem Feature](../jobs_and_resources/slurm.md#filesystem-features) | -|:------------------------------------------------------------|---------------:|-----------:|---------:|:-------------------------------------------------------------------------| -| ` horse` | 100 | 10 | 30 | | -| ` walrus` | 100 | 10 | 60 | | -| `beegfs_global0` (deprecated) | 30 | 2 | 30 | `fs_beegfs_global0` | -| `beegfs` | 30 | 2 | 30 | `fs_beegfs` | +| Filesystem (use with parameter `--filesystem <filesystem>`) | Max. Duration in Days | Extensions | Keeptime | +|:------------------------------------------------------------|---------------:|-----------:|---------:| +| ` horse` | 100 | 10 | 30 | +| ` walrus` | 100 | 10 | 60 | +| `beegfs` | 30 | 2 | 30 | {: summary="Settings for Workspace Filesystems."} !!! note @@ -44,17 +43,6 @@ renewals are provided in the following table. Currently, not all filesystems are available on all of our five clusters. The page [Working Filesystems](working.md) provides the necessary information. -??? warning "End-of-life filesystems" - - The filesystems `warm_archive`, `ssd` and `scratch` will be switched off end of 2023. Do not use - them anymore! - - | Filesystem (use with parameter `--filesystem <filesystem>`) | Duration, days | Extensions | [Filesystem Feature](../jobs_and_resources/slurm.md#filesystem-features) | Remarks | - |:-------------------------------------|---------------:|-----------:|:-------------------------------------------------------------------------|:--------| - | `scratch` (default) | 100 | 10 | `fs_lustre_scratch2` | Scratch filesystem (`/lustre/scratch2`, symbolic link: `/scratch`) with high streaming bandwidth, based on spinning disks | - | `ssd` | 30 | 2 | `fs_lustre_ssd` | High-IOPS filesystem (`/lustre/ssd`, symbolic link: `/ssd`) on SSDs. | - | `warm_archive` | 365 | 2 | 30 | `fs_warm_archive_ws` | Capacity filesystem based on spinning disks | - ### List Available Filesystems To list all available filesystems for using workspaces, you can either invoke `ws_list -l` or diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/barnard.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/barnard.md index 7a14bc360e6324c6f62ccffde7417fdff2513e19..95976c5f7f337f4f97ee97541bef4a8d91285598 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/barnard.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/barnard.md @@ -326,4 +326,5 @@ Please use `module spider` to identify the software modules you need to load. Note that most nodes on Barnard don't have a local disk and space in `/tmp` is **very** limited. If you need a local disk request this with the -[Slurm feature](slurm.md#node-features-for-selective-job-submission) `--constraint=local_disk`. +[Slurm feature](slurm.md#node-local-storage-in-jobs) +`--constraint=local_disk` to `sbatch`, `salloc`, and `srun`. diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm.md index 9323bdc80b7ac849b8858f67eee67892d526aefa..a089490d78e7c122a2ed33a7be2dc12bfe00128b 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm.md @@ -534,49 +534,21 @@ marie@login$ scontrol show res=<reservation name> If you want to use your reservation, you have to add the parameter `--reservation=<reservation name>` either in your job script or to your `srun` or `salloc` command. -## Node Features for Selective Job Submission +## Node-Local Storage in Jobs -The nodes in our HPC system are becoming more diverse in multiple aspects, e.g, hardware, mounted -storage, software. The system administrators can describe the set of properties and it is up to you -as user to specify the requirements. These features should be thought of as changing over time -(e.g., a filesystem get stuck on a certain node). +For some workloads and applications, it is valuable to use node-local storage in order to reduce or +even completly omit usage of the [parallel filesystems](../data_lifecycle/working.md). -A feature can be used with the Slurm option `-C, --constraint=<ARG>` like -`srun --constraint="fs_lustre_scratch2" [...]` with `srun` or `sbatch`. +The availability and the capacity of local storage differ between our clusters, as depicted in the +follwing table. -Multiple features can also be combined using AND, OR, matching OR, resource count etc. -E.g., `--constraint="fs_beegfs|fs_lustre_ssd"` requests for nodes with at least one of the -features `fs_beegfs` and `fs_lustre_ssd`. For a detailed description of the possible -constraints, please refer to the [Slurm documentation](https://slurm.schedmd.com/srun.html#OPT_constraint). +| Cluster | Number of Nodes | Local Storage | Mountpoint | Request | +|------------------|-------------------|-----------------------|------------|--------------------------------------------------------------------| +| `Alpha Centauri` | All compute nodes | 3.5 TB on NVMe device | `/tmp` | Always present, no action needed | +| `Barnard` | 12 nodes | 1.8 TB on NVMe device | `/tmp` | `--constraint=local_disk` option to `sbatch`, `salloc`, and `srun` | +| `Romeo` | All compute nodes | 200 GB | `/tmp` | Always present, no action needed | -!!! hint +!!! hint "Clusters `Power9` and `Julia`" - A feature is checked only for scheduling. Running jobs are not affected by changing features. - -### Filesystem Features - -!!! danger "Not functional at the moment" - - The filesystem features are currently not functional after the shutdown of the Taurus system. We - are actively working on integrating this feature mimic in the new HPC clusters. - - We decided to not temporally remove this documentation, although it is not functional at the - moment. Please understand it as kind of an announcement to an upcoming feature addition. - -If you need a local disk (i.e. `/tmp`) on a diskless cluster (e.g. [Barnard](barnard.md)) -use the feature `local_disk`.` - -A feature `fs_*` is active if a certain (global) filesystem is mounted and available on a node. -Access to these filesystems is tested every few minutes on each node and the Slurm features are -set accordingly. - -| Feature | Description | [Workspace Name](../data_lifecycle/workspaces.md#extension-of-a-workspace) | -|:---------------------|:-------------------------------------------------------------------|:---------------------------------------------------------------------------| -| `fs_lustre_scratch2` | `/scratch` mounted read-write (mount point is `/lustre/scratch2`) | `scratch` | -| `fs_lustre_ssd` | `/ssd` mounted read-write (mount point is `/lustre/ssd`) | `ssd` | -| `fs_warm_archive_ws` | `/warm_archive/ws` mounted read-only | `warm_archive` | -| `fs_beegfs_global0` | `/beegfs/global0` mounted read-write | `beegfs_global0` | -| `fs_beegfs` | `/beegfs` mounted read-write | `beegfs` | - -For certain projects, specific filesystems are provided. For those, -additional features are available, like `fs_beegfs_<projectname>`. + Node-local storage is not available on the two clusters [`Power9`](power9.md) and + [`Julia`](julia.md). diff --git a/doc.zih.tu-dresden.de/mkdocs.yml b/doc.zih.tu-dresden.de/mkdocs.yml index e2da720a688c1b28d4f77303bfdfcbd85c90c065..d2b2bc9feb444046b96ee93b84c0826418e0985a 100644 --- a/doc.zih.tu-dresden.de/mkdocs.yml +++ b/doc.zih.tu-dresden.de/mkdocs.yml @@ -132,6 +132,7 @@ nav: - Switched-Off Systems: - Overview: archive/systems_switched_off.md - System Taurus: archive/system_taurus.md + - Filesystems: archive/filesystems.md - Migration From Deimos to Atlas: archive/migrate_to_atlas.md - System Altix: archive/system_altix.md - System Atlas: archive/system_atlas.md