From a574f0224c26ef7d219769ae33ed130b1d19dc4a Mon Sep 17 00:00:00 2001 From: Ulf Markwardt <ulf.markwardt@tu-dresden.de> Date: Wed, 4 Aug 2021 15:51:27 +0200 Subject: [PATCH] split home + projekt --- .../docs/data_lifecycle/home.md | 227 ++++++++++++++++++ .../docs/data_lifecycle/projects.md | 227 ++++++++++++++++++ 2 files changed, 454 insertions(+) create mode 100644 doc.zih.tu-dresden.de/docs/data_lifecycle/home.md create mode 100644 doc.zih.tu-dresden.de/docs/data_lifecycle/projects.md diff --git a/doc.zih.tu-dresden.de/docs/data_lifecycle/home.md b/doc.zih.tu-dresden.de/docs/data_lifecycle/home.md new file mode 100644 index 000000000..decdb1fb3 --- /dev/null +++ b/doc.zih.tu-dresden.de/docs/data_lifecycle/home.md @@ -0,0 +1,227 @@ +# File Systems + +As soon as you have access to ZIH systems you have to manage your data. Several file systems are +available. Each file system serves for special purpose according to their respective capacity, +performance and permanence. + +## Permanent File Systems + +### Global /home File System + +Each user has 50 GB in a `/home` directory independent of the granted capacity for the project. +Hints for the usage of the global home directory: + +- If you need distinct `.bashrc` files for each machine, you should + create separate files for them, named `.bashrc_<machine_name>` +- If you use various machines frequently, it might be useful to set + the environment variable HISTFILE in `.bashrc_deimos` and + `.bashrc_mars` to `$HOME/.bash_history_<machine_name>`. Setting + HISTSIZE and HISTFILESIZE to 10000 helps as well. +- Further, you may use private module files to simplify the process of + loading the right installation directories, see + **todo link: private modules - AnchorPrivateModule**. + +### Global /projects File System + +For project data, we have a global project directory, that allows better collaboration between the +members of an HPC project. However, for compute nodes /projects is mounted as read-only, because it +is not a filesystem for parallel I/O. See below and also check the +**todo link: HPC introduction - %PUBURL%/Compendium/WebHome/HPC-Introduction.pdf** for more details. + +### Backup and Snapshots of the File System + +- Backup is **only** available in the `/home` and the `/projects` file systems! +- Files are backed up using snapshots of the NFS server and can be restored by the user +- A changed file can always be recovered as it was at the time of the snapshot +- Snapshots are taken: + - From Monday through Saturday between 06:00 and 18:00 every two hours and kept for one day + (7 snapshots) + - From Monday through Saturday at 23:30 and kept for two weeks (12 snapshots) + - Every Sunday st 23:45 and kept for 26 weeks +- To restore a previous version of a file: + - Go into the directory of the file you want to restore + - Run `cd .snapshot` (this subdirectory exists in every directory on the `/home` file system + although it is not visible with `ls -a`) + - In the .snapshot-directory are all available snapshots listed + - Just `cd` into the directory of the point in time you wish to restore and copy the file you + wish to restore to where you want it + - **Attention** The `.snapshot` directory is not only hidden from normal view (`ls -a`), it is + also embedded in a different directory structure. An `ls ../..` will not list the directory + where you came from. Thus, we recommend to copy the file from the location where it + originally resided: + `pwd /home/username/directory_a % cp .snapshot/timestamp/lostfile lostfile.backup` +- `/home` and `/projects/` are definitely NOT made as a work directory: + since all files are kept in the snapshots and in the backup tapes over a long time, they + - Senseless fill the disks and + - Prevent the backup process by their sheer number and volume from working efficiently. + +### Group Quotas for the File System + +The quotas of the home file system are meant to help the users to keep in touch with their data. +Especially in HPC, it happens that millions of temporary files are created within hours. This is the +main reason for performance degradation of the file system. If a project exceeds its quota (total +size OR total number of files) it cannot submit jobs into the batch system. The following commands +can be used for monitoring: + +- `showquota` shows your projects' usage of the file system. +- `quota -s -f /home` shows the user's usage of the file system. + +In case a project is above it's limits please ... + +- Remove core dumps, temporary data +- Talk with your colleagues to identify the hotspots, +- Check your workflow and use /tmp or the scratch file systems for temporary files +- *Systematically* handle your important data: + - For later use (weeks...months) at the HPC systems, build tar + archives with meaningful names or IDs and store e.g. them in an + [archive](intermediate_archive.md). + - Refer to the hints for [long term preservation for research data](preservation_research_data.md). + +## Work Directories + +| File system | Usable directory | Capacity | Availability | Backup | Remarks | +|:------------|:------------------|:---------|:-------------|:-------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `Lustre` | `/scratch/` | 4 PB | global | No | Only accessible via **todo link: workspaces - WorkSpaces**. Not made for billions of files! | +| `Lustre` | `/lustre/ssd` | 40 TB | global | No | Only accessible via **todo link: workspaces - WorkSpaces**. For small I/O operations | +| `BeeGFS` | `/beegfs/global0` | 232 TB | global | No | Only accessible via **todo link: workspaces - WorkSpaces**. Fastest available file system, only for large parallel applications running with millions of small I/O operations | +| `ext4` | `/tmp` | 95.0 GB | local | No | is cleaned up after the job automatically | + + +## Warm Archive + +!!! warning + This is under construction. The functionality is not there, yet. + +The warm archive is intended a storage space for the duration of a running HPC-DA project. It can +NOT substitute a long-term archive. It consists of 20 storage nodes with a net capacity of 10 PB. +Within Taurus (including the HPC-DA nodes), the management software "Quobyte" enables access via + +- native quobyte client - read-only from compute nodes, read-write + from login and nvme nodes +- S3 - read-write from all nodes, +- Cinder (from OpenStack cluster). + +For external access, you can use: + +- S3 to `<bucket>.s3.taurusexport.hrsk.tu-dresden.de` +- or normal file transfer via our taurusexport nodes (see [DataManagement](overview.md)). + +An HPC-DA project can apply for storage space in the warm archive. This is limited in capacity and +duration. +TODO + +## Recommendations for File System Usage + +To work as efficient as possible, consider the following points + +- Save source code etc. in `/home` or /projects/... +- Store checkpoints and other temporary data in `/scratch/ws/...` +- Compilation in `/dev/shm` or `/tmp` + +Getting high I/O-bandwitdh + +- Use many clients +- Use many processes (writing in the same file at the same time is possible) +- Use large I/O transfer blocks + +## Cheat Sheet for Debugging File System Issues + +Every Taurus-User should normaly be able to perform the following commands to get some intel about +their data. + +### General + +For the first view, you can easily use the "df-command". + +```Bash +df +``` + +Alternativly you can use the "findmnt"-command, which is also able to perform an `df` by adding the +"-D"-parameter. + +```Bash +findmnt -D +``` + +Optional you can use the `-t`-parameter to specify the fs-type or the `-o`-parameter to alter the +output. + +We do **not recommend** the usage of the "du"-command for this purpose. It is able to cause issues +for other users, while reading data from the filesystem. + + + +### BeeGFS + +Commands to work with the BeeGFS file system. + +#### Capacity and file system health + +View storage and inode capacity and utilization for metadata and storage targets. + +```Bash +beegfs-df -p /beegfs/global0 +``` + +The `-p` parameter needs to be the mountpoint of the file system and is mandatory. + +List storage and inode capacity, reachability and consistency information of each storage target. + +```Bash +beegfs-ctl --listtargets --nodetype=storage --spaceinfo --longnodes --state --mount=/beegfs/global0 +``` + +To check the capacity of the metadata server just toggle the `--nodetype` argument. + +```Bash +beegfs-ctl --listtargets --nodetype=meta --spaceinfo --longnodes --state --mount=/beegfs/global0 +``` + +#### Striping + +View the stripe information of a given file on the file system and shows on which storage target the +file is stored. + +```Bash +beegfs-ctl --getentryinfo /beegfs/global0/my-workspace/myfile --mount=/beegfs/global0 +``` + +Set the stripe pattern for an directory. In BeeGFS the stripe pattern will be inherited form a +directory to its children. + +```Bash +beegfs-ctl --setpattern --chunksize=1m --numtargets=16 /beegfs/global0/my-workspace/ --mount=/beegfs/global0 +``` + +This will set the stripe pattern for `/beegfs/global0/path/to/mydir/` to a chunksize of 1M +distributed over 16 storage targets. + +Find files located on certain server or targets. The following command searches all files that are +stored on the storage targets with id 4 or 30 und my-workspace directory. + +```Bash +beegfs-ctl --find /beegfs/global0/my-workspace/ --targetid=4 --targetid=30 --mount=/beegfs/global0 +``` + +#### Network + +View the network addresses of the file system servers. + +```Bash +beegfs-ctl --listnodes --nodetype=meta --nicdetails --mount=/beegfs/global0 +beegfs-ctl --listnodes --nodetype=storage --nicdetails --mount=/beegfs/global0 +beegfs-ctl --listnodes --nodetype=client --nicdetails --mount=/beegfs/global0 +``` + +Display connections the client is actually using + +```Bash +beegfs-net +``` + +Display possible connectivity of the services + +```Bash +beegfs-check-servers -p /beegfs/global0 +``` diff --git a/doc.zih.tu-dresden.de/docs/data_lifecycle/projects.md b/doc.zih.tu-dresden.de/docs/data_lifecycle/projects.md new file mode 100644 index 000000000..decdb1fb3 --- /dev/null +++ b/doc.zih.tu-dresden.de/docs/data_lifecycle/projects.md @@ -0,0 +1,227 @@ +# File Systems + +As soon as you have access to ZIH systems you have to manage your data. Several file systems are +available. Each file system serves for special purpose according to their respective capacity, +performance and permanence. + +## Permanent File Systems + +### Global /home File System + +Each user has 50 GB in a `/home` directory independent of the granted capacity for the project. +Hints for the usage of the global home directory: + +- If you need distinct `.bashrc` files for each machine, you should + create separate files for them, named `.bashrc_<machine_name>` +- If you use various machines frequently, it might be useful to set + the environment variable HISTFILE in `.bashrc_deimos` and + `.bashrc_mars` to `$HOME/.bash_history_<machine_name>`. Setting + HISTSIZE and HISTFILESIZE to 10000 helps as well. +- Further, you may use private module files to simplify the process of + loading the right installation directories, see + **todo link: private modules - AnchorPrivateModule**. + +### Global /projects File System + +For project data, we have a global project directory, that allows better collaboration between the +members of an HPC project. However, for compute nodes /projects is mounted as read-only, because it +is not a filesystem for parallel I/O. See below and also check the +**todo link: HPC introduction - %PUBURL%/Compendium/WebHome/HPC-Introduction.pdf** for more details. + +### Backup and Snapshots of the File System + +- Backup is **only** available in the `/home` and the `/projects` file systems! +- Files are backed up using snapshots of the NFS server and can be restored by the user +- A changed file can always be recovered as it was at the time of the snapshot +- Snapshots are taken: + - From Monday through Saturday between 06:00 and 18:00 every two hours and kept for one day + (7 snapshots) + - From Monday through Saturday at 23:30 and kept for two weeks (12 snapshots) + - Every Sunday st 23:45 and kept for 26 weeks +- To restore a previous version of a file: + - Go into the directory of the file you want to restore + - Run `cd .snapshot` (this subdirectory exists in every directory on the `/home` file system + although it is not visible with `ls -a`) + - In the .snapshot-directory are all available snapshots listed + - Just `cd` into the directory of the point in time you wish to restore and copy the file you + wish to restore to where you want it + - **Attention** The `.snapshot` directory is not only hidden from normal view (`ls -a`), it is + also embedded in a different directory structure. An `ls ../..` will not list the directory + where you came from. Thus, we recommend to copy the file from the location where it + originally resided: + `pwd /home/username/directory_a % cp .snapshot/timestamp/lostfile lostfile.backup` +- `/home` and `/projects/` are definitely NOT made as a work directory: + since all files are kept in the snapshots and in the backup tapes over a long time, they + - Senseless fill the disks and + - Prevent the backup process by their sheer number and volume from working efficiently. + +### Group Quotas for the File System + +The quotas of the home file system are meant to help the users to keep in touch with their data. +Especially in HPC, it happens that millions of temporary files are created within hours. This is the +main reason for performance degradation of the file system. If a project exceeds its quota (total +size OR total number of files) it cannot submit jobs into the batch system. The following commands +can be used for monitoring: + +- `showquota` shows your projects' usage of the file system. +- `quota -s -f /home` shows the user's usage of the file system. + +In case a project is above it's limits please ... + +- Remove core dumps, temporary data +- Talk with your colleagues to identify the hotspots, +- Check your workflow and use /tmp or the scratch file systems for temporary files +- *Systematically* handle your important data: + - For later use (weeks...months) at the HPC systems, build tar + archives with meaningful names or IDs and store e.g. them in an + [archive](intermediate_archive.md). + - Refer to the hints for [long term preservation for research data](preservation_research_data.md). + +## Work Directories + +| File system | Usable directory | Capacity | Availability | Backup | Remarks | +|:------------|:------------------|:---------|:-------------|:-------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `Lustre` | `/scratch/` | 4 PB | global | No | Only accessible via **todo link: workspaces - WorkSpaces**. Not made for billions of files! | +| `Lustre` | `/lustre/ssd` | 40 TB | global | No | Only accessible via **todo link: workspaces - WorkSpaces**. For small I/O operations | +| `BeeGFS` | `/beegfs/global0` | 232 TB | global | No | Only accessible via **todo link: workspaces - WorkSpaces**. Fastest available file system, only for large parallel applications running with millions of small I/O operations | +| `ext4` | `/tmp` | 95.0 GB | local | No | is cleaned up after the job automatically | + + +## Warm Archive + +!!! warning + This is under construction. The functionality is not there, yet. + +The warm archive is intended a storage space for the duration of a running HPC-DA project. It can +NOT substitute a long-term archive. It consists of 20 storage nodes with a net capacity of 10 PB. +Within Taurus (including the HPC-DA nodes), the management software "Quobyte" enables access via + +- native quobyte client - read-only from compute nodes, read-write + from login and nvme nodes +- S3 - read-write from all nodes, +- Cinder (from OpenStack cluster). + +For external access, you can use: + +- S3 to `<bucket>.s3.taurusexport.hrsk.tu-dresden.de` +- or normal file transfer via our taurusexport nodes (see [DataManagement](overview.md)). + +An HPC-DA project can apply for storage space in the warm archive. This is limited in capacity and +duration. +TODO + +## Recommendations for File System Usage + +To work as efficient as possible, consider the following points + +- Save source code etc. in `/home` or /projects/... +- Store checkpoints and other temporary data in `/scratch/ws/...` +- Compilation in `/dev/shm` or `/tmp` + +Getting high I/O-bandwitdh + +- Use many clients +- Use many processes (writing in the same file at the same time is possible) +- Use large I/O transfer blocks + +## Cheat Sheet for Debugging File System Issues + +Every Taurus-User should normaly be able to perform the following commands to get some intel about +their data. + +### General + +For the first view, you can easily use the "df-command". + +```Bash +df +``` + +Alternativly you can use the "findmnt"-command, which is also able to perform an `df` by adding the +"-D"-parameter. + +```Bash +findmnt -D +``` + +Optional you can use the `-t`-parameter to specify the fs-type or the `-o`-parameter to alter the +output. + +We do **not recommend** the usage of the "du"-command for this purpose. It is able to cause issues +for other users, while reading data from the filesystem. + + + +### BeeGFS + +Commands to work with the BeeGFS file system. + +#### Capacity and file system health + +View storage and inode capacity and utilization for metadata and storage targets. + +```Bash +beegfs-df -p /beegfs/global0 +``` + +The `-p` parameter needs to be the mountpoint of the file system and is mandatory. + +List storage and inode capacity, reachability and consistency information of each storage target. + +```Bash +beegfs-ctl --listtargets --nodetype=storage --spaceinfo --longnodes --state --mount=/beegfs/global0 +``` + +To check the capacity of the metadata server just toggle the `--nodetype` argument. + +```Bash +beegfs-ctl --listtargets --nodetype=meta --spaceinfo --longnodes --state --mount=/beegfs/global0 +``` + +#### Striping + +View the stripe information of a given file on the file system and shows on which storage target the +file is stored. + +```Bash +beegfs-ctl --getentryinfo /beegfs/global0/my-workspace/myfile --mount=/beegfs/global0 +``` + +Set the stripe pattern for an directory. In BeeGFS the stripe pattern will be inherited form a +directory to its children. + +```Bash +beegfs-ctl --setpattern --chunksize=1m --numtargets=16 /beegfs/global0/my-workspace/ --mount=/beegfs/global0 +``` + +This will set the stripe pattern for `/beegfs/global0/path/to/mydir/` to a chunksize of 1M +distributed over 16 storage targets. + +Find files located on certain server or targets. The following command searches all files that are +stored on the storage targets with id 4 or 30 und my-workspace directory. + +```Bash +beegfs-ctl --find /beegfs/global0/my-workspace/ --targetid=4 --targetid=30 --mount=/beegfs/global0 +``` + +#### Network + +View the network addresses of the file system servers. + +```Bash +beegfs-ctl --listnodes --nodetype=meta --nicdetails --mount=/beegfs/global0 +beegfs-ctl --listnodes --nodetype=storage --nicdetails --mount=/beegfs/global0 +beegfs-ctl --listnodes --nodetype=client --nicdetails --mount=/beegfs/global0 +``` + +Display connections the client is actually using + +```Bash +beegfs-net +``` + +Display possible connectivity of the services + +```Bash +beegfs-check-servers -p /beegfs/global0 +``` -- GitLab