diff --git a/doc.zih.tu-dresden.de/docs/data_management/FileSystems.md b/doc.zih.tu-dresden.de/docs/data_management/FileSystems.md index 5c54797a48eae3f74c8d7435318ba6f16a5d6eef..ecd1919b5014b1dcbbf5d54cbd29a98c6485c211 100644 --- a/doc.zih.tu-dresden.de/docs/data_management/FileSystems.md +++ b/doc.zih.tu-dresden.de/docs/data_management/FileSystems.md @@ -1,95 +1,79 @@ -# File systems - -## Permanent file systems - -### Global /home file system - -Each user has 50 GB in his /home directory independent of the granted -capacity for the project. Hints for the usage of the global home -directory: - -- If you need distinct `.bashrc` files for each machine, you should - create separate files for them, named `.bashrc_<machine_name>` -- If you use various machines frequently, it might be useful to set - the environment variable HISTFILE in `.bashrc_deimos` and - `.bashrc_mars` to `$HOME/.bash_history_<machine_name>`. Setting - HISTSIZE and HISTFILESIZE to 10000 helps as well. -- Further, you may use private module files to simplify the process of - loading the right installation directories, see - **todo link: private modules - AnchorPrivateModule**. - -### Global /projects file system - -For project data, we have a global project directory, that allows better -collaboration between the members of an HPC project. However, for -compute nodes /projects is mounted as read-only, because it is not a -filesystem for parallel I/O. See below and also check the -**todo link: HPC introduction - %PUBURL%/Compendium/WebHome/HPC-Introduction.pdf** for more -details. - -### Backup and snapshots of the file system - -- Backup is **only** available in the `/home` and the `/projects` file - systems! -- Files are backed up using snapshots of the NFS server and can be - restored by the user -- A changed file can always be recovered as it was at the time of the - snapshot -- Snapshots are taken: - - from Monday through Saturday between 06:00 and 18:00 every two - hours and kept for one day (7 snapshots) - - from Monday through Saturday at 23:30 and kept for two weeks (12 - snapshots) - - every Sunday st 23:45 and kept for 26 weeks -- to restore a previous version of a file: - - go into the directory of the file you want to restore - - run `cd .snapshot` (this subdirectory exists in every directory - on the /home file system although it is not visible with - `ls -a`) - - in the .snapshot-directory are all available snapshots listed - - just `cd` into the directory of the point in time you wish to - restore and copy the file you wish to restore to where you want - it - - \*Attention\* The .snapshot directory is not only hidden from - normal view (`ls -a`), it is also embedded in a different - directory structure. An \<span class="WYSIWYG_TT">ls - ../..\</span>will not list the directory where you came from. - Thus, we recommend to copy the file from the location where it - originally resided: \<pre>% pwd /home/username/directory_a % cp - .snapshot/timestamp/lostfile lostfile.backup \</pre> -- /home and /projects/ are definitely NOT made as a work directory: - since all files are kept in the snapshots and in the backup tapes - over a long time, they - - senseless fill the disks and - - prevent the backup process by their sheer number and volume from - working efficiently. - -### Group quotas for the file system - -The quotas of the home file system are meant to help the users to keep -in touch with their data. Especially in HPC, it happens that millions of -temporary files are created within hours. This is the main reason for -performance degradation of the file system. If a project exceeds its -quota (total size OR total number of files) it cannot submit jobs into -the batch system. The following commands can be used for monitoring: - -- `showquota` shows your projects' usage of the file system. -- `quota -s -f /home` shows the user's usage of the file system. - -In case a project is above it's limits please... - -- remove core dumps, temporary data -- talk with your colleagues to identify the hotspots, -- check your workflow and use /tmp or the scratch file systems for - temporary files -- *systematically*handle your important data: - - For later use (weeks...months) at the HPC systems, build tar - archives with meaningful names or IDs and store e.g. them in an - [archive](IntermediateArchive.md). - - refer to the hints for [long term preservation for research - data](PreservationResearchData.md). - -## Work directories +# File Systems + +## Permanent File Systems + +### Global /home File System + +Each user has 50 GB in a `/home` directory independent of the granted capacity for the project. +Hints for the usage of the global home directory: + +- If you need distinct `.bashrc` files for each machine, you should + create separate files for them, named `.bashrc_<machine_name>` +- If you use various machines frequently, it might be useful to set + the environment variable HISTFILE in `.bashrc_deimos` and + `.bashrc_mars` to `$HOME/.bash_history_<machine_name>`. Setting + HISTSIZE and HISTFILESIZE to 10000 helps as well. +- Further, you may use private module files to simplify the process of + loading the right installation directories, see + **todo link: private modules - AnchorPrivateModule**. + +### Global /projects File System + +For project data, we have a global project directory, that allows better collaboration between the +members of an HPC project. However, for compute nodes /projects is mounted as read-only, because it +is not a filesystem for parallel I/O. See below and also check the +**todo link: HPC introduction - %PUBURL%/Compendium/WebHome/HPC-Introduction.pdf** for more details. + +### Backup and Snapshots of the File System + +- Backup is **only** available in the `/home` and the `/projects` file systems! +- Files are backed up using snapshots of the NFS server and can be restored by the user +- A changed file can always be recovered as it was at the time of the snapshot +- Snapshots are taken: + - From Monday through Saturday between 06:00 and 18:00 every two hours and kept for one day + (7 snapshots) + - From Monday through Saturday at 23:30 and kept for two weeks (12 snapshots) + - Every Sunday st 23:45 and kept for 26 weeks +- To restore a previous version of a file: + - Go into the directory of the file you want to restore + - Run `cd .snapshot` (this subdirectory exists in every directory on the `/home` file system + although it is not visible with `ls -a`) + - In the .snapshot-directory are all available snapshots listed + - Just `cd` into the directory of the point in time you wish to restore and copy the file you + wish to restore to where you want it + - **Attention** The `.snapshot` directory is not only hidden from normal view (`ls -a`), it is + also embedded in a different directory structure. An `ls ../..` will not list the directory + where you came from. Thus, we recommend to copy the file from the location where it + originally resided: + `pwd /home/username/directory_a % cp .snapshot/timestamp/lostfile lostfile.backup` +- `/home` and `/projects/` are definitely NOT made as a work directory: + since all files are kept in the snapshots and in the backup tapes over a long time, they + - Senseless fill the disks and + - Prevent the backup process by their sheer number and volume from working efficiently. + +### Group Quotas for the File System + +The quotas of the home file system are meant to help the users to keep in touch with their data. +Especially in HPC, it happens that millions of temporary files are created within hours. This is the +main reason for performance degradation of the file system. If a project exceeds its quota (total +size OR total number of files) it cannot submit jobs into the batch system. The following commands +can be used for monitoring: + +- `showquota` shows your projects' usage of the file system. +- `quota -s -f /home` shows the user's usage of the file system. + +In case a project is above it's limits please ... + +- Remove core dumps, temporary data +- Talk with your colleagues to identify the hotspots, +- Check your workflow and use /tmp or the scratch file systems for temporary files +- *Systematically* handle your important data: + - For later use (weeks...months) at the HPC systems, build tar + archives with meaningful names or IDs and store e.g. them in an + [archive](IntermediateArchive.md). + - Refer to the hints for [long term preservation for research data](PreservationResearchData.md). + +## Work Directories | File system | Usable directory | Capacity | Availability | Backup | Remarks | |:------------|:------------------|:---------|:-------------|:-------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------| @@ -98,98 +82,106 @@ In case a project is above it's limits please... | `BeeGFS` | `/beegfs/global0` | 232 TB | global | No | Only accessible via **todo link: workspaces - WorkSpaces**. Fastest available file system, only for large parallel applications running with millions of small I/O operations | | `ext4` | `/tmp` | 95.0 GB | local | No | is cleaned up after the job automatically | -### Large files in /scratch +### Large Files in /scratch -The data containers in Lustre are called object storage targets (OST). -The capacity of one OST is about 21 TB. All files are striped over a -certain number of these OSTs. For small and medium files, the default -number is 2. As soon as a file grows above \~1 TB it makes sense to -spread it over a higher number of OSTs, eg. 16. Once the file system is -used \> 75%, the average space per OST is only 5 GB. So, it is essential -to split your larger files so that the chunks can be saved! +The data containers in Lustre are called object storage targets (OST). The capacity of one OST is +about 21 TB. All files are striped over a certain number of these OSTs. For small and medium files, +the default number is 2. As soon as a file grows above \~1 TB it makes sense to spread it over a +higher number of OSTs, eg. 16. Once the file system is used \> 75%, the average space per OST is +only 5 GB. So, it is essential to split your larger files so that the chunks can be saved! -Lets assume you have a dierctory where you tar your results, eg. -`/scratch/mark/tar` . Now, simply set the stripe count to a higher -number in this directory with: +Lets assume you have a dierctory where you tar your results, e.g. `/scratch/mark/tar`. Now, simply +set the stripe count to a higher number in this directory with: - lfs setstripe -c 20 /scratch/ws/mark-stripe20/tar +```Bash +lfs setstripe -c 20 /scratch/ws/mark-stripe20/tar +``` -%RED%Note:<span class="twiki-macro ENDCOLOR"></span> This does not -affect existing files. But all files that **will be created** in this +**Note:** This does not affect existing files. But all files that **will be created** in this directory will be distributed over 20 OSTs. -## Warm archive +## Warm Archive TODO -## Recommendations for file system usage +## Recommendations for File System Usage To work as efficient as possible, consider the following points -- Save source code etc. in `/home` or /projects/... -- Store checkpoints and other temporary data in `/scratch/ws/...` -- Compilation in `/dev/shm` or `/tmp` +- Save source code etc. in `/home` or /projects/... +- Store checkpoints and other temporary data in `/scratch/ws/...` +- Compilation in `/dev/shm` or `/tmp` Getting high I/O-bandwitdh -- Use many clients -- Use many processes (writing in the same file at the same time is - possible) -- Use large I/O transfer blocks +- Use many clients +- Use many processes (writing in the same file at the same time is possible) +- Use large I/O transfer blocks -## Cheat Sheet for debugging file system issues +## Cheat Sheet for Debugging File System Issues -Every Taurus-User should normaly be able to perform the following -commands to get some intel about theire data. +Every Taurus-User should normaly be able to perform the following commands to get some intel about +their data. ### General For the first view, you can easily use the "df-command". - df +```Bash +df +``` -Alternativly you can use the "findmnt"-command, which is also able to -perform an "df" by adding the "-D"-parameter. +Alternativly you can use the "findmnt"-command, which is also able to perform an `df` by adding the +"-D"-parameter. - findmnt -D +```Bash +findmnt -D +``` -Optional you can use the "-t"-parameter to specify the fs-type or the -"-o"-parameter to alter the output. +Optional you can use the `-t`-parameter to specify the fs-type or the `-o`-parameter to alter the +output. -We do **not recommend** the usage of the "du"-command for this purpose. -It is able to cause issues for other users, while reading data from the -filesystem. +We do **not recommend** the usage of the "du"-command for this purpose. It is able to cause issues +for other users, while reading data from the filesystem. -### Lustre file system +### Lustre File System -These commands work for /scratch and /ssd. +These commands work for `/scratch` and `/ssd`. -#### Listing disk usages per OST and MDT +#### Listing Disk Usages per OST and MDT - lfs quota -h -u username /path/to/my/data +```Bash +lfs quota -h -u username /path/to/my/data +``` -It is possible to display the usage on each OST by adding the -"-v"-parameter. +It is possible to display the usage on each OST by adding the "-v"-parameter. #### Listing space usage per OST and MDT - lfs df -h /path/to/my/data +```Bash +lfs df -h /path/to/my/data +``` #### Listing inode usage for an specific path - lfs df -i /path/to/my/data +```Bash +lfs df -i /path/to/my/data +``` #### Listing OSTs - lfs osts /path/to/my/data +```Bash +lfs osts /path/to/my/data +``` #### View striping information - lfs getstripe myfile - lfs getstripe -d mydirectory +```Bash +lfs getstripe myfile +lfs getstripe -d mydirectory +``` -The "-d"-parameter will also display striping for all files in the -directory +The `-d`-parameter will also display striping for all files in the directory ### BeeGFS @@ -197,57 +189,70 @@ Commands to work with the BeeGFS file system. #### Capacity and file system health -View storage and inode capacity and utilization for metadata and storage -targets. +View storage and inode capacity and utilization for metadata and storage targets. - beegfs-df -p /beegfs/global0 +```Bash +beegfs-df -p /beegfs/global0 +``` -The "-p" parameter needs to be the mountpoint of the file system and is -mandatory. +The `-p` parameter needs to be the mountpoint of the file system and is mandatory. -List storage and inode capacity, reachability and consistency -information of each storage target. +List storage and inode capacity, reachability and consistency information of each storage target. - beegfs-ctl --listtargets --nodetype=storage --spaceinfo --longnodes --state --mount=/beegfs/global0 +```Bash +beegfs-ctl --listtargets --nodetype=storage --spaceinfo --longnodes --state --mount=/beegfs/global0 +``` -To check the capacity of the metadata server just toggle the -"--nodetype" argument. +To check the capacity of the metadata server just toggle the `--nodetype` argument. - beegfs-ctl --listtargets --nodetype=meta --spaceinfo --longnodes --state --mount=/beegfs/global0 +```Bash +beegfs-ctl --listtargets --nodetype=meta --spaceinfo --longnodes --state --mount=/beegfs/global0 +``` #### Striping -View the stripe information of a given file on the file system and shows -on which storage target the file is stored. +View the stripe information of a given file on the file system and shows on which storage target the +file is stored. - beegfs-ctl --getentryinfo /beegfs/global0/my-workspace/myfile --mount=/beegfs/global0 +```Bash +beegfs-ctl --getentryinfo /beegfs/global0/my-workspace/myfile --mount=/beegfs/global0 +``` -Set the stripe pattern for an directory. In BeeGFS the stripe pattern -will be inherited form a directory to its children. +Set the stripe pattern for an directory. In BeeGFS the stripe pattern will be inherited form a +directory to its children. - beegfs-ctl --setpattern --chunksize=1m --numtargets=16 /beegfs/global0/my-workspace/ --mount=/beegfs/global0 +```Bash +beegfs-ctl --setpattern --chunksize=1m --numtargets=16 /beegfs/global0/my-workspace/ --mount=/beegfs/global0 +``` -This will set the stripe pattern for "/beegfs/global0/path/to/mydir/" to -a chunksize of 1M distributed over 16 storage targets. +This will set the stripe pattern for `/beegfs/global0/path/to/mydir/` to a chunksize of 1M +distributed over 16 storage targets. -Find files located on certain server or targets. The following command -searches all files that are stored on the storage targets with id 4 or -30 und my-workspace directory. +Find files located on certain server or targets. The following command searches all files that are +stored on the storage targets with id 4 or 30 und my-workspace directory. - beegfs-ctl --find /beegfs/global0/my-workspace/ --targetid=4 --targetid=30 --mount=/beegfs/global0 +```Bash +beegfs-ctl --find /beegfs/global0/my-workspace/ --targetid=4 --targetid=30 --mount=/beegfs/global0 +``` #### Network View the network addresses of the file system servers. - beegfs-ctl --listnodes --nodetype=meta --nicdetails --mount=/beegfs/global0 - beegfs-ctl --listnodes --nodetype=storage --nicdetails --mount=/beegfs/global0 - beegfs-ctl --listnodes --nodetype=client --nicdetails --mount=/beegfs/global0 +```Bash +beegfs-ctl --listnodes --nodetype=meta --nicdetails --mount=/beegfs/global0 +beegfs-ctl --listnodes --nodetype=storage --nicdetails --mount=/beegfs/global0 +beegfs-ctl --listnodes --nodetype=client --nicdetails --mount=/beegfs/global0 +``` Display connections the client is actually using - beegfs-net +```Bash +beegfs-net +``` Display possible connectivity of the services - beegfs-check-servers -p /beegfs/global0 +```Bash +beegfs-check-servers -p /beegfs/global0 +```