Skip to content
Snippets Groups Projects
Commit 403ded6f authored by Michael Müller's avatar Michael Müller
Browse files

Merge branch 'Quota' into 'Filesystems'

Quota into Filesystems

See merge request !236
parents 1796aafe b460af38
No related branches found
No related tags found
4 merge requests!322Merge preview into main,!319Merge preview into main,!236Quota into Filesystems,!234Aufgeteilt
# Permanent File Systems # Permanent File Systems
Do not use permanent file systems as work directories:
- Even temporary files are kept in the snapshots and in the backup tapes over a long time,
senseless filling the disks,
- By the sheer number and volume of work files they may keep the backup from working efficiently.
## Global /home File System ## Global /home File System
Each user has 50 GB in a `/home` directory independent of the granted capacity for the project. Each user has 50 GB in a `/home` directory independent of the granted capacity for the project.
This file system is mounted with read-write permissions on all HPC systems.
Hints for the usage of the global home directory: Hints for the usage of the global home directory:
- Do not use your `/home` as work directory: Frequent changes (like temporary output from a
running job) would fill snapshots and backups (see below).
- If you need distinct `.bashrc` files for each machine, you should - If you need distinct `.bashrc` files for each machine, you should
create separate files for them, named `.bashrc_<machine_name>` create separate files for them, named `.bashrc_<machine_name>`
- Further, you may use private module files to simplify the process of
loading the right installation directories, see If a user exceeds her/his quota (total size OR total number of files) she/he cannot
**todo link: private modules - AnchorPrivateModule**. submit jobs into the batch system. Running jobs are not affected.
!!! note
We have no feasible way to get the contribution of
a single user to a project's disk usage.
## Global /projects File System ## Global /projects File System
For project data, we have a global project directory, that allows better collaboration between the For project data, we have a global project directory, that allows better collaboration between the
members of an HPC project. However, for compute nodes /projects is mounted as read-only, because it members of an HPC project.
is not a filesystem for parallel I/O. Typically, all members of the project have read/write access to that directory.
It can only be written to on the login and export nodes.
## Backup and Snapshots of the File System
!!! note
- Backup is **only** available in the `/home` and the `/projects` file systems! On compute nodes, /projects is mounted as read-only, because it must nut be used as
- Files are backed up using snapshots of the NFS server and can be restored by the user work directory and heavy I/O.
- A changed file can always be recovered as it was at the time of the snapshot
- Snapshots are taken: ## Snapshots
- From Monday through Saturday between 06:00 and 18:00 every two hours and kept for one day
(7 snapshots) A changed file can always be recovered as it was at the time of the snapshot.
- From Monday through Saturday at 23:30 and kept for two weeks (12 snapshots) These snapshots are taken (subject to changes):
- Every Sunday st 23:45 and kept for 26 weeks
- To restore a previous version of a file: - from Monday through Saturday between 06:00 and 18:00 every two hours and kept for one day
- Go into the directory of the file you want to restore (7 snapshots)
- Run `cd .snapshot` (this subdirectory exists in every directory on the `/home` file system - from Monday through Saturday at 23:30 and kept for two weeks (12 snapshots)
although it is not visible with `ls -a`) - every Sunday st 23:45 and kept for 26 weeks.
- In the .snapshot-directory are all available snapshots listed
- Just `cd` into the directory of the point in time you wish to restore and copy the file you To restore a previous version of a file:
wish to restore to where you want it
- **Attention** The `.snapshot` directory is not only hidden from normal view (`ls -a`), it is - Go into the directory of the file you want to restore
also embedded in a different directory structure. An `ls ../..` will not list the directory - Run `cd .snapshot` (this subdirectory exists in every directory on the `/home` file system
where you came from. Thus, we recommend to copy the file from the location where it although it is not visible with `ls -a`).
originally resided: - In the .snapshot-directory are all available snapshots listed.
`pwd /home/username/directory_a % cp .snapshot/timestamp/lostfile lostfile.backup` - Just `cd` into the directory of the point in time you wish to restore and copy the file you
- `/home` and `/projects/` are definitely NOT made as a work directory: wish to restore to where you want it.
since all files are kept in the snapshots and in the backup tapes over a long time, they
- Senseless fill the disks and !!! note
- Prevent the backup process by their sheer number and volume from working efficiently.
The `.snapshot` directory is embedded in a different directory structure. An `ls ../..` will not show the directory
## Group Quotas for the File System where you came from. Thus, for your `cp`, you should *use an absolute path* as destination.
The quotas of the home file system are meant to help the users to keep in touch with their data. ## Backup
Just for the eventuality of a major file system crash we keep tape-based backups of our
permanent file systems for 180 days.
## Quotas
The quotas of the permanent file system are meant to help the users to keep in touch with their data.
Especially in HPC, it happens that millions of temporary files are created within hours. This is the Especially in HPC, it happens that millions of temporary files are created within hours. This is the
main reason for performance degradation of the file system. If a project exceeds its quota (total main reason for performance degradation of the file system.
size OR total number of files) it cannot submit jobs into the batch system. The following commands
can be used for monitoring: !!! note
If a quota is exceeded - project or home - (total size OR total number of files)
job submission is forbidden. Running jobs are not affected.
The following commands can be used for monitoring:
- `showquota` shows your projects' usage of the file system. - `showquota` shows your projects' usage of the file system.
- `quota -s -f /home` shows the user's usage of the file system. - `quota -s -f /home` shows the user's usage of the file system.
In case a project is above it's limits please ... In case a quota is above it's limits:
- Remove core dumps, temporary data - Remove core dumps and temporary data
- Talk with your colleagues to identify the hotspots, - Talk with your colleagues to identify the hotspots,
- Check your workflow and use /tmp or the scratch file systems for temporary files - Check your workflow and use /tmp or the scratch file systems for temporary files
- *Systematically* handle your important data: - *Systematically* handle your important data:
- For later use (weeks...months) at the HPC systems, build tar - For later use (weeks...months) at the HPC systems, build and zip tar
archives with meaningful names or IDs and store e.g. them in an archives with meaningful names or IDs and store e.g. them in a workspace in the
[archive](intermediate_archive.md). [warm archive](warm_archive.md) or an [archive](intermediate_archive.md).
- Refer to the hints for [long term preservation for research data](preservation_research_data.md) - Refer to the hints for [long term preservation for research data](preservation_research_data.md)
# Quotas for the home file system
The quotas of the home file system are meant to help the users to keep in touch with their data.
Especially in HPC, millions of temporary files can be created within hours. We have identified this
as a main reason for performance degradation of the HOME file system. To stay in operation with out
HPC systems we regrettably have to fall back to this unpopular technique.
Based on a balance between the allotted disk space and the usage over the time, reasonable quotas
(mostly above current used space) for the projects have been defined. The will be activated by the
end of April 2012.
If a project exceeds its quota (total size OR total number of files) it cannot submit jobs into the
batch system. Running jobs are not affected. The following commands can be used for monitoring:
- `quota -s -g` shows the file system usage of all groups the user is
a member of.
- `showquota` displays a more convenient output. Use `showquota -h` to
read about its usage. It is not yet available on all machines but we
are working on it.
**Please mark:** We have no quotas for the single accounts, but for the
project as a whole. There is no feasible way to get the contribution of
a single user to a project's disk usage.
## Alternatives
In case a project is above its limits, please
- remove core dumps, temporary data,
- talk with your colleagues to identify the hotspots,
- check your workflow and use /fastfs for temporary files,
- *systematically* handle your important data:
- for later use (weeks...months) at the HPC systems, build tar
archives with meaningful names or IDs and store them in the
[DMF system](#AnchorDataMigration). Avoid using this system
(`/hpc_fastfs`) for files < 1 MB!
- refer to the hints for
[long term preservation for research data](../data_lifecycle/preservation_research_data.md).
## No Alternatives
The current situation is this:
- `/home` provides about 50 TB of disk space for all systems. Rapidly
changing files (temporary data) decrease the size of usable disk
space since we keep all files in multiple snapshots for 26 weeks. If
the *number* of files comes into the range of a million the backup
has problems handling them.
- The work file system for the clusters is `/fastfs`. Here, we have 60
TB disk space (without backup). This is the file system of choice
for temporary data.
- About 180 projects have to share our resources, so it makes no sense
at all to simply move the data from `/home` to `/fastfs` or to
`/hpc_fastfs`.
In case of problems don't hesitate to ask for support.
...@@ -80,7 +80,6 @@ nav: ...@@ -80,7 +80,6 @@ nav:
- BeeGFS 2: data_lifecycle/bee_gfs.md - BeeGFS 2: data_lifecycle/bee_gfs.md
- WarmArchive: data_lifecycle/warm_archive.md - WarmArchive: data_lifecycle/warm_archive.md
- Intermediate Archive: data_lifecycle/intermediate_archive.md - Intermediate Archive: data_lifecycle/intermediate_archive.md
- Quotas: data_lifecycle/quotas.md
- Workspaces: data_lifecycle/workspaces.md - Workspaces: data_lifecycle/workspaces.md
- Preservation of Research Data: data_lifecycle/preservation_research_data.md - Preservation of Research Data: data_lifecycle/preservation_research_data.md
- Structuring Experiments: data_lifecycle/experiments.md - Structuring Experiments: data_lifecycle/experiments.md
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment