From 935d19303fe5be5ce6d25c123bc23da3f3ad1988 Mon Sep 17 00:00:00 2001 From: Martin Schroschk <martin.schroschk@tu-dresden.de> Date: Thu, 16 Sep 2021 09:53:19 +0200 Subject: [PATCH] WIP: Start review --- .../docs/data_transfer/data_mover.md | 85 ------------------ .../docs/data_transfer/datamover.md | 86 +++++++++++++++++++ .../docs/data_transfer/overview.md | 54 ++++++------ doc.zih.tu-dresden.de/mkdocs.yml | 2 +- 4 files changed, 116 insertions(+), 111 deletions(-) delete mode 100644 doc.zih.tu-dresden.de/docs/data_transfer/data_mover.md create mode 100644 doc.zih.tu-dresden.de/docs/data_transfer/datamover.md diff --git a/doc.zih.tu-dresden.de/docs/data_transfer/data_mover.md b/doc.zih.tu-dresden.de/docs/data_transfer/data_mover.md deleted file mode 100644 index 856af9f30..000000000 --- a/doc.zih.tu-dresden.de/docs/data_transfer/data_mover.md +++ /dev/null @@ -1,85 +0,0 @@ -# Transferring files between HPC systems - -We provide a special data transfer machine providing the global file -systems of each ZIH HPC system. This machine is not accessible through -SSH as it is dedicated to data transfers. To move or copy files from one -file system to another file system you have to use the following -commands: - -- **dtcp**, **dtls, dtmv**, **dtrm, dtrsync**, **dttar** - -These commands submit a job to the data transfer machines performing the -selected command. Except the following options their syntax is the same -than the shell command without **dt** prefix (cp, ls, mv, rm, rsync, -tar). - -Additional options: - -| | | -|-------------------|-------------------------------------------------------------------------------| -| --account=ACCOUNT | Assign data transfer job to specified account. | -| --blocking | Do not return until the data transfer job is complete. (default for **dtls**) | -| --time=TIME | Job time limit (default 18h). | - -- **dtinfo**, **dtqueue**, **dtq**, **dtcancel** - -**dtinfo** shows information about the nodes of the data transfer -machine (like sinfo). **dtqueue** and **dtq** shows all the data -transfer jobs that belong to you (like squeue -u $USER). **dtcancel** -signals data transfer jobs (like scancel). - -To identify the mount points of the different HPC file systems on the -data transfer machine, please use **dtinfo**. It shows an output like -this (attention, the mount points can change without an update on this -web page) : - -| HPC system | Local directory | Directory on data transfer machine | -|:-------------------|:-----------------|:-----------------------------------| -| Taurus, Venus | /scratch/ws | /scratch/ws | -| | /ssd/ws | /ssd/ws | -| | /warm_archive/ws | /warm_archive/ws | -| | /home | /home | -| | /projects | /projects | -| **Archive** | | /archiv | -| **Group Storages** | | /grp/\<group storage> | - -## How to copy your data from an old scratch (Atlas, Triton, Venus) to our new scratch (Taurus) - -You can use our tool called Datamover to copy your data from A to B. - - dtcp -r /scratch/<project or user>/<directory> /projects/<project or user>/<directory> # or - dtrsync -a /scratch/<project or user>/<directory> /lustre/ssd/<project or user>/<directory> - -Options for dtrsync: - - -a, --archive archive mode; equals -rlptgoD (no -H,-A,-X) - - -r, --recursive recurse into directories - -l, --links copy symlinks as symlinks - -p, --perms preserve permissions - -t, --times preserve modification times - -g, --group preserve group - -o, --owner preserve owner (super-user only) - -D same as --devices --specials - -Example: - - dtcp -r /scratch/rotscher/results /luste/ssd/rotscher/ # or - new: dtrsync -a /scratch/rotscher/results /home/rotscher/results - -## Examples on how to use data transfer commands: - -Copying data from Taurus' /scratch to Taurus' /projects - - % dtcp -r /scratch/jurenz/results/ /home/jurenz/ - -Moving data from Venus' /sratch to Taurus' /luste/ssd - - % dtmv /scratch/jurenz/results/ /lustre/ssd/jurenz/results - -TGZ data from Taurus' /scratch to the Archive - - % dttar -czf /archiv/jurenz/taurus_results_20140523.tgz /scratch/jurenz/results - -**%RED%Note:<span class="twiki-macro ENDCOLOR"></span>**Please do not -generate files in the archive much larger that 500 GB. diff --git a/doc.zih.tu-dresden.de/docs/data_transfer/datamover.md b/doc.zih.tu-dresden.de/docs/data_transfer/datamover.md new file mode 100644 index 000000000..4272d7ade --- /dev/null +++ b/doc.zih.tu-dresden.de/docs/data_transfer/datamover.md @@ -0,0 +1,86 @@ +# Transferring Files Between ZIH Systems + +With the **datamover**, we provide a special data transfer machine for transferring data with best +trasnfer speed between the filesystems of ZIH systems. The datamover machine is not accessible +through SSH as it is dedicated to data transfers. To move or copy files from one filesystem to +another filesystem, you have to use the following commands: + +- `dtcp`, `dtls`, `dtmv`, `dtrm`, `dtrsync`, `dttar`, and `dtwget` + +These commands submit a [batch job](jobs_and_resources/slurm.md) to the data transfer machines +performing the selected command. Except the following options their syntax is the very same as the +wellknown shell commands without the prefix *dt*. + +| Additional Option | Description | +|---------------------|-------------------------------------------------------------------------------| +| `--account=ACCOUNT` | Assign data transfer job to specified account. | +| `--blocking ` | Do not return until the data transfer job is complete. (default for `dtls`) | +| `--time=TIME ` | Job time limit (default: 18 h). | + +## Managing Transfer Jobs + +There are the commands `dtinfo`, `dtqueue`, `dtq`, and `dtcancel` to manage your transfer commands +and jobs. + +* `dtinfo` shows information about the nodes of the data transfer machine (like `sinfo`). +* `dtqueue` and `dtq` show all your data transfer jobs (like `squeue -u $USER`). +* `dtcancel` signals data transfer jobs (like `scancel`). + +To identify the mount points of the different filesystems on the data transfer machine, use +`dtinfo`. It shows an output like this: + +| ZIH system | Local directory | Directory on data transfer machine | +|:-------------------|:-------------------|:-----------------------------------| +| Taurus | `/scratch/ws` | `/scratch/ws` | +| | `/ssd/ws` | `/ssd/ws` | +| | `/warm_archive/ws` | `/warm_archive/ws` | +| | `/home` | `/home` | +| | `/projects` | `/projects` | +| **Archive** | | `/archiv` | +| **Group Storages** | | `/grp/\<group storage>` | + +## How to copy your data from an old scratch (Atlas, Triton, Venus) to our new scratch (Taurus) + +You can use our tool called Datamover to copy your data from A to B. + + dtcp -r /scratch/<project or user>/<directory> /projects/<project or user>/<directory> # or + dtrsync -a /scratch/<project or user>/<directory> /lustre/ssd/<project or user>/<directory> + +Options for dtrsync: + + -a, --archive archive mode; equals -rlptgoD (no -H,-A,-X) + + -r, --recursive recurse into directories + -l, --links copy symlinks as symlinks + -p, --perms preserve permissions + -t, --times preserve modification times + -g, --group preserve group + -o, --owner preserve owner (super-user only) + -D same as --devices --specials + +Example: + + dtcp -r /scratch/rotscher/results /luste/ssd/rotscher/ # or + new: dtrsync -a /scratch/rotscher/results /home/rotscher/results + +## Examples on how to use data transfer commands: + +Copying data from Taurus' /scratch to Taurus' /projects + + % dtcp -r /scratch/jurenz/results/ /home/jurenz/ + +Moving data from Venus' /sratch to Taurus' /luste/ssd + + % dtmv /scratch/jurenz/results/ /lustre/ssd/jurenz/results + +TGZ data from Taurus' /scratch to the Archive + + % dttar -czf /archiv/jurenz/taurus_results_20140523.tgz /scratch/jurenz/results + +**%RED%Note:<span class="twiki-macro ENDCOLOR"></span>**Please do not +generate files in the archive much larger that 500 GB. + +!!! hint + + The [warm archive](../data_lifecycle/warm_archive.md)) is not writable from within batch jobs. + However, you can store the data in the warm archive with the datamover. diff --git a/doc.zih.tu-dresden.de/docs/data_transfer/overview.md b/doc.zih.tu-dresden.de/docs/data_transfer/overview.md index 3f92972f3..0270effeb 100644 --- a/doc.zih.tu-dresden.de/docs/data_transfer/overview.md +++ b/doc.zih.tu-dresden.de/docs/data_transfer/overview.md @@ -1,37 +1,41 @@ # Transfer of Data -## Moving data to/from the HPC Machines +## Moving Data to/from ZIH Systems -To copy data to/from the HPC machines, the Taurus export nodes should be used as a preferred way. -There are three possibilities to exchanging data between your local machine (lm) and the HPC -machines (hm): SCP, RSYNC, SFTP. Type following commands in the terminal of the local machine. The -SCP command was used for the following example. Copy data from lm to hm +There are at least three tools to exchange data between your local machine (lm) and ZIH systems: +`scp`, `rsync`, and `sftp`. Please refer to the offline or online man pages of +[scp](https://www.man7.org/linux/man-pages/man1/scp.1.html), +[rsync](https://man7.org/linux/man-pages/man1/rsync.1.html), and +[sftp](https://man7.org/linux/man-pages/man1/sftp.1.html) for detailed information. -```Bash -# Copy file from your local machine. For example: scp helloworld.txt mustermann@taurusexport.hrsk.tu-dresden.de:/scratch/ws/mastermann-Macine_learning_project/ -scp <file> <zih-user>@taurusexport.hrsk.tu-dresden.de:<target-location> +!!! hint -scp -r <directory> <zih-user>@taurusexport.hrsk.tu-dresden.de:<target-location> #Copy directory from your local machine. -``` + No matter what tool you prefer, it is crucial that the **export nodes** are used prefered way to + copy data to/from ZIH systems. -Copy data from hm to lm +!!! example "Example using `scp` to copy a file from your workstation to ZIH systems" -```Bash -# Copy file. For example: scp mustermann@taurusexport.hrsk.tu-dresden.de:/scratch/ws/mastermann-Macine_learning_project/helloworld.txt /home/mustermann/Downloads -scp <zih-user>@taurusexport.hrsk.tu-dresden.de:<file> <target-location> + ```console + marie@local$ scp <file> <zih-user>@taurusexport.hrsk.tu-dresden.de:<target-location> -scp -r <zih-user>@taurusexport.hrsk.tu-dresden.de:<directory> <target-location> #Copy directory -``` + # Add -r to copy whole directory + marie@local$ scp -r <directory> <zih-user>@taurusexport.hrsk.tu-dresden.de:<target-location> + ``` -## Moving data inside the HPC machines: Datamover +!!! example "Example using `scp` to copy a file from ZIH systems to your workstation" -The best way to transfer data inside the Taurus is the datamover. It is the special data transfer -machine provides the best data speed. To load, move, copy etc. files from one file system to another -file system, you have to use commands with dt prefix, such as: dtcp, dtwget, dtmv, dtrm, dtrsync, -dttar, dtls. These commands submit a job to the data transfer machines that execute the selected -command. Except for the 'dt' prefix, their syntax is the same as the shell command without the 'dt'. +```console +marie@login$ scp <zih-user>@taurusexport.hrsk.tu-dresden.de:<file> <target-location> + +# Add -r to copy whole directory +marie@login$ scp -r <zih-user>@taurusexport.hrsk.tu-dresden.de:<directory> <target-location> +``` -Keep in mind: The warm_archive is not writable for jobs. However, you can store the data in the warm -archive with the datamover. +## Moving Data Inside ZIH Systems: Datamover -Useful links: [Data Mover]**todo link**, [Export Nodes]**todo link** +The recommended way for data transfer inside ZIH Systems is the **datamover**. It is a special +data transfer machine that provides the best transfer speed. To load, move, copy etc. files from one +filesystem to another filesystem, you have to use commands prefixed with `dt`: `dtcp`, `dtwget`, +`dtmv`, `dtrm`, `dtrsync`, `dttar`, `dtls`. These commands submit a job to the data transfer machines that +execute the selected command. +Plese refer to the detailed documentation regarding the [datamover](datamover.md). diff --git a/doc.zih.tu-dresden.de/mkdocs.yml b/doc.zih.tu-dresden.de/mkdocs.yml index a1ea32410..93dd9c0be 100644 --- a/doc.zih.tu-dresden.de/mkdocs.yml +++ b/doc.zih.tu-dresden.de/mkdocs.yml @@ -18,7 +18,7 @@ nav: - Security Restrictions: access/security_restrictions.md - Transfer of Data: - Overview: data_transfer/overview.md - - Data Mover: data_transfer/data_mover.md + - Data Mover: data_transfer/datamover.md - Export Nodes: data_transfer/export_nodes.md - Environment and Software: - Overview: software/overview.md -- GitLab