Skip to content
Snippets Groups Projects
Commit ffda7d33 authored by Sebastian Döbel's avatar Sebastian Döbel
Browse files

enhance ratarmount

parent a3296869
No related branches found
No related tags found
2 merge requests!1101Automated merge from preview to main,!1091ratarmount workflow
......@@ -240,7 +240,6 @@ archive or in your home directory if the archive is in a non-writable location.
Subsequent mounts instantly load that sidecar file instead of reanalyzing the archive.
You will find further information on the [GitHub page](https://github.com/mxmlnkn/ratarmount).
#### Example Workflow for using Ratarmount
Ratarmount is installed globally on the HPC system.
......@@ -255,12 +254,13 @@ marie@local$ tar cf dataset.tar folder_containing_my_small_files
marie@compute$ dttar cf dataset.tar folder_containing_my_small_files
```
For the latter, please make sure that you are on a datamover node and **not** on a login node. Depending on the number of files, the tar bundle process may take some time.
For the latter, please make sure that you are on a [Datamover node](../data_transfer/datamover.md)
and **not** on a login node.
Depending on the number of files, the tar bundle process may take some time.
We do not recommend to compress (e.g. gzip) the archive, as this will decrease the read performance substantially
We do not recommend to compress (e.g. Gzip) the archive, as this can decrease the read performance substantially
e.g. for images, audio and video files.
Once the tar archive has been created, you can mount it on the compute node using `ratarmount'.
All files in the mount points can be accessed as normal files or directories
in the filesystem without any special treatment.
......@@ -269,9 +269,17 @@ Note that the tar archive must be mounted on every compute node in your job.
!!! note
Mounting an archive for the first time can take some time because Ratarmount has to create an index of its contents to access it efficiently.
The index, named `.<name_of_the_archive>.index.sqlite`, will be placed
in the same directory as the archive if the directory is writable,
otherwise ratarmount will try to place the index in your home directory.
This indexing step could be done in a separate job to save resources.
It also prevents conflicting indexing by more than one process at the same time.
```bash
sbatch --ntask=1 --mem=10G --time=5:00:00 ratarmount dataset.tar
```
!!! example Example job script using Ratarmount
!!! example "Example job script using Ratarmount"
```bash
#!/bin/bash
......@@ -287,7 +295,7 @@ Note that the tar archive must be mounted on every compute node in your job.
srun --ntasks-per-node=1 ratarmount dataset.tar ${DATASET}
# now it can be accessed like a normal directory
srun -ntasks=1 ls ${DATASET}
srun --ntasks=1 ls ${DATASET}
# start the application
srun ./my_application --input-directory ${DATASET}
......@@ -296,28 +304,14 @@ Note that the tar archive must be mounted on every compute node in your job.
srun --ntasks-per-node=1 ratarmount -u ${DATASET}
```
The index, named `.<name_of_the_archive>.index.sqlite`, will be placed
in the same directory as the archive if the directory is writable,
otherwise ratarmount will try to place the index in your home directory.
This indexing step could be done in a separate job to save resources.
It also prevents conflicting indexing by more than one process at the same time.
```bash
sbatch --ntask=1 --mem=10G --time=5:00:00 ratarmount dataset.tar
```
!!! hint
If you are starting many processes per node, Ratarmount could benefit from
having individual mount points for each process, rather than just one per node.
In case of Ratarmount issues
please [open an issue](https://github.com/mxmlnkn/ratarmount/issues) on GitHub.
There also is a library interface called
[ratarmountcore](https://github.com/mxmlnkn/ratarmount/tree/master/core#example) that works
fully without FUSE, which might make access to files from Python even faster.
\ No newline at end of file
fully without FUSE, which might make access to files from Python even faster.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment