From 540446bf97d5b81229b747834707b45fbc890471 Mon Sep 17 00:00:00 2001 From: lazariv <taras.lazariv@tu-dresden.de> Date: Fri, 27 Aug 2021 10:56:46 +0000 Subject: [PATCH] Fix checks --- .../docs/archive/install_jupyter.md | 16 +++++----- .../docs/data_lifecycle/overview.md | 29 +++++++++---------- .../docs/software/overview.md | 6 ++-- doc.zih.tu-dresden.de/wordlist.aspell | 1 + 4 files changed, 26 insertions(+), 26 deletions(-) diff --git a/doc.zih.tu-dresden.de/docs/archive/install_jupyter.md b/doc.zih.tu-dresden.de/docs/archive/install_jupyter.md index 993870ead..72dc74fe0 100644 --- a/doc.zih.tu-dresden.de/docs/archive/install_jupyter.md +++ b/doc.zih.tu-dresden.de/docs/archive/install_jupyter.md @@ -1,7 +1,7 @@ # Jupyter Installation Jupyter notebooks are a great way for interactive computing in your web browser. Jupyter allows -working with data cleaning and transformation, numerical simulation, statistical modelling, data +working with data cleaning and transformation, numerical simulation, statistical modeling, data visualization and of course with machine learning. There are two general options on how to work Jupyter notebooks using HPC: remote Jupyter server and @@ -26,7 +26,7 @@ environment: srun --pty -n 1 --cpus-per-task=2 --time=2:00:00 --mem-per-cpu=2500 --x11=first bash -l -i ``` -Create a new subdirectory in your home, e.g. Jupyter +Create a new directory in your home, e.g. Jupyter ```Bash mkdir Jupyter cd Jupyter @@ -52,7 +52,7 @@ Anaconda3-2019.03-Linux-x86_64.sh ./Anaconda3-2019.03-Linux-x86_64.sh ``` Next step will install the anaconda environment into the home -directory (/home/userxx/anaconda3). Create a new anaconda environment with the name "jnb". +directory (`/home/userxx/anaconda3`). Create a new anaconda environment with the name `jnb`. ```Bash conda create --name jnb @@ -67,8 +67,8 @@ deactivate it also manually) and install Jupyter packages for this python enviro source activate jnb conda install jupyter ``` -If you need to adjust the configuration, you should create the template. Generate config files for -Jupyter notebook server: +If you need to adjust the configuration, you should create the template. Generate configuration +files for Jupyter notebook server: ```Bash jupyter notebook --generate-config @@ -91,7 +91,7 @@ You get a message like that: /home/<zih_user>/.jupyter/jupyter_notebook_config.json ``` -I order to create an SSL certificate for secure connections, you can create a self-signed +I order to create a certificate for secure connections, you can create a self-signed certificate: ```Bash @@ -100,7 +100,7 @@ openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout mykey.key -out mycer Fill in the form with decent values. -Possible entries for your Jupyter config (`.jupyter/jupyter_notebook*config.py*`). +Possible entries for your Jupyter configuration (`.jupyter/jupyter_notebook*config.py*`). ```Bash c.NotebookApp.certfile = u'<path-to-cert>/mycert.pem' c.NotebookApp.keyfile = @@ -127,7 +127,7 @@ unset XDG_RUNTIME_DIR # might be required when interactive instead of sbatch t 'Permission denied error' srun jupyter notebook ``` -Start the script above (e.g. with the name jnotebook) with sbatch command: +Start the script above (e.g. with the name `jnotebook`) with sbatch command: ```Bash sbatch jnotebook.slurm diff --git a/doc.zih.tu-dresden.de/docs/data_lifecycle/overview.md b/doc.zih.tu-dresden.de/docs/data_lifecycle/overview.md index 0ef2d03db..e85307724 100644 --- a/doc.zih.tu-dresden.de/docs/data_lifecycle/overview.md +++ b/doc.zih.tu-dresden.de/docs/data_lifecycle/overview.md @@ -24,13 +24,13 @@ properly: * use `workspaces` as a place for working data (i.e. datasets); Recommendations of choosing the correct storage system for workspace presented below. -### Taxonomy of File Systems +### Taxonomy of Filesystems It is important to design your data workflow according to characteristics, like I/O footprint (bandwidth/IOPS) of the application, size of the data, (number of files,) and duration of the -storage to efficiently use the provided storage and file systems. -The page [file systems](file_systems.md) holds a comprehensive documentation on the different file -systems. +storage to efficiently use the provided storage and filesystems. +The page [filesystems](file_systems.md) holds a comprehensive documentation on the different +filesystems. <!--In general, the mechanisms of so-called--> <!--[Workspaces](workspaces.md) are compulsory for all HPC users to store data for a defined duration ---> <!--depending on the requirements and the storage system this time span might @@ -48,7 +48,7 @@ range from days to a few--> <!--years.--> [warm_archive](file_systems.md#warm_archive) can be used. (Note that this is mounted **read-only** on the compute nodes). * For a series of calculations that works on the same data please use a `scratch` based [workspace](workspaces.md). - * **SSD**, in its turn, is the fastest available file system made only for large parallel + * **SSD**, in its turn, is the fastest available filesystem made only for large parallel applications running with millions of small I/O (input, output operations). * If the batch job needs a directory for temporary data then **SSD** is a good choice as well. The data can be deleted afterwards. @@ -60,17 +60,17 @@ otherwise it could vanish. The core data of your project should be [backed up](# ### Backup The backup is a crucial part of any project. Organize it at the beginning of the project. The -backup mechanism on ZIH systems covers **only** the `/home` and `/projects` file systems. Backed up +backup mechanism on ZIH systems covers **only** the `/home` and `/projects` filesystems. Backed up files can be restored directly by the users. Details can be found [here](file_systems.md#backup-and-snapshots-of-the-file-system). !!! warning - If you accidentally delete your data in the "no backup" file systems it **can not be restored**! + If you accidentally delete your data in the "no backup" filesystems it **can not be restored**! ### Folder Structure and Organizing Data -Organizing of living data using the file system helps for consistency and structuredness of the +Organizing of living data using the filesystem helps for consistency and structuredness of the project. We recommend following the rules for your work regarding: * Organizing the data: Never change the original data; Automatize the organizing the data; Clearly @@ -130,7 +130,7 @@ you don’t need throughout its life cycle. <!--## Software Packages--> -<!--As was written before the module concept is the basic concept for using software on Taurus.--> +<!--As was written before the module concept is the basic concept for using software on ZIH system.--> <!--Uniformity of the project has to be achieved by using the same set of software on different levels.--> <!--It could be done by using environments. There are two types of environments should be distinguished:--> <!--runtime environment (the project level, use scripts to load [modules]**todo link**), Python virtual--> @@ -144,16 +144,16 @@ you don’t need throughout its life cycle. <!--### Python Virtual Environment--> -<!--If you are working with the Python then it is crucial to use the virtual environment on Taurus. The--> +<!--If you are working with the Python then it is crucial to use the virtual environment on ZIH system. The--> <!--main purpose of Python virtual environments (don't mess with the software environment for modules)--> <!--is to create an isolated environment for Python projects (self-contained directory tree that--> <!--contains a Python installation for a particular version of Python, plus a number of additional--> <!--packages).--> <!--**Vitualenv (venv)** is a standard Python tool to create isolated Python environments. We--> -<!--recommend using venv to work with Tensorflow and Pytorch on Taurus. It has been integrated into the--> +<!--recommend using venv to work with Tensorflow and Pytorch on ZIH system. It has been integrated into the--> <!--standard library under the [venv module]**todo link**. **Conda** is the second way to use a virtual--> -<!--environment on the Taurus. Conda is an open-source package management system and environment--> +<!--environment on the ZIH system. Conda is an open-source package management system and environment--> <!--management system from the Anaconda.--> <!--[Detailed information]**todo link** about using the virtual environment.--> @@ -168,9 +168,8 @@ you don’t need throughout its life cycle. The concept of **permissions** and **ownership** is crucial in Linux. See the [HPC-introduction]**todo link** slides for the understanding of the main concept. Standard Linux -changing permission command (i.e `chmod`) valid for Taurus as well. The **group** access level +changing permission command (i.e `chmod`) valid for ZIH system as well. The **group** access level contains members of your project group. Be careful with 'write' permission and never allow to change the original data. -Useful links: [Data Management]**todo link**, [File Systems]**todo link**, [Get Started with -HPC-DA]**todo link**, [Project Management]**todo link**, [Preservation research data[**todo link** +Useful links: [Data Management]**todo link**, [Filesystems]**todo link**, [Project Management]**todo link**, [Preservation research data[**todo link** diff --git a/doc.zih.tu-dresden.de/docs/software/overview.md b/doc.zih.tu-dresden.de/docs/software/overview.md index bdf96b20b..f8f4bf32b 100644 --- a/doc.zih.tu-dresden.de/docs/software/overview.md +++ b/doc.zih.tu-dresden.de/docs/software/overview.md @@ -29,11 +29,11 @@ list]**todo link**. <!--After logging in, you are on one of the login nodes. They are not meant for work, but only for the--> <!--login process and short tests. Allocating resources will be done by batch system--> -<!--[SLURM](../jobs_and_resources/slurm.md).--> +<!--[Slurm](../jobs_and_resources/slurm.md).--> ## Modules -Usage of software on HPC systems, e.g., frameworks, compilers, loader and libraries, is +Usage of software on ZIH systems, e.g., frameworks, compilers, loader and libraries, is almost always managed by a **modules system**. Thus, it is crucial to be familiar with the [modules concept and its commands](modules.md). A module is a user interface that provides utilities for the dynamic modification of a user's environment without manual modifications. @@ -47,7 +47,7 @@ The [Jupyter Notebook](https://jupyter.org/) is an open-source web application t documents containing live code, equations, visualizations, and narrative text. There is a [JupyterHub](../access/jupyterhub.md) service on ZIH systems, where you can simply run your Jupyter notebook on compute nodes using [modules](#modules), preloaded or custom virtual environments. -Moreover, you can run a [manually created remote jupyter server](../archive/install_jupyter.md) +Moreover, you can run a [manually created remote jupyter server](../archive/install_jupyter.md) for more specific cases. ## Containers diff --git a/doc.zih.tu-dresden.de/wordlist.aspell b/doc.zih.tu-dresden.de/wordlist.aspell index 2ec271c6c..5351b3d7c 100644 --- a/doc.zih.tu-dresden.de/wordlist.aspell +++ b/doc.zih.tu-dresden.de/wordlist.aspell @@ -156,3 +156,4 @@ workspaces stdout stderr multithreaded +hostname -- GitLab