Skip to content
Snippets Groups Projects
Commit 540446bf authored by Taras Lazariv's avatar Taras Lazariv
Browse files

Fix checks

parent 7cccbf6c
No related branches found
No related tags found
5 merge requests!333Draft: update NGC containers,!322Merge preview into main,!319Merge preview into main,!279Draft: Machine Learning restructuring,!258Data Analytics restructuring
# Jupyter Installation # Jupyter Installation
Jupyter notebooks are a great way for interactive computing in your web browser. Jupyter allows Jupyter notebooks are a great way for interactive computing in your web browser. Jupyter allows
working with data cleaning and transformation, numerical simulation, statistical modelling, data working with data cleaning and transformation, numerical simulation, statistical modeling, data
visualization and of course with machine learning. visualization and of course with machine learning.
There are two general options on how to work Jupyter notebooks using HPC: remote Jupyter server and There are two general options on how to work Jupyter notebooks using HPC: remote Jupyter server and
...@@ -26,7 +26,7 @@ environment: ...@@ -26,7 +26,7 @@ environment:
srun --pty -n 1 --cpus-per-task=2 --time=2:00:00 --mem-per-cpu=2500 --x11=first bash -l -i srun --pty -n 1 --cpus-per-task=2 --time=2:00:00 --mem-per-cpu=2500 --x11=first bash -l -i
``` ```
Create a new subdirectory in your home, e.g. Jupyter Create a new directory in your home, e.g. Jupyter
```Bash ```Bash
mkdir Jupyter cd Jupyter mkdir Jupyter cd Jupyter
...@@ -52,7 +52,7 @@ Anaconda3-2019.03-Linux-x86_64.sh ./Anaconda3-2019.03-Linux-x86_64.sh ...@@ -52,7 +52,7 @@ Anaconda3-2019.03-Linux-x86_64.sh ./Anaconda3-2019.03-Linux-x86_64.sh
``` ```
Next step will install the anaconda environment into the home Next step will install the anaconda environment into the home
directory (/home/userxx/anaconda3). Create a new anaconda environment with the name "jnb". directory (`/home/userxx/anaconda3`). Create a new anaconda environment with the name `jnb`.
```Bash ```Bash
conda create --name jnb conda create --name jnb
...@@ -67,8 +67,8 @@ deactivate it also manually) and install Jupyter packages for this python enviro ...@@ -67,8 +67,8 @@ deactivate it also manually) and install Jupyter packages for this python enviro
source activate jnb conda install jupyter source activate jnb conda install jupyter
``` ```
If you need to adjust the configuration, you should create the template. Generate config files for If you need to adjust the configuration, you should create the template. Generate configuration
Jupyter notebook server: files for Jupyter notebook server:
```Bash ```Bash
jupyter notebook --generate-config jupyter notebook --generate-config
...@@ -91,7 +91,7 @@ You get a message like that: ...@@ -91,7 +91,7 @@ You get a message like that:
/home/<zih_user>/.jupyter/jupyter_notebook_config.json /home/<zih_user>/.jupyter/jupyter_notebook_config.json
``` ```
I order to create an SSL certificate for secure connections, you can create a self-signed I order to create a certificate for secure connections, you can create a self-signed
certificate: certificate:
```Bash ```Bash
...@@ -100,7 +100,7 @@ openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout mykey.key -out mycer ...@@ -100,7 +100,7 @@ openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout mykey.key -out mycer
Fill in the form with decent values. Fill in the form with decent values.
Possible entries for your Jupyter config (`.jupyter/jupyter_notebook*config.py*`). Possible entries for your Jupyter configuration (`.jupyter/jupyter_notebook*config.py*`).
```Bash ```Bash
c.NotebookApp.certfile = u'<path-to-cert>/mycert.pem' c.NotebookApp.keyfile = c.NotebookApp.certfile = u'<path-to-cert>/mycert.pem' c.NotebookApp.keyfile =
...@@ -127,7 +127,7 @@ unset XDG_RUNTIME_DIR # might be required when interactive instead of sbatch t ...@@ -127,7 +127,7 @@ unset XDG_RUNTIME_DIR # might be required when interactive instead of sbatch t
'Permission denied error' srun jupyter notebook 'Permission denied error' srun jupyter notebook
``` ```
Start the script above (e.g. with the name jnotebook) with sbatch command: Start the script above (e.g. with the name `jnotebook`) with sbatch command:
```Bash ```Bash
sbatch jnotebook.slurm sbatch jnotebook.slurm
......
...@@ -24,13 +24,13 @@ properly: ...@@ -24,13 +24,13 @@ properly:
* use `workspaces` as a place for working data (i.e. datasets); Recommendations of choosing the * use `workspaces` as a place for working data (i.e. datasets); Recommendations of choosing the
correct storage system for workspace presented below. correct storage system for workspace presented below.
### Taxonomy of File Systems ### Taxonomy of Filesystems
It is important to design your data workflow according to characteristics, like I/O footprint It is important to design your data workflow according to characteristics, like I/O footprint
(bandwidth/IOPS) of the application, size of the data, (number of files,) and duration of the (bandwidth/IOPS) of the application, size of the data, (number of files,) and duration of the
storage to efficiently use the provided storage and file systems. storage to efficiently use the provided storage and filesystems.
The page [file systems](file_systems.md) holds a comprehensive documentation on the different file The page [filesystems](file_systems.md) holds a comprehensive documentation on the different
systems. filesystems.
<!--In general, the mechanisms of <!--In general, the mechanisms of
so-called--> <!--[Workspaces](workspaces.md) are compulsory for all HPC users to store data for a so-called--> <!--[Workspaces](workspaces.md) are compulsory for all HPC users to store data for a
defined duration ---> <!--depending on the requirements and the storage system this time span might defined duration ---> <!--depending on the requirements and the storage system this time span might
...@@ -48,7 +48,7 @@ range from days to a few--> <!--years.--> ...@@ -48,7 +48,7 @@ range from days to a few--> <!--years.-->
[warm_archive](file_systems.md#warm_archive) can be used. [warm_archive](file_systems.md#warm_archive) can be used.
(Note that this is mounted **read-only** on the compute nodes). (Note that this is mounted **read-only** on the compute nodes).
* For a series of calculations that works on the same data please use a `scratch` based [workspace](workspaces.md). * For a series of calculations that works on the same data please use a `scratch` based [workspace](workspaces.md).
* **SSD**, in its turn, is the fastest available file system made only for large parallel * **SSD**, in its turn, is the fastest available filesystem made only for large parallel
applications running with millions of small I/O (input, output operations). applications running with millions of small I/O (input, output operations).
* If the batch job needs a directory for temporary data then **SSD** is a good choice as well. * If the batch job needs a directory for temporary data then **SSD** is a good choice as well.
The data can be deleted afterwards. The data can be deleted afterwards.
...@@ -60,17 +60,17 @@ otherwise it could vanish. The core data of your project should be [backed up](# ...@@ -60,17 +60,17 @@ otherwise it could vanish. The core data of your project should be [backed up](#
### Backup ### Backup
The backup is a crucial part of any project. Organize it at the beginning of the project. The The backup is a crucial part of any project. Organize it at the beginning of the project. The
backup mechanism on ZIH systems covers **only** the `/home` and `/projects` file systems. Backed up backup mechanism on ZIH systems covers **only** the `/home` and `/projects` filesystems. Backed up
files can be restored directly by the users. Details can be found files can be restored directly by the users. Details can be found
[here](file_systems.md#backup-and-snapshots-of-the-file-system). [here](file_systems.md#backup-and-snapshots-of-the-file-system).
!!! warning !!! warning
If you accidentally delete your data in the "no backup" file systems it **can not be restored**! If you accidentally delete your data in the "no backup" filesystems it **can not be restored**!
### Folder Structure and Organizing Data ### Folder Structure and Organizing Data
Organizing of living data using the file system helps for consistency and structuredness of the Organizing of living data using the filesystem helps for consistency and structuredness of the
project. We recommend following the rules for your work regarding: project. We recommend following the rules for your work regarding:
* Organizing the data: Never change the original data; Automatize the organizing the data; Clearly * Organizing the data: Never change the original data; Automatize the organizing the data; Clearly
...@@ -130,7 +130,7 @@ you don’t need throughout its life cycle. ...@@ -130,7 +130,7 @@ you don’t need throughout its life cycle.
<!--## Software Packages--> <!--## Software Packages-->
<!--As was written before the module concept is the basic concept for using software on Taurus.--> <!--As was written before the module concept is the basic concept for using software on ZIH system.-->
<!--Uniformity of the project has to be achieved by using the same set of software on different levels.--> <!--Uniformity of the project has to be achieved by using the same set of software on different levels.-->
<!--It could be done by using environments. There are two types of environments should be distinguished:--> <!--It could be done by using environments. There are two types of environments should be distinguished:-->
<!--runtime environment (the project level, use scripts to load [modules]**todo link**), Python virtual--> <!--runtime environment (the project level, use scripts to load [modules]**todo link**), Python virtual-->
...@@ -144,16 +144,16 @@ you don’t need throughout its life cycle. ...@@ -144,16 +144,16 @@ you don’t need throughout its life cycle.
<!--### Python Virtual Environment--> <!--### Python Virtual Environment-->
<!--If you are working with the Python then it is crucial to use the virtual environment on Taurus. The--> <!--If you are working with the Python then it is crucial to use the virtual environment on ZIH system. The-->
<!--main purpose of Python virtual environments (don't mess with the software environment for modules)--> <!--main purpose of Python virtual environments (don't mess with the software environment for modules)-->
<!--is to create an isolated environment for Python projects (self-contained directory tree that--> <!--is to create an isolated environment for Python projects (self-contained directory tree that-->
<!--contains a Python installation for a particular version of Python, plus a number of additional--> <!--contains a Python installation for a particular version of Python, plus a number of additional-->
<!--packages).--> <!--packages).-->
<!--**Vitualenv (venv)** is a standard Python tool to create isolated Python environments. We--> <!--**Vitualenv (venv)** is a standard Python tool to create isolated Python environments. We-->
<!--recommend using venv to work with Tensorflow and Pytorch on Taurus. It has been integrated into the--> <!--recommend using venv to work with Tensorflow and Pytorch on ZIH system. It has been integrated into the-->
<!--standard library under the [venv module]**todo link**. **Conda** is the second way to use a virtual--> <!--standard library under the [venv module]**todo link**. **Conda** is the second way to use a virtual-->
<!--environment on the Taurus. Conda is an open-source package management system and environment--> <!--environment on the ZIH system. Conda is an open-source package management system and environment-->
<!--management system from the Anaconda.--> <!--management system from the Anaconda.-->
<!--[Detailed information]**todo link** about using the virtual environment.--> <!--[Detailed information]**todo link** about using the virtual environment.-->
...@@ -168,9 +168,8 @@ you don’t need throughout its life cycle. ...@@ -168,9 +168,8 @@ you don’t need throughout its life cycle.
The concept of **permissions** and **ownership** is crucial in Linux. See the The concept of **permissions** and **ownership** is crucial in Linux. See the
[HPC-introduction]**todo link** slides for the understanding of the main concept. Standard Linux [HPC-introduction]**todo link** slides for the understanding of the main concept. Standard Linux
changing permission command (i.e `chmod`) valid for Taurus as well. The **group** access level changing permission command (i.e `chmod`) valid for ZIH system as well. The **group** access level
contains members of your project group. Be careful with 'write' permission and never allow to change contains members of your project group. Be careful with 'write' permission and never allow to change
the original data. the original data.
Useful links: [Data Management]**todo link**, [File Systems]**todo link**, [Get Started with Useful links: [Data Management]**todo link**, [Filesystems]**todo link**, [Project Management]**todo link**, [Preservation research data[**todo link**
HPC-DA]**todo link**, [Project Management]**todo link**, [Preservation research data[**todo link**
...@@ -29,11 +29,11 @@ list]**todo link**. ...@@ -29,11 +29,11 @@ list]**todo link**.
<!--After logging in, you are on one of the login nodes. They are not meant for work, but only for the--> <!--After logging in, you are on one of the login nodes. They are not meant for work, but only for the-->
<!--login process and short tests. Allocating resources will be done by batch system--> <!--login process and short tests. Allocating resources will be done by batch system-->
<!--[SLURM](../jobs_and_resources/slurm.md).--> <!--[Slurm](../jobs_and_resources/slurm.md).-->
## Modules ## Modules
Usage of software on HPC systems, e.g., frameworks, compilers, loader and libraries, is Usage of software on ZIH systems, e.g., frameworks, compilers, loader and libraries, is
almost always managed by a **modules system**. Thus, it is crucial to be familiar with the almost always managed by a **modules system**. Thus, it is crucial to be familiar with the
[modules concept and its commands](modules.md). A module is a user interface that provides [modules concept and its commands](modules.md). A module is a user interface that provides
utilities for the dynamic modification of a user's environment without manual modifications. utilities for the dynamic modification of a user's environment without manual modifications.
...@@ -47,7 +47,7 @@ The [Jupyter Notebook](https://jupyter.org/) is an open-source web application t ...@@ -47,7 +47,7 @@ The [Jupyter Notebook](https://jupyter.org/) is an open-source web application t
documents containing live code, equations, visualizations, and narrative text. There is a documents containing live code, equations, visualizations, and narrative text. There is a
[JupyterHub](../access/jupyterhub.md) service on ZIH systems, where you can simply run your Jupyter [JupyterHub](../access/jupyterhub.md) service on ZIH systems, where you can simply run your Jupyter
notebook on compute nodes using [modules](#modules), preloaded or custom virtual environments. notebook on compute nodes using [modules](#modules), preloaded or custom virtual environments.
Moreover, you can run a [manually created remote jupyter server](../archive/install_jupyter.md) Moreover, you can run a [manually created remote jupyter server](../archive/install_jupyter.md)
for more specific cases. for more specific cases.
## Containers ## Containers
......
...@@ -156,3 +156,4 @@ workspaces ...@@ -156,3 +156,4 @@ workspaces
stdout stdout
stderr stderr
multithreaded multithreaded
hostname
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment