diff --git a/README.md b/README.md index 17a8f9baadd7a1f34655f311518f8661eea754eb..b694e1223fb05f577ce6a6cc4a46f69a7230464d 100644 --- a/README.md +++ b/README.md @@ -20,7 +20,7 @@ Please check for any already existing issue before submitting your issue in orde issues. **Reminder:** Non-documentation issues and requests need to be send as ticket to -[hpcsupport@zih.tu-dresden.de](mailto:hpcsupport@zih.tu-dresden.de). +[hpcsupport@zih.tu-dresden.de](mailto:hpc-support@tu-dresden.de). ## Contributing diff --git a/doc.zih.tu-dresden.de/docs/access/desktop_cloud_visualization.md b/doc.zih.tu-dresden.de/docs/access/desktop_cloud_visualization.md index e3fe6e8f25e5a59c876454807410c05c2494f8d3..31dca5e0ce4d1aaf34754d998f934fad49ad4cb4 100644 --- a/doc.zih.tu-dresden.de/docs/access/desktop_cloud_visualization.md +++ b/doc.zih.tu-dresden.de/docs/access/desktop_cloud_visualization.md @@ -36,7 +36,7 @@ direct rendering: Yes [...] ``` -If direct rendering is not set to `Yes`, please contact [HPC support](mailto:hpcsupport@zih.tu-dresden.de). +If direct rendering is not set to `Yes`, please contact [HPC support](mailto:hpc-support@tu-dresden.de). - Expand LD_LIBRARY_PATH: diff --git a/doc.zih.tu-dresden.de/docs/access/key_fingerprints.md b/doc.zih.tu-dresden.de/docs/access/key_fingerprints.md index 2b849f3ee87c51b1e5a027a4512040c9a3732d41..d3b754f7d05d50e3ce254ecae0a3e5027cb82edf 100644 --- a/doc.zih.tu-dresden.de/docs/access/key_fingerprints.md +++ b/doc.zih.tu-dresden.de/docs/access/key_fingerprints.md @@ -5,56 +5,35 @@ The key fingerprints of login and export nodes can occasionally change. This page holds up-to-date fingerprints. -## Login Nodes +Each cluster can be accessed via so-called **login nodes** using specific hostnames. All login nodes +of a cluster share common SSH key fingerprints that are unique. When connecting to one of our HPC +systems via the corresponding login nodes, please make sure that the provided fingerprint matches +one of the table. This is the only way to ensure you are connecting to the correct server from the +very beginning. If the **fingerprint differs**, please contact the +[HPC support team](../support/support.md). -### Taurus +In case an additional login node of the same cluster is used, the key needs to be checked and +approved again. -The following hostnames can be used to access ZIH systems: - -- `taurus.hrsk.tu-dresden.de` -- `tauruslogin3.hrsk.tu-dresden.de` -- `tauruslogin4.hrsk.tu-dresden.de` -- `tauruslogin5.hrsk.tu-dresden.de` -- `tauruslogin6.hrsk.tu-dresden.de` - -All of these login nodes share common keys. When connecting, please make sure that the fingerprint -shown matches one of the table. - -| Key type | Fingerprint | -|:---------|:----------------------------------------------------| -| RSA | SHA256:/M1lW1KTOlxj8UFZJS4tLi+8TyndcDqrZfLGX7KAU8s | -| RSA | MD5:b8:e1:21:ed:38:1a:ba:1a:5b:2b:bc:35:31:62:21:49 | -| ECDSA | SHA256:PeCpW/gAFLvHDzTP2Rb93NxD+rpUsyQY8WebjQC7kz0 | -| ECDSA | MD5:47:7e:24:46:ab:30:59:2c:1f:e8:fd:37:2a:5d:ee:25 | -| ED25519 | SHA256:nNxjtCny1kB0N0epHaOPeY1YFd0ri2Dvt2CK7rOGlXg | -| ED25519 | MD5:7c:0c:2b:8b:83:21:b2:08:19:93:6d:03:80:76:8a:7b | -{: summary="List of valid fingerprints for login nodes"} - -??? example "Connecting with SSH" +??? example "Connecting with SSH to Barnard" ```console - marie@local$ ssh taurus.hrsk.tu-dresden.de - The authenticity of host 'taurus.hrsk.tu-dresden.de (141.30.73.105)' can't be established. - ECDSA key fingerprint is SHA256:PeCpW/gAFLvHDzTP2Rb93NxD+rpUsyQY8WebjQC7kz0. + marie@local$ ssh login1.barnard.hpc.tu-dresden.de + The authenticity of host 'login1.barnard.hpc.tu-dresden.de (172.24.95.28)' can't be established. + ECDSA key fingerprint is SHA256:8Coljw7yoVH6HA8u+K3makRK9HfOSfe+BG8W/CUEPp0. Are you sure you want to continue connecting (yes/no)? ``` - In this case, the fingerprint matches the one given in the table. Thus, one can proceed by + In this case, the fingerprint matches the one given in the table. Thus, you can proceed by typing 'yes'. -### Barnard - -The following hostnames can be used to access ZIH systems: - -- `login1.barnard.hpc.tu-dresden.de` -- `login2.barnard.hpc.tu-dresden.de` +## Barnard -All of these login nodes share common keys. When connecting, please make sure that the fingerprint -shown matches one of the table. +The cluster [`Barnard`](../jobs_and_resources/barnard.md) can be accessed via the four login +nodes `login[1-4].barnard.hpc.tu-dresden.de`. (Please choose one concrete login node when +connecting, see example below.) -#### login[1-4].barnard.hpc.tu-dresden.de - -| Key type | Fingerprint | +| Key Type | Fingerprint | |:---------|:----------------------------------------------------| | RSA | SHA256:lVQOvnci07jkxmFnX58pQf3cD7lz1mf4K4b9jZrAlVU | | RSA | MD5:5b:39:ae:03:3a:60:15:21:4b:e8:ba:72:52:b8:a1:ad | @@ -62,32 +41,84 @@ shown matches one of the table. | ECDSA | MD5:02:fd:ab:c8:39:f9:94:cc:3f:e0:7e:78:5f:76:b8:4c | | ED25519 | SHA256:Gn4n5IX9eEvkpOGrtZzs9T9yAfJUB200bgRchchiKAQ | | ED25519 | MD5:e8:10:96:67:e8:4c:fd:87:f0:c6:4e:e8:1f:53:a9:be | -{: summary="List of valid fingerprints for Barnard login[1-4] node"} +{: summary="List of valid fingerprints for Barnard login[1-4] nodes"} -??? example "Connecting with SSH" +## Romeo - ```console - marie@local$ ssh login1.barnard.hpc.tu-dresden.de - The authenticity of host 'login1.barnard.hpc.tu-dresden.de (172.24.95.28)' can't be established. - ECDSA key fingerprint is SHA256:8Coljw7yoVH6HA8u+K3makRK9HfOSfe+BG8W/CUEPp0. - Are you sure you want to continue connecting (yes/no)? - ``` +The cluster [`Romeo`](../jobs_and_resources/romeo.md) can be accessed via the two +login nodes `login[1-2].romeo.hpc.tu-dresden.de`. (Please choose one concrete login node when +connecting, see example below.) - In this case, the fingerprint matches the one given in the table. Thus, one can proceed by - typing 'yes'. +| Key Type | Fingerprint | +|:---------|:----------------------------------------------------| +|RSA | SHA256:BvYEYJtIYDGr3U0up58q5F7aog7JA2RP+w53XKmwO8I | +|RSA | MD5:5d:dc:40:3b:8b:89:77:5d:0f:29:84:31:0f:73:25:9f | +|ECDSA | SHA256:lgxNRgGcKe7oDGuwf0WV9VPukA30kEqg0sNDLLQwu8Y | +|ECDSA | MD5:e1:bd:e4:77:06:97:f9:f3:03:18:56:66:14:5d:8d:18 | +|ED25519 | SHA256:QNjH0ulelqykywMkt3UNTG4W1HzRkHqrhu0f6oq302I | +|ED25519 | MD5:e4:4e:7a:76:aa:87:da:17:92:b1:17:c6:a1:25:29:7e | +{: summary="List of valid fingerprints for Romeo login[1-2] node"} + +## Alpha Centauri + +The cluster [`Alpha Centauri`](../jobs_and_resources/alpha_centauri.md) can be accessed via the two +login nodes `login[1-2].alpha.hpc.tu-dresden.de`. (Please choose one concrete login node when +connecting, see example below.) + +| Key Type | Fingerprint | +|:---------|:----------------------------------------------------| +| RSA | SHA256:BvYEYJtIYDGr3U0up58q5F7aog7JA2RP+w53XKmwO8I | +| RSA | MD5:5d:dc:40:3b:8b:89:77:5d:0f:29:84:31:0f:73:25:9f | +| ECDSA | SHA256:lgxNRgGcKe7oDGuwf0WV9VPukA30kEqg0sNDLLQwu8Y | +| ECDSA | MD5:e1:bd:e4:77:06:97:f9:f3:03:18:56:66:14:5d:8d:18 | +| ED25519 | SHA256:QNjH0ulelqykywMkt3UNTG4W1HzRkHqrhu0f6oq302I | +| ED25519 | MD5:e4:4e:7a:76:aa:87:da:17:92:b1:17:c6:a1:25:29:7e | +{: summary="List of valid fingerprints for Alpha Centauri login[1-2] node"} + +## Julia + +The cluster [`Julia`](../jobs_and_resources/julia.md) can be accessed via `julia.hpc.tu-dresden.de`. +(Note, there is no separate login node.) + +| Key Type | Fingerprint | +|:---------|:----------------------------------------------------| +| RSA | | +| RSA | | +| ECDSA | | +| ECDSA | | +| ED25519 | | +| ED25519 | | +{: summary="List of valid fingerprints for Julia login node"} + +## IBM Power9 + +The cluster [`Power9`](../jobs_and_resources/power9.md) can be accessed via the four Taurus login +nodes `tauruslogin[3-6].hrsk.tu-dresden.de` or shortly `taurus.hrsk.tu-dresden.de`. +All of these login nodes share common keys. When connecting, please make sure that the fingerprint +shown matches one of the table. + +| Key Type | Fingerprint | +|:---------|:----------------------------------------------------| +| RSA | SHA256:/M1lW1KTOlxj8UFZJS4tLi+8TyndcDqrZfLGX7KAU8s | +| RSA | MD5:b8:e1:21:ed:38:1a:ba:1a:5b:2b:bc:35:31:62:21:49 | +| ECDSA | SHA256:PeCpW/gAFLvHDzTP2Rb93NxD+rpUsyQY8WebjQC7kz0 | +| ECDSA | MD5:47:7e:24:46:ab:30:59:2c:1f:e8:fd:37:2a:5d:ee:25 | +| ED25519 | SHA256:nNxjtCny1kB0N0epHaOPeY1YFd0ri2Dvt2CK7rOGlXg | +| ED25519 | MD5:7c:0c:2b:8b:83:21:b2:08:19:93:6d:03:80:76:8a:7b | +{: summary="List of valid fingerprints for login nodes"} ## Export Nodes +All of these export nodes share common keys. When transfering files, please make sure that the +fingerprint shown matches one of the table. + The following hostnames can be used to transfer files to/from ZIH systems: -- `taurusexport.hrsk.tu-dresden.de` +- `taurusexport.hpc.tu-dresden.de` - `taurusexport3.hrsk.tu-dresden.de` - `taurusexport4.hrsk.tu-dresden.de` -All of these export nodes share common keys. When transfering files, please make sure that the -fingerprint shown matches one of the table. - -| Key type | Fingerprint | +| Key Type | Fingerprint | |:---------|:----------------------------------------------------| | RSA | SHA256:Qjg79R+5x8jlyHhLBZYht599vRk+SujnG1yT1l2dYUM | | RSA | MD5:1e:4c:2d:81:ee:58:1b:d1:3c:0a:18:c4:f7:0b:23:20 | @@ -96,3 +127,28 @@ fingerprint shown matches one of the table. | ED25519 | SHA256:jxWiddvDe0E6kpH55PHKF0AaBg/dQLefQaQZ2P4mb3o | | ED25519 | MD5:fe:0a:d2:46:10:4a:08:40:fd:e1:99:b7:f2:06:4f:bc | {: summary="List of valid fingerprints for export nodes"} + +## Dataport Nodes + +When you transfer files using the [dataport nodes](../data_transfer/dataport_nodes.md), please make +sure that the fingerprint shown matches one of the table. + +| Key Type | Fingerprint | +|:---------|:----------------------------------------------------| +| RSA | SHA256:t4Vl4rHRHbglZIm+hck2MWld+0smYAb2rx7EGWWmya0 | +| RSA | MD5:59:7f:c1:7b:34:ec:c8:07:3d:fe:b8:b5:6a:96:ea:0c | +| ECDSA | SHA256:Ga1dXpp1yM5GRJC77PgDCQwDy7oHdrTY7z11V1Eq1L8 | +| ECDSA | MD5:43:1b:b8:f6:23:ab:4a:08:dc:a1:b3:09:8c:8c:be:f9 | +| ED25519 | SHA256:SRYDLKt7YTQkXmkv+doWV/b55xQz1nT4ZZtXYGhydg4 | +| ED25519 | MD5:eb:96:21:d7:61:9f:39:10:82:b9:21:e9:4a:87:c2:9a | +{: summary="List of valid fingerprints for dataport1.hpc.tu-dresden.de"} + +| Key Type | Fingerprint | +|:---------|:----------------------------------------------------| +| RSA | SHA256:aMVmc3E0+ndXPiQ8EpY6lFk5CFdfGJfjy/0UBcxor58 | +| RSA | MD5:03:b8:8a:2b:10:e5:6b:c8:0b:78:ab:4e:5b:2c:0e:2c | +| ECDSA | SHA256:6t69R86zhJjGDlIobdsSbKFn8Km3cs9JYWlDW22nEvI | +| ECDSA | MD5:58:70:7d:df:e5:c8:43:cf:a5:75:ad:f5:da:9f:1a:6d | +| ED25519 | SHA256:KAD9xwCRK5C8Ch6Idfnfy88XrzZcqEJ9Ms6O+AzGfDE | +| ED25519 | MD5:2b:61:5a:89:fe:96:95:0d:3d:f6:29:40:55:ea:e6:11 | +{: summary="List of valid fingerprints for dataport2.hpc.tu-dresden.de"} diff --git a/doc.zih.tu-dresden.de/docs/access/ssh_login.md b/doc.zih.tu-dresden.de/docs/access/ssh_login.md index 760c752a2399cc8a41c5f7a921bfada5ba473465..6208725610d9e0104d6f98e399eaf7d09918dc53 100644 --- a/doc.zih.tu-dresden.de/docs/access/ssh_login.md +++ b/doc.zih.tu-dresden.de/docs/access/ssh_login.md @@ -27,11 +27,11 @@ Enter same passphrase again: ``` Type in a passphrase for the protection of your key. The passphrase should be **non-empty**. -Copy the public key to the ZIH system (Replace placeholder `marie` with your ZIH login): +Copy the **public key** to the ZIH system (Replace placeholder `marie` with your ZIH login): ```console -marie@local$ ssh-copy-id -i ~/.ssh/id_ed25519.pub marie@taurus.hrsk.tu-dresden.de -The authenticity of host 'taurus.hrsk.tu-dresden.de (141.30.73.104)' can't be established. +marie@local$ ssh-copy-id -i ~/.ssh/id_ed25519.pub marie@login2.barnard.hpc.tu-dresden.de +The authenticity of host 'barnard.hpc.tu-dresden.de (141.30.73.104)' can't be established. RSA key fingerprint is SHA256:HjpVeymTpk0rqoc8Yvyc8d9KXQ/p2K0R8TJ27aFnIL8. Are you sure you want to continue connecting (yes/no)? ``` @@ -39,24 +39,31 @@ Are you sure you want to continue connecting (yes/no)? Compare the shown fingerprint with the [documented fingerprints](key_fingerprints.md). Make sure they match. Then you can accept by typing `yes`. -!!! info +!!! note "One `ssh-copy-id` command for all clusters" + + Since your home directory, where the file `.ssh/authorized_keys` is stored, is available on all HPC + systems, this task is only required once and you can freely choose a target system for the + `ssh-copy-id` command. Afterwards, you can access all clusters with this key file. + +??? info "ssh-copy-id is not available" + If `ssh-copy-id` is not available, you need to do additional steps: ```console - marie@local$ scp ~/.ssh/id_ed25519.pub marie@taurus.hrsk.tu-dresden.de: - The authenticity of host 'taurus.hrsk.tu-dresden.de (141.30.73.104)' can't be established. - RSA key fingerprint is SHA256:HjpVeymTpk0rqoc8Yvyc8d9KXQ/p2K0R8TJ27aFnIL8. + marie@local$ scp ~/.ssh/id_ed25519.pub marie@login2.barnard.hpc.tu-dresden.de: + The authenticity of host 'barnard.hpc.tu-dresden.de (141.30.73.104)' can't be established. + RSA key fingerprint is SHA256:Gn4n5IX9eEvkpOGrtZzs9T9yAfJUB200bgRchchiKAQ. Are you sure you want to continue connecting (yes/no)? ``` After that, you need to manually copy the key to the right place: ```console - marie@local$ ssh marie@taurus.hrsk.tu-dresden.de + marie@local$ ssh marie@login2.barnard.hpc.tu-dresden.de [...] - marie@login$ mkdir -p ~/.ssh - marie@login$ touch ~/.ssh/authorized_keys - marie@login$ cat id_ed25519.pub >> ~/.ssh/authorized_keys + marie@login.barnard$ mkdir -p ~/.ssh + marie@login.barnard$ touch ~/.ssh/authorized_keys + marie@login.barnard$ cat id_ed25519.pub >> ~/.ssh/authorized_keys ``` ### Configuring Default Parameters for SSH @@ -64,20 +71,20 @@ they match. Then you can accept by typing `yes`. After you have copied your key to the ZIH system, you should be able to connect using: ```console -marie@local$ ssh marie@taurus.hrsk.tu-dresden.de +marie@local$ ssh marie@login2.barnard.hpc.tu-dresden.de [...] -marie@login$ exit +marie@login.barnard$ exit ``` However, you can make this more comfortable if you prepare an SSH configuration on your local workstation. Navigate to the subdirectory `.ssh` in your home directory and open the file `config` (`~/.ssh/config`) in your favorite editor. If it does not exist, create it. Put the following lines -in it (you can omit lines starting with `#`): +in it (you can omit lines starting with `#`):<!--TODO: Host taurusexport to be removed ~May 2024--> ```bash -Host taurus +Host barnard #For login (shell access) - HostName taurus.hrsk.tu-dresden.de + HostName login1.barnard.hpc.tu-dresden.de #Put your ZIH-Login after keyword "User": User marie #Path to private key: @@ -87,6 +94,15 @@ Host taurus #Enable X11 forwarding for graphical applications and compression. You don't need parameter -X and -C when invoking ssh then. ForwardX11 yes Compression yes +Host dataport + #For copying data without shell access + HostName dataport1.hpc.tu-dresden.de + #Put your ZIH-Login after keyword "User": + User marie + #Path to private key: + IdentityFile ~/.ssh/id_ed25519 + #Don't try other keys if you have more: + IdentitiesOnly yes Host taurusexport #For copying data without shell access HostName taurusexport.hrsk.tu-dresden.de @@ -101,11 +117,16 @@ Host taurusexport Afterwards, you can connect to the ZIH system using: ```console -marie@local$ ssh taurus +marie@local$ ssh barnard ``` -If you want to copy data from/to ZIH systems, please refer to [Export Nodes: Transfer Data to/from -ZIH's Filesystems](../data_transfer/export_nodes.md) for more information on export nodes. +If you want to copy data from/to ZIH systems, please refer to [Dataport Nodes: Transfer Data to/from +ZIH's Filesystems](../data_transfer/dataport_nodes.md) for more information on dataport nodes. + +!!! note "HPC systems `Alpha`, `Julia`, `Romeo`" + + In the above `.ssh/config` file, the HPC system `Barnard` is chosen as an example. + The very same settings can be made for the HPC systems `Alpha`, `Julia`, `Romeo`, etc. ## X11-Forwarding @@ -114,7 +135,7 @@ X11-forwarding for the connection. If you use the SSH configuration described ab already prepared and you can simply use: ```console -marie@local$ ssh taurus +marie@local$ ssh barnard ``` If you have omitted the last two lines in the default configuration above, you need to add the @@ -122,7 +143,7 @@ option `-X` or `-XC` to your SSH command. The `-C` enables compression which usu usability in this case: ```console -marie@local$ ssh -XC taurus +marie@local$ ssh -XC barnard ``` !!! info diff --git a/doc.zih.tu-dresden.de/docs/access/ssh_mobaxterm.md b/doc.zih.tu-dresden.de/docs/access/ssh_mobaxterm.md index 85d5f9cfc1e55078ba9c79e881d7156dacf9ca0a..5978b9b31165864c952c69d82e79259f8516cda6 100644 --- a/doc.zih.tu-dresden.de/docs/access/ssh_mobaxterm.md +++ b/doc.zih.tu-dresden.de/docs/access/ssh_mobaxterm.md @@ -38,9 +38,10 @@ Here you can set different options in the following tabs:  -1. Select a SSH section. Insert "Remote host" (`taurus.hrsk.tu-dresden.de`), "Username" (replace - `marie` with your ZIH login), and "Port" 22. Using the button right from the username option, - you can store and manage credentials. +1. Select a SSH section. Insert "Remote host" (`login2.barnard.hpc.tu-dresden.de`), "Username" + (replace `marie` with your ZIH login), and "Port" 22. Using the button right from the username + option, you can store and manage credentials. To access a different cluster, change the name + accordingly (e.g.`login1.alpha.hpc.tu-dresden.de`)  diff --git a/doc.zih.tu-dresden.de/docs/access/ssh_putty.md b/doc.zih.tu-dresden.de/docs/access/ssh_putty.md index 28105fef81b555f6a4af28fc8fdcc128797ea65a..5fd33790caee0d97c7dbab8322a227a3cd43f8d3 100644 --- a/doc.zih.tu-dresden.de/docs/access/ssh_putty.md +++ b/doc.zih.tu-dresden.de/docs/access/ssh_putty.md @@ -13,8 +13,8 @@ instructions for installation. ## Start a new SSH session -1. Start PuTTY and insert the "Host Name" (`taurus.hrsk.tu-dresden.de`) and leave the default - port (22). +1. Start PuTTY and insert the "Host Name" (`login2.barnard.hpc.tu-dresden.de`) and leave the default + port (22). To access a different cluster, change the name accordingly (e.g.`login1.alpha.hpc.tu-dresden.de`)  diff --git a/doc.zih.tu-dresden.de/docs/application/acknowledgement.md b/doc.zih.tu-dresden.de/docs/application/acknowledgement.md index dfe9fd0e42bc6d97aadada80e499ac9dac2d3b50..f76a8ab81ccb81e1a0181f749019ea9b84a72f23 100644 --- a/doc.zih.tu-dresden.de/docs/application/acknowledgement.md +++ b/doc.zih.tu-dresden.de/docs/application/acknowledgement.md @@ -4,22 +4,26 @@ To provide you with modern and powerful HPC systems in future as well, we have t systems help to advance research. For that purpose we rely on your help. In most cases, the results of your computations are used for presentations and publications, especially in peer-reviewed magazines, journals, and conference proceedings. We kindly ask you to mention the HPC resource usage -in the acknowledgment section of all publications that are based on granted HPC resources of the TU -Dresden. Examples: +in the acknowledgment section of all publications that are based on granted HPC resources of the +[NHR center at TU Dresden](https://tu-dresden.de/zih/hochleistungsrechnen/nhr-center). Examples: -!!! example +!!! example (standard case) - The authors gratefully acknowledge the GWK support for funding this project by providing - computing time through the Center for Information Services and HPC (ZIH) at TU Dresden. + The authors gratefully acknowledge the computing time made available to them on + the high-performance computer <XY> at the NHR Center of TU Dresden. This center is jointly + supported by the Federal Ministry of Education and Research and the state governments + participating in the NHR (www.nhr-verein.de/unsere-partner). -!!! example +!!! example (two NHR centers) - The authors are grateful to the Center for Information Services and High Performance Computing - [Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH)] at TU Dresden for providing its - facilities for high throughput calculations. + The authors gratefully acknowledge the computing time made available to them on + the high-performance computers <XY> and <YZ> at the NHR Centers at TU Dresden and <YZ>. + These centers are jointly supported by the Federal Ministry of Education and Research + and the state governments participating in the NHR (www.nhr-verein.de/unsere-partner). -!!! example +!!! example (German) - Die Autoren danken Bund und Land Sachsen für die Unterstützung bei der Finanzierung dieses - Projektes mit der Bereitstellung von Rechenzeit durch das Zentrum für Informationsdienste und - Hochleistungsrechnen (ZIH) an der TU Dresden. + Die Autoren bedanken sich für die ihnen zur Verfügung gestellte Rechenzeit auf + dem Hochleistungsrechner <XY> am NHR-Zentrum der TU Dresden. Dieses wird gemeinsam durch + das Bundesministerium für Bildung und Forschung und den am NHR beteiligten + Landesregierungen (www.nhr-verein.de/unsere-partner) unterstützt. diff --git a/doc.zih.tu-dresden.de/docs/archive/beegfs_on_demand.md b/doc.zih.tu-dresden.de/docs/archive/beegfs_on_demand.md index f7804c5fc7e5792b9dab3ef722e6a6e8bf83a754..d44116c0b97ff204bafc3cf5240340627769bec5 100644 --- a/doc.zih.tu-dresden.de/docs/archive/beegfs_on_demand.md +++ b/doc.zih.tu-dresden.de/docs/archive/beegfs_on_demand.md @@ -10,8 +10,8 @@ search: This documentation page is outdated. Please see the [new BeeGFS page](../data_lifecycle/beegfs.md). -**Prerequisites:** To work with TensorFlow you obviously need a [login](../application/overview.md) to -the ZIH systems and basic knowledge about Linux, mounting, and batch system Slurm. +**Prerequisites:** To work with TensorFlow you obviously need a [login](../application/overview.md) +to the ZIH systems and basic knowledge about Linux, mounting, and batch system Slurm. **Aim** of this page is to introduce users how to start working with the BeeGFS filesystem - a high-performance parallel filesystem. @@ -67,7 +67,7 @@ Check the status of the job with `squeue -u \<username>`. ## Mount BeeGFS Filesystem You can mount BeeGFS filesystem on the partition ml (PowerPC architecture) or on the -partition haswell (x86_64 architecture), more information about [partitions](../jobs_and_resources/partitions_and_limits.md). +partition haswell (x86_64 architecture). ### Mount BeeGFS Filesystem on the Partition `ml` diff --git a/doc.zih.tu-dresden.de/docs/archive/hardware_overview_2022.md b/doc.zih.tu-dresden.de/docs/archive/hardware_overview_2022.md deleted file mode 100644 index 3974a2524de36f7dd2b2fdb03f5a54fa3375f6a0..0000000000000000000000000000000000000000 --- a/doc.zih.tu-dresden.de/docs/archive/hardware_overview_2022.md +++ /dev/null @@ -1,109 +0,0 @@ -# HPC Resources - -HPC resources in ZIH systems comprise the *High Performance Computing and Storage Complex* and its -extension *High Performance Computing – Data Analytics*. In total it offers scientists -about 60,000 CPU cores and a peak performance of more than 1.5 quadrillion floating point -operations per second. The architecture specifically tailored to data-intensive computing, Big Data -analytics, and artificial intelligence methods with extensive capabilities for energy measurement -and performance monitoring provides ideal conditions to achieve the ambitious research goals of the -users and the ZIH. - -## Login and Export Nodes - -- 4 Login-Nodes `tauruslogin[3-6].hrsk.tu-dresden.de` - - Each login node is equipped with 2x Intel(R) Xeon(R) CPU E5-2680 v3 with 24 cores in total @ - 2.50 GHz, Multithreading disabled, 64 GB RAM, 128 GB SSD local disk - - IPs: 141.30.73.\[102-105\] -- 2 Data-Transfer-Nodes `taurusexport[3-4].hrsk.tu-dresden.de` - - DNS Alias `taurusexport.hrsk.tu-dresden.de` - - 2 Servers without interactive login, only available via file transfer protocols - (`rsync`, `ftp`) - - IPs: 141.30.73.\[82,83\] - - Further information on the usage is documented on the site - [Export Nodes](../data_transfer/export_nodes.md) - -## AMD Rome CPUs + NVIDIA A100 - -- 34 nodes, each with - - 8 x NVIDIA A100-SXM4 Tensor Core-GPUs - - 2 x AMD EPYC CPU 7352 (24 cores) @ 2.3 GHz, Multithreading available - - 1 TB RAM - - 3.5 TB local memory on NVMe device at `/tmp` -- Hostnames: `taurusi[8001-8034]` -- Slurm partition: `alpha` -- Further information on the usage is documented on the site [Alpha Centauri Nodes](../jobs_and_resources/alpha_centauri.md) - -## Island 7 - AMD Rome CPUs - -- 192 nodes, each with - - 2 x AMD EPYC CPU 7702 (64 cores) @ 2.0 GHz, Multithreading available - - 512 GB RAM - - 200 GB local memory on SSD at `/tmp` -- Hostnames: `taurusi[7001-7192]` -- Slurm partition: `romeo` -- Further information on the usage is documented on the site [AMD Rome Nodes](../jobs_and_resources/rome_nodes.md) - -## Large SMP System HPE Superdome Flex - -- 1 node, with - - 32 x Intel(R) Xeon(R) Platinum 8276M CPU @ 2.20 GHz (28 cores) - - 47 TB RAM -- Configured as one single node -- 48 TB RAM (usable: 47 TB - one TB is used for cache coherence protocols) -- 370 TB of fast NVME storage available at `/nvme/<projectname>` -- Hostname: `taurussmp8` -- Slurm partition: `julia` -- Further information on the usage is documented on the site [HPE Superdome Flex](../jobs_and_resources/sd_flex.md) - -## IBM Power9 Nodes for Machine Learning - -For machine learning, we have IBM AC922 nodes installed with this configuration: - -- 32 nodes, each with - - 2 x IBM Power9 CPU (2.80 GHz, 3.10 GHz boost, 22 cores) - - 256 GB RAM DDR4 2666 MHz - - 6 x NVIDIA VOLTA V100 with 32 GB HBM2 - - NVLINK bandwidth 150 GB/s between GPUs and host -- Hostnames: `taurusml[1-32]` -- Slurm partition: `ml` - -## Island 6 - Intel Haswell CPUs - -- 612 nodes, each with - - 2 x Intel(R) Xeon(R) CPU E5-2680 v3 (12 cores) @ 2.50 GHz, Multithreading disabled - - 128 GB local memory on SSD -- Varying amounts of main memory (selected automatically by the batch system for you according to - your job requirements) - * 594 nodes with 2.67 GB RAM per core (64 GB in total): `taurusi[6001-6540,6559-6612]` - - 18 nodes with 10.67 GB RAM per core (256 GB in total): `taurusi[6541-6558]` -- Hostnames: `taurusi[6001-6612]` -- Slurm Partition: `haswell` - -??? hint "Node topology" - -  - {: align=center} - -## Island 2 Phase 2 - Intel Haswell CPUs + NVIDIA K80 GPUs - -- 64 nodes, each with - - 2 x Intel(R) Xeon(R) CPU E5-E5-2680 v3 (12 cores) @ 2.50 GHz, Multithreading disabled - - 64 GB RAM (2.67 GB per core) - - 128 GB local memory on SSD - - 4 x NVIDIA Tesla K80 (12 GB GDDR RAM) GPUs -- Hostnames: `taurusi[2045-2108]` -- Slurm Partition: `gpu2` -- Node topology, same as [island 4 - 6](#island-6-intel-haswell-cpus) - -## SMP Nodes - up to 2 TB RAM - -- 5 Nodes, each with - - 4 x Intel(R) Xeon(R) CPU E7-4850 v3 (14 cores) @ 2.20 GHz, Multithreading disabled - - 2 TB RAM -- Hostnames: `taurussmp[3-7]` -- Slurm partition: `smp2` - -??? hint "Node topology" - -  - {: align=center} diff --git a/doc.zih.tu-dresden.de/docs/archive/migrate_to_atlas.md b/doc.zih.tu-dresden.de/docs/archive/migrate_to_atlas.md index ab44cf9bcba0898c0461be3b0d3774d61cd8d6a4..af0a0ca650fa4ca18b5131d2960feea946de783e 100644 --- a/doc.zih.tu-dresden.de/docs/archive/migrate_to_atlas.md +++ b/doc.zih.tu-dresden.de/docs/archive/migrate_to_atlas.md @@ -90,7 +90,7 @@ re-compile made sense. ### Applications As a default, all applications provided by ZIH should run an atlas -without problems. Please [tell us](mailto:hpcsupport@zih.tu-dresden.de) +without problems. Please [tell us](mailto:hpc-support@tu-dresden.de) if you are missing your application or experience severe performance degradation. Please include "Atlas" in your subject. diff --git a/doc.zih.tu-dresden.de/docs/archive/no_ib_jobs.md b/doc.zih.tu-dresden.de/docs/archive/no_ib_jobs.md index 93394d5b8f14c12d8acfda5604066d14e39790ad..249f86aea7d879120941d1a48a2b8d418c5a0617 100644 --- a/doc.zih.tu-dresden.de/docs/archive/no_ib_jobs.md +++ b/doc.zih.tu-dresden.de/docs/archive/no_ib_jobs.md @@ -20,7 +20,7 @@ search: At the moment when parts of the IB stop we will start batch system plugins to parse for this batch system option: `--comment=NO_IB`. Jobs with this option set can run on nodes without -Infiniband access if (and only if) they have set the `--tmp`-option as well: +InfiniBand access if (and only if) they have set the `--tmp`-option as well: *From the Slurm documentation:* diff --git a/doc.zih.tu-dresden.de/docs/archive/overview.md b/doc.zih.tu-dresden.de/docs/archive/overview.md index 5da5b965a4ebc6751c5bd8f8f496a502a7ba6748..a8f48182ba3ebd12c7381f3c7608f7fcc7f2908a 100644 --- a/doc.zih.tu-dresden.de/docs/archive/overview.md +++ b/doc.zih.tu-dresden.de/docs/archive/overview.md @@ -5,7 +5,6 @@ search: # Archive (Outdated) -A warm welcome to the **archive**. You probably got here by following a link from within the compendium -or by purpose. -The archive holds outdated documentation for future reference. -Hence, documentation in the archive is not further updated. +A warm welcome to the **archive**. You probably got here by following a link from within the +compendium or by purpose. The archive holds outdated documentation for future reference. Hence, +documentation in the archive is not further updated. diff --git a/doc.zih.tu-dresden.de/docs/archive/scs5_software.md b/doc.zih.tu-dresden.de/docs/archive/scs5_software.md index 5aadc995ef4c07d5009cc959388e61889edd3a86..79c7ba16d393f0f31963e9d6fe4e69dfcefcffd3 100644 --- a/doc.zih.tu-dresden.de/docs/archive/scs5_software.md +++ b/doc.zih.tu-dresden.de/docs/archive/scs5_software.md @@ -17,7 +17,7 @@ Here are the major changes from the user's perspective: | Red Hat Enterprise Linux (RHEL) | 6.x | 7.x | | Linux kernel | 2.26 | 3.10 | | glibc | 2.12 | 2.17 | -| Infiniband stack | OpenIB | Mellanox | +| InfiniBand stack | OpenIB | Mellanox | | Lustre client | 2.5 | 2.10 | ## Host Keys @@ -73,7 +73,7 @@ that was built especially for SCS5. ### Which modules should I use? If possible, please use the modules from **modenv/scs5**. In case there is a certain software -missing, you can write an [email to hpcsupport](mailto:hpcsupport@zih.tu-dresden.de) and we will try +missing, you can write an [email to hpcsupport](mailto:hpc-support@tu-dresden.de) and we will try to install the latest version of this particular software for you. However, if you still need *older* versions of some software, you have to resort to using the diff --git a/doc.zih.tu-dresden.de/docs/archive/slurm_profiling.md b/doc.zih.tu-dresden.de/docs/archive/slurm_profiling.md index 3ca0a8e2b6e4618923a379ed8bcec854256b7fbf..7602e45320f536baf022cc669e4458680ec2f18d 100644 --- a/doc.zih.tu-dresden.de/docs/archive/slurm_profiling.md +++ b/doc.zih.tu-dresden.de/docs/archive/slurm_profiling.md @@ -1,4 +1,4 @@ -# Job Profiling +# Job Profiling (Outdated) !!! info "2022-05-24" @@ -14,7 +14,7 @@ The following data can be gathered: * Task data, such as CPU frequency, CPU utilization, memory consumption (RSS and VMSize), I/O * Energy consumption of the nodes -* Infiniband data (currently deactivated) +* InfiniBand data (currently deactivated) * Lustre filesystem data (currently deactivated) The data is sampled at a fixed rate (i.e. every 5 seconds) and is stored in a HDF5 file. diff --git a/doc.zih.tu-dresden.de/docs/archive/system_atlas.md b/doc.zih.tu-dresden.de/docs/archive/system_atlas.md index e8a63ab236196b0853b9a1e6f90c809cfc567be5..94e34d7cdd918483b9392bea94dbf2809c8369d3 100644 --- a/doc.zih.tu-dresden.de/docs/archive/system_atlas.md +++ b/doc.zih.tu-dresden.de/docs/archive/system_atlas.md @@ -32,7 +32,7 @@ node has 180 GB local disk space for scratch mounted on `/tmp`. The jobs for the scheduled by the [Platform LSF](platform_lsf.md) batch system from the login nodes `atlas.hrsk.tu-dresden.de` . -A QDR Infiniband interconnect provides the communication and I/O infrastructure for low latency / +A QDR InfiniBand interconnect provides the communication and I/O infrastructure for low latency / high throughput data traffic. Users with a login on the [SGI Altix](system_altix.md) can access their home directory via NFS diff --git a/doc.zih.tu-dresden.de/docs/archive/system_deimos.md b/doc.zih.tu-dresden.de/docs/archive/system_deimos.md index b36a9348138dc808273c83501afe92c99872a155..50682072db8e3c3d3bfba88ec9dfce6897b55c7f 100644 --- a/doc.zih.tu-dresden.de/docs/archive/system_deimos.md +++ b/doc.zih.tu-dresden.de/docs/archive/system_deimos.md @@ -29,7 +29,7 @@ mounted on `/tmp`. The jobs for the compute nodes are scheduled by the [Platform LSF](platform_lsf.md) batch system from the login nodes `deimos.hrsk.tu-dresden.de` . -Two separate Infiniband networks (10 Gb/s) with low cascading switches provide the communication and +Two separate InfiniBand networks (10 Gb/s) with low cascading switches provide the communication and I/O infrastructure for low latency / high throughput data traffic. An additional gigabit Ethernet network is used for control and service purposes. diff --git a/doc.zih.tu-dresden.de/docs/archive/system_phobos.md b/doc.zih.tu-dresden.de/docs/archive/system_phobos.md index 3519c36b876b15ea8b57146f112207ad0b5dd9f7..833c23d66d7c365a9c90d27fc067ff20175b9b34 100644 --- a/doc.zih.tu-dresden.de/docs/archive/system_phobos.md +++ b/doc.zih.tu-dresden.de/docs/archive/system_phobos.md @@ -25,7 +25,7 @@ All nodes share a 4.4 TB SAN. Each node has additional local disk space mounted jobs for the compute nodes are scheduled by a [Platform LSF](platform_lsf.md) batch system running on the login node `phobos.hrsk.tu-dresden.de`. -Two separate Infiniband networks (10 Gb/s) with low cascading switches provide the infrastructure +Two separate InfiniBand networks (10 Gb/s) with low cascading switches provide the infrastructure for low latency / high throughput data traffic. An additional GB/Ethernetwork is used for control and service purposes. diff --git a/doc.zih.tu-dresden.de/docs/archive/system_taurus.md b/doc.zih.tu-dresden.de/docs/archive/system_taurus.md new file mode 100644 index 0000000000000000000000000000000000000000..857eaba9f13d0974dd7c8c980243ee86ccb9f3dd --- /dev/null +++ b/doc.zih.tu-dresden.de/docs/archive/system_taurus.md @@ -0,0 +1,31 @@ +--- +search: + boost: 0.00001 +--- + +# System Taurus (Outdated) + +!!! warning + + **This page is deprecated! The system Taurus is decommissoned by the end of 2023!** + +HPC resources in ZIH systems comprise the *High Performance Computing and Storage Complex* and its +extension *High Performance Computing – Data Analytics*. In total it offers scientists +about 60,000 CPU cores and a peak performance of more than 1.5 quadrillion floating point +operations per second. The architecture specifically tailored to data-intensive computing, Big Data +analytics, and artificial intelligence methods with extensive capabilities for energy measurement +and performance monitoring provides ideal conditions to achieve the ambitious research goals of the +users and the ZIH. + +## SMP Nodes - up to 2 TB RAM + +- 5 Nodes, each with + - 4 x Intel(R) Xeon(R) CPU E7-4850 v3 (14 cores) @ 2.20 GHz, Multithreading disabled + - 2 TB RAM +- Hostnames: `taurussmp[3-7]` +- Slurm partition: `smp2` + +??? hint "Node topology" + +  + {: align=center} diff --git a/doc.zih.tu-dresden.de/docs/archive/systems_switched_off.md b/doc.zih.tu-dresden.de/docs/archive/systems_switched_off.md index 34880dfe3da0a57c6729ae813ac14142d96f460a..f2c381e57465fa3f1b9e6d10d64310cb1085faec 100644 --- a/doc.zih.tu-dresden.de/docs/archive/systems_switched_off.md +++ b/doc.zih.tu-dresden.de/docs/archive/systems_switched_off.md @@ -15,6 +15,7 @@ Documentation on former systems for future reference can be found on the followi - [Windows-HPC-Server Titan](system_titan.md) - [PC-Cluster Triton](system_triton.md) - [Shared-Memory-System Venus](system_venus.md) +- [Bull Taurus](system_taurus.md) ## Historical Overview diff --git a/doc.zih.tu-dresden.de/docs/contrib/content_rules.md b/doc.zih.tu-dresden.de/docs/contrib/content_rules.md index 1f54cd072b71e92e975864be51f2e7701a0bcd1c..e4de5420ca847cb4e3331e6aa0b14bf614891221 100644 --- a/doc.zih.tu-dresden.de/docs/contrib/content_rules.md +++ b/doc.zih.tu-dresden.de/docs/contrib/content_rules.md @@ -33,7 +33,7 @@ These licenses also apply to your contributions. If you are in doubt, please contact us either via [GitLab Issue](https://gitlab.hrz.tu-chemnitz.de/zih/hpcsupport/hpc-compendium/-/issues) -or via [Email](mailto:hpcsupport@zih.tu-dresden.de). +or via [Email](mailto:hpc-support@tu-dresden.de). ## Quick Overview @@ -397,11 +397,15 @@ This should help to avoid errors. | Localhost | `marie@local$` | | Login nodes | `marie@login$` | | Arbitrary compute node | `marie@compute$` | -| Partition `haswell` | `marie@haswell$` | -| Partition `ml` | `marie@ml$` | -| Partition `alpha` | `marie@alpha$` | -| Partition `romeo` | `marie@romeo$` | -| Partition `julia` | `marie@julia$` | +| Compute node `Barnard` | `marie@barnard$` | +| Login node `Barnard` | `marie@login.barnard$` | +| Compute node `Power9` | `marie@power9$` | +| Login node `Power9` | `marie@login.power9$` | +| Compute node `Alpha` | `marie@alpha$` | +| Login node `Alpha` | `marie@login.alpha$` | +| Compute node `Romeo` | `marie@romeo$` | +| Login node `Romeo` | `marie@login.romeo$` | +| Node `Julia` | `marie@julia$` | | Partition `dcv` | `marie@dcv$` | * **Always use a prompt**, even if there is no output provided for the shown command. diff --git a/doc.zih.tu-dresden.de/docs/contrib/howto_contribute.md b/doc.zih.tu-dresden.de/docs/contrib/howto_contribute.md index 065f40b75db35cd0ec4ac5c3e1a8fa51347c546e..6061c6fd2d589eec9509b3e946ac0d8926087057 100644 --- a/doc.zih.tu-dresden.de/docs/contrib/howto_contribute.md +++ b/doc.zih.tu-dresden.de/docs/contrib/howto_contribute.md @@ -42,7 +42,7 @@ merge request back to the `preview` branch. A member of the ZIH team will review (four-eyes principle) and finally merge your changes to `preview`. All contributions need to pass through the CI pipeline consisting of several checks to ensure compliance with the content rules. Please, don't worry too much about the checks. The ZIH staff will help you with that. You can find -more information about the [CI/CD pipeline](cicd-pipeline) in the eponymous subsection. +more information about the [CI/CD pipeline](#cicd-pipeline) in the eponymous subsection. In order to publish the updates and make them visible in the compendium, the changes on `preview` branch are either automatically merged into the `main` branch on every @@ -114,7 +114,7 @@ documentation. !!! warning "HPC support" Non-documentation issues and requests need to be send as ticket to - [hpcsupport@zih.tu-dresden.de](mailto:hpcsupport@zih.tu-dresden.de). + [hpc-support@tu-dresden.de](mailto:hpc-support@tu-dresden.de). ## Contribute via Web IDE diff --git a/doc.zih.tu-dresden.de/docs/data_lifecycle/file_systems.md b/doc.zih.tu-dresden.de/docs/data_lifecycle/file_systems.md index f39b03c7737a765b976f398304c6b2efbe164126..3af08d9ea429e6ce0afbab33a3e3118391f18335 100644 --- a/doc.zih.tu-dresden.de/docs/data_lifecycle/file_systems.md +++ b/doc.zih.tu-dresden.de/docs/data_lifecycle/file_systems.md @@ -4,15 +4,13 @@ As soon as you have access to ZIH systems, you have to manage your data. Several available. Each filesystem serves for special purpose according to their respective capacity, performance and permanence. -## Work Directories - -| Filesystem | Usable directory | Capacity | Availability | Backup | Remarks | -|:------------|:------------------|:---------|:-------------|:-------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `Lustre` | `/scratch/` | 4 PB | global | No | Only accessible via [Workspaces](workspaces.md). Not made for billions of files! | -| `Lustre` | `/lustre/ssd` | 40 TB | global | No | Only accessible via [Workspaces](workspaces.md). For small I/O operations | -| `BeeGFS` | `/beegfs/global0` | 280 TB | global | No | Only accessible via [Workspaces](workspaces.md). Fastest available filesystem, only for large parallel applications running with millions of small I/O operations | -| `BeeGFS` | `/beegfs/global1` | 232 TB | global | No | Only accessible via [Workspaces](workspaces.md). Fastest available filesystem, only for large parallel applications running with millions of small I/O operations | -| `ext4` | `/tmp` | 95 GB | local | No | is cleaned up after the job automatically | +We differentiate **between filesystems** and **working filesystems**: + +* the [permanent filesystems](permanent.md), i.e. `/home` and `/projects`, are meant to hold your +source code, configuration files, and other permanent data. +* The [working filesystems](working.md), i.e, `horse`, `walrus`, etc., are designed as scratch +filesystems holding your working and temporary data, e.g., input and output of your compute +jobs. ## Recommendations for Filesystem Usage diff --git a/doc.zih.tu-dresden.de/docs/data_lifecycle/overview.md b/doc.zih.tu-dresden.de/docs/data_lifecycle/overview.md index 6aca684354ab9beb0b0f338c86f0524bf9827e48..8355168e7b9f62570c39c46c564606af75f998b8 100644 --- a/doc.zih.tu-dresden.de/docs/data_lifecycle/overview.md +++ b/doc.zih.tu-dresden.de/docs/data_lifecycle/overview.md @@ -22,7 +22,7 @@ properly: of calculations. The home directory is not a working directory! However, `/home` filesystem is [backed up](#backup) using snapshots; * use `workspaces` as a place for working data (i.e. data sets); Recommendations of choosing the - correct storage system for workspace presented below. + correct storage system for workspace is presented below. ### Taxonomy of Filesystems diff --git a/doc.zih.tu-dresden.de/docs/data_lifecycle/permanent.md b/doc.zih.tu-dresden.de/docs/data_lifecycle/permanent.md index 3c75c37faf6650adbbad9a87bcfd228c9b55473d..fd300eee4121f752ca6eaa85e12961342b821924 100644 --- a/doc.zih.tu-dresden.de/docs/data_lifecycle/permanent.md +++ b/doc.zih.tu-dresden.de/docs/data_lifecycle/permanent.md @@ -8,9 +8,15 @@ senselessly filling the disks, - By the sheer number and volume of work files, they may keep the backup from working efficiently. +| Filesystem Name | Usable Directory | Availability | Type | Quota | +|:------------------|:------------------|:-------------|:---------|:-------------------| +| Home | `/home` | global (w/o Power9) | Lustre | per user: 20 GB | +| Projects | `/projects` | global (w/o Power9) | Lustre | per project | +| (Taurus/old) Home | `/home` | [Power9](../jobs_and_resources/power9.md) | NFS | per user: 20 GB | + ## Global /home Filesystem -Each user has 50 GiB in a `/home` directory independent of the granted capacity for the project. +Each user has 20 GiB in a `/home` directory independent of the granted capacity for the project. The home directory is mounted with read-write permissions on all nodes of the ZIH system. Hints for the usage of the global home directory: diff --git a/doc.zih.tu-dresden.de/docs/data_lifecycle/warm_archive.md b/doc.zih.tu-dresden.de/docs/data_lifecycle/warm_archive.md index 01c6e319ea575ca971cd52bc7c9dca3f5fd85ff3..04d84c53d38f3868179b88844c6678bfa0e66f2b 100644 --- a/doc.zih.tu-dresden.de/docs/data_lifecycle/warm_archive.md +++ b/doc.zih.tu-dresden.de/docs/data_lifecycle/warm_archive.md @@ -1,5 +1,16 @@ # Warm Archive +!!! danger "Warm Archive is End of Life" + + The `warm_archive` storage system will be decommissioned for good together with the Taurus + system (end of 2023). Thus, please **do not use** `warm_archive` any longer and **migrate you + data from** `warm_archive` to the new filesystems. We provide a quite comprehensive + documentation on the + [data migration process to the new filesystems](../jobs_and_resources/barnard.md#data-migration-to-new-filesystems). + + You should consider the new `walrus` storage as an substitue for jobs with moderately low + bandwidth, low IOPS. + The warm archive is intended as a storage space for the duration of a running HPC project. It does **not** substitute a long-term archive, though. diff --git a/doc.zih.tu-dresden.de/docs/data_lifecycle/working.md b/doc.zih.tu-dresden.de/docs/data_lifecycle/working.md new file mode 100644 index 0000000000000000000000000000000000000000..395d32ae71e9256271dd3d047879ea538fa69cd4 --- /dev/null +++ b/doc.zih.tu-dresden.de/docs/data_lifecycle/working.md @@ -0,0 +1,77 @@ +# Working Filesystems + +As soon as you have access to ZIH systems, you have to manage your data. Several filesystems are +available. Each filesystem serves for special purpose according to their respective capacity, +performance and permanence. + +!!! danger "End of life of `scratch` and `ssd`" + + The filesystem `/lustre/scratch` and `/lustre/ssd` will be turned off on January 3 2024 for good + (no data access afterwards!). + + The `/beegfs` filesystem will remain available to + [Alpha Centauri](../jobs_and_resources/hardware_overview.md#alpha-centauri) + and + [Power](../jobs_and_resources/hardware_overview.md#power9) + users only. + + All others need to migrate your data to Barnard’s new file system `/horse`. Please follow these + detailed instruction on how to [migrate to Barnard](../jobs_and_resources/barnard.md). + +| Filesystem Type | Usable Directory | Capacity | Availability | Remarks | +|:----------------|:------------------|:---------|:-------------------|:----------------------------------------------------------| +| `Lustre` | `/data/horse` | 20 PB | global | Only accessible via [Workspaces](workspaces.md). **The(!)** working directory to meet almost all demands | +| `Lustre` | `/data/walrus` | 20 PB | global | Only accessible via [Workspaces](workspaces.md). For moderately low bandwidth, low IOPS. Mounted read-only on compute nodes. | +| `WEKAio` | `/data/weasel` | 1 PB | global (w/o Power) | *Coming 2024!* For high IOPS | +| `BeeGFS` | `/beegfs/.global0` | 280 TB | [Alpha](../jobs_and_resources/alpha_centauri.md) and [Power9](../jobs_and_resources/power9.md) | Only accessible via [Workspaces](workspaces.md). Fastest available filesystem, only for large parallel applications running with millions of small I/O operations | +| `BeeGFS` | `/beegfs/.global1` | 232 TB | [Alpha](../jobs_and_resources/alpha_centauri.md) and [Power9](../jobs_and_resources/power9.md) | Only accessible via [Workspaces](workspaces.md). Fastest available filesystem, only for large parallel applications running with millions of small I/O operations | +| `ext4` | `/tmp` | 95 GB | node local | Systems: tbd. Is cleaned up after the job automatically. | + +??? "Outdated filesystems `/lustre/scratch` and `/lustre/ssd`" + + | Filesystem | Usable directory | Capacity | Availability | Backup | Remarks | + |:------------|:------------------|:---------|:-------------|:-------|:---------------------------------------------------------------------------------| + | `Lustre` | `/scratch/` | 4 PB | global | No | Only accessible via [Workspaces](workspaces.md). Not made for billions of files! | + | `Lustre` | `/lustre/ssd` | 40 TB | global | No | Only accessible via [Workspaces](workspaces.md). For small I/O operations | + +## Recommendations for Filesystem Usage + +To work as efficient as possible, consider the following points + +- Save source code etc. in `/home` or `/projects/...` +- Store checkpoints and other temporary data in [workspaces](workspaces.md) on `horse` +- Compilation in `/dev/shm` or `/tmp` + +Getting high I/O-bandwidth + +- Use many clients +- Use many processes (writing in the same file at the same time is possible) +- Use large I/O transfer blocks + +## Cheat Sheet for Debugging Filesystem Issues + +Users can select from the following commands to get some idea about +their data. + +### General + +For the first view, you can use the command `df`. + +```console +marie@login$ df +``` + +Alternatively, you can use the command `findmnt`, which is also able to report space usage +by adding the parameter `-D`: + +```console +marie@login$ findmnt -D +``` + +Optionally, you can use the parameter `-t` to specify the filesystem type or the parameter `-o` to +alter the output. + +!!! important + + Do **not** use the `du`-command for this purpose. It is able to cause issues + for other users, while reading data from the filesystem. diff --git a/doc.zih.tu-dresden.de/docs/data_lifecycle/workspaces.md b/doc.zih.tu-dresden.de/docs/data_lifecycle/workspaces.md index 924d98077b2489ba5f2516f3e21fe49004747ad2..769e8164a79f0932702a481e331acd64603d25d3 100644 --- a/doc.zih.tu-dresden.de/docs/data_lifecycle/workspaces.md +++ b/doc.zih.tu-dresden.de/docs/data_lifecycle/workspaces.md @@ -2,9 +2,9 @@ Storage systems differ in terms of capacity, streaming bandwidth, IOPS rate, etc. Price and efficiency don't allow to have it all in one. That is why fast parallel filesystems at ZIH have -restrictions with regards to **age of files** and [quota](permanent.md#quotas). The mechanism of -workspaces enables you to better manage your HPC data. It is common and used at a large number -of HPC centers. +restrictions with regards to **lifetime** and volume **[quota](permanent.md#quotas)**. The mechanism +of using _workspaces_ enables you to better manage your HPC data. It is common and used at a large +number of HPC centers. !!! note @@ -25,26 +25,89 @@ times. ## Workspace Management +### Workspace Lifetimes + +Since the workspace filesystems are intended for different use cases and thus differ in +performance, their granted timespans differ accordingly. The maximum lifetime and number of +renewals are provided in the following table. + +| Filesystem (use with parameter `--filesystem=<filesystem>`) | Max. Duration in Days | Extensions | Keeptime | [Filesystem Feature](../jobs_and_resources/slurm.md#filesystem-features) | +|:------------------------------------------------------------|---------------:|-----------:|---------:|:-------------------------------------------------------------------------| +| ` horse` | 100 | 10 | 30 | | +| ` walrus` | 100 | 10 | 60 | | +| `beegfs_global0` (deprecated) | 30 | 2 | 30 | `fs_beegfs_global0` | +| `beegfs` | 30 | 2 | 30 | `fs_beegfs` | +{: summary="Settings for Workspace Filesystems."} + +!!! note + + Currently, not all filesystems are available on all of our five clusters. The page + [Working Filesystems](working.md) provides the necessary information. + +??? warning "End-of-life filesystems" + + The filesystems `warm_archive`, `ssd` and `scratch` will be switched off end of 2023. Do not use + them anymore! + + | Filesystem (use with parameter `--filesystem=<filesystem>`) | Duration, days | Extensions | [Filesystem Feature](../jobs_and_resources/slurm.md#filesystem-features) | Remarks | + |:-------------------------------------|---------------:|-----------:|:-------------------------------------------------------------------------|:--------| + | `scratch` (default) | 100 | 10 | `fs_lustre_scratch2` | Scratch filesystem (`/lustre/scratch2`, symbolic link: `/scratch`) with high streaming bandwidth, based on spinning disks | + | `ssd` | 30 | 2 | `fs_lustre_ssd` | High-IOPS filesystem (`/lustre/ssd`, symbolic link: `/ssd`) on SSDs. | + | `warm_archive` | 365 | 2 | 30 | `fs_warm_archive_ws` | Capacity filesystem based on spinning disks | + ### List Available Filesystems To list all available filesystems for using workspaces, you can either invoke `ws_list -l` or -`ws_find --list`, e.g., +`ws_find --list`. Since not all workspace filesystems are available on all HPC systems, the concrete +output differs depending on the system you are logged in. The page [Working Filesystems](working.md) +provides information which filesystem is available on which cluster. -```console -marie@login$ ws_find --list -available filesystems: -scratch (default) -warm_archive -ssd -beegfs_global0 -beegfs -``` +=== "Barnard" + + ```console + marie@login.barnard$ ws_list -l + available filesystems: + horse (default) + walrus + ``` -!!! note "Default is `scratch`" +=== "Alpha Centauri" - The default filesystem is `scratch`. If you prefer another filesystem (cf. section + ```console + marie@login.alpha$ ws_list -l + available filesystems: + ssd + beegfs_global0 + beegfs (default) + ``` + +=== "Romeo" + + ```console + marie@login.romeo$ ws_list -l + available filesystems: + horse + ``` + +=== "Taurus (deprecated)" + + ```console + marie@login.taurus$ ws_list -l + available filesystems: + scratch (default) + warm_archive + ssd + beegfs_global0 + beegfs + ``` + +!!! note "Default filesystem" + + The output of the commands `ws_find --list` and `ws_list -l` will indicate the + **default filesystem**. If you prefer another filesystem (cf. section [List Available Filesystems](#list-available-filesystems)), you have to explictly - provide the option `--filesystem=<filesystem>` to the workspace commands. + provide the option `--filesystem=<filesystem>` to the workspace commands. If the default + filesystems is the one you want to work with, you can omit this option. ### List Current Workspaces @@ -53,12 +116,12 @@ The command `ws_list` lists all your currently active (,i.e, not expired) worksp ```console marie@login$ ws_list id: test-workspace - workspace directory : /scratch/ws/0/marie-test-workspace - remaining time : 89 days 23 hours - creation time : Thu Jul 29 10:30:04 2021 - expiration date : Wed Oct 27 10:30:04 2021 - filesystem name : scratch - available extensions : 10 + workspace directory : /data/horse/ws/marie-test-workspace + remaining time : 89 days 23 hours + creation time : Wed Dec 6 14:46:12 2023 + expiration date : Tue Mar 5 14:46:12 2024 + filesystem name : horse + available extensions : 10 ``` The output of `ws_list` can be customized via several options. The following switch tab provides a @@ -66,53 +129,53 @@ overview of some of these options. All available options can be queried by `ws_l === "Certain filesystem" - ``` - marie@login$ ws_list --filesystem=scratch_fast - id: numbercrunch - workspace directory : /lustre/ssd/ws/marie-numbercrunch - remaining time : 2 days 23 hours - creation time : Thu Mar 2 14:15:33 2023 - expiration date : Sun Mar 5 14:15:33 2023 - filesystem name : ssd - available extensions : 2 + ```console + marie@login$ ws_list --filesystem=walrus + id: marie-numbercrunch + workspace directory : /data/walrus/ws/marie-numbercrunch + remaining time : 89 days 23 hours + creation time : Wed Dec 6 14:49:55 2023 + expiration date : Tue Mar 5 14:49:55 2024 + filesystem name : walrus + available extensions : 2 ``` === "Verbose output" - ``` + ```console marie@login$ ws_list -v id: test-workspace - workspace directory : /scratch/ws/0/marie-test-workspace - remaining time : 89 days 23 hours - creation time : Thu Jul 29 10:30:04 2021 - expiration date : Wed Oct 27 10:30:04 2021 - filesystem name : scratch - available extensions : 10 - acctcode : p_numbercrunch - reminder : Sat Oct 20 10:30:04 2021 - mailaddress : marie@tu-dresden.de + workspace directory : /data/horse/ws/marie-test-workspace + remaining time : 89 days 23 hours + creation time : Wed Dec 6 14:46:12 2023 + expiration date : Tue Mar 5 14:46:12 2024 + filesystem name : scratch + available extensions : 10 + acctcode : p_numbercrunch + reminder : Tue Feb 27 14:46:12 2024 + mailaddress : marie@tu-dresden.de ``` === "Terse output" - ``` + ```console marie@login$ ws_list -t id: test-workspace - workspace directory : /scratch/ws/0/marie-test-workspace - remaining time : 89 days 23 hours - available extensions : 10 - id: foo - workspace directory : /scratch/ws/0/marie-foo - remaining time : 3 days 22 hours - available extensions : 10 + workspace directory : /data/horse/ws/marie-test-workspace + remaining time : 89 days 23 hours + available extensions : 10 + id: numbercrunch + workspace directory : /data/walrus/ws/marie-numbercrunch + remaining time : 89 days 23 hours + available extensions : 2 ``` === "Show only names" - ``` + ```console marie@login$ ws_list -s test-workspace - foo + numbercrunch ``` === "Sort by remaining time" @@ -123,13 +186,13 @@ overview of some of these options. All available options can be queried by `ws_l ``` marie@login$ ws_list -R -t id: test-workspace - workspace directory : /scratch/ws/0/marie-test-workspace + workspace directory : /data/horse/ws/marie-test-workspace remaining time : 89 days 23 hours available extensions : 10 - id: foo - workspace directory : /scratch/ws/0/marie-foof - remaining time : 3 days 22 hours - available extensions : 10 + id: marie-numbercrunch + workspace directory : /data/walrus/ws/marie-numbercrunch + remaining time : 89 days 23 hours + available extensions : 2 ``` ### Allocate a Workspace @@ -154,7 +217,13 @@ Options: -c [ --comment ] arg comment ``` -!!! example "Simple workspace allocation" +!!! Note "Name of a workspace" + + The workspace name should help you to remember the experiment and data stored here. It has to + be unique on a certain filesystem. On the other hand it is possible to use the very same name + for workspaces on different filesystems. + +=== "Simple allocation" The simple way to allocate a workspace is calling `ws_allocate` command with two arguments, where the first specifies the workspace name and the second the duration. This allocates a @@ -163,67 +232,55 @@ Options: ```console marie@login$ ws_allocate test-workspace 90 Info: creating workspace. - /scratch/ws/marie-test-workspace + /data/horse/ws/marie-test-workspace remaining extensions : 10 remaining time in days: 90 ``` -!!! example "Workspace allocation on specific filesystem" +=== "Specific filesystem" In order to allocate a workspace on a non-default filesystem, the option `--filesystem=<filesystem>` is required. ```console - marie@login$ ws_allocate --filesystem=scratch_fast test-workspace 3 + marie@login$ ws_allocate --filesystem=walrus test-workspace 99 Info: creating workspace. /lustre/ssd/ws/marie-test-workspace remaining extensions : 2 - remaining time in days: 3 + remaining time in days: 99 ``` -!!! example "Workspace allocation with e-mail reminder" +=== "with e-mail reminder" - This command will create a workspace with the name `test-workspace` on the `/scratch` filesystem - with a duration of 90 days and send an e-mail reminder. The e-mail reminder will be sent every + This command will create a workspace with the name `test-workspace` on the `/horse` filesystem + (default) + with a duration of 99 days and send an e-mail reminder. The e-mail reminder will be sent every day starting 7 days prior to expiration. We strongly recommend setting this e-mail reminder. ```console - marie@login$ ws_allocate --reminder=7 --mailaddress=marie.testuser@tu-dresden.de test-workspace 90 + marie@login$ ws_allocate --reminder=7 --mailaddress=marie@tu-dresden.de test-workspace 99 Info: creating workspace. - /scratch/ws/marie-test-workspace + /horse/ws/marie-test-workspace remaining extensions : 10 - remaining time in days: 90 + remaining time in days: 99 ``` -!!! Note "Name of a workspace" - - The workspace name should help you to remember the experiment and data stored here. It has to - be unique on a certain filesystem. On the other hand it is possible to use the very same name - for workspaces on different filesystems. - Please refer to the [section Cooperative Usage](#cooperative-usage-group-workspaces) for group workspaces. ### Extension of a Workspace The lifetime of a workspace is finite and different filesystems (storage systems) have different -maximum durations. A workspace can be extended multiple times, depending on the filesystem. - -| Filesystem (use with parameter `--filesystem=<filesystem>`) | Duration, days | Extensions | [Filesystem Feature](../jobs_and_resources/slurm.md#filesystem-features) | Remarks | -|:-------------------------------------|---------------:|-----------:|:-------------------------------------------------------------------------|:--------| -| `scratch` (default) | 100 | 10 | `fs_lustre_scratch2` | Scratch filesystem (`/lustre/scratch2`, symbolic link: `/scratch`) with high streaming bandwidth, based on spinning disks | -| `ssd` | 30 | 2 | `fs_lustre_ssd` | High-IOPS filesystem (`/lustre/ssd`, symbolic link: `/ssd`) on SSDs. | -| `beegfs_global0` (deprecated) | 30 | 2 | `fs_beegfs_global0` | High-IOPS filesystem (`/beegfs/global0`) on NVMes. | -| `beegfs` | 30 | 2 | `fs_beegfs` | High-IOPS filesystem (`/beegfs`) on NVMes. | -| `warm_archive` | 365 | 2 | `fs_warm_archive_ws` | Capacity filesystem based on spinning disks | -{: summary="Settings for Workspace Filesystem."} +maximum durations. The life time of a workspace can be adjusted multiple times, depending on the +filesystem. You can find the concrete values in the +[section settings for workspaces](#workspace-lifetimes). -Use the command `ws_extend` to extend your workspace: +Use the command `ws_extend [-F filesystem] workspace days` to extend your workspace: ```console marie@login$ ws_extend -F scratch test-workspace 100 Info: extending workspace. -/scratch/ws/marie-test-workspace +/data/horse/ws/marie-test-workspace remaining extensions : 1 remaining time in days: 100 ``` @@ -239,10 +296,10 @@ workspace, too. This means when you extend a workspace that expires in 90 days with the command ```console -marie@login$ ws_extend -F scratch my-workspace 40 +marie@login$ ws_extend -F scratch test-workspace 40 ``` -it will now expire in 40 days **not** 130 days. +it will now expire in 40 days, **not** in 130 days! ### Send Reminder for Workspace Expiration Date @@ -263,8 +320,10 @@ See the [example above](#allocate-a-workspace) for reference. If you missed setting an e-mail reminder at workspace allocation, you can add a reminder later, e.g. ``` +# initial allocation marie@login$ ws_allocate --name=FancyExp --duration=17 [...] +# add e-mail reminder marie@login$ ws_allocate --name=FancyExp --duration=17 --reminder=7 --mailaddress=marie@dlr.de --extension ``` @@ -279,38 +338,33 @@ The command `ws_send_ical` sends you an ical event on the expiration date of a s as follows: ```console - ws_send_ical --filesystem=<filesystem> --mail=<e-mail-address> --workspace=<workspace name> + ws_send_ical [--filesystem <filesystem>] --mail <e-mail-address> --workspace <workspace name> ``` ### Deletion of a Workspace To delete a workspace use the `ws_release` command. It is mandatory to specify the name of the -workspace and the filesystem in which it is located: +workspace and the filesystem in which it is allocated: ```console -marie@login$ ws_release --filesystem=scratch --name=my-workspace +marie@login$ ws_release --filesystem=horse --name=test-workspace ``` You can list your already released or expired workspaces using the `ws_restore --list` command. ```console marie@login$ ws_restore --list -warm_archive: -scratch: -marie-my-workspace-1665014486 - unavailable since Thu Oct 6 02:01:26 2022 -marie-foo-647085320 - unavailable since Sat Mar 12 12:42:00 2022 -ssd: -marie-bar-1654074660 - unavailable since Wen Jun 1 11:11:00 2022 -beegfs_global0: -beegfs: +horse: +marie-test-workspace-1701873807 + unavailable since Wed Dec 6 15:43:27 2023 +walrus: +marie-numbercrunch-1701873907 + unavailable since Wed Dec 6 15:45:07 2023 ``` -In this example, the user `marie` has three inactive, i.e., expired, workspaces namely -`my-workspace` in `scratch`, as well as `foo` and `bar` in `ssd` filesystem. The command -`ws_restore --list` lists the name of the workspace and the expiration date. As you can see, the +In this example, the user `marie` has two inactive, i.e., expired, workspaces namely +`test-workspace` in `horse`, as well as numbercrunch in the `walrus` filesystem. The command +`ws_restore --list` lists the name of the workspace and its expiration date. As you can see, the expiration date is added to the workspace name as Unix timestamp. !!! hint "Deleting data in in an expired workspace" @@ -320,44 +374,47 @@ expiration date is added to the workspace name as Unix timestamp. rights remain unchanged. I.e., you can delete the data inside the workspace directory but you must not delete the workspace directory itself! -#### Expirer Process +#### Expire Process The clean up process of expired workspaces is automatically handled by a so-called expirer process. It performs the following steps once per day and filesystem: - Check for remaining life time of all workspaces. - - If the workspaces expired, move it to a hidden directory so that it becomes inactive. + - If the workspaces expired, move it to a hidden directory so that it becomes inactive. - Send reminder e-mails to users if the reminder functionality was configured for their particular workspaces. - Scan through all workspaces in grace period. - - If a workspace exceeded the grace period, the workspace and its data are deleted. + - If a workspace exceeded the grace period, the workspace and its data are permanently deleted. ### Restoring Expired Workspaces -At expiration time your workspace will be moved to a special, hidden directory. For a month (in -warm_archive: 2 months), you can still restore your data **into an existing workspace**. +At expiration time your workspace will be moved to a special, hidden directory. For a month, +you can still restore your data **into an existing workspace**. !!! warning When you release a workspace **by hand**, it will not receive a grace period and be **permanently deleted** the **next day**. The advantage of this design is that you can create - and release workspaces inside jobs and not swamp the filesystem with data no one needs anymore + and release workspaces inside jobs and not flood the filesystem with data no one needs anymore in the hidden directories (when workspaces are in the grace period). Use ```console -marie@login$ ws_restore --list --filesystem=scratch -scratch: -marie-my-workspace-1665014486 - unavailable since Thu Oct 6 02:01:26 2022 +marie@login$ ws_restore --list +horse: +marie-test-workspace-1701873807 + unavailable since Wed Dec 6 15:43:27 2023 +walrus: +marie-numbercrunch-1701873907 + unavailable since Wed Dec 6 15:45:07 2023 ``` to get a list of your expired workspaces, and then restore them like that into an existing, active workspace 'new_ws': ```console -marie@login$ ws_restore --filesystem=scratch marie-my-workspace-1665014486 new_ws +marie@login$ ws_restore --filesystem=horse marie-test-workspace-1701873807 new_ws ``` The expired workspace has to be specified by its full name as listed by `ws_restore --list`, @@ -400,16 +457,15 @@ the following example (which works [for the program g16](../software/nanoscale_s it to your needs and workflow, e.g. * adopt Slurm options for ressource specification, - * inserting the path to your input file, - * what software you want to [load](../software/modules.md), - * and calling the actual software to do your computation. + * insert the path to your input file, + * specify what software you want to [load](../software/modules.md), + * and call the actual software to do your computation. !!! example "Using temporary workspaces for I/O intensive tasks" ```bash #!/bin/bash - #SBATCH --partition=haswell #SBATCH --time=48:00:00 #SBATCH --nodes=1 #SBATCH --ntasks=1 @@ -473,26 +529,26 @@ the following example (which works [for the program g16](../software/nanoscale_s ### Data for a Campaign For a series of jobs or calculations that work on the same data, you should allocate a workspace -once, e.g., in `scratch` for 100 days: +once, e.g., in `horse` for 100 days: ```console -marie@login$ ws_allocate --filesystem=scratch my_scratchdata 100 +marie@login$ ws_allocate --filesystem=horse my_scratchdata 100 Info: creating workspace. -/scratch/ws/marie-my_scratchdata -remaining extensions : 2 +/data/horse/ws/marie-my_scratchdata +remaining extensions : 10 remaining time in days: 99 ``` You can grant your project group access rights: -``` -chmod g+wrx /scratch/ws/marie-my_scratchdata +```console +marie@login$ chmod g+wrx /data/horse/ws/marie-my_scratchdata ``` And verify it with: ```console -marie@login$ ls -la /scratch/ws/marie-my_scratchdata +marie@login$ ls -la /data/horse/ws/marie-my_scratchdata total 8 drwxrwx--- 2 marie hpcsupport 4096 Jul 10 09:03 . drwxr-xr-x 5 operator adm 4096 Jul 10 09:01 .. @@ -500,39 +556,44 @@ drwxr-xr-x 5 operator adm 4096 Jul 10 09:01 .. ### Mid-Term Storage -For data that seldom changes but consumes a lot of space, the warm archive can be used. Note that -this is mounted read-only on the compute nodes, so you cannot use it as a work directory for your -jobs! +<!-- TODO: to be confirmed - is walrus really intended for this purpose? --> +For data that rarely changes but consumes a lot of space, the `walrus` filesystem can be used. Note +that this is mounted read-only on the compute nodes, so you cannot use it as a work directory for +your jobs! ```console -marie@login$ ws_allocate --filesystem=warm_archive my_inputdata 365 -/warm_archive/ws/marie-my_inputdata +marie@login$ ws_allocate --filesystem=walrus my_inputdata 100 +/data/walrus/ws/marie-my_inputdata remaining extensions : 2 -remaining time in days: 365 +remaining time in days: 100 ``` +<!-- TODO to be confirmed for walrus / warm_archive replacement !!!Attention The warm archive is not built for billions of files. There is a quota for 100.000 files per group. Please archive data. +--> +<!-- TODO command not found - not available yet for walrus?! To see your active quota use ```console -marie@login$ qinfo quota /warm_archive/ws/ +marie@login$ qinfo quota /data/walrus/ws/ ``` Note that the workspaces reside under the mountpoint `/warm_archive/ws/` and not `/warm_archive` anymore. +--> ## Cooperative Usage (Group Workspaces) When a workspace is created with the option `-g, --group`, it gets a group workspace that is visible to others (if in the same group) via `ws_list -g`. -!!! hint "Chose group" +!!! hint "Choose group" - If you are member of multiple groups, than the group workspace is visible for your primary + If you are member of multiple groups, then the group workspace is visible for your primary group. You can list all groups you belong to via `groups`, and the first entry is your primary group. @@ -553,7 +614,7 @@ to others (if in the same group) via `ws_list -g`. ```console marie@login$ ws_allocate --group --name=numbercrunch --duration=30 Info: creating workspace. - /scratch/ws/0/marie-numbercrunch + /data/horse/ws/marie-numbercrunch remaining extensions : 10 remaining time in days: 30 ``` @@ -561,8 +622,8 @@ to others (if in the same group) via `ws_list -g`. This workspace directory is readable for the group, e.g., ```console - marie@login$ ls -ld /scratch/ws/0/marie-numbercrunch - drwxr-x--- 2 marie p_number_crunch 4096 Mar 2 15:24 /scratch/ws/0/marie-numbercrunch + marie@login$ ls -ld /data/horse/ws/marie-numbercrunch + drwxr-x--- 2 marie p_number_crunch 4096 Mar 2 15:24 /data/horse/ws/marie-numbercrunch ``` All members of the project group `p_number_crunch` can now list this workspace using @@ -571,7 +632,7 @@ to others (if in the same group) via `ws_list -g`. ```console martin@login$ ws_list -g -t id: numbercrunch - workspace directory : /scratch/ws/0/marie-numbercrunch + workspace directory : /data/horse/ws/marie-numbercrunch remaining time : 29 days 23 hours available extensions : 10 ``` @@ -604,10 +665,11 @@ workspace. **A**: The workspace you want to restore into is either not on the same filesystem or you used the wrong name. Use only the short name that is listed after `id:` when using `ws_list`. +See section [restoring expired workspaces](#restoring-expired-workspaces). ---- -**Q**: I forgot to specify an e-mail alert when allocating my workspace. How can I add the +**Q**: I forgot to specify an e-mail reminder when allocating my workspace. How can I add the e-mail alert functionality to an existing workspace? **A**: You can add the e-mail alert by "overwriting" the workspace settings via diff --git a/doc.zih.tu-dresden.de/docs/data_transfer/datamover.md b/doc.zih.tu-dresden.de/docs/data_transfer/datamover.md index 80631a56987f8b5f67fca331d65d558740ec80e2..2be180379f6c86c6a22cd3e1ad865736193f58b9 100644 --- a/doc.zih.tu-dresden.de/docs/data_transfer/datamover.md +++ b/doc.zih.tu-dresden.de/docs/data_transfer/datamover.md @@ -1,15 +1,15 @@ # Transfer Data Inside ZIH Systems with Datamover -With the **Datamover**, we provide a special data transfer machine for transferring data with best -transfer speed between the filesystems of ZIH systems. The Datamover machine is not accessible +With the **Datamover**, we provide special data transfer machines for transferring data between +the ZIH filesystems with best transfer speed. The Datamover machine is not accessible through SSH as it is dedicated to data transfers. To move or copy files from one filesystem to -another filesystem, you have to use the following commands: +another, you have to use the following commands after logging in to any of the ZIH HPC systems: - `dtcp`, `dtls`, `dtmv`, `dtrm`, `dtrsync`, `dttar`, and `dtwget` -These commands submit a [batch job](../jobs_and_resources/slurm.md) to the data transfer machines -performing the selected command. Except the following options their syntax is the very same as the -well-known shell commands without the prefix *dt*. +These special commands submit a [batch job](../jobs_and_resources/slurm.md) to the data transfer machines +performing the selected command. Their syntax and behavior is the very same as the +well-known shell commands without the prefix *`dt`*, except for the following options. | Additional Option | Description | |---------------------|-------------------------------------------------------------------------------| @@ -31,33 +31,49 @@ To identify the mount points of the different filesystems on the data transfer m | ZIH system | Local directory | Directory on data transfer machine | |:-------------------|:---------------------|:-----------------------------------| -| Taurus | `/scratch/ws` | `/scratch/ws` | -| | `/ssd/ws` | `/ssd/ws` | -| | `/beegfs/global0/ws` | `/beegfs/global0/ws` | -| | `/warm_archive/ws` | `/warm_archive/ws` | -| | `/home` | `/home` | +| **Barnard** | `/data/horse` | `/data/horse` | +| | `/data/walrus` | `/data/walrus` | +| *outdated: Taurus* | `/home` | `/data/old/home` | +| | `/scratch/ws` | `/data/old/lustre/scratch2/ws` | +| | `/ssd/ws` | `/data/old/lustre/ssd/ws` | +| | `/beegfs/global0/ws` | `/data/old/beegfs/global0/ws` | +| | `/warm_archive/ws` | `/data/old/warm_archive/ws` | | | `/projects` | `/projects` | -| **Archive** | | `/archiv` | -| **Group storage** | | `/grp/<group storage>` | +| **Archive** | | `/data/archiv` | ## Usage of Datamover -!!! example "Copying data from `/beegfs/global0` to `/projects` filesystem." +<!--TODO: remove when released in May 2024--> +??? "Data on outdated filesystems" - ``` console - marie@login$ dtcp -r /beegfs/global0/ws/marie-workdata/results /projects/p_number_crunch/. - ``` + !!! example "Copying data from `/beegfs/.global0` to `/projects` filesystem." + + ``` console + marie@login$ dtcp -r /data/old/beegfs/.global0/ws/marie-workdata/results /projects/p_number_crunch/. + ``` + + !!! example "Archive data from `/beegfs/.global0` to `/archiv` filesystem." + + ``` console + marie@login$ dttar -czf /data/archiv/p_number_crunch/results.tgz /data/old/beegfs/global0/ws/marie-workdata/results + ``` + +!!! example "Copy data from `/data/horse` to `/projects` filesystem." + + ``` console + marie@login$ dtcp -r /data/horse/ws/marie-workdata/results /projects/p_number_crunch/. + ``` -!!! example "Moving data from `/beegfs/global0` to `/warm_archive` filesystem." +!!! example "Move data from `/data/horse` to `/data/walrus` filesystem." ``` console - marie@login$ dtmv /beegfs/global0/ws/marie-workdata/results /warm_archive/ws/marie-archive/. + marie@login$ dtmv /data/horse/ws/marie-workdata/results /data/walrus/ws/marie-archive/. ``` -!!! example "Archive data from `/beegfs/global0` to `/archiv` filesystem." +!!! example "Archive data from `/data/walrus` to `/archiv` filesystem." ``` console - marie@login$ dttar -czf /archiv/p_number_crunch/results.tgz /beegfs/global0/ws/marie-workdata/results + marie@login$ dttar -czf /archiv/p_number_crunch/results.tgz /data/walrus/ws/marie-workdata/results ``` !!! warning @@ -66,34 +82,28 @@ To identify the mount points of the different filesystems on the data transfer m !!! note The [warm archive](../data_lifecycle/warm_archive.md) and the `projects` filesystem are not writable from within batch jobs. - However, you can store the data in the `warm_archive` using the Datamover. + However, you can store the data in the [`walrus` filesystem](../data_lifecycle/working.md) + using the Datamover nodes via `dt*` commands. ## Transferring Files Between ZIH Systems and Group Drive -1. Copy your public SSH key from ZIH system to `login1.zih.tu-dresden.de`. +In order to let the datamover have access to your group drive, copy your public SSH key from ZIH +system to `login1.zih.tu-dresden.de`, first. ``` console marie@login$ ssh-copy-id -i ~/.ssh/id_rsa.pub login1.zih.tu-dresden.de + # Export the name of your group drive for reuse of example commands + marie@login$ export GROUP_DRIVE_NAME=<my-drive-name> ``` -1. Now you can access your group drive with the Datamover commands. -!!! example "Export the name of your group drive." - - ``` console - marie@login$ export GROUP_DRIVE_NAME=??? - ``` - -!!! note - Please replace `???` with the name of your group drive. - -!!! example "Copying data from your group drive to `/beegfs/global0` filesystem." +!!! example "Copy data from your group drive to `/data/horse` filesystem." ``` console - marie@login$ dtrsync -av dgw.zih.tu-dresden.de:/glw/${GROUP_DRIVE_NAME}/inputfile /beegfs/global0/ws/marie-workdata/. + marie@login$ dtrsync -av dgw.zih.tu-dresden.de:/glw/${GROUP_DRIVE_NAME}/inputfile /data/horse/ws/marie-workdata/. ``` -!!! example "Copying data from `/beegfs/global0` filesystem to your group drive." +!!! example "Copy data from `/data/horse` filesystem to your group drive." ``` console - marie@login$ dtrsync -av /beegfs/global0/ws/marie-workdata/resultfile dgw.zih.tu-dresden.de:/glw/${GROUP_DRIVE_NAME}/. + marie@login$ dtrsync -av /data/horse/ws/marie-workdata/resultfile dgw.zih.tu-dresden.de:/glw/${GROUP_DRIVE_NAME}/. ``` diff --git a/doc.zih.tu-dresden.de/docs/data_transfer/dataport_nodes.md b/doc.zih.tu-dresden.de/docs/data_transfer/dataport_nodes.md new file mode 100644 index 0000000000000000000000000000000000000000..0ad33a08f0c2db4336afbb88d0a12e2bee103751 --- /dev/null +++ b/doc.zih.tu-dresden.de/docs/data_transfer/dataport_nodes.md @@ -0,0 +1,190 @@ +# Transfer Data to/from ZIH Systems via Dataport Nodes + +To copy large data to/from ZIH systems, the so-called **dataport nodes** should be used. While it is +possible to transfer small files directly via the login nodes, they are not intended to be used that +way. Furthermore, longer transfers will hit the CPU time limit on the login nodes, i.e. the process +get killed. The **dataport nodes** have a better uplink (10 GBit/s) allowing for higher bandwidth. +Note that you cannot log in via SSH to the dataport nodes, but only use +<!-- [NT] currently not available:`scp`, --> +`rsync` or `sftp` (incl. FTP-clients like e.g. +[FileZilla](https://filezilla-project.org/)) on them. + +The dataport nodes are reachable under the hostnames + +- `dataport1.hpc.tu-dresden.de` (IP: 141.30.73.4) +- `dataport2.hpc.tu-dresden.de` (IP: 141.30.73.5) + +Through the usage of these dataport nodes, you can bring your data to ZIH HPC systems or get data +from there - they have access to the different HPC +[filesystems](../data_lifecycle/file_systems.md#recommendations-for-filesystem-usage). +As of 11/2023, the following directories are accessible: + +- `/home` +- `/projects` +- `/data/horse` +- `/data/walrus` +- `/data/archiv` +- (`/data/software`) +- (`/data/old/home`) +- (`/data/old/software`) + +## Access From Linux + +There are at least three tools to exchange data between your local workstation and ZIH systems. They +are explained in the following section in more detail. + +!!! important + The following explanations require that you have already set up your + [SSH configuration](../access/ssh_login.md#configuring-default-parameters-for-ssh). +<!-- [NT] scp currently not available + +### SCP + +The tool [`scp`](https://www.man7.org/linux/man-pages/man1/scp.1.html) +(OpenSSH secure file copy) copies files between hosts on a network. To copy all files +in a directory, the option `-r` has to be specified. + +??? example "Example: Copy a file from your workstation to ZIH systems" + + ```bash + marie@local$ scp <file> dataport:<target-location> + + # Add -r to copy whole directory + marie@local$ scp -r <directory> dataport:<target-location> + ``` + + For example, if you want to copy your data file `mydata.csv` to the directory `input` in your + home directory, you would use the following: + + ```console + marie@local$ scp mydata.csv dataport:input/ + ``` + +??? example "Example: Copy a file from ZIH systems to your workstation" + + ```bash + marie@local$ scp dataport:<file> <target-location> + + # Add -r to copy whole directory + marie@local$ scp -r dataport:<directory> <target-location> + ``` + + For example, if you have a directory named `output` in your home directory on ZIH systems and + you want to copy it to the directory `/tmp` on your workstation, you would use the following: + + ```console + marie@local$ scp -r dataport:output /tmp + ``` +--> + +### SFTP + +The tool [`sftp`](https://man7.org/linux/man-pages/man1/sftp.1.html) (OpenSSH secure file transfer) +is a file transfer program, which performs all operations over an encrypted SSH transport. It may +use compression to increase performance. + +`sftp` is basically a virtual command line, which you could access and exit as follows. + +!!! warning "Note" + It is important from which point in your directory tree you 'enter' `sftp`! + The current working directory (double ckeck with `pwd`) will be the target folder on your local + machine from/to which remote files from the ZIH system will be put/get by `sftp`. + The local folder might also be changed during a session with special commands. + During the `sftp` session, you can use regular commands like `ls`, `cd`, `pwd` etc. + But if you wish to access your local workstation, these must be prefixed with the letter `l` + (`l`ocal), e.g., `lls`, `lcd` or `lpwd`. + +```console +# Enter virtual command line +marie@local$ sftp dataport +# Exit virtual command line +sftp> exit +# or +sftp> <Ctrl+D> +``` + +??? example "Example: Copy a file from your workstation to ZIH systems" + + ```console + marie@local$ cd my/local/work + marie@local$ sftp dataport + # Copy file + sftp> put <file> + # Copy directory + sftp> put -r <directory> + ``` + +??? example "Example: Copy a file from ZIH systems to your local workstation" + + ```console + marie@local$ sftp dataport + # Copy file + sftp> get <file> + # change local (target) directory + sftp> lcd /my/local/work + # Copy directory + sftp> get -r <directory> + ``` + +### Rsync + +[`Rsync`](https://man7.org/linux/man-pages/man1/rsync.1.html), is a fast and extraordinarily +versatile file copying tool. It can copy locally, to/from another host over any remote shell, or +to/from a remote `rsync` daemon. It is famous for its delta-transfer algorithm, which reduces the +amount of data sent over the network by sending only the differences between the source files and +the existing files in the destination. + +Type following commands in the terminal when you are in the directory of +the local machine. + +??? example "Example: Copy a file from your workstation to ZIH systems" + + ```console + # Copy file + marie@local$ rsync <file> dataport:<target-location> + # Copy directory + marie@local$ rsync -r <directory> dataport:<target-location> + ``` + +??? example "Example: Copy a file from ZIH systems to your local workstation" + + ```console + # Copy file + marie@local$ rsync dataport:<file> <target-location> + # Copy directory + marie@local$ rsync -r dataport:<directory> <target-location> + ``` + +## Access From Windows + +### Command Line + +Windows 10 (1809 and higher) comes with a +[built-in OpenSSH support](https://docs.microsoft.com/en-us/windows-server/administration/openssh/openssh_overview) +including the above described <!--[SCP](#scp) and -->[SFTP](#sftp). + +### GUI - Using WinSCP + +First you have to install [WinSCP](http://winscp.net/eng/download.php). + +Then you have to execute the WinSCP application and configure some +option as described below. + +<!-- screenshots will have to be updated--> + +{: align="center"} + + +{: align="center"} + + +{: align="center"} + + +{: align="center"} + +After your connection succeeded, you can copy files from your local workstation to ZIH systems and +the other way around. + + +{: align="center"} diff --git a/doc.zih.tu-dresden.de/docs/data_transfer/export_nodes.md b/doc.zih.tu-dresden.de/docs/data_transfer/export_nodes.md index 2b3a3da9e005352b1c2165afa3ce184486b89e30..64823121378f21dd951abd62b81d83d8a279251b 100644 --- a/doc.zih.tu-dresden.de/docs/data_transfer/export_nodes.md +++ b/doc.zih.tu-dresden.de/docs/data_transfer/export_nodes.md @@ -1,4 +1,4 @@ -# Transfer Data to/from ZIH Systems via Export Nodes +# Transfer Data to/from old ZIH Systems via Export Nodes To copy large data to/from ZIH systems, the so-called **export nodes** should be used. While it is possible to transfer small files directly via the login nodes, they are not intended to be used that diff --git a/doc.zih.tu-dresden.de/docs/data_transfer/overview.md b/doc.zih.tu-dresden.de/docs/data_transfer/overview.md index 6e8a1bf1cc12e36e4aa15bd46b9eaf84e24171bc..28b458b6b831a36469f30018755bd89fa011a20a 100644 --- a/doc.zih.tu-dresden.de/docs/data_transfer/overview.md +++ b/doc.zih.tu-dresden.de/docs/data_transfer/overview.md @@ -1,6 +1,6 @@ # Data Transfer -## Data Transfer to/from ZIH Systems: Export Nodes +## Data Transfer to/from ZIH Systems: Dataport Nodes There are at least three tools for exchanging data between your local workstation and ZIH systems: `scp`, `rsync`, and `sftp`. Please refer to the offline or online man pages of @@ -8,9 +8,14 @@ There are at least three tools for exchanging data between your local workstatio [rsync](https://man7.org/linux/man-pages/man1/rsync.1.html), and [sftp](https://man7.org/linux/man-pages/man1/sftp.1.html) for detailed information. -No matter what tool you prefer, it is crucial that the **export nodes** are used as preferred way to +No matter what tool you prefer, it is crucial that the **dataport nodes** are used as preferred way to copy data to/from ZIH systems. Please follow the link to the documentation on -[export nodes](export_nodes.md) for further reference and examples. +[dataport nodes](dataport_nodes.md) for further reference and examples. + +!!! warning "Note" + + The former **export nodes** are still available as long as the outdated filesystems (`scratch`, + `ssd`, etc.) are accessible. Their operation will end on January 3rd, 2024. ## Data Transfer Inside ZIH Systems: Datamover diff --git a/doc.zih.tu-dresden.de/docs/index.md b/doc.zih.tu-dresden.de/docs/index.md index 64802daa9ae83761fb147961185fec3322880f0b..addbc83cb956f789f40da89239bdc76b753bc805 100644 --- a/doc.zih.tu-dresden.de/docs/index.md +++ b/doc.zih.tu-dresden.de/docs/index.md @@ -27,13 +27,19 @@ Please also find out the other ways you could contribute in our !!! tip "Reminder" Non-documentation issues and requests need to be send to - [hpcsupport@zih.tu-dresden.de](mailto:hpcsupport@zih.tu-dresden.de). + [hpc-support@tu-dresden.de](mailto:hpc-support@tu-dresden.de). ## News -* **2023-11-16** [OpenMPI 4.1.x - Workaround for MPI-IO Performance Loss](jobs_and_resources/mpi_issues/#openmpi-v41x-performance-loss-with-mpi-io-module-ompio) -* **2023-10-04** [User tests on Barnard](jobs_and_resources/barnard_test.md) -* **2023-06-01** [New hardware and complete re-design](jobs_and_resources/architecture_2023.md) +* **2023-12-07** [Maintenance finished: CPU cluster `Romeo` is now available](jobs_and_resources/romeo.md) +* **2023-12-01** [Maintenance finished: GPU cluster `Alpha Centauri` is now available](jobs_and_resources/alpha_centauri.md) +* **2023-11-25** [Data transfer available for Barnard via Dataport Nodes](data_transfer/dataport_nodes.md) +* **2023-11-14** [End of life of `scratch` and `ssd` filesystems is January 3 2024](data_lifecycle/file_systems.md) +* **2023-11-14** [End of life of Taurus system is December 11 2023](jobs_and_resources/hardware_overview.md) +* **2023-11-14** [Update on maintenance dates and work w.r.t. redesign of HPC systems](jobs_and_resources/hardware_overview.md) +* **2023-11-06** [Substantial update on "How-To: Migration to Barnard"](jobs_and_resources/barnard.md) +* **2023-10-16** [Open MPI 4.1.x - Workaround for MPI-IO Performance Loss](jobs_and_resources/mpi_issues.md#performance-loss-with-mpi-io-module-ompio) +* **2023-06-01** [New hardware and complete re-design](jobs_and_resources/hardware_overview.md#architectural-re-design-2023) * **2023-01-04** [New hardware: NVIDIA Arm HPC Developer Kit](jobs_and_resources/arm_hpc_devkit.md) ## Training and Courses diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/alpha_centauri.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/alpha_centauri.md index b5b09281dd9ab9fdde89c7ae4ffe9ad4ec48c089..4cfccc786620a9ec70258f7eee0243bd413d9281 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/alpha_centauri.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/alpha_centauri.md @@ -1,9 +1,45 @@ -# Alpha Centauri +# GPU Cluster Alpha Centauri -The multi-GPU sub-cluster "Alpha Centauri" has been installed for AI-related computations (ScaDS.AI). +The multi-GPU cluster `Alpha Centauri` has been installed for AI-related computations (ScaDS.AI). The hardware specification is documented on the page -[HPC Resources](hardware_overview.md#amd-rome-cpus-nvidia-a100). +[HPC Resources](hardware_overview.md#alpha-centauri). + +## Becoming a Stand-Alone Cluster + +The former HPC system Taurus is partly switched-off and partly split up into separate clusters +until the end of 2023. One such upcoming separate cluster is what you have known as partition +`alpha` so far. With the end of the maintenance at November 30 2023, `Alpha Centauri` is now a +stand-alone cluster with + +* homogenous hardware resources incl. two login nodes `login[1-2].alpha.hpc.tu-dresden.de`, +* and own Slurm batch system. + +### Filesystems + +Your new `/home` directory (from `Barnard`) is also your `/home` on `Alpha Centauri`. +If you have not +[migrated your `/home` from Taurus to your **new** `/home` on Barnard](barnard.md#data-management-and-data-transfer) +, please do so as soon as possible! + +!!! warning "Current limititations w.r.t. filesystems" + + For now, `Alpha Centauri` will not be integrated in the InfiniBand fabric of Barnard. With this + comes a dire restriction: **the only work filesystems for Alpha Centauri** will be the `/beegfs` + filesystems. (`/scratch` and `/lustre/ssd` are not usable any longer.) + + Please, prepare your + stage-in/stage-out workflows using our [datamovers](../data_transfer/datamover.md) to enable the + work with larger datasets that might be stored on Barnard’s new capacity filesystem + `/data/walrus`. The datamover commands are not yet running. Thus, you need to use them from + Barnard! + + The new Lustre filesystems, namely `horse` and `walrus`, will be mounted as soon as `Alpha` is + recabled (planned for May 2024). + +!!! warning "Current limititations w.r.t. workspace management" + + Workspace management commands do not work for `beegfs` yet. (Use them from Taurus!) ## Usage @@ -23,80 +59,102 @@ cores are available per node. ### Modules The easiest way is using the [module system](../software/modules.md). -The software for the partition `alpha` is available in module environment `modenv/hiera`. +All software available from the module system has been specifically build for the cluster `Alpha` +i.e., with optimzation for Zen2 microarchitecture and CUDA-support enabled. -To check the available modules for `modenv/hiera`, use the command +To check the available modules for `Alpha`, use the command ```console -marie@alpha$ module spider <module_name> +marie@login.alpha$ module spider <module_name> ``` -For example, to check whether PyTorch is available in version 1.7.1: +??? example "Example: Searching and loading PyTorch" -```console -marie@alpha$ module spider PyTorch/1.7.1 + For example, to check which `PyTorch` versions are available you can invoke ------------------------------------------------------------------------------------------------------------------------------------------ - PyTorch: PyTorch/1.7.1 ------------------------------------------------------------------------------------------------------------------------------------------ - Description: - Tensors and Dynamic neural networks in Python with strong GPU acceleration. PyTorch is a deep learning framework that puts Python - first. + ```console + marie@login.alpha$ module spider PyTorch + ------------------------------------------------------------------------------------------------------------------------- + PyTorch: + ------------------------------------------------------------------------------------------------------------------------- + Description: + Tensors and Dynamic neural networks in Python with strong GPU acceleration. PyTorch is a deep learning framework + that puts Python first. + Versions: + PyTorch/1.12.0 + PyTorch/1.12.1-CUDA-11.7.0 + PyTorch/1.12.1 + [...] + ``` - You will need to load all module(s) on any one of the lines below before the "PyTorch/1.7.1" module is available to load. + Not all modules can be loaded directly. Most modules are build with a certain compiler or toolchain + that need to be loaded beforehand. + Luckely, the module system can tell us, what we need to do for a specific module or software version - modenv/hiera GCC/10.2.0 CUDA/11.1.1 OpenMPI/4.0.5 + ```console + marie@login.alpha$ module spider PyTorch/1.12.1-CUDA-11.7.0 -[...] -``` + ------------------------------------------------------------------------------------------------------------------------- + PyTorch: PyTorch/1.12.1-CUDA-11.7.0 + ------------------------------------------------------------------------------------------------------------------------- + Description: + Tensors and Dynamic neural networks in Python with strong GPU acceleration. PyTorch is a deep learning framework + that puts Python first. -The output of `module spider <module_name>` provides hints which dependencies should be loaded beforehand: -```console -marie@alpha$ module load modenv/hiera GCC/10.2.0 CUDA/11.1.1 OpenMPI/4.0.5 -Module GCC/10.2.0, CUDA/11.1.1, OpenMPI/4.0.5 and 15 dependencies loaded. -marie@alpha$ module avail PyTorch --------------------------------------- /sw/modules/hiera/all/MPI/GCC-CUDA/10.2.0-11.1.1/OpenMPI/4.0.5 --------------------------------------- - PyTorch/1.7.1 (L) PyTorch/1.9.0 (D) -marie@alpha$ module load PyTorch/1.7.1 -Module PyTorch/1.7.1 and 39 dependencies loaded. -marie@alpha$ python -c "import torch; print(torch.__version__); print(torch.cuda.is_available())" -1.7.1 -True -``` + You will need to load all module(s) on any one of the lines below before the "PyTorch/1.12.1" module is available to load. -### Python Virtual Environments + release/23.04 GCC/11.3.0 OpenMPI/4.1.4 + [...] + ``` -[Virtual environments](../software/python_virtual_environments.md) allow users to install -additional python packages and create an isolated -runtime environment. We recommend using `virtualenv` for this purpose. + Finaly, the commandline to load the `PyTorch/1.12.1-CUDA-11.7.0` module is -```console -marie@login$ srun --partition=alpha-interactive --nodes=1 --cpus-per-task=1 --gres=gpu:1 --time=01:00:00 --pty bash -marie@alpha$ mkdir python-environments # please use workspaces -marie@alpha$ module load modenv/hiera GCC/10.2.0 CUDA/11.1.1 OpenMPI/4.0.5 PyTorch -Module GCC/10.2.0, CUDA/11.1.1, OpenMPI/4.0.5, PyTorch/1.9.0 and 54 dependencies loaded. -marie@alpha$ which python -/sw/installed/Python/3.8.6-GCCcore-10.2.0/bin/python -marie@alpha$ pip list -[...] -marie@alpha$ virtualenv --system-site-packages python-environments/my-torch-env -created virtual environment CPython3.8.6.final.0-64 in 42960ms - creator CPython3Posix(dest=~/python-environments/my-torch-env, clear=False, global=True) - seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=~/.local/share/virtualenv) - added seed packages: pip==21.1.3, setuptools==57.2.0, wheel==0.36.2 - activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator -marie@alpha$ source python-environments/my-torch-env/bin/activate -(my-torch-env) marie@alpha$ pip install torchvision -[...] -Installing collected packages: torchvision -Successfully installed torchvision-0.10.0 -[...] -(my-torch-env) marie@alpha$ python -c "import torchvision; print(torchvision.__version__)" -0.10.0+cu102 -(my-torch-env) marie@alpha$ deactivate -``` + ```console + marie@login.alpha$ module load release/23.04 GCC/11.3.0 OpenMPI/4.1.4 PyTorch/1.12.1-CUDA-11.7.0 + Module GCC/11.3.0, OpenMPI/4.1.4, PyTorch/1.12.1-CUDA-11.7.0 and 64 dependencies loaded. + ``` + + ```console + marie@login.alpha$ python -c "import torch; print(torch.__version__); print(torch.cuda.is_available())" + 1.12.1 + True + ``` + +### Python Virtual Environments + +[Virtual environments](../software/python_virtual_environments.md) allow you to install +additional Python packages and create an isolated runtime environment. We recommend using +`virtualenv` for this purpose. + +??? example "Example: Creating virtual environment and installing `torchvision` package" + + ```console + marie@login.alpha$ srun --nodes=1 --cpus-per-task=1 --gres=gpu:1 --time=01:00:00 --pty bash -l + marie@alpha$ mkdir python-environments # please use workspaces + marie@alpha$ module load modenv/hiera GCC/10.2.0 CUDA/11.1.1 OpenMPI/4.0.5 PyTorch + Module GCC/10.2.0, CUDA/11.1.1, OpenMPI/4.0.5, PyTorch/1.9.0 and 54 dependencies loaded. + marie@alpha$ which python + /sw/installed/Python/3.8.6-GCCcore-10.2.0/bin/python + marie@alpha$ pip list + [...] + marie@alpha$ virtualenv --system-site-packages python-environments/my-torch-env + created virtual environment CPython3.8.6.final.0-64 in 42960ms + creator CPython3Posix(dest=~/python-environments/my-torch-env, clear=False, global=True) + seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=~/.local/share/virtualenv) + added seed packages: pip==21.1.3, setuptools==57.2.0, wheel==0.36.2 + activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator + marie@alpha$ source python-environments/my-torch-env/bin/activate + (my-torch-env) marie@alpha$ pip install torchvision + [...] + Installing collected packages: torchvision + Successfully installed torchvision-0.10.0 + [...] + (my-torch-env) marie@alpha$ python -c "import torchvision; print(torchvision.__version__)" + 0.10.0+cu102 + (my-torch-env) marie@alpha$ deactivate + ``` ### JupyterHub diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/architecture_2023.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/architecture_2023.md deleted file mode 100644 index b0d23e2e789719ed0ff95a84f8f1056753cbb60c..0000000000000000000000000000000000000000 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/architecture_2023.md +++ /dev/null @@ -1,58 +0,0 @@ -# Architectural Re-Design 2023 - -With the replacement of the Taurus system by the cluster `Barnard` in 2023, -the rest of the installed hardware had to be re-connected, both with -Infiniband and with Ethernet. - - -{: align=center} - -## Compute Systems - -All compute clusters now act as separate entities having their own -login nodes of the same hardware and their very own Slurm batch systems. The different hardware, -e.g. Romeo and Alpha Centauri, is no longer managed via a single Slurm instance with -corresponding partitions. Instead, you as user now chose the hardware by the choice of the -correct login node. - -The login nodes can be used for smaller interactive jobs on the clusters. There are -restrictions in place, though, wrt. usable resources and time per user. For larger -computations, please use interactive jobs. - -## Storage Systems - -### Permanent Filesystems - -We now have `/home`, `/projects` and `/software` in a Lustre filesystem. Snapshots -and tape backup are configured. For convenience, we will make the old home available -read-only as `/home_old` on the data mover nodes for the data migration period. - -`/warm_archive` is mounted on the data movers, only. - -### Work Filesystems - -With new players with new software in the filesystem market it is getting more and more -complicated to identify the best suited filesystem for temporary data. In many cases, -only tests can provide the right answer, for a short time. - -For an easier grasp on the major categories (size, speed), the work filesystems now come -with the names of animals: - -* `/data/horse` - 20 PB - high bandwidth (Lustre) -* `/data/octopus` - 0.5 PB - for interactive usage (Lustre) -* `/data/weasel` - 1 PB - for high IOPS (WEKA) - coming soon - -### Difference Between "Work" And "Permanent" - -A large number of changing files is a challenge for any backup system. To protect -our snapshots and backup from work data, -`/projects` cannot be used for temporary data on the compute nodes - it is mounted read-only. - -Please use our data mover mechanisms to transfer worthy data to permanent -storages. - -## Migration Phase - -For about one month, the new cluster Barnard, and the old cluster Taurus -will run side-by-side - both with their respective filesystems. You can find a comprehensive -[description of the migration phase here](migration_2023.md). diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/arm_hpc_devkit.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/arm_hpc_devkit.md index 0f358bec3d47224da657a34a8d38b57bb0aa4354..7b10490891b70ba2ad548d1f448c2706038749c7 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/arm_hpc_devkit.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/arm_hpc_devkit.md @@ -12,7 +12,7 @@ This Arm HPC Developer kit offers: * 512G DDR4 memory (8x 64G) * 6TB SAS/ SATA 3.5″ * 2x NVIDIA A100 GPU -* 2x NVIDIA BlueField-2 E-Series DPU: 200GbE/HDR single-port, both connected to the Infiniband network +* 2x NVIDIA BlueField-2 E-Series DPU: 200GbE/HDR single-port, both connected to the InfiniBand network ## Further Information @@ -24,7 +24,7 @@ Further information about this new system can be found on the following websites ## Getting Access To get access to the developer kit, write a mail to -[the hpcsupport team](mailto:hpcsupport@zih.tu-dresden.de) +[the hpcsupport team](mailto:hpc-support@tu-dresden.de) with your ZIH login and a short description, what you want to use the developer kit for. After you have gained access, you can log into the developer kit system via SSH from the login @@ -34,6 +34,11 @@ nodes: marie@login$ ssh taurusa1 ``` +!!! note + + November 2023: After the migration to Barnard: for now, until further notice, access remains via + the Taurus login nodes. + ## Running Applications !!! warning "Not under Slurm control" diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/barnard.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/barnard.md new file mode 100644 index 0000000000000000000000000000000000000000..87b034f7bef81caa7a00951d20739bcba7252d3c --- /dev/null +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/barnard.md @@ -0,0 +1,343 @@ +# CPU Cluster Barnard + +All HPC users are cordially invited to migrate to our new HPC system **Barnard** and prepare your +software and workflows for production there. + +!!! note "Migration Phase" + + Please make sure to have read the details on the overall + [Architectural Re-Design 2023](hardware_overview.md) before further reading. + +The migration from Taurus to Barnard comprises the following steps: + +* [Prepare login to Barnard](#login-to-barnard) +* [Data management and data transfer to new filesystems](#data-management-and-data-transfer) +* [Update job scripts and workflow to new software](#software) +* [Update job scripts and workflow w.r.t. Slurm](#slurm) + +!!! note + + We highly recommand to first read the entire page carefully, and then execute the steps. + +The migration can only be successful as a joint effort of HPC team and users. +We value your feedback. Please provide it directly via our ticket system. For better processing, +please add "Barnard:" as a prefix to the subject of the [support ticket](../support/support.md). + +## Login to Barnard + +!!! hint + + All users and projects from Taurus now can work on Barnard. + +You use `login[1-4].barnard.hpc.tu-dresden.de` to access the system +from campus (or VPN). In order to verify the SSH fingerprints of the login nodes, please refer to +the page [Fingerprints](../access/key_fingerprints.md#barnard). + +All users have **new empty HOME** file systems, this means you have first to ... + +??? "... install your public SSH key on Barnard" + + - Please create a new SSH keypair with ed25519 encryption, secured with + a passphrase. Please refer to this + [page for instructions](../access/ssh_login.md#before-your-first-connection). + - After login, add the public key to your `.ssh/authorized_keys` file on Barnard. + +## Data Management and Data Transfer + +### Filesystems on Barnard + +Our new HPC system Barnard also comes with **two new Lustre filesystems**, namely `/data/horse` and +`/data/walrus`. Both have a capacity of 20 PB, but differ in performance and intended usage, see +below. In order to support the data life cycle management, the well-known +[workspace concept](#workspaces-on-barnard) is applied. + +* The `/project` filesystem is the same on Taurus and Barnard +(mounted read-only on the compute nodes). +* The new work filesystem is `/data/horse`. +* The slower `/data/walrus` can be considered as a substitute for the old + `/warm_archive`- mounted **read-only** on the compute nodes. + It can be used to store e.g. results. + +!!! Warning + + All old filesystems, i.e., `ssd`, `beegfs`, and `scratch`, will be shutdown by the end of 2023. + To work with your data from Taurus you might have to move/copy them to the new storages. + + Please, carefully read the following documentation and instructions. + +### Workspaces on Barnard + +The filesystems `/data/horse` and `/data/walrus` can only be accessed via workspaces. Please refer +to the [workspace page](../data_lifecycle/workspaces.md), if you are not familiar with the +workspace concept and the corresponding commands. You can find the settings for +workspaces on these two filesystems in the +[section Settings for Workspaces](../data_lifecycle/workspaces.md#settings-for-workspaces). + +### Data Migration to New Filesystems + +Since all old filesystems of Taurus will be shutdown by the end of 2023, your data needs to be +migrated to the new filesystems on Barnard. This migration comprises + +* your personal `/home` directory, +* your workspaces on `/ssd`, `/beegfs` and `/scratch`. + +!!! note "It's your turn" + + **You are responsible for the migration of your data**. With the shutdown of the old + filesystems, all data will be deleted. + +!!! note "Make a plan" + + We highly recommand to **take some minutes for planing the transfer process**. Do not act with + precipitation. + + Please **do not copy your entire data** from the old to the new filesystems, but consider this + opportunity for **cleaning up your data**. E.g., it might make sense to delete outdated scripts, + old log files, etc., and move other files, e.g., results, to the `/data/walrus` filesystem. + +!!! hint "Generic login" + + In the following we will use the generic login `marie` and workspace `numbercrunch` + ([cf. content rules on generic names](../contrib/content_rules.md#data-privacy-and-generic-names)). + **Please make sure to replace it with your personal login.** + +We have four new [datamover nodes](../data_transfer/datamover.md) that have mounted all storages +of the old Taurus and new Barnard system. Do not use the datamovers from Taurus, i.e., all data +transfer need to be invoked from Barnard! Thus, the very first step is to +[login to Barnard](#login-to-barnard). + +The command `dtinfo` will provide you the mount points of the old filesystems + +```console +marie@barnard$ dtinfo +[...] +directory on datamover mounting clusters directory on cluster + +/data/old/home Taurus /home +/data/old/lustre/scratch2 Taurus /scratch +/data/old/lustre/ssd Taurus /lustre/ssd +[...] +``` + +In the following, we will provide instructions with comprehensive examples for the data transfer of +your data to the new `/home` filesystem, as well as the working filesystems `/data/horse` and +`/data/walrus`. + +??? "Migration of Your Home Directory" + + Your personal (old) home directory at Taurus will not be automatically transferred to the new + Barnard system. Please do not copy your entire home, but clean up your data. E.g., it might + make sense to delete outdated scripts, old log files, etc., and move other files to an archive + filesystem. Thus, please transfer only selected directories and files that you need on the new + system. + + The steps are as follows: + + 1. Login to Barnard, i.e., + + ``` + ssh login[1-4].barnard.tu-dresden.de + ``` + + 1. The command `dtinfo` will provide you the mountpoint + + ```console + marie@barnard$ dtinfo + [...] + directory on datamover mounting clusters directory on cluster + + /data/old/home Taurus /home + [...] + ``` + + 1. Use the `dtls` command to list your files on the old home directory + + ``` + marie@barnard$ dtls /data/old/home/marie + [...] + ``` + + 1. Use the `dtcp` command to invoke a transfer job, e.g., + + ```console + marie@barnard$ dtcp --recursive /data/old/home/marie/<useful data> /home/marie/ + ``` + + **Note**, please adopt the source and target paths to your needs. All available options can be + queried via `dtinfo --help`. + + !!! warning + + Please be aware that there is **no synchronisation process** between your home directories + at Taurus and Barnard. Thus, after the very first transfer, they will become divergent. + +Please follow these instructions for transferring you data from `ssd`, `beegfs` and `scratch` to the +new filesystems. The instructions and examples are divided by the target not the source filesystem. + +This migration task requires a preliminary step: You need to allocate workspaces on the +target filesystems. + +??? Note "Preliminary Step: Allocate a workspace" + + Both `/data/horse/` and `/data/walrus` can only be used with + [workspaces](../data_lifecycle/workspaces.md). Before you invoke any data transer from the old + working filesystems to the new ones, you need to allocate a workspace first. + + The command `ws_list --list` lists the available and the default filesystem for workspaces. + + ``` + marie@barnard$ ws_list --list + available filesystems: + horse (default) + walrus + ``` + + As you can see, `/data/horse` is the default workspace filesystem at Barnard. I.e., if you + want to allocate, extend or release a workspace on `/data/walrus`, you need to pass the + option `--filesystem=walrus` explicitly to the corresponding workspace commands. Please + refer to our [workspace documentation](../data_lifecycle/workspaces.md), if you need refresh + your knowledge. + + The most simple command to allocate a workspace is as follows + + ``` + marie@barnard$ ws_allocate numbercrunch 90 + ``` + + Please refer to the table holding the settings + (cf. [subsection workspaces on Barnard](#workspaces-on-barnard)) for the max. duration and + `ws_allocate --help` for all available options. + +??? "Migration to work filesystem `/data/horse`" + + === "Source: old `/scratch`" + + We are synchronizing the old `/scratch` to `/data/horse/lustre/scratch2/` (**last: October + 18**). + If you transfer data from the old `/scratch` to `/data/horse`, it is sufficient to use + `dtmv` instead of `dtcp` since this data has already been copied to a special directory on + the new `horse` filesystem. Thus, you just need to move it to the right place (the Lustre + metadata system will update the correspoding entries). + + The workspaces within the subdirectories `ws/0` and `ws/1`, respectively. A corresponding + data transfer using `dtmv` looks like + + ```console + marie@barnard$ dtmv /data/horse/lustre/scratch2/ws/0/marie-numbercrunch/<useful data> /data/horse/ws/marie-numbercrunch/ + ``` + + Please do **NOT** copy those data yourself. Instead check if it is already sychronized + to `/data/horse/lustre/scratch2/ws/0/marie-numbercrunch`. + + In case you need to update this (Gigabytes, not Terabytes!) please run `dtrsync` like in + + ``` + marie@barnard$ dtrsync -a /data/old/lustre/scratch2/ws/0/marie-numbercrunch/<useful data> /data/horse/ws/marie-numbercrunch/ + ``` + + === "Source: old `/ssd`" + + The old `ssd` filesystem is mounted at `/data/old/lustre/ssd` on the datamover nodes and the + workspaces are within the subdirectory `ws/`. A corresponding data transfer using `dtcp` + looks like + + ```console + marie@barnard$ dtcp --recursive /data/old/lustre/ssd/ws/marie-numbercrunch/<useful data> /data/horse/ws/marie-numbercrunch/ + ``` + + === "Source: old `/beegfs`" + + The old `beegfs` filesystem is mounted at `/data/old/beegfs` on the datamover nodes and the + workspaces are within the subdirectories `ws/0` and `ws/1`, respectively. A corresponding + data transfer using `dtcp` looks like + + ```console + marie@barnard$ dtcp --recursive /data/old/beegfs/ws/0/marie-numbercrunch/<useful data> /data/horse/ws/marie-numbercrunch/ + ``` + +??? "Migration to `/data/walrus`" + + === "Source: old `/scratch`" + + We are synchronizing the old `/scratch` to `/data/horse/lustre/scratch2/` (**last: October + 18**). The old `scratch` filesystem has been already synchronized to + `/data/horse/lustre/scratch2` nodes and the workspaces are within the subdirectories `ws/0` + and `ws/1`, respectively. A corresponding data transfer using `dtcp` looks like + + ```console + marie@barnard$ dtcp --recursive /data/horse/lustre/scratch2/ws/0/marie-numbercrunch/<useful data> /data/walrus/ws/marie-numbercrunch/ + ``` + + Please do **NOT** copy those data yourself. Instead check if it is already sychronized + to `/data/horse/lustre/scratch2/ws/0/marie-numbercrunch`. + + In case you need to update this (Gigabytes, not Terabytes!) please run `dtrsync` like in + + ``` + marie@barnard$ dtrsync -a /data/old/lustre/scratch2/ws/0/marie-numbercrunch/<useful data> /data/walrus/ws/marie-numbercrunch/ + ``` + + === "Source: old `/ssd`" + + The old `ssd` filesystem is mounted at `/data/old/lustre/ssd` on the datamover nodes and the + workspaces are within the subdirectory `ws/`. A corresponding data transfer using `dtcp` + looks like + + ```console + marie@barnard$ dtcp --recursive /data/old/lustre/ssd/<useful data> /data/walrus/ws/marie-numbercrunch/ + ``` + + === "Source: old `/beegfs`" + + The old `beegfs` filesystem is mounted at `/data/old/beegfs` on the datamover nodes and the + workspaces are within the subdirectories `ws/0` and `ws/1`, respectively. A corresponding + data transfer using `dtcp` looks like + + ```console + marie@barnard$ dtcp --recursive /data/old/beegfs/ws/0/marie-numbercrunch/<useful data> /data/walrus/ws/marie-numbercrunch/ + ``` + +??? "Migration from `/warm_archive`" + + We are synchronizing the old `/warm_archive` to `/data/walrus/warm_archive/`. Therefor, it can + be sufficient to use `dtmv` instead of `dtcp` (No data will be copied, but the Lustre system + will update the correspoding metadata entries). A corresponding data transfer using `dtmv` looks + like + + ```console + marie@barnard$ dtmv /data/walrus/warm_archive/ws/marie-numbercrunch/<useful data> /data/walrus/ws/marie-numbercrunch/ + ``` + + Please do **NOT** copy those data yourself. Instead check if it is already sychronized + to `/data/walrus/warm_archive/ws`. + + In case you need to update this (Gigabytes, not Terabytes!) please run `dtrsync` like in + + ``` + marie@barnard$ dtrsync -a /data/old/warm_archive/ws/marie-numbercrunch/<useful data> /data/walrus/ws/marie-numbercrunch/ + ``` + +When the last compute system will have been migrated the old file systems will be +set write-protected and we start a final synchronization (scratch+walrus). +The target directories for synchronization `/data/horse/lustre/scratch2/ws` and +`/data/walrus/warm_archive/ws/` will not be deleted automatically in the meantime. + +## Software + +Barnard is running on Linux RHEL 8.7. All application software was re-built consequently using Git +and CI/CD pipelines for handling the multitude of versions. + +We start with `release/23.10` which is based on software requests from user feedbacks of our +HPC users. Most major software versions exist on all hardware platforms. + +Please use `module spider` to identify the software modules you need to load. + +## Slurm + +* We are running the most recent Slurm version. +* You must not use the old partition names. +* Not all things are tested. + +Note that most nodes on Barnard don't have a local disk and space in `/tmp` is **very** limited. +If you need a local disk request this with the +[Slurm feature](slurm.md#node-features-for-selective-job-submission) `--constraint=local_disk`. diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/barnard_test.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/barnard_test.md deleted file mode 100644 index c9d503b5f0337eaa2d51b5e162793bb0217e6a0d..0000000000000000000000000000000000000000 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/barnard_test.md +++ /dev/null @@ -1,163 +0,0 @@ -# Tests on Barnard - -All HPC users are invited to test our new HPC system Barnard and prepare your software -and workflows for production there. For general hints please refer to these sites: - -* [Details on architecture](/jobs_and_resources/architecture_2023), -* [Description of the migration](migration_2023.md). - -We value your feedback. Please provide it directly via our ticket system. For better processing, -please add "Barnard:" as a prefix to the subject of the [support ticket](../support/support). - -Here, you can find few hints which might help you with the first steps. - -## Login to Barnard - -All users and projects from Taurus now can work on Barnard. - -They can use `login[2-4].barnard.hpc.tu-dresden.de` to access the system -from campus (or VPN). [Fingerprints](/access/key_fingerprints/#barnard) - -All users have **new empty HOME** file systems, this means you have first to... - -??? "... install your public ssh key on the system" - - - Please create a new SSH keypair with ed25519 encryption, secured with - a passphrase. Please refer to this - [page for instructions](../../access/ssh_login#before-your-first-connection). - - After login, add the public key to your `.ssh/authorized_keys` file - on Barnard. - -## Data Management - -* The `/project` filesystem is the same on Taurus and Barnard -(mounted read-only on the compute nodes). -* The new work filesystem is `/data/horse`. -* The slower `/data/walrus` can be considered as a substitute for the old - `/warm_archive`- mounted **read-only** on the compute nodes. - It can be used to store e.g. results. - -These `/data/horse` and `/data/walrus` can be accesed via workspaces. Please refer to the -[workspace page](../../data_lifecycle/workspaces/), if you are not familiar with workspaces. - -??? "Tips on workspaces" - * To list all available workspace filessystem, invoke the command `ws_list -l`." - * Please use the command `dtinfo` to get the current mount points: - ``` - marie@login1> dtinfo - [...] - directory on datamover mounting clusters directory on cluster - - /data/old/home Taurus /home - /data/old/lustre/scratch2 Taurus /scratch - /data/old/lustre/ssd Taurus /lustre/ssd - [...] - ``` - -!!! Warning - - All old filesystems fill be shutdown by the end of 2023. - - To work with your data from Taurus you might have to move/copy them to the new storages. - -For this, we have four new [datamover nodes](/data_transfer/datamover) that have mounted all storages -of the old and new system. (Do not use the datamovers from Taurus!) - -??? "Migration from Home Directory" - - Your personal (old) home directory at Taurus will not be automatically transferred to the new Barnard - system. **You are responsible for this task.** Please do not copy your entire home, but consider - this opportunity for cleaning up you data. E.g., it might make sense to delete outdated scripts, old - log files, etc., and move other files to an archive filesystem. Thus, please transfer only selected - directories and files that you need on the new system. - - The well-known [datamover tools](../../data_transfer/datamover/) are available to run such transfer - jobs under Slurm. The steps are as follows: - - 1. Login to Barnard: `ssh login[1-4].barnard.tu-dresden.de` - 1. The command `dtinfo` will provide you the mountpoints - - ```console - marie@barnard$ dtinfo - [...] - directory on datamover mounting clusters directory on cluster - - /data/old/home Taurus /home - /data/old/lustre/scratch2 Taurus /scratch - /data/old/lustre/ssd Taurus /lustre/ssd - [...] - ``` - - 1. Use the `dtls` command to list your files on the old home directory: `marie@barnard$ dtls - /data/old/home/marie` - 1. Use `dtcp` command to invoke a transfer job, e.g., - - ```console - marie@barnard$ dtcp --recursive /data/old/home/marie/<useful data> /home/marie/ - ``` - - **Note**, please adopt the source and target paths to your needs. All available options can be - queried via `dtinfo --help`. - - !!! warning - - Please be aware that there is **no synchronisation process** between your home directories at - Taurus and Barnard. Thus, after the very first transfer, they will become divergent. - - We recommand to **take some minutes for planing the transfer process**. Do not act with - precipitation. - -??? "Migration from `/lustre/ssd` or `/beegfs`" - - **You** are entirely responsible for the transfer of these data to the new location. - Start the dtrsync process as soon as possible. (And maybe repeat it at a later time.) - -??? "Migration from `/lustre/scratch2` aka `/scratch`" - - We are synchronizing this (**last: October 18**) to `/data/horse/lustre/scratch2/`. - - Please do **NOT** copy those data yourself. Instead check if it is already sychronized - to `/data/walrus/warm_archive/ws`. - - In case you need to update this (Gigabytes, not Terabytes!) please run `dtrsync` like in - `dtrsync -a /data/old/lustre/scratch2/ws/0/my-workspace/newest/ /data/horse/lustre/scratch2/ws/0/my-workspace/newest/` - -??? "Migration from `/warm_archive`" - - We are preparing another sync from `/warm_archive` to `The process of syncing data from `/warm_archive` to `/data/walrus/warm_archive` is still ongoing. - - Please do **NOT** copy those data yourself. Instead check if it is already sychronized - to `/data/walrus/warm_archive/ws`. - - In case you need to update this (Gigabytes, not Terabytes!) please run `dtrsync` like in - `dtrsync -a /data/old/warm_archive/ws/my-workspace/newest/ /data/walrus/warm_archive/ws/my-workspace/newest/` - -When the last compute system will have been migrated the old file systems will be -set write-protected and we start a final synchronization (sratch+walrus). -The target directories for synchronization `/data/horse/lustre/scratch2/ws` and -`/data/walrus/warm_archive/ws/` will not be deleted automatically in the mean time. - -## Software - -Please use `module spider` to identify the software modules you need to load.Like -on Taurus. - - The default release version is 23.10. - -## Slurm - -* We are running the most recent Slurm version. -* You must not use the old partition names. -* Not all things are tested. - -## Updates after your feedback (state: October 19) - -* A **second synchronization** from `/scratch` has started on **October, 18**, and is - now nearly done. -* A first, and incomplete synchronization from `/warm_archive` has been done (see above). - With support from NEC we are transferring the rest in the next weeks. -* The **data transfer tools** now work fine. -* After fixing too tight security restrictions, **all users can login** now. -* **ANSYS** now starts: please check if your specific use case works. -* **login1** is under construction, do not use it at the moment. Workspace creation does - not work there. diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/checkpoint_restart.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/checkpoint_restart.md index 9df02cc5dcc2029294b1a7598946b842ee496078..6c0033d46cf3f121a1a6d6f8976aaafc8b40fb13 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/checkpoint_restart.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/checkpoint_restart.md @@ -56,7 +56,7 @@ checkpoint/restart bits transparently to your batch script. You just have to spe total runtime of your calculation and the interval in which you wish to do checkpoints. The latter (plus the time it takes to write the checkpoint) will then be the runtime of the individual jobs. This should be targeted at below 24 hours in order to be able to run on all -[partitions haswell64](../jobs_and_resources/partitions_and_limits.md#runtime-limits). For +[partitions haswell64](../jobs_and_resources/slurm_limits.md#slurm-resource-limits-table). For increased fault-tolerance, it can be chosen even shorter. To use it, first add a `dmtcp_launch` before your application call in your batch script. In the case @@ -193,3 +193,54 @@ have been exported by the `start_coordinator` function). ./dmtcp_restart_script.sh -h $DMTCP_COORD_HOST -p $DMTCP_COORD_PORT ``` + +## Signal Handler + +If for some reason your job is taking unexpectedly long and would be killed by Slurm +due to reaching its time limit, you can use `--signal=<sig_num>[@sig_time]` to make +Slurm sent your processes a Unix signal `sig_time` seconds before. +Your application should take care of this signal and can write some checkpoints +or output intermediate results and terminate gracefully. +`sig_num` can be any numeric signal number or name, e.g. `10` and `USR1`. You will find a +comprehensive list of Unix signals including documentation in the +[signal man page](https://man7.org/linux/man-pages/man7/signal.7.html). +`sig_time` has to be an integer value between 0 and 65535 representing seconds +Slurm sends the signal before the time limit is reached. Due to resolution effects +the signal may be sent up to 60 seconds earlier than specified. + +The command line + +```console +marie@login$ srun --ntasks=1 --time=00:05:00 --signal=USR1@120 ./signal-handler +``` + +makes Slurm send `./signal-handler` the signal `USR1` 120 seconds before +the time limit is reached. The following example provides a skeleton implementation of a +signal-aware application. + +???+ example "Example signal-handler.c" + + ```C hl_lines="15" + #include <stdio.h> + #include <stdlib.h> + #include <signal.h> + + void sigfunc(int sig) { + if(sig == SIGUSR1) { + printf("Allocation's time limit reached. Saving checkpoint and exit\n"); + exit(EXIT_SUCCESS); + } + + return; + } + + int main(void) { + signal(SIGUSR1, sigfunc); + printf("do number crunching\n"); + while(1) { + ; + } + + return EXIT_SUCCESS; + } + ``` diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/hardware_overview.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/hardware_overview.md index bf5f25146730bceb6b442bd20d4e08e73e0863fc..10fe69e2e0f158f5e7048d7702afa46163264d91 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/hardware_overview.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/hardware_overview.md @@ -1,5 +1,6 @@ # HPC Resources +<!--TODO: Update this introduction--> HPC resources in ZIH systems comprise the *High Performance Computing and Storage Complex* and its extension *High Performance Computing – Data Analytics*. In total it offers scientists about 60,000 CPU cores and a peak performance of more than 1.5 quadrillion floating point @@ -8,42 +9,172 @@ analytics, and artificial intelligence methods with extensive capabilities for e and performance monitoring provides ideal conditions to achieve the ambitious research goals of the users and the ZIH. -## Login and Export Nodes +!!! danger "HPC Systems Migration Phase" -- 4 Login-Nodes `tauruslogin[3-6].hrsk.tu-dresden.de` - - Each login node is equipped with 2x Intel(R) Xeon(R) CPU E5-2680 v3 with 24 cores in total @ - 2.50 GHz, Multithreading disabled, 64 GB RAM, 128 GB SSD local disk - - IPs: 141.30.73.\[102-105\] -- 2 Data-Transfer-Nodes `taurusexport[3-4].hrsk.tu-dresden.de` + **On December 11 2023 Taurus will be decommissioned for good**. + + With our new HPC system Barnard comes a significant change in HPC system landscape at ZIH: We + will have five homogeneous clusters with their own Slurm instances and with cluster specific + login nodes running on the same CPU. + +With the installation and start of operation of the [new HPC system Barnard](#barnard), +quite significant changes w.r.t. HPC system landscape at ZIH follow. The former HPC system Taurus is +partly switched-off and partly split up into separate clusters. In the end, from the users' +perspective, there will be **five separate clusters**: + +| Name | Description | Year of Installation | DNS | +| ----------------------------------- | ----------------------| -------------------- | --- | +| [`Barnard`](#barnard) | CPU cluster | 2023 | `n[1001-1630].barnard.hpc.tu-dresden.de` | +| [`Alpha Centauri`](#alpha-centauri) | GPU cluster | 2021 | `i[8001-8037].alpha.hpc.tu-dresden.de` | +| [`Julia`](#julia) | Single SMP system | 2021 | `smp8.julia.hpc.tu-dresden.de` | +| [`Romeo`](#romeo) | CPU cluster | 2020 | `i[8001-8190].romeo.hpc.tu-dresden.de` | +| [`Power9`](#power9) | IBM Power/GPU cluster | 2018 | `ml[1-29].power9.hpc.tu-dresden.de` | + +All clusters will run with their own [Slurm batch system](slurm.md) and job submission is possible +only from their respective login nodes. + +## Architectural Re-Design 2023 + +Over the last decade we have been running our HPC system of high heterogeneity with a single +Slurm batch system. This made things very complicated, especially to inexperienced users. With +the replacement of the Taurus system by the cluster [Barnard](#barnard) we +**now create homogeneous clusters with their own Slurm instances and with cluster specific login +nodes** running on the same CPU. Job submission will be possible only from within the cluster +(compute or login node). + +All clusters will be integrated to the new InfiniBand fabric and have then the same access to +the shared filesystems. This recabling will require a brief downtime of a few days. + + +{: align=center} + +### Compute Systems + +All compute clusters now act as separate entities having their own +login nodes of the same hardware and their very own Slurm batch systems. The different hardware, +e.g. Romeo and Alpha Centauri, is no longer managed via a single Slurm instance with +corresponding partitions. Instead, you as user now chose the hardware by the choice of the +correct login node. + +The login nodes can be used for smaller interactive jobs on the clusters. There are +restrictions in place, though, wrt. usable resources and time per user. For larger +computations, please use interactive jobs. + +### Storage Systems + +For an easier grasp on the major categories (size, speed), the +work filesystems now come with the names of animals. + +#### Permanent Filesystems + +We now have `/home` and `/software` in a Lustre filesystem. Snapshots +and tape backup are configured. (`/projects` remains the same until a recabling.) + +The Lustre filesystem `/data/walrus` is meant for larger data with a slow +access. It is installed to replace `/warm_archive`. + +#### Work Filesystems + +In the filesystem market with new players it is getting more and more +complicated to identify the best suited filesystem for a specific use case. Often, +only tests can find the best setup for a specific workload. + +* `/data/horse` - 20 PB - high bandwidth (Lustre) +* `/data/octopus` - 0.5 PB - for interactive usage (Lustre) - to be mounted on Alpha Centauri +* `/data/weasel` - 1 PB - for high IOPS (WEKA) - coming 2024. + +#### Difference Between "Work" And "Permanent" + +A large number of changing files is a challenge for any backup system. To protect +our snapshots and backup from work data, +`/projects` cannot be used for temporary data on the compute nodes - it is mounted read-only. + +For `/home`, we create snapshots and tape backups. That's why working there, +with a high frequency of changing files is a bad idea. + +Please use our data mover mechanisms to transfer worthy data to permanent +storages or long-term archives. + +### Migration Phase + +For about one month, the new cluster Barnard, and the old cluster Taurus +will run side-by-side - both with their respective filesystems. We provide a comprehensive +[description of the migration to Barnard](barnard.md). + +<! -- +The follwing figure provides a graphical overview of the overall process (red: user action +required): + + +{: align=center} +--> + +## Login and Dataport Nodes + +!!! danger "**On December 11 2023 Taurus will be decommissioned for good**." + + Do not use Taurus for production anymore. + +- Login-Nodes + - Individual for each cluster. See sections below. +- 2 Data-Transfer-Nodes + - 2 servers without interactive login, only available via file transfer protocols + (`rsync`, `ftp`) + - `dataport[3-4].hpc.tu-dresden.de` + - IPs: 141.30.73.\[4,5\] + - Further information on the usage is documented on the site + [dataport Nodes](../data_transfer/dataport_nodes.md) +- *outdated*: 2 Data-Transfer-Nodes `taurusexport[3-4].hrsk.tu-dresden.de`<!--TODO: remove after release in May 2024--> - DNS Alias `taurusexport.hrsk.tu-dresden.de` - 2 Servers without interactive login, only available via file transfer protocols (`rsync`, `ftp`) - - IPs: 141.30.73.\[82,83\] - - Further information on the usage is documented on the site - [Export Nodes](../data_transfer/export_nodes.md) + - available as long as outdated filesystems (e.g. `scratch`) are accessible + +## Barnard + +The cluster `Barnard` is a general purpose cluster by Bull. It is based on Intel Sapphire Rapids +CPUs. + +- 630 diskless nodes, each with + - 2 x Intel Xeon Platinum 8470 (52 cores) @ 2.00 GHz, Multithreading enabled + - 512 GB RAM +- Login nodes: `login[1-4].barnard.hpc.tu-dresden.de` +- Hostnames: `n[1001-1630].barnard.hpc.tu-dresden.de` +- Operating system: Red Hat Enterpise Linux 8.7 +- Further information on the usage is documented on the site [CPU Cluster Barnard](barnard.md) + +## Alpha Centauri -## AMD Rome CPUs + NVIDIA A100 +The cluster `Alpha Centauri` (short: `Alpha`) by NEC provides AMD Rome CPUs and NVIDIA A100 GPUs +and is designed for AI and ML tasks. - 34 nodes, each with - 8 x NVIDIA A100-SXM4 Tensor Core-GPUs - 2 x AMD EPYC CPU 7352 (24 cores) @ 2.3 GHz, Multithreading available - 1 TB RAM - 3.5 TB local memory on NVMe device at `/tmp` -- Hostnames: `taurusi[8001-8034]` -- Slurm partition: `alpha` -- Further information on the usage is documented on the site [Alpha Centauri Nodes](alpha_centauri.md) +- Login nodes: `login[1-2].alpha.hpc.tu-dresden.de` +- Hostnames: `i[8001-8037].alpha.hpc.tu-dresden.de` +- Operating system: Rocky Linux 8.7 +- Further information on the usage is documented on the site [GPU Cluster Alpha Centauri](alpha_centauri.md) -## Island 7 - AMD Rome CPUs +## Romeo + +The cluster `Romeo` is a general purpose cluster by NEC based on AMD Rome CPUs. - 192 nodes, each with - 2 x AMD EPYC CPU 7702 (64 cores) @ 2.0 GHz, Multithreading available - 512 GB RAM - 200 GB local memory on SSD at `/tmp` -- Hostnames: `taurusi[7001-7192]` -- Slurm partition: `romeo` -- Further information on the usage is documented on the site [AMD Rome Nodes](rome_nodes.md) +- Login nodes: `login[1-2].romeo.hpc.tu-dresden.de` +- Hostnames: `i[7001-7190].romeo.hpc.tu-dresden.de` +- Operating system: Rocky Linux 8.7 +- Further information on the usage is documented on the site [CPU Cluster Romeo](romeo.md) + +## Julia -## Large SMP System HPE Superdome Flex +The cluster `Julia` is a large SMP (shared memory parallel) system by HPE based on Superdome Flex +architecture. - 1 node, with - 32 x Intel(R) Xeon(R) Platinum 8276M CPU @ 2.20 GHz (28 cores) @@ -51,59 +182,58 @@ users and the ZIH. - Configured as one single node - 48 TB RAM (usable: 47 TB - one TB is used for cache coherence protocols) - 370 TB of fast NVME storage available at `/nvme/<projectname>` -- Hostname: `taurussmp8` -- Slurm partition: `julia` -- Further information on the usage is documented on the site [HPE Superdome Flex](sd_flex.md) +- Hostname: `smp8.julia.hpc.tu-dresden.de` +- Further information on the usage is documented on the site [SMP System Julia](julia.md) -## IBM Power9 Nodes for Machine Learning +??? note "Maintenance from November 27 to December 12" -For machine learning, we have IBM AC922 nodes installed with this configuration: + The recabling will take place from November 27 to December 12. These works are planned: -- 32 nodes, each with - - 2 x IBM Power9 CPU (2.80 GHz, 3.10 GHz boost, 22 cores) - - 256 GB RAM DDR4 2666 MHz - - 6 x NVIDIA VOLTA V100 with 32 GB HBM2 - - NVLINK bandwidth 150 GB/s between GPUs and host -- Hostnames: `taurusml[1-32]` -- Slurm partition: `ml` + * update the software stack (OS, firmware, software), + * change the ethernet access (new VLANs), + * complete integration of Romeo and Julia into the Barnard Infiniband network to get full + bandwidth access to all Barnard filesystems, + * configure and deploy stand-alone Slurm batch systems. -## Island 6 - Intel Haswell CPUs + After the maintenance, the Julia system reappears as a stand-alone cluster that can be reached + via `smp8.julia.hpc.tu-dresden.de`. -- 612 nodes, each with - - 2 x Intel(R) Xeon(R) CPU E5-2680 v3 (12 cores) @ 2.50 GHz, Multithreading disabled - - 128 GB local memory on SSD -- Varying amounts of main memory (selected automatically by the batch system for you according to - your job requirements) - * 594 nodes with 2.67 GB RAM per core (64 GB in total): `taurusi[6001-6540,6559-6612]` - - 18 nodes with 10.67 GB RAM per core (256 GB in total): `taurusi[6541-6558]` -- Hostnames: `taurusi[6001-6612]` -- Slurm Partition: `haswell` + **Changes w.r.t. filesystems:** + Your new `/home` directory (from Barnard) will become your `/home` on Romeo, *Julia*, Alpha + Centauri and the Power9 system. Thus, please [migrate your `/home` from Taurus to your **new** + `/home` on Barnard](barnard.md#data-management-and-data-transfer). -??? hint "Node topology" + The old work filesystems `/lustre/scratch` and `/lustre/ssd will` be turned off on January 1 + 2024 for good (no data access afterwards!). The new work filesystem available on the Julia + system will be `/horse`. Please + [migrate your working data to `/horse`](barnard.md#data-migration-to-new-filesystems). -  - {: align=center} +## Power9 -## Island 2 Phase 2 - Intel Haswell CPUs + NVIDIA K80 GPUs +The cluster `Power9` by IBM is based on Power9 CPUs and provides NVIDIA V100 GPUs. +`Power9` is specifically designed for machine learning (ML) tasks. -- 64 nodes, each with - - 2 x Intel(R) Xeon(R) CPU E5-E5-2680 v3 (12 cores) @ 2.50 GHz, Multithreading disabled - - 64 GB RAM (2.67 GB per core) - - 128 GB local memory on SSD - - 4 x NVIDIA Tesla K80 (12 GB GDDR RAM) GPUs -- Hostnames: `taurusi[2045-2108]` -- Slurm Partition: `gpu2` -- Node topology, same as [island 4 - 6](#island-6-intel-haswell-cpus) +- 32 nodes, each with + - 2 x IBM Power9 CPU (2.80 GHz, 3.10 GHz boost, 22 cores) + - 256 GB RAM DDR4 2666 MHz + - 6 x NVIDIA VOLTA V100 with 32 GB HBM2 + - NVLINK bandwidth 150 GB/s between GPUs and host +- Login nodes: `login[1-2].power9.hpc.tu-dresden.de` +- Hostnames: `ml[1-29].power9.hpc.tu-dresden.de` (after recabling phase; expected January '24) +- Further information on the usage is documented on the site [GPU Cluster Power9](power9.md) -## SMP Nodes - up to 2 TB RAM +??? note "Maintenance" -- 5 Nodes, each with - - 4 x Intel(R) Xeon(R) CPU E7-4850 v3 (14 cores) @ 2.20 GHz, Multithreading disabled - - 2 TB RAM -- Hostnames: `taurussmp[3-7]` -- Slurm partition: `smp2` + The recabling will take place from November 27 to December 12. After the maintenance, the Power9 + system reappears as a stand-alone cluster that can be reached via + `ml[1-29].power9.hpc.tu-dresden.de`. -??? hint "Node topology" + **Changes w.r.t. filesystems:** + Your new `/home` directory (from Barnard) will become your `/home` on Romeo, Julia, Alpha + Centauri and the *Power9* system. Thus, please [migrate your `/home` from Taurus to your **new** + `/home` on Barnard](barnard.md#data-management-and-data-transfer). -  - {: align=center} + The old work filesystems `/lustre/scratch` and `/lustre/ssd will` be turned off on January 1 + 2024 for good (no data access afterwards!). The only work filesystem available on the Power9 + system will be `/beegfs`. Please + [migrate your working data to `/horse`](barnard.md#data-migration-to-new-filesystems). diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/hardware_overview_2023.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/hardware_overview_2023.md deleted file mode 100644 index c888857b47414e2c068cac78f9ca9804efb056b5..0000000000000000000000000000000000000000 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/hardware_overview_2023.md +++ /dev/null @@ -1,81 +0,0 @@ -# HPC Resources - -The architecture specifically tailored to data-intensive computing, Big Data -analytics, and artificial intelligence methods with extensive capabilities -for performance monitoring provides ideal conditions to achieve the ambitious -research goals of the users and the ZIH. - -## Overview - -From the users' perspective, there are separate clusters, all of them with their subdomains: - -| Name | Description | Year| DNS | -| --- | --- | --- | --- | -| **Barnard** | CPU cluster |2023| n[1001-1630].barnard.hpc.tu-dresden.de | -| **Romeo** | CPU cluster |2020|i[8001-8190].romeo.hpc.tu-dresden.de | -| **Alpha Centauri** | GPU cluster |2021|i[8001-8037].alpha.hpc.tu-dresden.de | -| **Julia** | single SMP system |2021|smp8.julia.hpc.tu-dresden.de | -| **Power** | IBM Power/GPU system |2018|ml[1-29].power9.hpc.tu-dresden.de | - -They run with their own Slurm batch system. Job submission is possible only from -their respective login nodes. - -All clusters will have access to these shared parallel filesystems: - -| Filesystem | Usable directory | Type | Capacity | Purpose | -| --- | --- | --- | --- | --- | -| Home | `/home` | Lustre | quota per user: 20 GB | permanent user data | -| Project | `/projects` | Lustre | quota per project | permanent project data | -| Scratch for large data / streaming | `/data/horse` | Lustre | 20 PB | | - -## Barnard - Intel Sapphire Rapids CPUs - -- 630 diskless nodes, each with - - 2 x Intel Xeon Platinum 8470 (52 cores) @ 2.00 GHz, Multithreading enabled - - 512 GB RAM -- Hostnames: `n[1001-1630].barnard.hpc.tu-dresden.de` -- Login nodes: `login[1-4].barnard.hpc.tu-dresden.de` - -## AMD Rome CPUs + NVIDIA A100 - -- 34 nodes, each with - - 8 x NVIDIA A100-SXM4 Tensor Core-GPUs - - 2 x AMD EPYC CPU 7352 (24 cores) @ 2.3 GHz, Multithreading available - - 1 TB RAM - - 3.5 TB local memory on NVMe device at `/tmp` -- Hostnames: `taurusi[8001-8034]` -> `i[8001-8037].alpha.hpc.tu-dresden.de` -- Login nodes: `login[1-2].alpha.hpc.tu-dresden.de` -- Further information on the usage is documented on the site [Alpha Centauri Nodes](alpha_centauri.md) - -## Island 7 - AMD Rome CPUs - -- 192 nodes, each with - - 2 x AMD EPYC CPU 7702 (64 cores) @ 2.0 GHz, Multithreading available - - 512 GB RAM - - 200 GB local memory on SSD at `/tmp` -- Hostnames: `taurusi[7001-7192]` -> `i[7001-7190].romeo.hpc.tu-dresden.de` -- Login nodes: `login[1-2].romeo.hpc.tu-dresden.de` -- Further information on the usage is documented on the site [AMD Rome Nodes](rome_nodes.md) - -## Large SMP System HPE Superdome Flex - -- 1 node, with - - 32 x Intel Xeon Platinum 8276M CPU @ 2.20 GHz (28 cores) - - 47 TB RAM -- Configured as one single node -- 48 TB RAM (usable: 47 TB - one TB is used for cache coherence protocols) -- 370 TB of fast NVME storage available at `/nvme/<projectname>` -- Hostname: `taurussmp8` -> `smp8.julia.hpc.tu-dresden.de` -- Further information on the usage is documented on the site [HPE Superdome Flex](sd_flex.md) - -## IBM Power9 Nodes for Machine Learning - -For machine learning, we have IBM AC922 nodes installed with this configuration: - -- 32 nodes, each with - - 2 x IBM Power9 CPU (2.80 GHz, 3.10 GHz boost, 22 cores) - - 256 GB RAM DDR4 2666 MHz - - 6 x NVIDIA VOLTA V100 with 32 GB HBM2 - - NVLINK bandwidth 150 GB/s between GPUs and host -- Hostnames: `taurusml[1-32]` -> `ml[1-29].power9.hpc.tu-dresden.de` -- Login nodes: `login[1-2].power9.hpc.tu-dresden.de` diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/sd_flex.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/julia.md similarity index 77% rename from doc.zih.tu-dresden.de/docs/jobs_and_resources/sd_flex.md rename to doc.zih.tu-dresden.de/docs/jobs_and_resources/julia.md index 27a39b06a6444ebd13a5a4e86c74cf1b17317e8d..10db447e9e74dd39c3bdd922d1425e3cedb4208d 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/sd_flex.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/julia.md @@ -1,4 +1,4 @@ -# HPE Superdome Flex +# SMP Cluster Julia The HPE Superdome Flex is a large shared memory node. It is especially well suited for data intensive application scenarios, for example to process extremely large data sets completely in main @@ -13,7 +13,7 @@ There are 370 TB of NVMe devices installed. For immediate access for all project of fast NVMe storage is available at `/nvme/1/<projectname>`. A quota of 100 GB per project on this NVMe storage is set. -With a more detailed proposal to [hpcsupport@zih.tu-dresden.de](mailto:hpcsupport@zih.tu-dresden.de) +With a more detailed proposal to [hpc-support@tu-dresden.de](mailto:hpc-support@tu-dresden.de) on how this unique system (large shared memory + NVMe storage) can speed up their computations, a project's quota can be increased or dedicated volumes of up to the full capacity can be set up. @@ -21,8 +21,8 @@ project's quota can be increased or dedicated volumes of up to the full capacity - Granularity should be a socket (28 cores) - Can be used for OpenMP applications with large memory demands -- To use OpenMPI it is necessary to export the following environment - variables, so that OpenMPI uses shared-memory instead of Infiniband +- To use Open MPI it is necessary to export the following environment + variables, so that Open MPI uses shared-memory instead of InfiniBand for message transport: ``` @@ -31,4 +31,4 @@ project's quota can be increased or dedicated volumes of up to the full capacity ``` - Use `I_MPI_FABRICS=shm` so that Intel MPI doesn't even consider - using Infiniband devices itself, but only shared-memory instead + using InfiniBand devices itself, but only shared-memory instead diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/migration_2023.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/migration_2023.md deleted file mode 100644 index 3a6749cff0814d2dbf54d53288fbcaa7fcb85818..0000000000000000000000000000000000000000 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/migration_2023.md +++ /dev/null @@ -1,82 +0,0 @@ -# Migration 2023 - -## Brief Overview over Coming Changes - -All components of Taurus will be dismantled step by step. - -### New Hardware - -The new HPC system "Barnard" from Bull comes with these main properties: - -* 630 compute nodes based on Intel Sapphire Rapids -* new Lustre-based storage systems -* HDR Infiniband network large enough to integrate existing and near-future non-Bull hardware -* To help our users to find the best location for their data we now use the name of -animals (size, speed) as mnemonics. - -More details can be found in the [overview](/jobs_and_resources/hardware_overview_2023). - -### New Architecture - -Over the last decade we have been running our HPC system of high heterogeneity with a single -Slurm batch system. This made things very complicated, especially to inexperienced users. -To lower this hurdle we now create homogenous clusters with their own Slurm instances and with -cluster specific login nodes running on the same CPU. Job submission is possible only -from within the cluster (compute or login node). - -All clusters will be integrated to the new Infiniband fabric and have then the same access to -the shared filesystems. This recabling requires a brief downtime of a few days. - -[Details on architecture](/jobs_and_resources/architecture_2023). - -### New Software - -The new nodes run on Linux RHEL 8.7. For a seamless integration of other compute hardware, -all operating system will be updated to the same versions of OS, Mellanox and Lustre drivers. -With this all application software was re-built consequently using GIT and CI for handling -the multitude of versions. - -We start with `release/23.10` which is based on software reqeusts from user feedbacks of our -HPC users. Most major software versions exist on all hardware platforms. - -## Migration Path - -Please make sure to have read [Details on architecture](/jobs_and_resources/architecture_2023) before -further reading. - -The migration can only be successful as a joint effort of HPC team and users. Here is a description -of the action items. - -|When?|TODO ZIH |TODO users |Remark | -|---|---|---|---| -| done (May 2023) |first sync /scratch to /data/horse/old_scratch2| |copied 4 PB in about 3 weeks| -| done (June 2023) |enable access to Barnard| |initialized LDAP tree with Taurus users| -| done (July 2023) | |install new software stack|tedious work | -| ASAP | |adapt scripts|new Slurm version, new resources, no partitions| -| August 2023 | |test new software stack on Barnard|new versions sometimes require different prerequisites| -| August 2023| |test new software stack on other clusters|a few nodes will be made available with the new sw stack, but with the old filesystems| -| ASAP | |prepare data migration|The small filesystems `/beegfs` and `/lustre/ssd`, and `/home` are mounted on the old systems "until the end". They will *not* be migrated to the new system.| -| July 2023 | sync `/warm_archive` to new hardware| |using datamover nodes with Slurm jobs | -| September 2023 |prepare recabling of older hardware (Bull)| |integrate other clusters in the IB infrastructure | -| Autumn 2023 |finalize integration of other clusters (Bull)| |**~2 days downtime**, final rsync and migration of `/projects`, `/warm_archive`| -| Autumn 2023 ||transfer last data from old filesystems | `/beegfs`, `/lustre/scratch`, `/lustre/ssd` are no longer available on the new systems| - -### Data Migration - -Why do users need to copy their data? Why only some? How to do it best? - -* The sync of hundreds of terabytes can only be done planned and carefully. -(`/scratch`, `/warm_archive`, `/projects`). The HPC team will use multiple syncs -to not forget the last bytes. During the downtime, `/projects` will be migrated. -* User homes (`/home`) are relatively small and can be copied by the scientists. -Keeping in mind that maybe deleting and archiving is a better choice. -* For this, datamover nodes are available to run transfer jobs under Slurm. Please refer to the -section [Transfer Data to New Home Directory](../barnard_test#transfer-data-to-new-home-directory) -for more detailed instructions. - -### A Graphical Overview - -(red: user action required): - - -{: align=center} diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/misc/architecture_2023.png b/doc.zih.tu-dresden.de/docs/jobs_and_resources/misc/architecture_2023.png index bc1083880f5172240dd78f57dd8b1a7bac39dab5..bf5235a6e75b516cd096877e59787e5e3c5c1c0b 100644 Binary files a/doc.zih.tu-dresden.de/docs/jobs_and_resources/misc/architecture_2023.png and b/doc.zih.tu-dresden.de/docs/jobs_and_resources/misc/architecture_2023.png differ diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/mpi_issues.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/mpi_issues.md index 95f6eb58990233e85c5dfa535e0c1bde0c29ade6..bad0021077e40458ee342bf1e88c858f07949c5e 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/mpi_issues.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/mpi_issues.md @@ -1,36 +1,38 @@ -# Known Issues when Using MPI +# Known Issues with MPI This pages holds known issues observed with MPI and concrete MPI implementations. -## OpenMPI v4.1.x - Performance Loss with MPI-IO-Module OMPIO +## Open MPI -OpenMPI v4.1.x introduced a couple of major enhancements, e.g., the `OMPIO` module is now the +### Performance Loss with MPI-IO-Module OMPIO + +Open MPI v4.1.x introduced a couple of major enhancements, e.g., the `OMPIO` module is now the default module for MPI-IO on **all** filesystems incl. Lustre (cf. -[NEWS file in OpenMPI source code](https://raw.githubusercontent.com/open-mpi/ompi/v4.1.x/NEWS)). +[NEWS file in Open MPI source code](https://raw.githubusercontent.com/open-mpi/ompi/v4.1.x/NEWS)). Prior to this, `ROMIO` was the default MPI-IO module for Lustre. Colleagues of ZIH have found that some MPI-IO access patterns suffer a significant performance loss -using `OMPIO` as MPI-IO module with OpenMPI/4.1.x modules on ZIH systems. At the moment, the root +using `OMPIO` as MPI-IO module with `OpenMPI/4.1.x` modules on ZIH systems. At the moment, the root cause is unclear and needs further investigation. -**A workaround** for this performance loss is to use "old", i.e., `ROMIO` MPI-IO-module. This +**A workaround** for this performance loss is to use the "old", i.e., `ROMIO` MPI-IO-module. This is achieved by setting the environment variable `OMPI_MCA_io` before executing the application as follows ```console -export OMPI_MCA_io=^ompio -srun ... +marie@login$ export OMPI_MCA_io=^ompio +marie@login$ srun [...] ``` or setting the option as argument, in case you invoke `mpirun` directly ```console -mpirun --mca io ^ompio ... +marie@login$ mpirun --mca io ^ompio [...] ``` -## Mpirun on partition `alpha` and `ml` - -Using `mpirun` on partitions `alpha` and `ml` leads to wrong resource distribution when more than +### Mpirun on clusters `alpha` and `power9` +<!-- laut max möglich dass es nach dem update von alpha und power9 das problem nicht mehr relevant ist.--> +Using `mpirun` on clusters `alpha` and `power` leads to wrong resource distribution when more than one node is involved. This yields a strange distribution like e.g. `SLURM_NTASKS_PER_NODE=15,1` even though `--tasks-per-node=8` was specified. Unless you really know what you're doing (e.g. use rank pinning via perl script), avoid using mpirun. @@ -39,23 +41,22 @@ Another issue arises when using the Intel toolchain: mpirun calls a different MP 8-9x slowdown in the PALM app in comparison to using srun or the GCC-compiled version of the app (which uses the correct MPI). -## R Parallel Library on Multiple Nodes +### R Parallel Library on Multiple Nodes Using the R parallel library on MPI clusters has shown problems when using more than a few compute -nodes. The error messages indicate that there are buggy interactions of R/Rmpi/OpenMPI and UCX. +nodes. The error messages indicate that there are buggy interactions of R/Rmpi/Open MPI and UCX. Disabling UCX has solved these problems in our experiments. We invoked the R script successfully with the following command: ```console -mpirun -mca btl_openib_allow_ib true --mca pml ^ucx --mca osc ^ucx -np 1 Rscript ---vanilla the-script.R +marie@login$ mpirun -mca btl_openib_allow_ib true --mca pml ^ucx --mca osc ^ucx -np 1 Rscript --vanilla the-script.R ``` where the arguments `-mca btl_openib_allow_ib true --mca pml ^ucx --mca osc ^ucx` disable usage of UCX. -## MPI Function `MPI_Win_allocate` +### MPI Function `MPI_Win_allocate` The function `MPI_Win_allocate` is a one-sided MPI call that allocates memory and returns a window object for RDMA operations (ref. [man page](https://www.open-mpi.org/doc/v3.0/man3/MPI_Win_allocate.3.php)). @@ -65,6 +66,6 @@ object for RDMA operations (ref. [man page](https://www.open-mpi.org/doc/v3.0/ma It was observed for at least for the `OpenMPI/4.0.5` module that using `MPI_Win_Allocate` instead of `MPI_Alloc_mem` in conjunction with `MPI_Win_create` leads to segmentation faults in the calling -application . To be precise, the segfaults occurred at partition `romeo` when about 200 GB per node +application. To be precise, the segfaults occurred at partition `romeo` when about 200 GB per node where allocated. In contrast, the segmentation faults vanished when the implementation was -refactored to call the `MPI_Alloc_mem + MPI_Win_create` functions. +refactored to call the `MPI_Alloc_mem` + `MPI_Win_create` functions. diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/nvme_storage.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/nvme_storage.md index 78b8175ccbba3fb0eee8be7b946ebe2bee31219b..4a72b115e9b6433c889f28802ccf685209396d98 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/nvme_storage.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/nvme_storage.md @@ -4,7 +4,7 @@ - 8x Intel NVMe Datacenter SSD P4610, 3.2 TB - 3.2 GB/s (8x 3.2 =25.6 GB/s) -- 2 Infiniband EDR links, Mellanox MT27800, ConnectX-5, PCIe x16, 100 +- 2 InfiniBand EDR links, Mellanox MT27800, ConnectX-5, PCIe x16, 100 Gbit/s - 2 sockets Intel Xeon E5-2620 v4 (16 cores, 2.10GHz) - 64 GB RAM diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/overview.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/overview.md index 20a542d3abed3cd59b299c5d6560bc451f3eead0..2c14a21dd9bbff2a4c43e7de88bee11a95d74296 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/overview.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/overview.md @@ -1,33 +1,108 @@ -# HPC Resources and Jobs - -ZIH operates high performance computing (HPC) systems with more than 90.000 cores, 500 GPUs, and -a flexible storage hierarchy with about 20 PB total capacity. The HPC system provides an optimal -research environment especially in the area of data analytics and machine learning as well as for -processing extremely large data sets. Moreover it is also a perfect platform for highly scalable, -data-intensive and compute-intensive applications. - -With shared [login nodes](hardware_overview.md#login-nodes) and -[filesystems](../data_lifecycle/file_systems.md) our HPC system enables users to easily switch -between [the components](hardware_overview.md), each specialized for different application -scenarios. - -When log in to ZIH systems, you are placed on a login node where you can -[manage data life cycle](../data_lifecycle/overview.md), -setup experiments, -execute short tests and compile moderate projects. The login nodes cannot be used for real -experiments and computations. Long and extensive computational work and experiments have to be -encapsulated into so called **jobs** and scheduled to the compute nodes. - -Follow the page [Slurm](slurm.md) for comprehensive documentation using the batch system at -ZIH systems. There is also a page with extensive set of [Slurm examples](slurm_examples.md). +# Introduction HPC Resources and Jobs + +ZIH operates high performance computing (HPC) systems with more than 90.000 cores, 500 GPUs, and a +flexible storage hierarchy with about 20 PB total capacity. The HPC system provides an optimal +research environment especially in the area of data analytics, artificial intelligence methods and +machine learning as well as for processing extremely large data sets. Moreover it is also a perfect +platform for highly scalable, data-intensive and compute-intensive applications and has extensive +capabilities for energy measurement and performance monitoring. Therefore provides ideal conditions +to achieve the ambitious research goals of the users and the ZIH. + +The HPC system, redesigned in December 2023, consists of five homogeneous clusters with their own +[Slurm](slurm.md) instances and cluster specific +[login nodes](hardware_overview.md#login-nodes). The clusters share one +[filesystem](../data_lifecycle/file_systems.md) which enables users to easily switch between the +components. ## Selection of Suitable Hardware +The five clusters [`Barnard`](barnard.md), +[`Alpha Centauri`](alpha_centauri.md), +[`Romeo`](romeo.md), +[`Power9`](power9.md) and +[`Julia`](julia.md) +differ, among others, in number of nodes, cores per node, and GPUs and memory. The particular +[characteristica](hardware_overview.md) qualify them for different applications. + +### Which Cluster Do I Need? + +The majority of the basic tasks can be executed on the conventional nodes like on `Barnard`. When +log in to ZIH systems, you are placed on a login node where you can execute short tests and compile +moderate projects. The login nodes cannot be used for real experiments and computations. Long and +extensive computational work and experiments have to be encapsulated into so called **jobs** and +scheduled to the compute nodes. + +There is no such thing as free lunch at ZIH systems. Since compute nodes are operated in multi-user +node by default, jobs of several users can run at the same time at the very same node sharing +resources, like memory (but not CPU). On the other hand, a higher throughput can be achieved by +smaller jobs. Thus, restrictions w.r.t. [memory](#memory-limits) and +[runtime limits](#runtime-limits) have to be respected when submitting jobs. + +The following questions may help to decide which cluster to use + +- my application + - is [interactive or a batch job](slurm.md)? + - requires [parallelism](#parallel-jobs)? + - requires [multithreading (SMT)](#multithreading)? +- Do I need [GPUs](#what-do-i-need-a-cpu-or-gpu)? +- How much [run time](#runtime-limits) do I need? +- How many [cores](#how-many-cores-do-i-need) do I need? +- How much [memory](#how-much-memory-do-i-need) do I need? +- Which [software](#available-software) is required? + +<!-- cluster_overview_table --> +|Name|Description| DNS | Nodes | # Nodes | Cores per Node | Threads per Core | Memory per Node [in MB] | Memory per Core [in MB] | GPUs per Node +|---|---|----|:---|---:|---:|---:|---:|---:|---:| +|**Barnard**<br>_2023_| CPU|`n[node].barnard.hpc.tu-dresden.de` |n[1001-1630] | 630 |104| 2 |515,000 |2,475 | 0 | +|**Alpha**<br>_2021_| GPU |`i[node].alpha.hpc.tu-dresden.de`|taurusi[8001-8034] | 34 | 48 | 2 | 990,000 | 10,312| 8 | +|**Romeo**<br>_2020_| CPU |`i[node].romeo.hpc.tu-dresden.de`|taurusi[7001-7192] | 192|128 | 2 | 505,000| 1,972 | 0 | +|**Julia**<br>_2021_| single SMP system |`smp8.julia.hpc.tu-dresden.de`|taurusa[3-16] | 14 | 12 | 1 | 95,000 | 7,916 | 3 | +|**Power**<br>_2018_|IBM Power/GPU system |`ml[node].power9.hpc.tu-dresden.de`|taurusml[3-32] | 30 | 44 | 4 | 254,000 | 1,443 | 6 | +{: summary="cluster overview table" align="bottom"} + +### Interactive or Batch Mode + +**Interactive jobs:** An interactive job is the best choice for testing and development. See + [interactive-jobs](slurm.md). +Slurm can forward your X11 credentials to the first node (or even all) for a job +with the `--x11` option. To use an interactive job you have to specify `-X` flag for the ssh login. + +However, using `srun` directly on the Shell will lead to blocking and launch an interactive job. +Apart from short test runs, it is recommended to encapsulate your experiments and computational +tasks into batch jobs and submit them to the batch system. For that, you can conveniently put the +parameters directly into the job file which you can submit using `sbatch [options] <job file>`. + +### Parallel Jobs + +**MPI jobs:** For MPI jobs typically allocates one core per task. Several nodes could be allocated +if it is necessary. The batch system [Slurm](slurm.md) will automatically find suitable hardware. + +**OpenMP jobs:** SMP-parallel applications can only run **within a node**, so it is necessary to +include the [batch system](slurm.md) options `-N 1` and `-n 1`. Using `--cpus-per-task N` Slurm will +start one task and you will have `N` CPUs. The maximum number of processors for an SMP-parallel +program is 896 on partition `julia` (be aware that +the application has to be developed with that large number of threads in mind). + +Partitions with GPUs are best suited for **repetitive** and **highly-parallel** computing tasks. If +you have a task with potential [data parallelism](../software/gpu_programming.md) most likely that +you need the GPUs. Beyond video rendering, GPUs excel in tasks such as machine learning, financial +simulations and risk modeling. Use the cluster `power` only if you need GPUs! Otherwise +using the x86-based partitions most likely would be more beneficial. + +### Multithreading + +Some cluster/nodes have Simultaneous Multithreading (SMT) enabled, e.g [`alpha`](slurm.md) You +request for this additional threads using the Slurm option `--hint=multithread` or by setting the +environment variable `SLURM_HINT=multithread`. Besides the usage of the threads to speed up the +computations, the memory of the other threads is allocated implicitly, too, and you will always get +`Memory per Core`*`number of threads` as memory pledge. + ### What do I need, a CPU or GPU? If an application is designed to run on GPUs this is normally announced unmistakable since the efforts of adapting an existing software to make use of a GPU can be overwhelming. -And even if the software was listed in [NVIDIA's list of GPU-Accelerated Applications](https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/tesla-product-literature/gpu-applications-catalog.pdf) +And even if the software was listed in +[NVIDIA's list of GPU-Accelerated Applications](https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/tesla-product-literature/gpu-applications-catalog.pdf) only certain parts of the computations may run on the GPU. To answer the question: The easiest way is to compare a typical computation @@ -43,45 +118,100 @@ by a significant factor then this might be the obvious choice. (but the amount of data which a single GPU's core can handle is small), GPUs are not as versatile as CPUs. -### Available Hardware +### How much time do I need? -ZIH provides a broad variety of compute resources ranging from normal server CPUs of different -manufactures, large shared memory nodes, GPU-assisted nodes up to highly specialized resources for -[Machine Learning](../software/machine_learning.md) and AI. -The page [ZIH Systems](hardware_overview.md) holds a comprehensive overview. +#### Runtime limits -The desired hardware can be specified by the partition `-p, --partition` flag in Slurm. -The majority of the basic tasks can be executed on the conventional nodes like a Haswell. Slurm will -automatically select a suitable partition depending on your memory and GPU requirements. +!!! warning "Runtime limits on login nodes" -### Parallel Jobs + There is a time limit of 600 seconds set for processes on login nodes. Each process running + longer than this time limit is automatically killed. The login nodes are shared ressources + between all users of ZIH system and thus, need to be available and cannot be used for productive + runs. -**MPI jobs:** For MPI jobs typically allocates one core per task. Several nodes could be allocated -if it is necessary. The batch system [Slurm](slurm.md) will automatically find suitable hardware. + ``` + CPU time limit exceeded + ``` -**OpenMP jobs:** SMP-parallel applications can only run **within a node**, so it is necessary to -include the [batch system](slurm.md) options `-N 1` and `-n 1`. Using `--cpus-per-task N` Slurm will -start one task and you will have `N` CPUs. The maximum number of processors for an SMP-parallel -program is 896 on partition `julia`, see [partitions](partitions_and_limits.md) (be aware that -the application has to be developed with that large number of threads in mind). + Please submit extensive application runs to the compute nodes using the [batch system](slurm.md). -Partitions with GPUs are best suited for **repetitive** and **highly-parallel** computing tasks. If -you have a task with potential [data parallelism](../software/gpu_programming.md) most likely that -you need the GPUs. Beyond video rendering, GPUs excel in tasks such as machine learning, financial -simulations and risk modeling. Use the partitions `gpu2` and `ml` only if you need GPUs! Otherwise -using the x86-based partitions most likely would be more beneficial. +!!! note "Runtime limits are enforced." -**Interactive jobs:** An interactive job is the best choice for testing and development. See - [interactive-jobs](slurm.md). -Slurm can forward your X11 credentials to the first node (or even all) for a job -with the `--x11` option. To use an interactive job you have to specify `-X` flag for the ssh login. + A job is canceled as soon as it exceeds its requested limit. Currently, the maximum run time + limit is 7 days. -## Interactive vs. Batch Mode +Shorter jobs come with multiple advantages: -However, using `srun` directly on the Shell will lead to blocking and launch an interactive job. -Apart from short test runs, it is recommended to encapsulate your experiments and computational -tasks into batch jobs and submit them to the batch system. For that, you can conveniently put the -parameters directly into the job file which you can submit using `sbatch [options] <job file>`. +- lower risk of loss of computing time, +- shorter waiting time for scheduling, +- higher job fluctuation; thus, jobs with high priorities may start faster. + +To bring down the percentage of long running jobs we restrict the number of cores with jobs longer +than 2 days to approximately 50% and with jobs longer than 24 to 75% of the total number of cores. +(These numbers are subject to change.) As best practice we advise a run time of about 8h. + +!!! hint "Please always try to make a good estimation of your needed time limit." + + For this, you can use a command line like this to compare the requested timelimit with the + elapsed time for your completed jobs that started after a given date: + + ```console + marie@login$ sacct -X -S 2021-01-01 -E now --format=start,JobID,jobname,elapsed,timelimit -s COMPLETED + ``` + +Instead of running one long job, you should split it up into a chain job. Even applications that are +not capable of checkpoint/restart can be adapted. Please refer to the section +[Checkpoint/Restart](../jobs_and_resources/checkpoint_restart.md) for further documentation. + +### How many cores do I need? + +ZIH systems are focused on data-intensive computing. They are meant to be used for highly +parallelized code. Please take that into account when migrating sequential code from a local machine +to our HPC systems. To estimate your execution time when executing your previously sequential +program in parallel, you can use [Amdahl's law](https://en.wikipedia.org/wiki/Amdahl%27s_law). +Think in advance about the parallelization strategy for your project and how to effectively use HPC +resources. + +However, this is highly depending on the used software, investigate if your application supports a +parallel execution. + +### How much memory do I need? + +#### Memory Limits + +!!! note "Memory limits are enforced." + + Jobs which exceed their per-node memory limit are killed automatically by the batch system. + +Memory requirements for your job can be specified via the `sbatch/srun` parameters: + +`--mem-per-cpu=<MB>` or `--mem=<MB>` (which is "memory per node"). The **default limit** regardless +of the partition it runs on is quite low at **300 MB** per CPU. If you need more memory, you need +to request it. + +ZIH systems comprise different sets of nodes with different amount of installed memory which affect +where your job may be run. To achieve the shortest possible waiting time for your jobs, you should +be aware of the limits shown in the +[Slurm resource limits table](../jobs_and_resources/slurm_limits.md#slurm-resource-limits-table). + +Follow the page [Slurm](slurm.md) for comprehensive documentation using the batch system at +ZIH systems. There is also a page with extensive set of [Slurm examples](slurm_examples.md). + +### Which software is required? + +#### Available software + +Pre-installed software on our HPC systems is managed via [modules](../software/modules.md). +You can see the +[list of software that's already installed and accessible via modules](https://gauss-allianz.de/de/application?organizations%5B0%5D=1200). +However, there are many different variants of these modules available. Each cluster has its own set +of installed modules, depending on their purpose. + +Specific modules can be found with: + +```console +marie@compute$ module spider <software_name> +``` ## Processing of Data for Input and Output @@ -99,7 +229,7 @@ project. Please send your request **7 working days** before the reservation should start (as that's our maximum time limit for jobs and it is therefore not guaranteed that resources are available on shorter notice) with the following information to the -[HPC support](mailto:hpcsupport@zih.tu-dresden.de?subject=Request%20for%20a%20exclusive%20reservation%20of%20hardware&body=Dear%20HPC%20support%2C%0A%0AI%20have%20the%20following%20request%20for%20a%20exclusive%20reservation%20of%20hardware%3A%0A%0AProject%3A%0AReservation%20owner%3A%0ASystem%3A%0AHardware%20requirements%3A%0ATime%20window%3A%20%3C%5Byear%5D%3Amonth%3Aday%3Ahour%3Aminute%20-%20%5Byear%5D%3Amonth%3Aday%3Ahour%3Aminute%3E%0AReason%3A): +[HPC support](mailto:hpc-support@tu-dresden.de?subject=Request%20for%20a%20exclusive%20reservation%20of%20hardware&body=Dear%20HPC%20support%2C%0A%0AI%20have%20the%20following%20request%20for%20a%20exclusive%20reservation%20of%20hardware%3A%0A%0AProject%3A%0AReservation%20owner%3A%0ASystem%3A%0AHardware%20requirements%3A%0ATime%20window%3A%20%3C%5Byear%5D%3Amonth%3Aday%3Ahour%3Aminute%20-%20%5Byear%5D%3Amonth%3Aday%3Ahour%3Aminute%3E%0AReason%3A): - `Project:` *Which project will be credited for the reservation?* - `Reservation owner:` *Who should be able to run jobs on the diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/power9.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/power9.md new file mode 100644 index 0000000000000000000000000000000000000000..20de974b266395c789f09a292e7dc629b28a3b00 --- /dev/null +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/power9.md @@ -0,0 +1 @@ +# GPU Cluster Power9 diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/rome_nodes.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/romeo.md similarity index 72% rename from doc.zih.tu-dresden.de/docs/jobs_and_resources/rome_nodes.md rename to doc.zih.tu-dresden.de/docs/jobs_and_resources/romeo.md index 4347dd6b0e64005a67f4c60627a2002138a00631..ba63480e7b6ba582788a8b6d6651d80fe1ffebe3 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/rome_nodes.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/romeo.md @@ -1,7 +1,32 @@ -# AMD Rome Nodes +# CPU Cluster Romeo + +## Overview + +The HPC system `Romeo` is a general purpose cluster based on AMD Rome CPUs. From 2019 till the end +of 2023, it was available as partition `romeo` within `Taurus`. With the decommission of `Taurus`, +`Romeo` has been re-engineered and is now a homogeneous, standalone cluster with own +[Slurm batch system](slurm.md) and own login nodes. This maintenance also comprised: + + * change the ethernet access (new VLANs), + * complete integration of `Romeo` into the `Barnard` InfiniBand network to get full + bandwidth access to all new filesystems, + * configure and deploy stand-alone Slurm batch system, + * newly build software within separate software and module system. + +!!! note "Changes w.r.t. filesystems" + + Your new `/home` directory (from `Barnard`) is now your `/home` on *Romeo*, too. + Thus, please + [migrate your `/home` from Taurus to your **new** `/home` on Barnard](barnard.md#data-management-and-data-transfer). + + The old work filesystems `/lustre/scratch` and `/lustre/ssd will` be turned off on January 1 + 2024 for good (no data access afterwards!). The new work filesystem available on `Romeo` is + `horse`. Please [migrate your working data to `/horse`](barnard.md#data-migration-to-new-filesystems). + +## Hardware Resources The hardware specification is documented on the page -[HPC Resources](hardware_overview.md#island-7-amd-rome-cpus). +[HPC Resources](hardware_overview.md#romeo). ## Usage @@ -102,6 +127,6 @@ case on Rome. You might want to try `-mavx2 -fma` instead. ### Intel MPI -We have seen only half the theoretical peak bandwidth via Infiniband between two nodes, whereas -OpenMPI got close to the peak bandwidth, so you might want to avoid using Intel MPI on partition +We have seen only half the theoretical peak bandwidth via InfiniBand between two nodes, whereas +Open MPI got close to the peak bandwidth, so you might want to avoid using Intel MPI on partition `rome` if your application heavily relies on MPI communication until this issue is resolved. diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm.md index 98063cb50337c5396ee3125e245d9d929abe0679..eea01a7ce3540c33e0cca0a0d7fdf1a205ff6c28 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm.md @@ -110,7 +110,7 @@ can find it via `squeue --me`. The job ID allows you to The following table contains the most important options for `srun`, `sbatch`, `salloc` to specify resource requirements and control communication. -??? tip "Options Table (see `man sbatch`)" +??? tip "Options Table (see `man sbatch` for all available options)" | Slurm Option | Description | |:---------------------------|:------------| @@ -118,7 +118,6 @@ resource requirements and control communication. | `-N, --nodes=<N>` | Number of compute nodes | | `--ntasks-per-node=<N>` | Number of tasks per allocated node to start (default: 1) | | `-c, --cpus-per-task=<N>` | Number of CPUs per task; needed for multithreaded (e.g. OpenMP) jobs; typically `N` should be equal to `OMP_NUM_THREADS` | - | `-p, --partition=<name>` | Type of nodes where you want to execute your job (refer to [partitions](partitions_and_limits.md)) | | `--mem-per-cpu=<size>` | Memory need per allocated CPU in MB | | `-t, --time=<HH:MM:SS>` | Maximum runtime of the job | | `--mail-user=<your email>` | Get updates about the status of the jobs | @@ -132,6 +131,8 @@ resource requirements and control communication. | `-a, --array=<arg>` | Submit an array job ([examples](slurm_examples.md#array-jobs)) | | `-w <node1>,<node2>,...` | Restrict job to run on specific nodes only | | `-x <node1>,<node2>,...` | Exclude specific nodes from job | + | `--switches=<count>[@max-time]` | Optimum switches and max time to wait for optimum | + | `--signal=<sig_num>[@sig_time]` | Send signal `sig_num` to job `sig_time` before time limit (see [Checkoint/Restart page](checkpoint_restart.md#signal-handler)) | | `--test-only` | Retrieve estimated start time of a job considering the job queue; does not actually submit the job nor run the application | !!! note "Output and Error Files" @@ -143,14 +144,23 @@ resource requirements and control communication. !!! note "No free lunch" - Runtime and memory limits are enforced. Please refer to the section on [partitions and - limits](partitions_and_limits.md) for a detailed overview. + Runtime and memory limits are enforced. Please refer to the page + [Slurm resource limits](slurm_limits.md) for a detailed overview. ### Host List -If you want to place your job onto specific nodes, there are two options for doing this. Either use -`-p, --partition=<name>` to specify a host group aka. [partition](partitions_and_limits.md) that fits -your needs. Or, use `-w, --nodelist=<host1,host2,..>` with a list of hosts that will work for you. +If you want to place your job onto specific nodes, use `-w, --nodelist=<host1,host2,..>` with a +list of hosts that will work for you. + +### Number of Switches + +You can fine tune your job by specifying the number of switches desired for the job allocation and +optionally the maximum time to wait for that number of switches. The corresponding option to +`sbatch` is `--switches=<count>[@max-time]`. The job remains pending until it either finds an +allocation with desired switch count or the time limit expires. Acceptable time formats include +"minutes", "minutes:seconds", "hours:minutes:seconds", "days-hours", "days-hours:minutes" and +"days-hours:minutes:seconds". For a detailed explanation, please refer to the +[sbatch online documentation](https://slurm.schedmd.com/sbatch.html#OPT_switches). ## Interactive Jobs @@ -545,8 +555,12 @@ constraints, please refer to the [Slurm documentation](https://slurm.schedmd.com ### Filesystem Features -A feature `fs_*` is active if a certain filesystem is mounted and available on a node. Access to -these filesystems are tested every few minutes on each node and the Slurm features are set accordingly. +If you need a local disk (i.e. `/tmp`) on a diskless cluster (e.g. [Barnard](barnard.md)) +use the feature `local_disk`.` + +A feature `fs_*` is active if a certain (global) filesystem is mounted and available on a node. +Access to these filesystems is tested every few minutes on each node and the Slurm features are +set accordingly. | Feature | Description | [Workspace Name](../data_lifecycle/workspaces.md#extension-of-a-workspace) | |:---------------------|:-------------------------------------------------------------------|:---------------------------------------------------------------------------| diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm_examples.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm_examples.md index e35bf836d0dbd56a0a03a0c87eb12fa1064f38e0..d705c86314b6d06a11b1ca1a061f802ed7c59905 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm_examples.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm_examples.md @@ -8,9 +8,9 @@ depend on the type of parallelization and architecture. ### OpenMP Jobs An SMP-parallel job can only run within a node, so it is necessary to include the options `--node=1` -and `--ntasks=1`. The maximum number of processors for an SMP-parallel program is 896 and 56 on -partition `taurussmp8` and `smp2`, respectively, as described in the -[section on memory limits](partitions_and_limits.md#memory-limits). Using the option +and `--ntasks=1`. The maximum number of processors for an SMP-parallel program is 896 on +partition `taurussmp8`, as described in the +[section on memory limits](slurm_limits.md#slurm-resource-limits-table). Using the option `--cpus-per-task=<N>` Slurm will start one task and you will have `N` CPUs available for your job. An example job file would look like: diff --git a/doc.zih.tu-dresden.de/docs/jobs_and_resources/partitions_and_limits.md b/doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm_limits.md similarity index 59% rename from doc.zih.tu-dresden.de/docs/jobs_and_resources/partitions_and_limits.md rename to doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm_limits.md index f887a766218fb2003d8e1e4226cf7b74706c8484..702028219c6018c82572de71313d4886fc888d11 100644 --- a/doc.zih.tu-dresden.de/docs/jobs_and_resources/partitions_and_limits.md +++ b/doc.zih.tu-dresden.de/docs/jobs_and_resources/slurm_limits.md @@ -1,4 +1,4 @@ -# Partitions and Limits +# Slurm Resource Limits There is no such thing as free lunch at ZIH systems. Since compute nodes are operated in multi-user node by default, jobs of several users can run at the same time at the very same node sharing @@ -64,45 +64,30 @@ to request it. ZIH systems comprise different sets of nodes with different amount of installed memory which affect where your job may be run. To achieve the shortest possible waiting time for your jobs, you should be aware of the limits shown in the -[Partitions and limits table](../jobs_and_resources/partitions_and_limits.md#slurm-partitions). - -## Slurm Partitions - -The available compute nodes are grouped into logical (possibly overlapping) sets, the so-called -**partitions**. You can submit your job to a certain partition using the Slurm option -`--partition=<partition-name>`. - -Some partitions/nodes have Simultaneous Multithreading (SMT) enabled. You request for this +[Slurm resource limits table](#slurm-resource-limits-table). + +## Slurm Resource Limits Table + +The physical installed memory might differ from the amount available for Slurm jobs. One reason are +so-called diskless compute nodes, i.e., nodes without additional local drive. At these nodes, the +operating system and other components reside in the main memory, lowering the available memory for +jobs. The reserved amount of memory for the system operation might vary slightly over time. The +following table depicts the resource limits for [all our HPC systems](hardware_overview.md). + +| HPC System | Nodes | # Nodes | Cores per Node | Threads per Core | Memory per Node [in MB] | Memory per (SMT) Core [in MB] | GPUs per Node | Job Max Time | +|:-----------|:------|--------:|---------------:|-----------------:|------------------------:|------------------------------:|--------------:|-------------:| +| gpu2 | taurusi[2045-2103] | 59 | 24 | 1 | 62,000 | 2,583 | 4 | | +| gpu2-interactive | taurusi[2045-2103] | 59 | 24 | 1 | 62,000 | 2,583 | 4 | | +| hpdlf | taurusa[3-16] | 14 | 12 | 1 | 95,000 | 7,916 | 3 | | +| [`Barnard`](barnard.md) | `n[1001-1630].barnard` | 630 | 104 | 2 | 515,000 | 4,951 | - | unlimited | +| [`Power9`](power9.md) | `ml[1-29].power9` | 29 | 44 | 4 | 254,000 | 1,443 | 6 | unlimited | +| [`Romeo`](romeo.md) | `i[8001-8190].romeo` | 190 | 128 | 2 | 505,000 | 1,972 | - | unlimited | +| [`Julia`](julia.md) | `smp8.julia` | 1 | 896 | 1 | 48,390,000 | 54,006 | - | unlimited | +| [`Alpha Centauri`](alpha_centauri.md) | `i[8001-8037].alpha` | 37 | 48 | 2 | 990,000 | 10,312 | 8 | unlimited | +{: summary="Slurm resource limits table" align="bottom"} + +All HPC systems have Simultaneous Multithreading (SMT) enabled. You request for this additional threads using the Slurm option `--hint=multithread` or by setting the environment variable `SLURM_HINT=multithread`. Besides the usage of the threads to speed up the computations, the memory of the other threads is allocated implicitly, too, and you will always get `Memory per Core`*`number of threads` as memory pledge. - -Some partitions have a *interactive* counterpart for interactive jobs. The corresponding partitions -are suffixed with `-interactive` (e.g. `ml-interactive`) and have the same configuration. - -There is also a meta partition `haswell`, which contains the partitions `haswell64`, -`haswell256` and `smp2`. `haswell` is also the default partition. If you specify no partition or -partition `haswell` a Slurm plugin will choose the partition which fits to your memory requirements. -There are some other partitions, which are not specified in the table above, but those partitions -should not be used directly. - -<!-- partitions_and_limits_table --> -| Partition | Nodes | # Nodes | Cores per Node | Threads per Core | Memory per Node [in MB] | Memory per Core [in MB] | GPUs per Node -|:--------|:------|--------:|---------------:|------------:|------------:|--------------:|--------------:| -| gpu2 | taurusi[2045-2103] | 59 | 24 | 1 | 62,000 | 2,583 | 4 | -| gpu2-interactive | taurusi[2045-2103] | 59 | 24 | 1 | 62,000 | 2,583 | 4 | -| haswell | taurusi[6001-6604],taurussmp[3-7] | 609 | | | | | | -| haswell64 | taurusi[6001-6540,6559-6604] | 586 | 24 | 1 | 61,000 | 2,541 | | -| haswell256 | taurusi[6541-6558] | 18 | 24 | 1 | 254,000 | 10,583 | | -| interactive | taurusi[6605-6612] | 8 | 24 | 1 | 61,000 | 2,541 | | -| smp2 | taurussmp[3-7] | 5 | 56 | 1 | 2,044,000 | 36,500 | | -| hpdlf | taurusa[3-16] | 14 | 12 | 1 | 95,000 | 7,916 | 3 | -| ml | taurusml[3-32] | 30 | 44 | 4 | 254,000 | 1,443 | 6 | -| ml-interactive | taurusml[1-2] | 2 | 44 | 4 | 254,000 | 1,443 | 6 | -| romeo | taurusi[7003-7192] | 190 | 128 | 2 | 505,000 | 1,972 | | -| romeo-interactive | taurusi[7001-7002] | 2 | 128 | 2 | 505,000 | 1,972 | | -| julia | taurussmp8 | 1 | 896 | 1 | 48,390,000 | 54,006 | | -| alpha | taurusi[8003-8034] | 32 | 48 | 2 | 990,000 | 10,312 | 8 | -| alpha-interactive | taurusi[8001-8002] | 2 | 48 | 2 | 990,000 | 10,312 | 8 | -{: summary="Partitions and limits table" align="bottom"} diff --git a/doc.zih.tu-dresden.de/docs/quickstart/getting_started.md b/doc.zih.tu-dresden.de/docs/quickstart/getting_started.md index 89cf13134f570bd45042580d586cd362f1c46ab6..2aef8ee741d6c712bf80477f957038c0d56d1025 100644 --- a/doc.zih.tu-dresden.de/docs/quickstart/getting_started.md +++ b/doc.zih.tu-dresden.de/docs/quickstart/getting_started.md @@ -1,27 +1,23 @@ # Quick Start -This page is intended to provide the key information on starting to work on the ZIH High -Performance Computing (HPC) system and is of particular importance to new users. -It is a map of the compendium as it provides an overview of the most relevant topics and -directs to the corresponding detailed articles within the compendium. +This page will give new users guidance through the steps needed to submit a High Performance +Computing (HPC) job: -The topics covered include: - -* Applying for the ZIH HPC login: things to know about obtaining access to the ZIH HPC -* Accessing the ZIH HPC system: the list of options and corresponding instructions -* Handling Data: the do's and don'ts of importing, transferring, managing data of your project -* Accessing software: understanding ZIH HPC software options for your software needs -* Running a job: linking all of the above together to successfully setup and execute your code/program +* Applying for the ZIH HPC login +* Accessing the ZIH HPC systems +* Transferring code/data to ZIH HPC systems +* Accessing software +* Running a parallel HPC job ## Introductory Instructions -The ZIH HPC system is a Linux system (as most HPC systems). Some basic Linux knowledge is -therefore needed. In preparation, explore the [collection](https://hpc-wiki.info/hpc/Shell) -of the most important Linux commands needed on the HPC system. +The ZIH HPC systems are Linux systems (as most HPC systems). Basic Linux knowledge will +be needed. Being familiar with this [collection](https://hpc-wiki.info/hpc/Shell) +of the most important Linux commands is helpful. -To work on the ZIH HPC system and to follow the instructions on this page as well as other +To work on the ZIH HPC systems and to follow the instructions on this page as well as other compendium pages, it is important to be familiar with the -[basic terminology](https://hpc-wiki.info/hpc/HPC-Dictionary) such as +[basic terminology](https://hpc-wiki.info/hpc/HPC-Dictionary) in HPC such as [SSH](https://hpc-wiki.info/hpc/SSH), [cluster](https://hpc-wiki.info/hpc/HPC-Dictionary#Cluster), [login node](https://hpc-wiki.info/hpc/HPC-Dictionary#Login_Node), [compute node](https://hpc-wiki.info/hpc/HPC-Dictionary#Backend_Node), @@ -31,38 +27,35 @@ compendium pages, it is important to be familiar with the If you are new to HPC, we recommend visiting the introductory article about HPC at [https://hpc-wiki.info/hpc/Getting_Started](https://hpc-wiki.info/hpc/Getting_Started). -Throughout the compendium `marie@login` is used as an indication of working on the ZIH HPC command +Throughout the compendium, `marie@login` is used as an indication of working on the ZIH HPC command line and `marie@local` as working on your local machine's command line. `marie` stands-in for your username. ## Obtaining Access -To use the ZIH HPC system, an ZIH HPC login is needed. It is different from the ZIH login (which -members of the TU Dresden have), but has the same credentials. +A ZIH HPC login is needed to use the systems. It is different from the ZIH login (which +members of the TU Dresden have), but has the same credentials. Apply for it via the +[HPC login application form](https://selfservice.zih.tu-dresden.de/index.php/hpclogin/noLogin). -The ZIH HPC system is structured by so-called HPC projects. To work on the ZIH HPC system, there -are two possibilities: +Since HPC is structured in projects, there are two possibilities to work on the ZIH HPC systems: * Creating a [new project](../application/project_request_form.md) * Joining an existing project: e.g. new researchers in an existing project, students in projects for teaching purposes. The details will be provided to you by the project administrator. -A HPC project on the ZIH HPC system includes: a project directory, project group, project members +A HPC project on the ZIH HPC systems includes: a project directory, project group, project members (at least admin and manager), and resource quotas for compute time (CPU/GPU hours) and storage. -One important aspect for HPC projects is a collaborative working style (research groups, student -groups for teaching purposes). Thus, granting appropriate file permissions and creating a unified -and consistent software environment for multiple users is essential. -This aspect is considered for all the following recommendations. +It is essential to grant appropriate file permissions so that newly added users can access a +project appropriately. -## Accessing the ZIH HPC System +## Accessing ZIH HPC Systems -The ZIH HPC system can be accessed only within the TU Dresden campus networks. -Access from outside is possible by establishing a +ZIH provides five homogeneous compute systems, called clusters. These can only be accessed +within the TU Dresden campus networks. Access from outside is possible by establishing a [VPN connection](https://tu-dresden.de/zih/dienste/service-katalog/arbeitsumgebung/zugang_datennetz/vpn#section-4). - -There are different ways to access the ZIH HPC system (which are described in more detail below), -depending on the user's needs and previous knowledge: +Each of these clusters can be accessed in the three ways described below, depending on the user's +needs and previous knowledge: * [JupyterHub](../access/jupyterhub.md): browser based connection, easiest way for beginners * [SSH connection](../access/ssh_login.md) (command line/terminal/console): "classical" connection, @@ -75,7 +68,9 @@ Next, the mentioned access methods are described step by step. ### JupyterHub -1. Access JupyterHub here [https://taurus.hrsk.tu-dresden.de/jupyter](https://taurus.hrsk.tu-dresden.de/jupyter). +1. Access JupyterHub at +[https://taurus.hrsk.tu-dresden.de/jupyter](https://taurus.hrsk.tu-dresden.de/jupyter) +(not yet available for barnard). 1. Start by clicking on the button `Start My Server` and you will see two Spawner Options, `Simple` and `Advanced`. 1. The `Simple` view offers a minimal selection of parameters to choose from. The `Advanced` @@ -87,10 +82,10 @@ for choice of parameters and then click `Spawn`  1. Once it loads, you will see the possibility between opening a `Notebook`, `Console` or `Other`. Note that you will now be working in your home directory as opposed to a specific workspace -(see [Data Management and Data Transfer](#data-management-and-data-transfer) section below for more details). +(see [Data Management and Data Transfer](#data-transfer-and-data-management) section below for more details). !!! caution "Stopping session on JupyterHub" - Once you are done with your work on the ZIH HPC system, explicitly stop the session by logging + Once you are done with your work on the ZIH HPC systems, explicitly stop the session by logging out by clicking `File` → `Log Out` → `Stop My Server`. Alternatively, choose `File` → `Hub Control Panel` → `Stop My Server`. @@ -100,14 +95,14 @@ Explore the [JupyterHub](../access/jupyterhub.md) page for more information. The more "classical" way to work with HPC is based on the command line. After following the instructions below, you will be on one of the login nodes. -This is the starting point for many tasks such as running programs and data management. +This is the starting point for many tasks such as launching jobs and doing data management. !!! hint "Using SSH key pair" We recommend to create an SSH key pair by following the [instructions here](../access/ssh_login.md#before-your-first-connection). Using an SSH key pair is beneficial for security reasons, although it is not necessary to work - with the ZIH HPC system. + with ZIH HPC systems. === "Windows 10 and higher/Mac/Linux users" @@ -115,7 +110,7 @@ This is the starting point for many tasks such as running programs and data mana 1. Open a terminal/shell/console and type in ```console - marie@local$ ssh marie@taurus.hrsk.tu-dresden.de + marie@local$ ssh marie@login2.barnard.hpc.tu-dresden.de ``` 1. After typing in your password, you end up seeing something like the following image. @@ -131,29 +126,29 @@ For more information explore the [access compendium page](../access/ssh_login.md [Configuring default parameters](../access/ssh_login.md#configuring-default-parameters-for-ssh) makes connecting more comfortable. -## Data Management and Data Transfer +## Data Transfer and Data Management First, it is shown how to create a workspace, then how to transfer data within and to/from the ZIH HPC system. Also keep in mind to set the file permissions when collaborating with other researchers. ### Create a Workspace -There are different areas for storing your data on the ZIH HPC system, called [Filesystems](../data_lifecycle/file_systems.md). -You need to create a [workspace](../data_lifecycle/workspaces.md) for your data (see example -below) on one of these filesystems. +There are different places for storing your data on ZIH HPC systems, called [Filesystems](../data_lifecycle/file_systems.md). +You need to create a [workspace](../data_lifecycle/workspaces.md) for your data on one of these +(see example below). The filesystems have different [properties](../data_lifecycle/file_systems.md) (available space, storage time limit, permission rights). Therefore, choose the one that fits your project best. -To start we recommend the Lustre filesystem **scratch**. +To start we recommend the Lustre filesystem **horse**. -!!! example "Creating a workspace on Lustre filesystem scratch" +!!! example "Creating a workspace on Lustre filesystem horse" The following command creates a workspace ```console - marie@login$ ws_allocate -F scratch -r 7 -m marie.testuser@tu-dresden.de -n test-workspace -d 90 + marie@login$ ws_allocate -F horse -r 7 -m marie@tu-dresden.de -n test-workspace -d 90 Info: creating workspace. - /scratch/ws/marie-test-workspace + /data/horse/ws/marie-test-workspace remaining extensions : 10 remaining time in days: 90 ``` @@ -161,17 +156,17 @@ To start we recommend the Lustre filesystem **scratch**. To explain: - `ws_allocate` - command to allocate - - `-F scratch` - on the scratch filesystem - - `-r 7 -m marie.testuser@tu-dresden.de` - send a reminder to `marie.testuser@tu-dresden.de` 7 days before expiration - - `-n test-workspace` - workspace's name + - `-F horse` - on the horse filesystem + - `-r 7 -m marie@tu-dresden.de` - send a reminder to `marie@tu-dresden.de` 7 days before expiration + - `-n test-workspace` - workspace name - `-d 90` - a life time of 90 days - The path to this workspace is `/scratch/ws/marie-test-workspace`. You will need it when + The path to this workspace is `/data/horse/ws/marie-test-workspace`. You will need it when transferring data or running jobs. Find more [information on workspaces in the compendium](../data_lifecycle/workspaces.md). -### Transferring Data **Within** the ZIH HPC System +### Transferring Data *Within* ZIH HPC Systems The approach depends on the data volume: up to 100 MB or above. @@ -180,7 +175,7 @@ The approach depends on the data volume: up to 100 MB or above. Use the command `cp` to copy the file `example.R` from your ZIH home directory to a workspace: ```console - marie@login$ cp /home/marie/example.R /scratch/ws/marie-test-workspace + marie@login$ cp /home/marie/example.R /data/horse/ws/marie-test-workspace ``` Analogously use command `mv` to move a file. @@ -190,48 +185,47 @@ The approach depends on the data volume: up to 100 MB or above. ???+ example "`dtcp`/`dtmv` for medium to large data (above 100 MB)" - Use the command `dtcp` to copy the directory `/warm_archive/ws/large-dataset` from one + Use the command `dtcp` to copy the directory `/walrus/ws/large-dataset` from one filesystem location to another: ```console - marie@login$ dtcp -r /warm_archive/ws/large-dataset /scratch/ws/marie-test-workspace/data + marie@login$ dtcp -r /walrus/ws/large-dataset /data/horse/ws/marie-test-workspace/data ``` - Analogously use the command `dtmv` to move a file. + Analogously use the command `dtmv` to move a file or folder. More details on the [datamover](../data_transfer/datamover.md) are available in the data transfer section. -### Transferring Data **To/From** the ZIH HPC System - -???+ example "`scp` for transferring data to the ZIH HPC system" +### Transferring Data *To/From* ZIH HPC Systems +<!-- [NT] currently not available +???+ example "`scp` for transferring data to ZIH HPC systems" - Copy the file `example.R` from your local machine to a workspace on the ZIH system: + Copy the file `example.R` from your local machine to a workspace on the ZIH systems: ```console - marie@local$ scp /home/marie/Documents/example.R marie@taurusexport.hrsk.tu-dresden.de:/scratch/ws/0/your_workspace/ + marie@local$ scp /home/marie/Documents/example.R marie@dataport1.hpc.tu-dresden.de:/data/horse/ws/your_workspace/ Password: example.R 100% 312 32.2KB/s 00:00`` ``` - Note, the target path contains `taurusexport.hrsk.tu-dresden.de`, which is one of the - so called [export nodes](../data_transfer/export_nodes.md) that allows for data transfer from/to the outside. + Note, the target path contains `dataport1.hpc.tu-dresden.de`, which is one of the + so called [dataport nodes](../data_transfer/dataport_nodes.md) that allows for data transfer from/to the outside. -???+ example "`scp` to transfer data from the ZIH HPC system to local machine" +???+ example "`scp` to transfer data from ZIH HPC systems to local machine" - Copy the file `results.csv` from a workspace on the ZIH HPC system to your local machine: + Copy the file `results.csv` from a workspace on the ZIH HPC systems to your local machine: ```console - marie@local$ scp marie@taurusexport.hrsk.tu-dresden.de:/scratch/ws/0/marie-test-workspace/results.csv /home/marie/Documents/ + marie@local$ scp marie@dataport1.hpc.tu-dresden.de:/data/horse/ws/marie-test-workspace/results.csv /home/marie/Documents/ ``` - Feel free to explore further [examples](http://bropages.org/scp) of the `scp` command. - Furthermore, checkout other possibilities on the compendium for working with the - [export nodes](../data_transfer/export_nodes.md). - + Feel free to explore further [examples](http://bropages.org/scp) of the `scp` command + and possibilities of the [dataport nodes](../data_transfer/dataport_nodes.md). +--> !!! caution "Terabytes of data" - If you are planning to move terabytes or even more from an outside machine into the ZIH system, - please contact the ZIH [HPC support](mailto:hpcsupport@tu-dresden.de) in advance. + If you are planning to move terabytes or even more from an outside machine into ZIH systems, + please contact the ZIH [HPC support](mailto:hpc-support@tu-dresden.de) in advance. ### Permission Rights @@ -252,13 +246,13 @@ in Linux. permissions for write access for the group (`chmod g+w`). ```console - marie@login$ ls -la /scratch/ws/0/marie-training-data/dataset.csv # list file permissions - -rw-r--r-- 1 marie p_number_crunch 0 12. Jan 15:11 /scratch/ws/0/marie-training-data/dataset.csv + marie@login$ ls -la /data/horse/ws/marie-training-data/dataset.csv # list file permissions + -rw-r--r-- 1 marie p_number_crunch 0 12. Jan 15:11 /data/horse/ws/marie-training-data/dataset.csv - marie@login$ chmod g+w /scratch/ws/0/marie-training-data/dataset.csv # add write permissions + marie@login$ chmod g+w /data/horse/ws/marie-training-data/dataset.csv # add write permissions - marie@login$ ls -la /scratch/ws/0/marie-training-data/dataset.csv # list file permissions again - -rw-rw-r-- 1 marie p_number_crunch 0 12. Jan 15:11 /scratch/ws/0/marie-training-data/dataset.csv + marie@login$ ls -la /data/horse/ws/marie-training-data/dataset.csv # list file permissions again + -rw-rw-r-- 1 marie p_number_crunch 0 12. Jan 15:11 /data/horse/ws/marie-training-data/dataset.csv ``` ??? hint "GUI-based data management" @@ -276,7 +270,7 @@ in Linux. ## Software Environment -The [software](../software/overview.md) on the ZIH HPC system is not installed system-wide, +The [software](../software/overview.md) on the ZIH HPC systems is not installed system-wide, but is provided within so-called [modules](../software/modules.md). In order to use specific software you need to "load" the respective module. This modifies the current environment (so only for the current user in the current session) @@ -284,10 +278,11 @@ such that the software becomes available. !!! note - Different partitions might have different versions available of the same software. - See [software](../software/overview.md) for more details. + Different clusters (HPC systems) have different software or might have different versions of + the same available software. See [software](../software/overview.md) for more details. -- Use `module spider <software>` command to check all available versions of the software. +Use the command `module spider <software>` to check all available versions of a software that is +available on the one specific system you are currently on: ```console marie@login$ module spider Python @@ -320,9 +315,9 @@ marie@login$ module spider Python -------------------------------------------------------------------------------------------------------------------------------- ``` -We now see the list of versions of Python that are available. +We now see the list of available Python versions. -- To get information on a specific module, use `module spider <software>/<version>` call. +- To get information on a specific module, use `module spider <software>/<version>`: ```console hl_lines="9 10 11" marie@login$ module spider Python/3.9.5 @@ -396,10 +391,10 @@ For additional information refer to the detailed documentation on [modules](../s e.g. `numpy`, `tensorflow` or `pytorch`. Those modules may provide much better performance than the packages found on PyPi (installed via `pip`) which have to work on any system while our installation is optimized for - the ZIH system to make the best use of the specific CPUs and GPUs found here. + each ZIH system to make the best use of the specific CPUs and GPUs found here. However the Python package ecosystem (like others) is very heterogeneous and dynamic, with daily updates. - The central update cycle for software on the ZIH HPC system is approximately every six months. + The central update cycle for software on ZIH HPC systems is approximately every six months. So the software installed as modules might be a bit older. !!! warning @@ -412,25 +407,25 @@ For additional information refer to the detailed documentation on [modules](../s At HPC systems, computational work and resource requirements are encapsulated into so-called jobs. Since all computational resources are shared with other users, these resources need to be -allocated. For managing these allocations a so-called job scheduler or a batch system is used. -On the ZIH system, the job scheduler used is [Slurm](https://slurm.schedmd.com/quickstart.html). +allocated. For managing these allocations a so-called job scheduler or a batch system is used - +on ZIH systems this is [Slurm](https://slurm.schedmd.com/quickstart.html). It is possible to run a job [interactively](../jobs_and_resources/slurm.md#interactive-jobs) -(real time execution) or submit a [batch job](../jobs_and_resources/slurm.md#batch-jobs) +(real time execution) or to submit it as a [batch job](../jobs_and_resources/slurm.md#batch-jobs) (scheduled execution). -For beginners, we highly advise to run the job interactively. To do so, use the `srun` command. - -Here, among the other options it is possible to define a partition you would like to work on -(`--partition`), the number of tasks (`--ntasks`), number of CPUs per task (`--cpus-per-task`), +For beginners, we highly advise to run the job interactively. To do so, use the `srun` command +on any of the ZIH HPC clusters (systems). +For this `srun` command, it is possible to define options like the number of tasks (`--ntasks`), +number of CPUs per task (`--cpus-per-task`), the amount of time you would like to keep this interactive session open (`--time`), memory per CPU (`--mem-per-cpu`) and many others. See [Slurm documentation](../jobs_and_resources/slurm.md#interactive-jobs) for more details. ```console -marie@login$ srun --partition=haswell --ntasks=1 --cpus-per-task=4 --time=1:00:00 --mem-per-cpu=1700 --pty bash -l #allocate 4 cores for the interactive job -marie@haswell$ module load Python #load necessary packages -marie@haswell$ cd /scratch/ws/0/marie-test-workspace/ #go to your created workspace -marie@haswell$ python test.py #execute your file +marie@login$ srun --ntasks=1 --cpus-per-task=4 --time=1:00:00 --mem-per-cpu=1700 --pty bash -l #allocate 4 cores for the interactive job +marie@compute$ module load Python #load necessary packages +marie@compute$ cd /data/horse/ws/marie-test-workspace/ #go to your created workspace +marie@compute$ python test.py #execute your file Hello, World! ``` diff --git a/doc.zih.tu-dresden.de/docs/software/big_data_frameworks.md b/doc.zih.tu-dresden.de/docs/software/big_data_frameworks.md index 47c7567b1a063a4b67cca2982d53bf729b288295..c4e7fad813a4e41cec05d63bb27c53d7b383e0d9 100644 --- a/doc.zih.tu-dresden.de/docs/software/big_data_frameworks.md +++ b/doc.zih.tu-dresden.de/docs/software/big_data_frameworks.md @@ -37,16 +37,16 @@ as via [Jupyter notebooks](#jupyter-notebook). All three ways are outlined in th ### Default Configuration -The Spark and Flink modules are available in both `scs5` and `ml` environments. -Thus, Spark and Flink can be executed using different CPU architectures, e.g., Haswell and Power9. +The Spark and Flink modules are available in the `power` environment. +Thus, Spark and Flink can be executed using different CPU architectures, e.g., Power. Let us assume that two nodes should be used for the computation. Use a `srun` command similar to -the following to start an interactive session using the partition `haswell`. The following code -snippet shows a job submission to haswell nodes with an allocation of two nodes with 60000 MB main +the following to start an interactive session. The following code +snippet shows a job submission with an allocation of two nodes with 60000 MB main memory exclusively for one hour: ```console -marie@login$ srun --partition=haswell --nodes=2 --mem=60000M --exclusive --time=01:00:00 --pty bash -l +marie@login.power$ srun --nodes=2 --mem=60000M --exclusive --time=01:00:00 --pty bash -l ``` Once you have the shell, load desired Big Data framework using the command @@ -117,11 +117,11 @@ can start with a copy of the default configuration ahead of your interactive ses === "Spark" ```console - marie@login$ cp -r $SPARK_HOME/conf my-config-template + marie@login.power$ cp -r $SPARK_HOME/conf my-config-template ``` === "Flink" ```console - marie@login$ cp -r $FLINK_ROOT_DIR/conf my-config-template + marie@login.power$ cp -r $FLINK_ROOT_DIR/conf my-config-template ``` After you have changed `my-config-template`, you can use your new template in an interactive job @@ -175,7 +175,6 @@ example below: ```bash #!/bin/bash -l #SBATCH --time=01:00:00 - #SBATCH --partition=haswell #SBATCH --nodes=2 #SBATCH --exclusive #SBATCH --mem=60000M @@ -205,7 +204,6 @@ example below: ```bash #!/bin/bash -l #SBATCH --time=01:00:00 - #SBATCH --partition=haswell #SBATCH --nodes=2 #SBATCH --exclusive #SBATCH --mem=60000M diff --git a/doc.zih.tu-dresden.de/docs/software/building_software.md b/doc.zih.tu-dresden.de/docs/software/building_software.md index 73952b06efde809b7e91e936be0fbf9b240f88a8..e61a8a52d2e8d5ec07883eae32f31dd36edba9ab 100644 --- a/doc.zih.tu-dresden.de/docs/software/building_software.md +++ b/doc.zih.tu-dresden.de/docs/software/building_software.md @@ -3,9 +3,9 @@ While it is possible to do short compilations on the login nodes, it is generally considered good practice to use a job for that, especially when using many parallel make processes. Since 2016, the `/projects` filesystem is mounted read-only on all compute -nodes in order to prevent users from doing large I/O there (which is what the `/scratch` is for). +nodes in order to prevent users from doing large I/O there (which is what the `/data/horse` is for). In consequence, you cannot compile in `/projects` within a job. If you wish to install -software for your project group anyway, you can use a build directory in the `/scratch` filesystem +software for your project group anyway, you can use a build directory in the `/data/horse` filesystem instead. Every sane build system should allow you to keep your source code tree and your build directory @@ -19,11 +19,11 @@ For instance, when using CMake and keeping your source in `/projects`, you could # save path to your source directory: marie@login$ export SRCDIR=/projects/p_number_crunch/mysource -# create a build directory in /scratch: -marie@login$ mkdir /scratch/p_number_crunch/mysoftware_build +# create a build directory in /data/horse: +marie@login$ mkdir /data/horse/p_number_crunch/mysoftware_build -# change to build directory within /scratch: -marie@login$ cd /scratch/p_number_crunch/mysoftware_build +# change to build directory within /data/horse: +marie@login$ cd /data/horse/p_number_crunch/mysoftware_build # create Makefiles: marie@login$ cmake -DCMAKE_INSTALL_PREFIX=/projects/p_number_crunch/mysoftware $SRCDIR @@ -35,5 +35,5 @@ marie@login$ srun --mem-per-cpu=1500 --cpus-per-task=12 --pty make -j 12 marie@login$ make install ``` -As a bonus, your compilation should also be faster in the parallel `/scratch` filesystem than it +As a bonus, your compilation should also be faster in the parallel `/data/horse` filesystem than it would be in the comparatively slow NFS-based `/projects` filesystem. diff --git a/doc.zih.tu-dresden.de/docs/software/cfd.md b/doc.zih.tu-dresden.de/docs/software/cfd.md index 7cb4b2521eaddee6e7997b2fab109e31a7e4c5bf..7e0eebc74081f188a1441eea1136df0ba636458e 100644 --- a/doc.zih.tu-dresden.de/docs/software/cfd.md +++ b/doc.zih.tu-dresden.de/docs/software/cfd.md @@ -41,7 +41,7 @@ marie@login$ # source $FOAM_CSH OUTFILE="Output" module load OpenFOAM source $FOAM_BASH - cd /scratch/ws/1/marie-example-workspace # work directory using workspace + cd /horse/ws/marie-example-workspace # work directory using workspace srun pimpleFoam -parallel > "$OUTFILE" ``` @@ -61,7 +61,7 @@ geometry and mesh generator cfx5pre, and the post-processor cfx5post. #SBATCH --mail-type=ALL module load ANSYS - cd /scratch/ws/1/marie-example-workspace # work directory using workspace + cd /horse/ws/marie-example-workspace # work directory using workspace cfx-parallel.sh -double -def StaticMixer.def ``` @@ -116,3 +116,24 @@ Slurm job environment that can be passed to `starccm+`, enabling it to run acros INPUT_FILE="your_simulation.sim" starccm+ -collab -rsh ssh -cpubind off -np $SLURM_NTASKS -on $(/sw/taurus/tools/slurmtools/default/bin/create_rankfile -f CCM) -batch -power -licpath $LICPATH -podkey $PODKEY $INPUT_FILE ``` + +!!! note + The software path of the script `create_rankfile -f CCM` is different on the + [new HPC system Barnard](../jobs_and_resources/hardware_overview.md#barnard). + +???+ example + ```bash + #!/bin/bash + #SBATCH --time=12:00 # walltime + #SBATCH --ntasks=32 # number of processor cores (i.e. tasks) + #SBATCH --mem-per-cpu=2500M # memory per CPU core + #SBATCH --mail-user=marie@tu-dresden.de # email address (only tu-dresden) + #SBATCH --mail-type=ALL + + module load STAR-CCM+ + + LICPATH="port@host" + PODKEY="your podkey" + INPUT_FILE="your_simulation.sim" + starccm+ -collab -rsh ssh -cpubind off -np $SLURM_NTASKS -on $(/software/util/slurm/bin/create_rankfile -f CCM) -batch -power -licpath $LICPATH -podkey $PODKEY $INPUT_FILE + ``` diff --git a/doc.zih.tu-dresden.de/docs/software/compilers.md b/doc.zih.tu-dresden.de/docs/software/compilers.md index 1ee00ce46b589b4b65cba4ded2af46a1b8e6b9a5..5206e9ddd06f6110aff347863b22f421252132c1 100644 --- a/doc.zih.tu-dresden.de/docs/software/compilers.md +++ b/doc.zih.tu-dresden.de/docs/software/compilers.md @@ -67,17 +67,28 @@ Different architectures of CPUs feature different vector extensions (like SSE4.2 to accelerate computations. The following matrix shows proper compiler flags for the architectures at the ZIH: -| Architecture | GCC | Intel | PGI | -|--------------------|----------------------|----------------------|-----| -| Intel Haswell | `-march=haswell` | `-march=haswell` | `-tp=haswell` | -| AMD Rome | `-march=znver2` | `-march=core-avx2` | `-tp=zen` | -| Intel Cascade Lake | `-march=cascadelake` | `-march=cascadelake` | `-tp=skylake` | -| Host's architecture | `-march=native` | `-xHost` | | - -To build an executable for different node types (e.g. Cascade Lake with AVX512 and -Haswell without AVX512) the option `-march=haswell -axcascadelake` (for Intel compilers) -uses vector extension up to AVX2 as default path and runs along a different execution -path if AVX512 is available. +| HPC System | Architecture | GCC | Intel | Nvidia HPC | +|------------|--------------------|----------------------|----------------------|-----| +| [`Alpha Centauri`](../jobs_and_resources/alpha_centauri.md) | AMD Rome | `-march=znver2` | `-march=core-avx2` | `-tp=zen2` | +| [`Barnard`](../jobs_and_resources/barnard.md) | AMD Sapphire Rapids | `-march=sapphirerapids` | `-march=core-sapphirerapids` | | +| [`Julia`](../jobs_and_resources/julia.md) | Intel Cascade Lake | `-march=cascadelake` | `-march=cascadelake` | `-tp=cascadelake` | +| [`Romeo`](../jobs_and_resources/romeo.md) | AMD Rome | `-march=znver2` | `-march=core-avx2` | `-tp=zen2` | +| All x86 | Host's architecture | `-march=native` | `-xHost` or `-march=native` | `-tp=host` | +| [`Power9`](../jobs_and_resources/power9.md) | IBM Power9 | `-mcpu=power9` or `-mcpu=native` | | `-tp=pwr9` or `-tp=host` | + +To build an executable for different node types with the Intel compiler, use +`-axcode`, where `code` is to be replaced with one or more target architectures. +For Cascade Lake and Sapphire Rapids. the option `-axcascadelake,sapphirerapids` +(for Intel compilers) instructs the compiler to optimized code paths for the +specified architecture(s), if possible. +If the application is executed on one of these architectures, the optimized code +path will be chosen. +A baseline code path will also be generated. +This path is used on other architectures than the specified ones and is used +in code sections that were not optimized by the compiler for a specific architecture. +Other optimization flags can be used as well for, e.g. `-O3`. +However, the `-march` option cannot be used here, as this will overwrite the +`-axcode` option. This increases the size of the program code (might result in poorer L1 instruction cache hits) but enables to run the same program on -different hardware types. +different hardware types with compiler optimizations. diff --git a/doc.zih.tu-dresden.de/docs/software/containers.md b/doc.zih.tu-dresden.de/docs/software/containers.md index e40242e9a6531512965693e2a610c46e8eff02ef..402dedb5e1b668d3c8d71398a78bb97a2fb0319e 100644 --- a/doc.zih.tu-dresden.de/docs/software/containers.md +++ b/doc.zih.tu-dresden.de/docs/software/containers.md @@ -101,7 +101,7 @@ You can create a new custom container on your workstation, if you have root righ !!! attention "Respect the micro-architectures" - You cannot create containers for the partition `ml`, as it bases on Power9 micro-architecture + You cannot create containers for the cluster `Power`, as it bases on Power9 micro-architecture which is different to the x86 architecture in common computers/laptops. For that you can use the [VM Tools](singularity_power9.md). @@ -260,12 +260,12 @@ marie@compute$ singularity shell my-container.sif automatically and instead set up your binds manually via `-B` parameter. Example: ```console - marie@compute$ singularity shell --contain -B /scratch,/my/folder-on-host:/folder-in-container my-container.sif + marie@compute$ singularity shell --contain -B /data/horse,/my/folder-on-host:/folder-in-container my-container.sif ``` You can write into those folders by default. If this is not desired, add an `:ro` for read-only to -the bind specification (e.g. `-B /scratch:/scratch:ro\`). Note that we already defined bind paths -for `/scratch`, `/projects` and `/sw` in our global `singularity.conf`, so you needn't use the `-B` +the bind specification (e.g. `-B /data/horse:/data/horse:ro\`). Note that we already defined bind paths +for `/data/horse`, `/projects` and `/sw` in our global `singularity.conf`, so you needn't use the `-B` parameter for those. If you wish to install additional packages, you have to use the `-w` parameter to @@ -280,7 +280,7 @@ Singularity.my-container.sif> yum install htop The `-w` parameter should only be used to make permanent changes to your container, not for your productive runs (it can only be used writable by one user at the same time). You should write your -output to the usual ZIH filesystems like `/scratch`. Launching applications in your container +output to the usual ZIH filesystems like `/data/horse`. Launching applications in your container #### Run a Command Inside the Container @@ -351,9 +351,12 @@ One common use-case for containers is that you need an operating system with a n [glibc](https://www.gnu.org/software/libc/) version than what is available on ZIH systems. E.g., the bullx Linux on ZIH systems used to be based on RHEL 6 having a rather dated glibc version 2.12, some binary-distributed applications didn't work on that anymore. You can use one of our pre-made CentOS -7 container images (`/scratch/singularity/centos7.img`) to circumvent this problem. Example: +7 container images (`/data/horse/lustre/scratch2/singularity/centos7.img`) to circumvent this +problem. -```console -marie@compute$ singularity exec /scratch/singularity/centos7.img ldd --version -ldd (GNU libc) 2.17 +!!! example + + ```console + marie@compute$ singularity exec /data/horse/lustre/scratch2/singularity/centos7.img ldd --version + ldd (GNU libc) 2.17 ``` diff --git a/doc.zih.tu-dresden.de/docs/software/custom_easy_build_environment.md b/doc.zih.tu-dresden.de/docs/software/custom_easy_build_environment.md index 15063e28c0d378c0a64c3f4bf86cd85190e2605c..983e35d9833eec679eb02633dbaaac96257fbcc3 100644 --- a/doc.zih.tu-dresden.de/docs/software/custom_easy_build_environment.md +++ b/doc.zih.tu-dresden.de/docs/software/custom_easy_build_environment.md @@ -52,21 +52,21 @@ install your modules. You need a place where your modules are placed. This needs once: ```console -marie@login$ ws_allocate -F scratch EasyBuild 50 +marie@login$ ws_allocate EasyBuild 50 marie@login$ ws_list | grep 'directory.*EasyBuild' - workspace directory : /scratch/ws/1/marie-EasyBuild + workspace directory : /data/horse/ws/marie-EasyBuild ``` **Step 2:** Allocate nodes. You can do this with interactive jobs (see the example below) and/or put commands in a batch file and source it. The latter is recommended for non-interactive jobs, using the command `sbatch` instead of `srun`. For the sake of illustration, we use an interactive job as an example. Depending on the partitions that you want the module to be usable on -later, you need to select nodes with the same architecture. Thus, use nodes from partition `ml` for -building, if you want to use the module on nodes of that partition. In this example, we assume -that we want to use the module on nodes with x86 architecture and thus, we use Haswell nodes. +later, you need to select nodes with the same architecture. Thus, use nodes from cluster `power` for +building, if you want to use the module on nodes of that cluster. ~~In this example, we assume +that we want to use the module on nodes with x86 architecture and thus, we use Haswell nodes.~~ ```console -marie@login$ srun --partition=haswell --nodes=1 --cpus-per-task=4 --time=08:00:00 --pty /bin/bash -l +marie@login$ srun --nodes=1 --cpus-per-task=4 --time=08:00:00 --pty /bin/bash -l ``` !!! warning @@ -76,8 +76,12 @@ marie@login$ srun --partition=haswell --nodes=1 --cpus-per-task=4 --time=08:00:0 **Step 3:** Specify the workspace. The rest of the guide is based on it. Please create an environment variable called `WORKSPACE` with the path to your workspace: +_The module environments /hiera, /scs5, /classic and /ml originated from the taurus system are +momentarily under construction. The script will be updated after completion of the redesign +accordingly_ + ```console -marie@compute$ export WORKSPACE=/scratch/ws/1/marie-EasyBuild #see output of ws_list above +marie@compute$ export WORKSPACE=/data/horse/ws/marie-EasyBuild #see output of ws_list above ``` **Step 4:** Load the correct module environment `modenv` according to your current or target @@ -98,17 +102,61 @@ architecture: marie@compute$ module load EasyBuild ``` -**Step 6:** Set up your environment: +**Step 6:** Set up the EasyBuild configuration. + +This can be either done via environment variables: ```console -marie@compute$ export EASYBUILD_ALLOW_LOADED_MODULES=EasyBuild,modenv/scs5 -marie@compute$ export EASYBUILD_DETECT_LOADED_MODULES=unload -marie@compute$ export EASYBUILD_BUILDPATH="/tmp/${USER}-EasyBuild${SLURM_JOB_ID:-}" -marie@compute$ export EASYBUILD_SOURCEPATH="${WORKSPACE}/sources" -marie@compute$ export EASYBUILD_INSTALLPATH="${WORKSPACE}/easybuild-$(basename $(readlink -f /sw/installed))" -marie@compute$ export EASYBUILD_INSTALLPATH_MODULES="${EASYBUILD_INSTALLPATH}/modules" -marie@compute$ module use "${EASYBUILD_INSTALLPATH_MODULES}/all" -marie@compute$ export LMOD_IGNORE_CACHE=1 +marie@compute$ export EASYBUILD_ALLOW_LOADED_MODULES=EasyBuild \ +export EASYBUILD_DETECT_LOADED_MODULES=unload \ +export EASYBUILD_BUILDPATH="/tmp/${USER}-EasyBuild${SLURM_JOB_ID:-}" \ +export EASYBUILD_SOURCEPATH="${WORKSPACE}/sources" \ +export EASYBUILD_INSTALLPATH="${WORKSPACE}/easybuild" \ +export EASYBUILD_SUBDIR_MODULES="modules" \ +export EASYBUILD_MODULE_NAMING_SCHEME="HierarchicalMNS" \ +export EASYBUILD_MODULE_DEPENDS_ON=1 \ +export EASYBUILD_HOOKS="/software/util/easybuild/ebhooks.py" +``` + +Or you can do that via the configuration file at `$HOME/.config/easybuild/config.cfg`. +An initial file can be generated with: + +```console +marie@compute$ eb --confighelp > ~/.config/easybuild/config.cfg +``` + +Edit this file by uncommenting the above settings and specifying the respective values. +Note the difference in naming as each setting in the environment has the `EASYBUILD_` prefix +and is uppercase, while it is lowercase in the config. +For example `$EASYBUILD_MODULE_NAMING_SCHEME` above corresponds to `module-naming-scheme` +in the config file. + +Note that you cannot use environment variables (like `$WORKSPACE` or `$USER`) in the config file. +So the approach with the `$EASYBUILD_` variables is more flexible but needs to be done before each +use of EasyBuild and could be forgotten. + +You can also combine those approaches setting some in the config and some in the environment, +the latter will take precedence. +The configuration used can be shown via: + +```console +marie@compute$ eb --showconfig +``` + +This shows all changed/non-default options while the parameter `--show-full-config` shows all options. + +The hierarchical module naming scheme (used on Barnard) affects e.g. location and naming of modules. +In order for EasyBuild to use the existing modules, +you need to use the "all" modules folder of the main tree. +But likely only the "Core" subdirectory is set in `$MODULEPATH`. +Nonetheless, the module search path can be extended easily with `module use`: + +```console +marie@compute$ echo $MODULEPATH +/software/modules/rapids/r23.10/all/Core:/software/modules/releases/rapids +marie@compute$ module use /software/modules/rapids/r23.10/all +marie@compute$ echo $MODULEPATH +/software/modules/rapids/r23.10/all:/software/modules/rapids/r23.10/all/Core:/software/modules/releases/rapids ``` **Step 7:** Now search for an existing EasyConfig: @@ -123,18 +171,41 @@ marie@compute$ eb --search TensorFlow marie@compute$ eb TensorFlow-1.8.0-fosscuda-2018a-Python-3.6.4.eb -r ``` -This may take a long time. After this is done, you can load it just like any other module. +This may take a long time. + +If you want to investigate what would be build by that command, first run it with `-D`: + +```console +marie@compute$ eb TensorFlow-1.8.0-fosscuda-2018a-Python-3.6.4.eb -Dr +``` + +**Step 9:** To use your custom build modules you need to load the "base" modenv (see step 4) +and add your custom modules to the search path. + +Using the variable from step 6: + +```console +marie@compute$ module use "${EASYBUILD_INSTALLPATH}/modules/all" +marie@compute$ export LMOD_IGNORE_CACHE=1 +``` + +**OR** directly the path from step 1: + +```console +marie@compute$ module use "/data/horse/ws/marie-EasyBuild/easybuild/modules/all" +marie@compute$ export LMOD_IGNORE_CACHE=1 +``` -**Step 9:** To use your custom build modules you only need to rerun steps 3, 4, 5, 6 and execute -the usual: +Then you can load it just like any other module: ```console marie@compute$ module load TensorFlow-1.8.0-fosscuda-2018a-Python-3.6.4 #replace with the name of your module ``` The key is the `module use` command, which brings your modules into scope, so `module load` can find -them. The `LMOD_IGNORE_CACHE` line makes `LMod` pick up the custom modules instead of searching the -system cache. +them. +The `LMOD_IGNORE_CACHE` line makes `LMod` pick up the custom modules instead of searching the +system cache which doesn't include your new modules. ## Troubleshooting diff --git a/doc.zih.tu-dresden.de/docs/software/data_analytics_with_python.md b/doc.zih.tu-dresden.de/docs/software/data_analytics_with_python.md index 38d198969801a913287d92ffc300b0447bfacddb..28f7de81c34ad0d7c9eb6b649e073bb65b737b90 100644 --- a/doc.zih.tu-dresden.de/docs/software/data_analytics_with_python.md +++ b/doc.zih.tu-dresden.de/docs/software/data_analytics_with_python.md @@ -18,9 +18,9 @@ a research group and/or teaching class. For this purpose, The interactive Python interpreter can also be used on ZIH systems via an interactive job: ```console -marie@login$ srun --partition=haswell --gres=gpu:1 --ntasks=1 --cpus-per-task=7 --pty --mem-per-cpu=8000 bash -marie@haswell$ module load Python -marie@haswell$ python +marie@login$ srun --gres=gpu:1 --ntasks=1 --cpus-per-task=7 --pty --mem-per-cpu=8000 bash +marie@compute$ module load Python +marie@compute$ python Python 3.8.6 (default, Feb 17 2021, 11:48:51) [GCC 10.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. @@ -50,7 +50,7 @@ threads that can be used in parallel depends on the number of cores (parameter ` within the Slurm request, e.g. ```console -marie@login$ srun --partition=haswell --cpus-per-task=4 --mem=2G --hint=nomultithread --pty --time=8:00:00 bash +marie@login$ srun --cpus-per-task=4 --mem=2G --hint=nomultithread --pty --time=8:00:00 bash ``` The above request allows to use 4 parallel threads. @@ -239,7 +239,7 @@ from distributed import Client from dask_jobqueue import SLURMCluster from dask import delayed -cluster = SLURMCluster(queue='alpha', +cluster = SLURMCluster( cores=8, processes=2, project='p_number_crunch', @@ -294,7 +294,7 @@ for the Monte-Carlo estimation of Pi. #create a Slurm cluster, please specify your project - cluster = SLURMCluster(queue='alpha', cores=2, project='p_number_crunch', memory="8GB", walltime="00:30:00", extra=['--resources gpu=1'], scheduler_options={"dashboard_address": f":{portdash}"}) + cluster = SLURMCluster(cores=2, project='p_number_crunch', memory="8GB", walltime="00:30:00", extra=['--resources gpu=1'], scheduler_options={"dashboard_address": f":{portdash}"}) #submit the job to the scheduler with the number of nodes (here 2) requested: @@ -434,12 +434,15 @@ comm = MPI.COMM_WORLD print("%d of %d" % (comm.Get_rank(), comm.Get_size())) ``` +_The module environments /hiera, /scs5, /classic and /ml originated from the taurus system are +momentarily under construction. The script will be updated after completion of the redesign +accordingly_ + For the multi-node case, use a script similar to this: ```bash #!/bin/bash #SBATCH --nodes=2 -#SBATCH --partition=ml #SBATCH --tasks-per-node=2 #SBATCH --cpus-per-task=1 diff --git a/doc.zih.tu-dresden.de/docs/software/data_analytics_with_r.md b/doc.zih.tu-dresden.de/docs/software/data_analytics_with_r.md index c7334d918cdba1ff5e97c2f4cd0ea3b788b2c26d..35d0b78e2ec0dfd1804b185a8f9c4e66dd1aa8d1 100644 --- a/doc.zih.tu-dresden.de/docs/software/data_analytics_with_r.md +++ b/doc.zih.tu-dresden.de/docs/software/data_analytics_with_r.md @@ -6,7 +6,7 @@ classical statistical tests, time-series analysis, classification, etc.), machin algorithms and graphical techniques. R is an integrated suite of software facilities for data manipulation, calculation and graphing. -We recommend using the partitions Haswell and/or Romeo to work with R. For more details +We recommend using the clusters Barnard and/or Romeo to work with R. For more details see our [hardware documentation](../jobs_and_resources/hardware_overview.md). ## R Console @@ -14,14 +14,18 @@ see our [hardware documentation](../jobs_and_resources/hardware_overview.md). In the following example, the `srun` command is used to start an interactive job, so that the output is visible to the user. Please check the [Slurm page](../jobs_and_resources/slurm.md) for details. +_The module environments /hiera, /scs5, /classic and /ml originated from the taurus system are +momentarily under construction. The script will be updated after completion of the redesign +accordingly_ + ```console -marie@login$ srun --partition=haswell --ntasks=1 --nodes=1 --cpus-per-task=4 --mem-per-cpu=2541 --time=01:00:00 --pty bash -marie@haswell$ module load modenv/scs5 -marie@haswell$ module load R/3.6 +marie@login.barnard$ srun --ntasks=1 --nodes=1 --cpus-per-task=4 --mem-per-cpu=2541 --time=01:00:00 --pty bash +marie@barnard$ module load modenv/scs5 +marie@barnard$ module load R/3.6 [...] Module R/3.6.0-foss-2019a and 56 dependencies loaded. -marie@haswell$ which R -marie@haswell$ /sw/installed/R/3.6.0-foss-2019a/bin/R +marie@barnard$ which R +marie@barnard$ /sw/installed/R/3.6.0-foss-2019a/bin/R ``` Using interactive sessions is recommended only for short test runs, while for larger runs batch jobs @@ -30,7 +34,7 @@ should be used. Examples can be found on the [Slurm page](../jobs_and_resources/ It is also possible to run `Rscript` command directly (after loading the module): ```console -marie@haswell$ Rscript </path/to/script/your_script.R> <param1> <param2> +marie@barnard$ Rscript </path/to/script/your_script.R> <param1> <param2> ``` ## R in JupyterHub @@ -263,6 +267,10 @@ Submitting a multicore R job to Slurm is very similar to submitting an [OpenMP Job](../jobs_and_resources/binding_and_distribution_of_tasks.md), since both are running multicore jobs on a **single** node. Below is an example: +_The module environments /hiera, /scs5, /classic and /ml originated from the taurus system are +momentarily under construction. The script will be updated after completion of the redesign +accordingly_ + ```Bash #!/bin/bash #SBATCH --nodes=1 @@ -391,7 +399,7 @@ Another example: list_of_averages <- parLapply(X=sample_sizes, fun=average, cl=cl) # shut down the cluster - #snow::stopCluster(cl) # usually it hangs over here with OpenMPI > 2.0. In this case this command may be avoided, Slurm will clean up after the job finishes + #snow::stopCluster(cl) # usually it hangs over here with Open MPI > 2.0. In this case this command may be avoided, Slurm will clean up after the job finishes ``` To use Rmpi and MPI please use one of these partitions: `haswell`, `broadwell` or `rome`. diff --git a/doc.zih.tu-dresden.de/docs/software/distributed_training.md b/doc.zih.tu-dresden.de/docs/software/distributed_training.md index 4e8fc427e71bd28ad1a3b663aba82d11bad088e6..096281640d192b434714cfecb30628026137d1a3 100644 --- a/doc.zih.tu-dresden.de/docs/software/distributed_training.md +++ b/doc.zih.tu-dresden.de/docs/software/distributed_training.md @@ -99,13 +99,12 @@ Each worker runs the training loop independently. TensorFlow is available as a module. Check for the version. The `TF_CONFIG` environment variable can be set as a prefix to the command. - Now, run the script on the partition `alpha` simultaneously on both nodes: + Now, run the script on the cluster `alpha` simultaneously on both nodes: ```bash #!/bin/bash #SBATCH --job-name=distr - #SBATCH --partition=alpha #SBATCH --output=%j.out #SBATCH --error=%j.err #SBATCH --mem=64000 @@ -121,8 +120,8 @@ Each worker runs the training loop independently. } NODE_1=$(print_nodelist | awk '{print $1}' | sort -u | head -n 1) NODE_2=$(print_nodelist | awk '{print $1}' | sort -u | tail -n 1) - IP_1=$(dig +short ${NODE_1}.taurus.hrsk.tu-dresden.de) - IP_2=$(dig +short ${NODE_2}.taurus.hrsk.tu-dresden.de) + IP_1=$(dig +short ${NODE_1}.alpha.hpc.tu-dresden.de) + IP_2=$(dig +short ${NODE_2}.alpha.hpc.tu-dresden.de) module load modenv/hiera module load modenv/hiera GCC/10.2.0 CUDA/11.1.1 OpenMPI/4.0.5 TensorFlow/2.4.1 @@ -257,7 +256,7 @@ marie@compute$ module spider Horovod # Check available modules marie@compute$ module load Horovod/0.19.5-fosscuda-2019b-TensorFlow-2.2.0-Python-3.7.4 ``` -Or if you want to use Horovod on the partition `alpha`, you can load it with the dependencies: +Or if you want to use Horovod on the cluster `alpha`, you can load it with the dependencies: ```console marie@alpha$ module spider Horovod #Check available modules @@ -300,8 +299,8 @@ Available Tensor Operations: [ ] Gloo ``` -If you want to use OpenMPI then specify `HOROVOD_GPU_ALLREDUCE=MPI`. -To have better performance it is recommended to use NCCL instead of OpenMPI. +If you want to use Open MPI then specify `HOROVOD_GPU_ALLREDUCE=MPI`. +To have better performance it is recommended to use NCCL instead of Open MPI. ##### Verify Horovod Works @@ -324,7 +323,7 @@ Hello from: 0 [official examples](https://github.com/horovod/horovod/tree/master/examples) to parallelize your code. In Horovod, each GPU gets pinned to a process. - You can easily start your job with the following bash script with four processes on two nodes: + You can easily start your job with the following bash script with four processes on two nodes using the cluster Power: ```bash #!/bin/bash @@ -332,7 +331,6 @@ Hello from: 0 #SBATCH --ntasks=4 #SBATCH --ntasks-per-node=2 #SBATCH --gres=gpu:2 - #SBATCH --partition=ml #SBATCH --mem=250G #SBATCH --time=01:00:00 #SBATCH --output=run_horovod.out diff --git a/doc.zih.tu-dresden.de/docs/software/fem_software.md b/doc.zih.tu-dresden.de/docs/software/fem_software.md index e2e48c8b541eb08695189cfedf2fe1841c78defe..285b7439dee3320e9c8724766f1e002416dc31c9 100644 --- a/doc.zih.tu-dresden.de/docs/software/fem_software.md +++ b/doc.zih.tu-dresden.de/docs/software/fem_software.md @@ -60,7 +60,7 @@ Slurm or [writing job files](../jobs_and_resources/slurm.md#job-files). #SBATCH --time=00:04:00 #SBATCH --job-name=yyyy # give a name, what ever you want #SBATCH --mail-type=END,FAIL # send email when the job finished or failed - #SBATCH --mail-user=<name>@mailbox.tu-dresden.de # set your email + #SBATCH --mail-user=marie@tu-dresden.de # set your email #SBATCH --account=p_number_crunch # charge compute time to project p_number_crunch @@ -205,7 +205,7 @@ below. ##### Interactive mode ```console -marie@login$ srun --partition=haswell --nodes 1 --ntasks-per-node=4 --time=0:20:00 --mem-per-cpu=1700 --pty bash -l +marie@login$ srun --nodes 1 --ntasks-per-node=4 --time=0:20:00 --mem-per-cpu=1700 --pty bash -l ``` ```console @@ -249,7 +249,7 @@ line option. ##### Interactive Mode ```console -marie@login$ srun --partition=haswell --nodes 4 --ntasks-per-node=4 --time=0:20:00 --mem-per-cpu=1700 --pty bash -l +marie@login$ srun --nodes 4 --ntasks-per-node=4 --time=0:20:00 --mem-per-cpu=1700 --pty bash -l # generate node list marie@node$ NODELIST=$(for node in $( scontrol show hostnames $SLURM_JOB_NODELIST | uniq ); do echo -n "${node}:${SLURM_NTASKS_PER_NODE}:"; done | sed 's/:$//') diff --git a/doc.zih.tu-dresden.de/docs/software/gpu_programming.md b/doc.zih.tu-dresden.de/docs/software/gpu_programming.md index 2e5b57422a0472650a6fc64c5c4bfeac433e5801..84eb94c667e550c55e6f8c73cb3672a95c88b278 100644 --- a/doc.zih.tu-dresden.de/docs/software/gpu_programming.md +++ b/doc.zih.tu-dresden.de/docs/software/gpu_programming.md @@ -200,12 +200,12 @@ detail in [nvcc documentation](https://docs.nvidia.com/cuda/cuda-compiler-driver This compiler is available via several `CUDA` packages, a default version can be loaded via `module load CUDA`. Additionally, the `NVHPC` modules provide CUDA tools as well. -For using CUDA with OpenMPI at multiple nodes, the OpenMPI module loaded shall have be compiled with -CUDA support. If you aren't sure if the module you are using has support for it you can check it as -following: +For using CUDA with Open MPI at multiple nodes, the `OpenMPI` module loaded shall have be compiled +with CUDA support. If you aren't sure if the module you are using has support for it you can check +it as following: ```console -ompi_info --parsable --all | grep mpi_built_with_cuda_support:value | awk -F":" '{print "OpenMPI supports CUDA:",$7}' +ompi_info --parsable --all | grep mpi_built_with_cuda_support:value | awk -F":" '{print "Open MPI supports CUDA:",$7}' ``` #### Usage of the CUDA Compiler @@ -260,7 +260,7 @@ metrics and `--export-profile` to generate a report file, like this: marie@compute$ nvprof --analysis-metrics --export-profile <output>.nvvp ./application [options] ``` -[Transfer the report file to your local system](../data_transfer/export_nodes.md) and analyze it in +[Transfer the report file to your local system](../data_transfer/dataport_nodes.md) and analyze it in the Visual Profiler (`nvvp`) locally. This will give the smoothest user experience. Alternatively, you can use [X11-forwarding](../access/ssh_login.md). Refer to the documentation for details about the individual @@ -317,7 +317,7 @@ needs, this analysis may be sufficient to identify optimizations targets. The graphical user interface version can be used for a thorough analysis of your previously generated report file. For an optimal user experience, we recommend a local installation of NVIDIA Nsight Systems. In this case, you can -[transfer the report file to your local system](../data_transfer/export_nodes.md). +[transfer the report file to your local system](../data_transfer/dataport_nodes.md). Alternatively, you can use [X11-forwarding](../access/ssh_login.md). The graphical user interface is usually available as `nsys-ui`. @@ -361,7 +361,7 @@ manually. This report file can be analyzed in the graphical user interface profiler. Again, we recommend you generate a report file on a compute node and -[transfer the report file to your local system](../data_transfer/export_nodes.md). +[transfer the report file to your local system](../data_transfer/dataport_nodes.md). Alternatively, you can use [X11-forwarding](../access/ssh_login.md). The graphical user interface is usually available as `ncu-ui` or `nv-nsight-cu`. diff --git a/doc.zih.tu-dresden.de/docs/software/hyperparameter_optimization.md b/doc.zih.tu-dresden.de/docs/software/hyperparameter_optimization.md index 885e617f3f47797acd8858e18c363807e77bde67..35104f1d28f97f29dd9ca2069cd9f3600841dae9 100644 --- a/doc.zih.tu-dresden.de/docs/software/hyperparameter_optimization.md +++ b/doc.zih.tu-dresden.de/docs/software/hyperparameter_optimization.md @@ -190,13 +190,13 @@ There are the following script preparation steps for OmniOpt: ``` 1. Testing script functionality and determine software requirements for the chosen - [partition](../jobs_and_resources/partitions_and_limits.md). In the following, the - partition `alpha` is used. Please note the parameters `--out-layer1`, `--batchsize`, `--epochs` when + [cluster](../jobs_and_resources/hardware_overview.md). In the following, the + cluster `Alpha` is used. Please note the parameters `--out-layer1`, `--batchsize`, `--epochs` when calling the Python script. Additionally, note the `RESULT` string with the output for OmniOpt. ??? hint "Hint for installing Python modules" - Note that for this example the module `torchvision` is not available on the partition `alpha` + Note that for this example the module `torchvision` is not available on the cluster `alpha` and it is installed by creating a [virtual environment](python_virtual_environments.md). It is recommended to install such a virtual environment into a [workspace](../data_lifecycle/workspaces.md). @@ -211,7 +211,7 @@ There are the following script preparation steps for OmniOpt: ```console # Job submission on alpha nodes with 1 GPU on 1 node with 800 MB per CPU - marie@login$ srun -p alpha --gres=gpu:1 -n 1 -c 7 --pty --mem-per-cpu=800 bash + marie@login$ srun --gres=gpu:1 -n 1 -c 7 --pty --mem-per-cpu=800 bash marie@alpha$ module load modenv/hiera GCC/10.2.0 CUDA/11.1.1 OpenMPI/4.0.5 PyTorch/1.9.0 # Activate virtual environment marie@alpha$ source </path/to/workspace/python-environments/torchvision_env>/bin/activate @@ -323,7 +323,7 @@ For getting informed about the current status of OmniOpt or for looking into res tool of OmniOpt is used. Switch to the OmniOpt folder and run `evaluate-run.sh`. ``` console -marie@login$ bash </scratch/ws/omniopt-workdir/>evaluate-run.sh +marie@login$ bash </data/horse/ws/omniopt-workdir/>evaluate-run.sh ``` After initializing and checking for updates in the background, OmniOpt is asking to select the @@ -356,7 +356,7 @@ In order to look into the results, there are the following basic approaches. After creating a 2D scatter plot or a parallel plot, OmniOpt will try to display the corresponding file (`html`, `png`) directly on the ZIH system. Therefore, X11 forwarding must be enabled, either by [SSH configuration - ](../access/ssh_login.md#configuring-default-parameters-for-ssh) or by using `ssh -X taurus` + ](../access/ssh_login.md#configuring-default-parameters-for-ssh) or by using e.g. `ssh -XC alpha` while logging in. Nevertheless, because of latency using X11 forwarding, it is recommended to download the created files and explore them on the local machine (esp. for the parallel plot). The created files are saved at diff --git a/doc.zih.tu-dresden.de/docs/software/licenses.md b/doc.zih.tu-dresden.de/docs/software/licenses.md index 5eface00968891d163982e3ed836a87a56b927c8..e0bd6c89770c70feadb3d2389b4e9eadafae218d 100644 --- a/doc.zih.tu-dresden.de/docs/software/licenses.md +++ b/doc.zih.tu-dresden.de/docs/software/licenses.md @@ -4,8 +4,7 @@ It is possible (please [contact the support team](../support/support.md) first) their own software and use their own license servers, e.g. FlexLM. The outbound IP addresses from ZIH systems are: -- compute nodes: NAT via 141.76.3.193 -- login nodes: 141.30.73.102-141.30.73.105 +- NAT via 141.76.3.203 and 141.76.3.204 (for both login and compute nodes) The IT department of the external institute has to open the firewall for license communications (might be multiple ports) from ZIH systems and enable handing-out license to these IPs and login. diff --git a/doc.zih.tu-dresden.de/docs/software/machine_learning.md b/doc.zih.tu-dresden.de/docs/software/machine_learning.md index 09049a94349433dadb78cb5c6dfd01ef6a2b4df2..ed57e1799196c97afda68df26a605a061938f1ad 100644 --- a/doc.zih.tu-dresden.de/docs/software/machine_learning.md +++ b/doc.zih.tu-dresden.de/docs/software/machine_learning.md @@ -1,51 +1,51 @@ # Machine Learning This is an introduction of how to run machine learning applications on ZIH systems. -For machine learning purposes, we recommend to use the partitions `alpha` and/or `ml`. +For machine learning purposes, we recommend to use the cluster `alpha` and/or `power`. -## Partition `ml` +## Cluster: `power` -The compute nodes of the partition `ml` are built on the base of +The compute nodes of the cluster `power` are built on the base of [Power9 architecture](https://www.ibm.com/it-infrastructure/power/power9) from IBM. The system was created for AI challenges, analytics and working with data-intensive workloads and accelerated databases. The main feature of the nodes is the ability to work with the [NVIDIA Tesla V100](https://www.nvidia.com/en-gb/data-center/tesla-v100/) GPU with **NV-Link** support that allows a total bandwidth with up to 300 GB/s. Each node on the -partition `ml` has 6x Tesla V-100 GPUs. You can find a detailed specification of the partition in our +cluster `power` has 6x Tesla V-100 GPUs. You can find a detailed specification of the cluster in our [Power9 documentation](../jobs_and_resources/hardware_overview.md). !!! note - The partition `ml` is based on the Power9 architecture, which means that the software built - for x86_64 will not work on this partition. Also, users need to use the modules which are + The cluster `power` is based on the Power9 architecture, which means that the software built + for x86_64 will not work on this cluster. Also, users need to use the modules which are specially build for this architecture (from `modenv/ml`). ### Modules -On the partition `ml` load the module environment: +On the cluster `power` load the module environment: ```console -marie@ml$ module load modenv/ml +marie@power$ module load modenv/ml The following have been reloaded with a version change: 1) modenv/scs5 => modenv/ml ``` ### Power AI -There are tools provided by IBM, that work on partition `ml` and are related to AI tasks. +There are tools provided by IBM, that work on cluster `power` and are related to AI tasks. For more information see our [Power AI documentation](power_ai.md). -## Partition: Alpha +## Cluster: Alpha -Another partition for machine learning tasks is `alpha`. It is mainly dedicated to -[ScaDS.AI](https://scads.ai/) topics. Each node on partition `alpha` has 2x AMD EPYC CPUs, 8x NVIDIA +Another cluster for machine learning tasks is `alpha`. It is mainly dedicated to +[ScaDS.AI](https://scads.ai/) topics. Each node on the cluster `alpha` has 2x AMD EPYC CPUs, 8x NVIDIA A100-SXM4 GPUs, 1 TB RAM and 3.5 TB local space (`/tmp`) on an NVMe device. You can find more -details of the partition in our [Alpha Centauri](../jobs_and_resources/alpha_centauri.md) +details of the cluster in our [Alpha Centauri](../jobs_and_resources/alpha_centauri.md) documentation. ### Modules -On the partition `alpha` load the module environment: +On the cluster `alpha` load the module environment: ```console marie@alpha$ module load modenv/hiera @@ -54,7 +54,7 @@ The following have been reloaded with a version change: 1) modenv/ml => modenv/ !!! note - On partition `alpha`, the most recent modules are build in `hiera`. Alternative modules might be + On cluster `alpha`, the most recent modules are build in `hiera`. Alternative modules might be build in `scs5`. ## Machine Learning via Console @@ -83,7 +83,7 @@ create documents containing live code, equations, visualizations, and narrative TensorFlow or PyTorch) on ZIH systems and to run your Jupyter notebooks on HPC nodes. After accessing JupyterHub, you can start a new session and configure it. For machine learning -purposes, select either partition `alpha` or `ml` and the resources, your application requires. +purposes, select either cluster `alpha` or `power` and the resources, your application requires. In your session you can use [Python](data_analytics_with_python.md#jupyter-notebooks), [R](data_analytics_with_r.md#r-in-jupyterhub) or [RStudio](data_analytics_with_rstudio.md) for your @@ -158,7 +158,7 @@ still need to download some datasets use [Datamover](../data_transfer/datamover. The ImageNet project is a large visual database designed for use in visual object recognition software research. In order to save space in the filesystem by avoiding to have multiple duplicates of this lying around, we have put a copy of the ImageNet database (ILSVRC2012 and ILSVR2017) under -`/scratch/imagenet` which you can use without having to download it again. For the future, the +`/data/horse/imagenet` which you can use without having to download it again. For the future, the ImageNet dataset will be available in [Warm Archive](../data_lifecycle/workspaces.md#mid-term-storage). ILSVR2017 also includes a dataset for recognition objects from a video. Please respect the corresponding diff --git a/doc.zih.tu-dresden.de/docs/software/misc/spec_nvhpc-alpha.cfg b/doc.zih.tu-dresden.de/docs/software/misc/spec_nvhpc-alpha.cfg index 18743ba58e98e299227fc0273cf301b52330ed4c..a0a1d8670b942ffdba5dc7fd3a3a2f6c5e0048c5 100644 --- a/doc.zih.tu-dresden.de/docs/software/misc/spec_nvhpc-alpha.cfg +++ b/doc.zih.tu-dresden.de/docs/software/misc/spec_nvhpc-alpha.cfg @@ -172,7 +172,7 @@ preEnv_MPICH_GPU_EAGER_DEVICE_MEM=0 %endif %ifdef %{ucx} -# if using OpenMPI with UCX support, these settings are needed with use of CUDA Aware MPI +# if using Open MPI with UCX support, these settings are needed with use of CUDA Aware MPI # without these flags, LBM is known to hang when using OpenACC and OpenMP Target to GPUs preENV_UCX_MEMTYPE_CACHE=n preENV_UCX_TLS=self,shm,cuda_copy diff --git a/doc.zih.tu-dresden.de/docs/software/misc/spec_nvhpc-ppc.cfg b/doc.zih.tu-dresden.de/docs/software/misc/spec_nvhpc-ppc.cfg index 06b9e1b85549892df1880e9ae2c461276ac95a2d..6e6112b1a8f81e01836541fe8f2257c215eb2fa7 100644 --- a/doc.zih.tu-dresden.de/docs/software/misc/spec_nvhpc-ppc.cfg +++ b/doc.zih.tu-dresden.de/docs/software/misc/spec_nvhpc-ppc.cfg @@ -217,7 +217,7 @@ preEnv_MPICH_GPU_EAGER_DEVICE_MEM=0 %endif %ifdef %{ucx} -# if using OpenMPI with UCX support, these settings are needed with use of CUDA Aware MPI +# if using Open MPI with UCX support, these settings are needed with use of CUDA Aware MPI # without these flags, LBM is known to hang when using OpenACC and OpenMP Target to GPUs preENV_UCX_MEMTYPE_CACHE=n preENV_UCX_TLS=self,shm,cuda_copy diff --git a/doc.zih.tu-dresden.de/docs/software/modules.md b/doc.zih.tu-dresden.de/docs/software/modules.md index c743e60767089f96fd2085c8054ba63b86a30d40..d5e42e1882e4b35ed4923642ffac0400fd6c33dd 100644 --- a/doc.zih.tu-dresden.de/docs/software/modules.md +++ b/doc.zih.tu-dresden.de/docs/software/modules.md @@ -5,7 +5,7 @@ Usage of software on HPC systems is managed by a **modules system**. !!! note "Module" A module is a user interface that provides utilities for the dynamic modification of a user's - environment, e.g. prepending paths to: + environment, e.g. prepending paths to * `PATH` * `LD_LIBRARY_PATH` @@ -40,6 +40,41 @@ certain module, you can use `module avail softwarename` and it will display the ### Examples +???+ example "Searching for software" + + The process of searching for a particular software you want to use on an HPC system consits of + two steps: Login to the target HPC system and invoke `module spider` command to search for the + software and list available versions. + + For example, if you want to search for available Matlab versions on `Barnard`, the steps might + be: + + ```console + marie@login.barnard$ module spider matlab + + --------------------------------------------------------------------------------------------------------------------------------------------------------- + MATLAB: MATLAB/2022b + --------------------------------------------------------------------------------------------------------------------------------------------------------- + Description: + MATLAB is a high-level language and interactive environment that enables you to perform computationally intensive tasks faster than with + traditional programming languages such as C, C++, and Fortran. + + + You will need to load all module(s) on any one of the lines below before the "MATLAB/2022b" module is available to load. + + release/23.04 + release/23.10 + [...] + ``` + + As you can see, `MATLAB` in version `2022b` is available on Barnard within the releases `23.04` + and`23.10`. Additionally, the output provides the information how to load it: + + ```console + marie@login.barnard$ module load release/23.10 MATLAB/2022b + Module MATLAB/2022b and 1 dependency loaded. + ``` + ???+ example "Finding available software" This examples illustrates the usage of the command `module avail` to search for available Matlab @@ -148,9 +183,9 @@ There is a front end for the module command, which helps you to type less. It is ## Module Environments -On ZIH systems, there exist different **module environments**, each containing a set of software modules. -They are activated via the meta module `modenv` which has different versions, one of which is loaded -by default. You can switch between them by simply loading the desired modenv-version, e.g. +On ZIH systems, there exist different **module environments**, each containing a set of software +modules. They are activated via the meta module `modenv` which has different versions, one of which +is loaded by default. You can switch between them by simply loading the desired modenv-version, e.g. ```console marie@compute$ module load modenv/ml @@ -159,12 +194,12 @@ marie@compute$ module load modenv/ml ### modenv/scs5 (default) * SCS5 software -* usually optimized for Intel processors (partitions `haswell`, `broadwell`, `gpu2`, `julia`) +* usually optimized for Intel processors (Cluster `Barnard`, `Julia`) ### modenv/ml -* data analytics software (for use on the partition `ml`) -* necessary to run most software on the partition `ml` +* data analytics software (for use on the Cluster `ml`) +* necessary to run most software on the cluster `ml` (The instruction set [Power ISA](https://en.wikipedia.org/wiki/Power_ISA#Power_ISA_v.3.0) is different from the usual x86 instruction set. Thus the 'machine code' of other modenvs breaks). @@ -172,7 +207,7 @@ Thus the 'machine code' of other modenvs breaks). ### modenv/hiera * uses a hierarchical module load scheme -* optimized software for AMD processors (partitions `romeo` and `alpha`) +* optimized software for AMD processors (Cluster `Romeo` and `Alpha`) ### modenv/classic @@ -183,8 +218,8 @@ Thus the 'machine code' of other modenvs breaks). ### Searching for Software The command `module spider <modname>` allows searching for a specific software across all modenv -environments. It will also display information on how to load a particular module when giving a precise -module (with version) as the parameter. +environments. It will also display information on how to load a particular module when giving a +precise module (with version) as the parameter. ??? example "Spider command" @@ -259,18 +294,16 @@ In some cases a desired software is available as an extension of a module. ## Toolchains -A program or library may break in various ways -(e.g. not starting, crashing or producing wrong results) -when it is used with a software of a different version than it expects. -So each module specifies the exact other modules it depends on. -They get loaded automatically when the dependent module is loaded. +A program or library may break in various ways (e.g. not starting, crashing or producing wrong +results) when it is used with a software of a different version than it expects. So each module +specifies the exact other modules it depends on. They get loaded automatically when the dependent +module is loaded. -Loading a single module is easy as there can't be any conflicts between dependencies. -However when loading multiple modules they can require different versions of the same software. -This conflict is currently handled in that loading the same software with a different version -automatically unloads the earlier loaded module. -As the dependents of that module are **not** automatically unloaded this means they now have a -wrong dependency (version) which can be a problem (see above). +Loading a single module is easy as there can't be any conflicts between dependencies. However when +loading multiple modules they can require different versions of the same software. This conflict is +currently handled in that loading the same software with a different version automatically unloads +the earlier loaded module. As the dependents of that module are **not** automatically unloaded this +means they now have a wrong dependency (version) which can be a problem (see above). To avoid this there are (versioned) toolchains and for each toolchain there is (usually) at most one version of each software. @@ -309,12 +342,12 @@ As you can see `GCC` and `intel-compilers` are on the same level, as are `gompi` although they are one level higher than the former. You can load and use modules from a lower toolchain with modules from -one of its parent toolchains. +one of its parent toolchains. For example `Python/3.6.6-foss-2019a` can be used with `Boost/1.70.0-gompi-2019a`. But you cannot combine toolchains or toolchain versions. So `QuantumESPRESSO/6.5-intel-2019a` and `OpenFOAM/8-foss-2020a` -are both incompatible with `Python/3.6.6-foss-2019a`. +are both incompatible with `Python/3.6.6-foss-2019a`. However `LLVM/7.0.1-GCCcore-8.2.0` can be used with either `QuantumESPRESSO/6.5-intel-2019a` or `Python/3.6.6-foss-2019a` because `GCCcore-8.2.0` is a sub-toolchain of `intel-2019a` and `foss-2019a`. diff --git a/doc.zih.tu-dresden.de/docs/software/mpi_usage_error_detection.md b/doc.zih.tu-dresden.de/docs/software/mpi_usage_error_detection.md index 630e19e8fe6d7ca70a89175662ab8e79b9adceac..c22c1cd6b34e79baeb91b1dc41feee491432172c 100644 --- a/doc.zih.tu-dresden.de/docs/software/mpi_usage_error_detection.md +++ b/doc.zih.tu-dresden.de/docs/software/mpi_usage_error_detection.md @@ -211,7 +211,7 @@ from the [MUST documentation v1.7.2](https://hpc.rwth-aachen.de/must/files/Docum html files. Copy the files to your local host, e.g. ```console - marie@local$ scp -r taurus.hrsk.tu-dresden.de:/scratch/ws/0/marie-must/{MUST_Output-files,MUST_Output.html} + marie@local$ scp -r dataport1.hpc.tu-dresden.de:/data/horse/ws/marie-must/{MUST_Output-files,MUST_Output.html} ``` and open the file `MUST_Output.html` using a webbrowser. Alternativly, you can open the html file with a diff --git a/doc.zih.tu-dresden.de/docs/software/nanoscale_simulations.md b/doc.zih.tu-dresden.de/docs/software/nanoscale_simulations.md index 392124c49f16e5fc4c2dcb1774782d70180682c6..d3886abc4ccd99803d8bd4b6682db099c8348f58 100644 --- a/doc.zih.tu-dresden.de/docs/software/nanoscale_simulations.md +++ b/doc.zih.tu-dresden.de/docs/software/nanoscale_simulations.md @@ -53,7 +53,7 @@ particularly designed for ab-initio molecular dynamics. For examples and documen [CPMD homepage](https://www.lcrc.anl.gov/for-users/software/available-software/cpmd/). CPMD is currently not installed as a module. -Please, contact [hpcsupport@zih.tu-dresden.de](mailto:hpcsupport@zih.tu-dresden.de) if you need assistance. +Please, contact [hpc-support@tu-dresden.de](mailto:hpc-support@tu-dresden.de) if you need assistance. ## GAMESS @@ -64,6 +64,10 @@ please look at the [GAMESS home page](https://www.msg.chem.iastate.edu/gamess/in GAMESS is available as [modules](modules.md) within the classic environment. Available packages can be listed and loaded with the following commands: +_The module environments /hiera, /scs5, /classic and /ml originated from the taurus system are +momentarily under construction. The script will be updated after completion of the redesign +accordingly_ + ```console marie@login$ module load modenv/classic [...] @@ -89,8 +93,8 @@ For runs with [Slurm](../jobs_and_resources/slurm.md), please use a script like module load modenv/classic module load gamess -rungms.slurm cTT_M_025.inp /scratch/ws/0/marie-gamess -# the third parameter is the location of your scratch directory +rungms.slurm cTT_M_025.inp /data/horse/ws/marie-gamess +# the third parameter is the location of your horse directory ``` *GAMESS should be cited as:* "General Atomic and Molecular Electronic Structure System", @@ -108,7 +112,7 @@ intermediates and transition structures. Access to the Gaussian installation on our system is limited to members of the UNIX group `s_gaussian`. Please, contact -[hpcsupport@zih.tu-dresden.de](mailto:hpcsupport@zih.tu-dresden.de) if you can't +[hpc-support@tu-dresden.de](mailto:hpc-support@tu-dresden.de) if you can't access it, yet wish to use it. ### Guidance on Data Management with Gaussian @@ -122,7 +126,6 @@ However hereafter we have an example on how that might look like for Gaussian: ``` #!/bin/bash - #SBATCH --partition=haswell #SBATCH --time=96:00:00 #SBATCH --nodes=1 #SBATCH --ntasks=1 diff --git a/doc.zih.tu-dresden.de/docs/software/ngc_containers.md b/doc.zih.tu-dresden.de/docs/software/ngc_containers.md index f19612d9a3310f869a483c20328d51168317552a..9e51fc1b1f9766f0c3dc8be728f316bd04fe9a40 100644 --- a/doc.zih.tu-dresden.de/docs/software/ngc_containers.md +++ b/doc.zih.tu-dresden.de/docs/software/ngc_containers.md @@ -50,14 +50,14 @@ If you are not familiar with Singularity's syntax, please find the information o However, some main commands will be explained. Create a container from the image from the NGC catalog. -(For this example, the alpha is used): +(For this example, the cluster alpha is used): ```console -marie@login$ srun --partition=alpha --nodes=1 --ntasks-per-node=1 --ntasks=1 --gres=gpu:1 --time=08:00:00 --pty --mem=50000 bash - -marie@compute$ cd /scratch/ws/<name_of_your_workspace>/containers #please create a Workspace - -marie@compute$ singularity pull pytorch:21.08-py3.sif docker://nvcr.io/nvidia/pytorch:21.08-py3 +marie@login.alpha$ srun --nodes=1 --ntasks-per-node=1 --ntasks=1 --gres=gpu:1 --time=08:00:00 --pty --mem=50000 bash +[...] +marie@alpha$ cd /data/horse/ws/<name_of_your_workspace>/containers #please create a Workspace +[...] +marie@alpha$ singularity pull pytorch:21.08-py3.sif docker://nvcr.io/nvidia/pytorch:21.08-py3 ``` Now, you have a fully functional PyTorch container. @@ -73,20 +73,20 @@ To download the dataset, please follow the Also, you can find the instructions in a README file which you can find inside the container: ```console -marie@compute$ singularity exec pytorch:21.06-py3_beegfs vim /workspace/examples/resnet50v1.5/README.md +marie@alpha$ singularity exec pytorch:21.06-py3_beegfs vim /workspace/examples/resnet50v1.5/README.md ``` It is recommended to run the container with a single command. However, for the educational purpose, the separate commands will be presented below: ```console -marie@login$ srun --partition=alpha --nodes=1 --ntasks-per-node=1 --ntasks=1 --gres=gpu:1 --time=08:00:00 --pty --mem=50000 bash +marie@login.alpha$ srun --nodes=1 --ntasks-per-node=1 --ntasks=1 --gres=gpu:1 --time=08:00:00 --pty --mem=50000 bash ``` Run a shell within a container with the `singularity shell` command: ```console -marie@compute$ singularity shell --nv -B /scratch/imagenet:/data/imagenet pytorch:21.06-py3 +marie@alpha$ singularity shell --nv -B /data/horse/imagenet:/data/imagenet pytorch:21.06-py3 ``` The flag `--nv` in the command above was used to enable Nvidia support for GPU usage @@ -112,8 +112,8 @@ As an example, please find the full command to run the ResNet50 model on the ImageNet dataset inside the PyTorch container: ```console -marie@login$ srun --partition=alpha --nodes=1 --ntasks-per-node=1 --ntasks=1 --gres=gpu:1 --time=08:00:00 --pty --mem=50000 \ - singularity exec --nv -B /scratch/ws/0/anpo879a-ImgNet/imagenet:/data/imagenet pytorch:21.06-py3 \ +marie@login.alpha$ srun --nodes=1 --ntasks-per-node=1 --ntasks=1 --gres=gpu:1 --time=08:00:00 --pty --mem=50000 \ + singularity exec --nv -B /data/horse/ws/anpo879a-ImgNet/imagenet:/data/imagenet pytorch:21.06-py3 \ python /workspace/examples/resnet50v1.5/multiproc.py --nnodes=1 --nproc_per_node 1 \ --node_rank=0 /workspace/examples/resnet50v1.5/main.py --data-backend dali-cpu --raport-file raport.json \ -j16 -p 100 --lr 2.048 --optimizer-batch-size 2048 --warmup 8 --arch resnet50 -c fanin --label-smoothing 0.1 \ @@ -136,11 +136,11 @@ An example of using the PyTorch container for the training of the ResNet50 model on the classification task on the ImageNet dataset is presented below: ```console -marie@login$ srun --partition=alpha --nodes=1 --ntasks-per-node=8 --ntasks=8 --gres=gpu:8 --time=08:00:00 --pty --mem=700G bash +marie@login.alpha$ srun --nodes=1 --ntasks-per-node=8 --ntasks=8 --gres=gpu:8 --time=08:00:00 --pty --mem=700G bash ``` ```console -marie@alpha$ singularity exec --nv -B /scratch/ws/0/marie-ImgNet/imagenet:/data/imagenet pytorch:21.06-py3 \ +marie@alpha$ singularity exec --nv -B /data/horse/ws/marie-ImgNet/imagenet:/data/imagenet pytorch:21.06-py3 \ python /workspace/examples/resnet50v1.5/multiproc.py --nnodes=1 --nproc_per_node 8 \ --node_rank=0 /workspace/examples/resnet50v1.5/main.py --data-backend dali-cpu \ --raport-file raport.json -j16 -p 100 --lr 2.048 --optimizer-batch-size 2048 --warmup 8 \ diff --git a/doc.zih.tu-dresden.de/docs/software/overview.md b/doc.zih.tu-dresden.de/docs/software/overview.md index 2c560c30951ae3ec3419e01a19ce8716c8d441da..4fa3543aa6455017a73cbe4920168bae9dd5fdcf 100644 --- a/doc.zih.tu-dresden.de/docs/software/overview.md +++ b/doc.zih.tu-dresden.de/docs/software/overview.md @@ -15,12 +15,6 @@ There are different options to work with software on ZIH systems: [modules](#mod [Jupyter Notebook](#jupyter-notebook) and [containers](#containers). Brief descriptions and related links on these options are provided below. -!!! note - There are two different software environments: - - * `scs5` environment for the x86 architecture based compute resources - * and `ml` environment for the Machine Learning partition based on the Power9 architecture. - ## Modules Usage of software on ZIH systems, e.g., frameworks, compilers, loader and libraries, is diff --git a/doc.zih.tu-dresden.de/docs/software/papi.md b/doc.zih.tu-dresden.de/docs/software/papi.md index c5f0e7cfaf6260323a8fb572832e9f0a44f792a4..68d94c78528f469e1d30ddddd75bdabd447ba1c4 100644 --- a/doc.zih.tu-dresden.de/docs/software/papi.md +++ b/doc.zih.tu-dresden.de/docs/software/papi.md @@ -78,11 +78,11 @@ measurements, especially for MPI applications via environment variable `PAPI_OUT ```bash export PAPI_EVENTS="PAPI_TOT_INS,PAPI_TOT_CYC" - export PAPI_OUTPUT_DIRECTORY="/scratch/measurement" + export PAPI_OUTPUT_DIRECTORY="/data/horse/measurement" ``` -This will generate a directory called `papi_hl_output` in `scratch/measurement` that contains one or -more output files in JSON format. +This will generate a directory called `papi_hl_output` in `/data/horse/measurement` that contains +one or more output files in JSON format. ### Low-Level API @@ -98,21 +98,22 @@ Before you start a PAPI measurement, check which events are available on the des For this purpose, PAPI offers the tools `papi_avail` and `papi_native_avail`. If you want to measure multiple events, please check which events can be measured concurrently using the tool `papi_event_chooser`. The PAPI wiki contains more details on -[the PAPI tools](https://bitbucket.org/icl/papi/wiki/PAPI-Overview.md#markdown-header-papi-utilities). +[the PAPI tools](https://bitbucket.org/icl/papi/wiki/PAPI-Overview.md#markdown-header-papi-utilities) +. !!! hint The PAPI tools must be run on the compute node, using an interactive shell or job. -!!! example "Example: Determine the events on the partition `romeo` from a login node" +!!! example "Example: Determine the events on the cluster `romeo` from a login node" Let us assume, that you are in project `p_number_crunch`. Then, use the following commands: ```console - marie@login$ module load PAPI - marie@login$ salloc --account=p_number_crunch --partition=romeo + marie@login.romeo$ module load PAPI + marie@login.romeo$ salloc --account=p_number_crunch [...] - marie@compute$ srun papi_avail - marie@compute$ srun papi_native_avail + marie@romeo$ srun papi_avail + marie@romeo$ srun papi_native_avail [...] # Exit with Ctrl+D ``` @@ -124,10 +125,10 @@ compile your application against the PAPI library. Assuming that you are in project `p_number_crunch`, use the following commands: ```console - marie@login$ module load PAPI - marie@login$ gcc app.c -o app -lpapi - marie@login$ salloc --account=p_number_crunch --partition=romeo - marie@compute$ srun ./app + marie@login.romeo$ module load PAPI + marie@login.romeo$ gcc app.c -o app -lpapi + marie@login.romeo$ salloc --account=p_number_crunch + marie@romeo$ srun ./app [...] # Exit with Ctrl+D ``` diff --git a/doc.zih.tu-dresden.de/docs/software/performance_engineering_overview.md b/doc.zih.tu-dresden.de/docs/software/performance_engineering_overview.md index b8c79afa6c2ab2b224eed7361b84a1e8a57501ae..81c6bc1ec989ac50d40c4e3af5a1c19479748d36 100644 --- a/doc.zih.tu-dresden.de/docs/software/performance_engineering_overview.md +++ b/doc.zih.tu-dresden.de/docs/software/performance_engineering_overview.md @@ -230,7 +230,7 @@ jobs. The data analysis of the given set of metrics is fully integrated and does not require any user actions. Performance metrics are accessible via the -[PIKA web service](https://selfservice.tu-dresden.de/services/1663599/). +[PIKA web service](https://pika.zih.tu-dresden.de/). ### Score-P diff --git a/doc.zih.tu-dresden.de/docs/software/pika.md b/doc.zih.tu-dresden.de/docs/software/pika.md index f84460f8056d8d010406dccc89a9270131cf87d5..117f8d00c635666c5938229630a8a31d5a75b310 100644 --- a/doc.zih.tu-dresden.de/docs/software/pika.md +++ b/doc.zih.tu-dresden.de/docs/software/pika.md @@ -1,21 +1,31 @@ +--- +search: + boost: 4.0 +--- + # Track Slurm Jobs with PIKA PIKA is a hardware performance monitoring stack to identify inefficient HPC jobs. Users of ZIH systems have the possibility to visualize and analyze the efficiency of their jobs via the -[PIKA web interface](https://selfservice.zih.tu-dresden.de/l/index.php/hpcportal/jobmonitoring/zih/jobs). +[PIKA web interface](https://pika.zih.tu-dresden.de). !!! hint To understand this guide, it is recommended that you open the - [web interface](https://selfservice.zih.tu-dresden.de/l/index.php/hpcportal/jobmonitoring/zih/jobs) + [web interface](https://pika.zih.tu-dresden.de) in a separate window. Furthermore, you should have submitted at least one real HPC job at ZIH systems. + If you are outside the TUD network, you will need to establish a VPN connection. For more + information on our VPN and how to set it up, please visit the corresponding + [ZIH service catalog + page](https://tu-dresden.de/zih/dienste/service-katalog/arbeitsumgebung/zugang_datennetz/vpn). + ## Overview PIKA consists of several components and tools. It uses the collection daemon collectd, InfluxDB to store time-series data and MariaDB to store job metadata. Furthermore, it provides a powerful -[web interface](https://selfservice.zih.tu-dresden.de/l/index.php/hpcportal/jobmonitoring/zih/jobs) +[web interface](https://pika.zih.tu-dresden.de) for the visualization and analysis of job performance data. ## Table View and Job Search @@ -145,7 +155,7 @@ flags in the job script: ```Bash #SBATCH --exclusive -#SBATCH --comment=no_monitoring +#SBATCH --constraint=no_monitoring ``` **Note:** Disabling PIKA monitoring is possible only for exclusive jobs! diff --git a/doc.zih.tu-dresden.de/docs/software/python_virtual_environments.md b/doc.zih.tu-dresden.de/docs/software/python_virtual_environments.md index 99e61332b8773e24e0ba7be606674c38c552db50..bfd9e0e0d22c0e867a932fb39a23aba7fb518769 100644 --- a/doc.zih.tu-dresden.de/docs/software/python_virtual_environments.md +++ b/doc.zih.tu-dresden.de/docs/software/python_virtual_environments.md @@ -48,13 +48,13 @@ marie@compute$ which python #Check which python are you using Then create the virtual environment and activate it. ```console -marie@compute$ ws_allocate -F scratch python_virtual_environment 1 +marie@compute$ ws_allocate python_virtual_environment 1 Info: creating workspace. -/scratch/ws/1/marie-python_virtual_environment +/data/horse/ws/marie-python_virtual_environment [...] -marie@compute$ virtualenv --system-site-packages /scratch/ws/1/marie-python_virtual_environment/env #Create virtual environment +marie@compute$ virtualenv --system-site-packages /data/horse/ws/marie-python_virtual_environment/env #Create virtual environment [...] -marie@compute$ source /scratch/ws/1/marie-python_virtual_environment/env/bin/activate #Activate virtual environment. Example output: (env) bash-4.2$ +marie@compute$ source /data/horse/ws/marie-python_virtual_environment/env/bin/activate #Activate virtual environment. Example output: (env) bash-4.2$ ``` Now you can work in this isolated environment, without interfering with other @@ -71,9 +71,9 @@ the environment as follows: This is an example on partition `alpha`. The example creates a python virtual environment, and installs the package `torchvision` with pip. ```console - marie@login$ srun --partition=alpha-interactive --nodes=1 --gres=gpu:1 --time=01:00:00 --pty bash - marie@alpha$ ws_allocate -F scratch my_python_virtualenv 100 # use a workspace for the environment - marie@alpha$ cd /scratch/ws/1/marie-my_python_virtualenv + marie@login.alpha$ srun --nodes=1 --gres=gpu:1 --time=01:00:00 --pty bash + marie@alpha$ ws_allocate my_python_virtualenv 100 # use a workspace for the environment + marie@alpha$ cd /data/horse/ws/marie-my_python_virtualenv marie@alpha$ module load modenv/hiera GCC/10.2.0 CUDA/11.1.1 OpenMPI/4.0.5 PyTorch/1.9.0 Module GCC/10.2.0, CUDA/11.1.1, OpenMPI/4.0.5, PyTorch/1.9.0 and 54 dependencies loaded. marie@alpha$ which python @@ -113,9 +113,9 @@ packages from the file: ```console marie@compute$ module load Python #Load default Python [...] -marie@compute$ virtualenv --system-site-packages /scratch/ws/1/marie-python_virtual_environment/env_post #Create virtual environment +marie@compute$ virtualenv --system-site-packages /data/horse/ws/marie-python_virtual_environment/env_post #Create virtual environment [...] -marie@compute$ source /scratch/ws/1/marie-python_virtual_environment/env/bin/activate #Activate virtual environment. Example output: (env_post) bash-4.2$ +marie@compute$ source /data/horse/ws/marie-python_virtual_environment/env/bin/activate #Activate virtual environment. Example output: (env_post) bash-4.2$ (env_post) marie@compute$ pip install -r requirements.txt #Install packages from the created requirements.txt file ``` @@ -145,9 +145,9 @@ This example shows how to start working with **conda** and virtual environment directory for the conda virtual environment: ```console -marie@compute$ ws_allocate -F scratch conda_virtual_environment 1 +marie@compute$ ws_allocate conda_virtual_environment 1 Info: creating workspace. -/scratch/ws/1/marie-conda_virtual_environment +/data/horse/ws/marie-conda_virtual_environment [...] ``` @@ -156,8 +156,8 @@ environment: ```console marie@compute$ module load Anaconda3 #load Anaconda module -marie@compute$ conda create --prefix /scratch/ws/1/marie-conda_virtual_environment/conda-env python=3.6 #create virtual environment with Python version 3.6 -marie@compute$ conda activate /scratch/ws/1/marie-conda_virtual_environment/conda-env #activate conda-env virtual environment +marie@compute$ conda create --prefix /data/horse/ws/marie-conda_virtual_environment/conda-env python=3.6 #create virtual environment with Python version 3.6 +marie@compute$ conda activate /data/horse/ws/marie-conda_virtual_environment/conda-env #activate conda-env virtual environment ``` Now you can work in this isolated environment, without interfering with other @@ -183,9 +183,9 @@ can deactivate the conda environment as follows: This is an example on partition `alpha`. The example creates a conda virtual environment, and installs the package `torchvision` with conda. ```console - marie@login$ srun --partition=alpha-interactive --nodes=1 --gres=gpu:1 --time=01:00:00 --pty bash - marie@alpha$ ws_allocate -F scratch my_conda_virtualenv 100 # use a workspace for the environment - marie@alpha$ cd /scratch/ws/1/marie-my_conda_virtualenv + marie@login.alpha$ srun --nodes=1 --gres=gpu:1 --time=01:00:00 --pty bash + marie@alpha$ ws_allocate my_conda_virtualenv 100 # use a workspace for the environment + marie@alpha$ cd /data/horse/ws/marie-my_conda_virtualenv marie@alpha$ module load Anaconda3 Module Anaconda3/2021.11 loaded. marie@alpha$ conda create --prefix my-torch-env python=3.8 @@ -250,8 +250,8 @@ Recreate the conda virtual environment with the packages from the created `environment.yml` file: ```console -marie@compute$ mkdir /scratch/ws/1/marie-conda_virtual_environment/conda-env #Create directory for environment +marie@compute$ mkdir /data/horse/ws/marie-conda_virtual_environment/conda-env #Create directory for environment marie@compute$ module load Anaconda3 #Load Anaconda marie@compute$ conda config --set channel_priority strict -marie@compute$ conda env create --prefix /scratch/ws/1/marie-conda_virtual_environment/conda-env --file environment.yml #Create conda env in directory with packages from environment.yml file +marie@compute$ conda env create --prefix /data/horse/ws/marie-conda_virtual_environment/conda-env --file environment.yml #Create conda env in directory with packages from environment.yml file ``` diff --git a/doc.zih.tu-dresden.de/docs/software/pytorch.md b/doc.zih.tu-dresden.de/docs/software/pytorch.md index 4d03aec66b6e8b68071179534f558d3245645745..f942f3035e3b2072ab9d2cb77331319c624eb4a3 100644 --- a/doc.zih.tu-dresden.de/docs/software/pytorch.md +++ b/doc.zih.tu-dresden.de/docs/software/pytorch.md @@ -15,18 +15,23 @@ marie@login$ module spider pytorch to find out, which PyTorch modules are available. -We recommend using partitions `alpha` and/or `ml` when working with machine learning workflows +We recommend using the cluster `alpha` and/or `power` when working with machine learning workflows and the PyTorch library. You can find detailed hardware specification in our [hardware documentation](../jobs_and_resources/hardware_overview.md). +_The module environments /hiera, /scs5, /classic and /ml originated from the taurus system are +momentarily under construction. The script will be updated after completion of the redesign +accordingly_ + ## PyTorch Console -On the partition `alpha`, load the module environment: +On the cluster `alpha`, load the module environment: ```console # Job submission on alpha nodes with 1 gpu on 1 node with 800 Mb per CPU -marie@login$ srun -p alpha --gres=gpu:1 -n 1 -c 7 --pty --mem-per-cpu=800 bash + +marie@login.alpha$ srun --gres=gpu:1 -n 1 -c 7 --pty --mem-per-cpu=800 bash marie@alpha$ module load modenv/hiera GCC/10.2.0 CUDA/11.1.1 OpenMPI/4.0.5 PyTorch/1.9.0 Die folgenden Module wurden in einer anderen Version erneut geladen: 1) modenv/scs5 => modenv/hiera @@ -34,9 +39,9 @@ Die folgenden Module wurden in einer anderen Version erneut geladen: Module GCC/10.2.0, CUDA/11.1.1, OpenMPI/4.0.5, PyTorch/1.9.0 and 54 dependencies loaded. ``` -??? hint "Torchvision on partition `alpha`" +??? hint "Torchvision on the cluster `alpha`" - On the partition `alpha`, the module torchvision is not yet available within the module + On the cluster `alpha`, the module torchvision is not yet available within the module system. (19.08.2021) Torchvision can be made available by using a virtual environment: @@ -49,46 +54,46 @@ Module GCC/10.2.0, CUDA/11.1.1, OpenMPI/4.0.5, PyTorch/1.9.0 and 54 dependencies Using the **--no-deps** option for "pip install" is necessary here as otherwise the PyTorch version might be replaced and you will run into trouble with the CUDA drivers. -On the partition `ml`: +On the cluster `power`: ```console -# Job submission in ml nodes with 1 gpu on 1 node with 800 Mb per CPU -marie@login$ srun -p ml --gres=gpu:1 -n 1 -c 7 --pty --mem-per-cpu=800 bash +# Job submission in power nodes with 1 gpu on 1 node with 800 Mb per CPU +marie@login.power$ srun --gres=gpu:1 -n 1 -c 7 --pty --mem-per-cpu=800 bash ``` After calling ```console -marie@login$ module spider pytorch +marie@login.power$ module spider pytorch ``` we know that we can load PyTorch (including torchvision) with ```console -marie@ml$ module load modenv/ml torchvision/0.7.0-fossCUDA-2019b-Python-3.7.4-PyTorch-1.6.0 +marie@power$ module load modenv/ml torchvision/0.7.0-fossCUDA-2019b-Python-3.7.4-PyTorch-1.6.0 Module torchvision/0.7.0-fossCUDA-2019b-Python-3.7.4-PyTorch-1.6.0 and 55 dependencies loaded. ``` Now, we check that we can access PyTorch: ```console -marie@{ml,alpha}$ python -c "import torch; print(torch.__version__)" +marie@{power,alpha}$ python -c "import torch; print(torch.__version__)" ``` The following example shows how to create a python virtual environment and import PyTorch. ```console # Create folder -marie@ml$ mkdir python-environments +marie@power$ mkdir python-environments # Check which python are you using -marie@ml$ which python +marie@power$ which python /sw/installed/Python/3.7.4-GCCcore-8.3.0/bin/python # Create virtual environment "env" which inheriting with global site packages -marie@ml$ virtualenv --system-site-packages python-environments/env +marie@power$ virtualenv --system-site-packages python-environments/env [...] # Activate virtual environment "env". Example output: (env) bash-4.2$ -marie@ml$ source python-environments/env/bin/activate -marie@ml$ python -c "import torch; print(torch.__version__)" +marie@power$ source python-environments/env/bin/activate +marie@power$ python -c "import torch; print(torch.__version__)" ``` ## PyTorch in JupyterHub diff --git a/doc.zih.tu-dresden.de/docs/software/singularity_power9.md b/doc.zih.tu-dresden.de/docs/software/singularity_power9.md index 080314e52f349f94caf3a1e4ca018807797fd0fa..5abb5c5019a8b0505ad24b947fb72551e4d35aed 100644 --- a/doc.zih.tu-dresden.de/docs/software/singularity_power9.md +++ b/doc.zih.tu-dresden.de/docs/software/singularity_power9.md @@ -9,10 +9,10 @@ The solution is to build your container on your local Linux workstation using Singularity and copy it to ZIH systems for execution. -**This does not work on the partition `ml`** as it uses the Power9 architecture which your +**This does not work on the cluster `power`** as it uses the Power9 architecture which your workstation likely doesn't. -For this, we provide a Virtual Machine (VM) on the partition `ml` which allows users to gain root +For this, we provide a Virtual Machine (VM) on the cluster `power` which allows users to gain root permissions in an isolated environment. The workflow to use this manually is described for [virtual machines](virtual_machines.md) but is quite cumbersome. @@ -29,7 +29,7 @@ what they say. The latter is for more advanced use cases, so you should be fine a base Docker container. Those typically exist for different architectures but with a common name (e.g. `ubuntu:18.04`). Singularity automatically uses the correct Docker container for your current architecture when building. So, in most cases, you can write your definition file, build it and test -it locally, then move it to ZIH systems and build it on Power9 (partition `ml`) without any further +it locally, then move it to ZIH systems and build it on Power9 (cluster `power`) without any further changes. However, sometimes Docker containers for different architectures have different suffixes, in which case you'd need to change that when moving to ZIH systems. @@ -38,20 +38,20 @@ in which case you'd need to change that when moving to ZIH systems. To build a Singularity container for the Power9 architecture on ZIH systems simply run: ```console -marie@login$ buildSingularityImage --arch=power9 myContainer.sif myDefinition.def +marie@login.power$ buildSingularityImage --arch=power9 myContainer.sif myDefinition.def ``` To build a singularity image on the x86-architecture, run: ```console -marie@login$ buildSingularityImage --arch=x86 myContainer.sif myDefinition.def +marie@login.power$ buildSingularityImage --arch=x86 myContainer.sif myDefinition.def ``` These commands will submit a batch job and immediately return. If you want it to block while the image is built and see live output, add the option `--interactive`: ```console -marie@login$ buildSingularityImage --arch=power9 --interactive myContainer.sif myDefinition.def +marie@login.power$ buildSingularityImage --arch=power9 --interactive myContainer.sif myDefinition.def ``` There are more options available, which can be shown by running `buildSingularityImage --help`. All @@ -82,18 +82,18 @@ As the build starts in a VM, you may not have access to all your files. It is us to refer to local files from inside a definition file anyway as this reduces reproducibility. However, common directories are available by default. For others, care must be taken. In short: -* `/home/$USER`, `/scratch/$USER` are available and should be used `/scratch/<group>` also works for - all groups the users is in +* `/home/$USER`, `/data/horse/$USER` are available and should be used `/data/horse/<group>` also + works for all groups the users is in * `/projects/<group>` similar, but is read-only! So don't use this to store your generated - container directly, but rather move it here afterwards + container directly, but rather move it here afterwards * `/tmp` is the VM local temporary directory. All files put here will be lost! -If the current directory is inside (or equal to) one of the above (except `/tmp`), then relative paths -for container and definition work as the script changes to the VM equivalent of the current +If the current directory is inside (or equal to) one of the above (except `/tmp`), then relative +paths for container and definition work as the script changes to the VM equivalent of the current directory. Otherwise, you need to use absolute paths. Using `~` in place of `$HOME` does work too. -Under the hood, the filesystem of ZIH systems is mounted via SSHFS at `/host_data`. So if you need any -other files, they can be found there. +Under the hood, the filesystem of ZIH systems is mounted via SSHFS at `/host_data`. So if you need +any other files, they can be found there. There is also a new SSH key named `kvm` which is created by the scripts and authorized inside the VM to allow for password-less access to SSHFS. This is stored at `~/.ssh/kvm` and regenerated if it @@ -128,7 +128,8 @@ As usual, more options can be shown by running `startInVM --help`, the most impo There are two special use cases for this script: 1. Execute an arbitrary command inside the VM instead of getting a bash by appending the command to - the script. Example: `startInVM --arch=power9 singularity build ~/myContainer.sif ~/myDefinition.de` + the script. + Example: `startInVM --arch=power9 singularity build ~/myContainer.sif ~/myDefinition.de` 1. Use the script in a job manually allocated via srun/sbatch. This will work the same as when running outside a job but will **not** start a new job. This is useful for using it inside batch scripts, when you already have an allocation or need special arguments for the job system. Again, diff --git a/doc.zih.tu-dresden.de/docs/software/singularity_recipe_hints.md b/doc.zih.tu-dresden.de/docs/software/singularity_recipe_hints.md index 6f60c340c53b97da31529a4e21a4cf9b20761063..1dc36a50a8bd17556e08aea9458e6db31cf47d59 100644 --- a/doc.zih.tu-dresden.de/docs/software/singularity_recipe_hints.md +++ b/doc.zih.tu-dresden.de/docs/software/singularity_recipe_hints.md @@ -74,9 +74,9 @@ From: ubuntu:20.04 %environment export MPICH_DIR=/opt/mpich - export SINGULARITY_MPICH_DIR=$MPICH_DIR - export SINGULARITYENV_APPEND_PATH=$MPICH_DIR/bin - export SINGULAIRTYENV_APPEND_LD_LIBRARY_PATH=$MPICH_DIR/lib + export SINGULARITY_MPICH_DIR=${MPICH_DIR} + export SINGULARITYENV_APPEND_PATH=${MPICH_DIR}/bin + export SINGULAIRTYENV_APPEND_LD_LIBRARY_PATH=${MPICH_DIR}/lib %post echo "Installing required packages..." @@ -88,23 +88,23 @@ From: ubuntu:20.04 echo "Installing MPICH" export MPICH_DIR=/opt/mpich export MPICH_VERSION=4.1 - export MPICH_URL="https://www.mpich.org/static/downloads/$MPICH_VERSION/mpich-$MPICH_VERSION.tar.gz" + export MPICH_URL="https://www.mpich.org/static/downloads/${MPICH_VERSION}/mpich-${MPICH_VERSION}.tar.gz" mkdir -p /tmp/mpich mkdir -p /opt # Download - cd /tmp/mpich && wget -O mpich-$MPICH_VERSION.tar.gz $MPICH_URL && tar -xf mpich-$MPICH_VERSION.tar.gz - + cd /tmp/mpich && wget -O mpich-${MPICH_VERSION}.tar.gz ${MPICH_URL} && tar -xf mpich-${MPICH_VERSION}.tar.gz + # Configure and compile/install - cd /tmp/mpich/mpich-$MPICH_VERSION - ./configure --prefix=$MPICH_DIR && make install - - + cd /tmp/mpich/mpich-${MPICH_VERSION} + ./configure --prefix=${MPICH_DIR} && make install + + # Set env variables so we can compile our application - export PATH=$MPICH_DIR/bin:$PATH - export LD_LIBRARY_PATH=$MPICH_DIR/lib:$LD_LIBRARY_PATH - export MANPATH=$MPICH_DIR/share/man:$MANPATH - - + export PATH=${MPICH_DIR}/bin:${PATH} + export LD_LIBRARY_PATH=${MPICH_DIR}/lib:${LD_LIBRARY_PATH} + export MANPATH=${MPICH_DIR}/share/man:${MANPATH} + + echo "Compiling the MPI application..." cd /opt && mpicc -o mpitest mpitest.c ``` @@ -123,12 +123,12 @@ At the HPC system run as following: marie@login$ srun -n 4 --ntasks-per-node 2 --time=00:10:00 singularity exec ubuntu_mpich.sif /opt/mpitest ``` -### CUDA + CuDNN + OpenMPI +### CUDA + CuDNN + Open MPI * Chosen CUDA version depends on installed driver of host -* OpenMPI needs PMI for Slurm integration -* OpenMPI needs CUDA for GPU copy-support -* OpenMPI needs `ibverbs` library for Infiniband +* Open MPI needs PMI for Slurm integration +* Open MPI needs CUDA for GPU copy-support +* Open MPI needs `ibverbs` library for InfiniBand * `openmpi-mca-params.conf` required to avoid warnings on fork (OK on ZIH systems) * Environment variables `SLURM_VERSION` and `OPENMPI_VERSION` can be set to choose different version when building the container @@ -195,13 +195,13 @@ This image may be run with singularity exec xeyes.sif xeyes. ``` -This works because all the magic is done by Singularity already like setting `$DISPLAY` to the outside -display and mounting `$HOME` so `$HOME/.Xauthority` (X11 authentication cookie) is found. When you are -using `--contain` or `--no-home` you have to set that cookie yourself or mount/copy it inside -the container. Similar for `--cleanenv` you have to set `$DISPLAY`, e.g., via +This works because all the magic is done by Singularity already like setting `${DISPLAY}` to the +outside display and mounting `${HOME}` so `${HOME}/.Xauthority` (X11 authentication cookie) is +found. When you are using `--contain` or `--no-home` you have to set that cookie yourself or +mount/copy it inside the container. Similar for `--cleanenv` you have to set `${DISPLAY}`, e.g., via ```console -export SINGULARITY_DISPLAY=$DISPLAY +export SINGULARITY_DISPLAY=${DISPLAY} ``` When you run a container as root (via `sudo`) you may need to allow root for your local display diff --git a/doc.zih.tu-dresden.de/docs/software/tensorboard.md b/doc.zih.tu-dresden.de/docs/software/tensorboard.md index a397498edc467bee26a23e3c9e5c870bb016e514..fa134117baf0b44180e38ad036d3ce2b26b10233 100644 --- a/doc.zih.tu-dresden.de/docs/software/tensorboard.md +++ b/doc.zih.tu-dresden.de/docs/software/tensorboard.md @@ -61,9 +61,9 @@ Then, create a workspace for the event data, that should be visualized in Tensor already have an event data directory, you can skip that step. ```console -marie@compute$ ws_allocate -F scratch tensorboard_logdata 1 +marie@compute$ ws_allocate -F /data/horse tensorboard_logdata 1 Info: creating workspace. -/scratch/ws/1/marie-tensorboard_logdata +/data/horse/ws/marie-tensorboard_logdata [...] ``` @@ -72,7 +72,7 @@ accessible for TensorBoard. Please find further information on the official [Ten Then, you can start TensorBoard and pass the directory of the event data: ```console -marie@compute$ tensorboard --logdir /scratch/ws/1/marie-tensorboard_logdata --bind_all +marie@compute$ tensorboard --logdir /data/horse/ws/marie-tensorboard_logdata --bind_all [...] TensorBoard 2.3.0 at http://taurusi8034.taurus.hrsk.tu-dresden.de:6006/ [...] diff --git a/doc.zih.tu-dresden.de/docs/software/tensorflow.md b/doc.zih.tu-dresden.de/docs/software/tensorflow.md index f11ecb3ac94e3cc65cf671815d813bacc9b9815f..ce83dce50ec3584575307302351ee6a89cf974c1 100644 --- a/doc.zih.tu-dresden.de/docs/software/tensorflow.md +++ b/doc.zih.tu-dresden.de/docs/software/tensorflow.md @@ -17,13 +17,17 @@ to find out, which TensorFlow modules are available on your partition. On ZIH systems, TensorFlow 2 is the default module version. For compatibility hints between TensorFlow 2 and TensorFlow 1, see the corresponding [section below](#compatibility-tf2-and-tf1). -We recommend using partitions `alpha` and/or `ml` when working with machine learning workflows +We recommend using the clusters `alpha` and/or `power` when working with machine learning workflows and the TensorFlow library. You can find detailed hardware specification in our [Hardware](../jobs_and_resources/hardware_overview.md) documentation. ## TensorFlow Console -On the partition `alpha`, load the module environment: +_The module environments /hiera, /scs5, /classic and /ml originated from the old taurus system are +momentarily under construction. The script will be updated after completion of the redesign +accordingly_ + +On the cluster `alpha`, load the module environment: ```console marie@alpha$ module load modenv/scs5 @@ -47,17 +51,17 @@ marie@alpha$ module avail TensorFlow [...] ``` -On the partition `ml` load the module environment: +On the cluster `power` load the module environment: ```console -marie@ml$ module load modenv/ml +marie@power$ module load modenv/ml The following have been reloaded with a version change: 1) modenv/scs5 => modenv/ml ``` This example shows how to install and start working with TensorFlow using the modules system. ```console -marie@ml$ module load TensorFlow +marie@power$ module load TensorFlow Module TensorFlow/2.3.1-fosscuda-2019b-Python-3.7.4 and 47 dependencies loaded. ``` @@ -68,16 +72,16 @@ import TensorFlow: !!! example ```console - marie@ml$ ws_allocate -F scratch python_virtual_environment 1 + marie@power$ ws_allocate -F horse python_virtual_environment 1 Info: creating workspace. - /scratch/ws/1/python_virtual_environment + /data/horse/ws/python_virtual_environment [...] - marie@ml$ which python #check which python are you using + marie@power$ which python #check which python are you using /sw/installed/Python/3.7.2-GCCcore-8.2.0 - marie@ml$ virtualenv --system-site-packages /scratch/ws/1/marie-python_virtual_environment/env + marie@power$ virtualenv --system-site-packages /data/horse/ws/marie-python_virtual_environment/env [...] - marie@ml$ source /scratch/ws/1/marie-python_virtual_environment/env/bin/activate - marie@ml$ python -c "import tensorflow as tf; print(tf.__version__)" + marie@power$ source /data/horse/ws/marie-python_virtual_environment/env/bin/activate + marie@power$ python -c "import tensorflow as tf; print(tf.__version__)" [...] 2.3.1 ``` @@ -105,7 +109,7 @@ Another option to use TensorFlow are containers. In the HPC domain, the following example, we use the tensorflow-test in a Singularity container: ```console -marie@ml$ singularity shell --nv /scratch/singularity/powerai-1.5.3-all-ubuntu16.04-py3.img +marie@power$ singularity shell --nv /data/horse/singularity/powerai-1.5.3-all-ubuntu16.04-py3.img Singularity>$ export PATH=/opt/anaconda3/bin:$PATH Singularity>$ source activate /opt/anaconda3 #activate conda environment (base) Singularity>$ . /opt/DL/tensorflow/bin/tensorflow-activate @@ -156,5 +160,5 @@ marie@compute$ module spider Keras [...] ``` -to find out, which Keras modules are available on your partition. TensorFlow should be automatically +to find out, which Keras modules are available on your cluster. TensorFlow should be automatically loaded as a dependency. After loading the module, you can use Keras as usual. diff --git a/doc.zih.tu-dresden.de/docs/software/vampir.md b/doc.zih.tu-dresden.de/docs/software/vampir.md index efbc0717fb00e1e889c16bc6ab18e8d7db51836b..fb57bd72f0c03f24864a1cb2f9b8d05fc6287520 100644 --- a/doc.zih.tu-dresden.de/docs/software/vampir.md +++ b/doc.zih.tu-dresden.de/docs/software/vampir.md @@ -77,7 +77,7 @@ This way, a job with a timelimit of 30 minutes and default resources is submitte your needs. If not, please feel free to request a **customized job** running VampirServer, e.g. ```console -marie@login$ vampirserver start --ntasks=8 -- --time=01:00:00 -- --mem-per-cpu=3000M --partition=romeo +marie@login$ vampirserver start --ntasks=8 -- --time=01:00:00 -- --mem-per-cpu=3000M Launching VampirServer... Submitting slurm 01:00:00 minutes job (this might take a while)... ``` diff --git a/doc.zih.tu-dresden.de/docs/software/virtual_machines.md b/doc.zih.tu-dresden.de/docs/software/virtual_machines.md index 69b5c3798b0d4f28309ddec24fbea486cfaf2460..0738f4fb4da398091cd373901dc94e8fa3fe5d94 100644 --- a/doc.zih.tu-dresden.de/docs/software/virtual_machines.md +++ b/doc.zih.tu-dresden.de/docs/software/virtual_machines.md @@ -7,9 +7,9 @@ The Singularity container setup requires a Linux machine with root privileges, t and a compatible kernel. If some of these requirements cannot be fulfilled, then there is also the option of using the provided virtual machines (VM) on ZIH systems. -Currently, starting VMs is only possible on partitions `ml` and `hpdlf`. The VMs on the ML nodes are -used to build singularity containers for the Power9 architecture and the HPDLF nodes to build -Singularity containers for the x86 architecture. +Currently, starting VMs is only possible on the cluster `power` (and `hpdlf`?). The VMs on the power +nodes are used to build singularity containers for the Power9 architecture and the HPDLF nodes to +build Singularity containers for the x86 architecture. ## Create a Virtual Machine @@ -18,7 +18,7 @@ The Slurm parameter `--cloud=kvm` specifies that a virtual machine should be sta ### On Power9 Architecture ```console -marie@login$ srun --partition=ml --nodes=1 --cpus-per-task=4 --hint=nomultithread --cloud=kvm --pty /bin/bash +marie@login.power$ srun --nodes=1 --cpus-per-task=4 --hint=nomultithread --cloud=kvm --pty /bin/bash srun: job 6969616 queued and waiting for resources srun: job 6969616 has been allocated resources bash-4.2$ @@ -26,6 +26,8 @@ bash-4.2$ ### On x86 Architecture +_to be updated...._ + ```console marie@login$ srun --partition=hpdlf --nodes=1 --cpus-per-task=4 --hint=nomultithread --cloud=kvm --pty /bin/bash srun: job 2969732 queued and waiting for resources @@ -48,7 +50,7 @@ bash-4.2$ cat /tmp/marie_2759627/activate if ! grep -q -- "Key for the VM on the partition ml" "/home/marie/.ssh/authorized_keys" > /dev/null; then cat "/tmp/marie_2759627/kvm.pub" >> "/home/marie/.ssh/authorized_keys" else - sed -i "s|.*Key for the VM on the partition ml.*|ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC3siZfQ6vQ6PtXPG0RPZwtJXYYFY73TwGYgM6mhKoWHvg+ZzclbBWVU0OoU42B3Ddofld7TFE8sqkHM6M+9jh8u+pYH4rPZte0irw5/27yM73M93q1FyQLQ8Rbi2hurYl5gihCEqomda7NQVQUjdUNVc6fDAvF72giaoOxNYfvqAkw8lFyStpqTHSpcOIL7pm6f76Jx+DJg98sXAXkuf9QK8MurezYVj1qFMho570tY+83ukA04qQSMEY5QeZ+MJDhF0gh8NXjX/6+YQrdh8TklPgOCmcIOI8lwnPTUUieK109ndLsUFB5H0vKL27dA2LZ3ZK+XRCENdUbpdoG2Czz Key for the VM on the partition ml|" "/home/marie/.ssh/authorized_keys" + sed -i "s|.*Key for the VM on the cluster power.*|ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC3siZfQ6vQ6PtXPG0RPZwtJXYYFY73TwGYgM6mhKoWHvg+ZzclbBWVU0OoU42B3Ddofld7TFE8sqkHM6M+9jh8u+pYH4rPZte0irw5/27yM73M93q1FyQLQ8Rbi2hurYl5gihCEqomda7NQVQUjdUNVc6fDAvF72giaoOxNYfvqAkw8lFyStpqTHSpcOIL7pm6f76Jx+DJg98sXAXkuf9QK8MurezYVj1qFMho570tY+83ukA04qQSMEY5QeZ+MJDhF0gh8NXjX/6+YQrdh8TklPgOCmcIOI8lwnPTUUieK109ndLsUFB5H0vKL27dA2LZ3ZK+XRCENdUbpdoG2Czz Key for the VM on the cluster power|" "/home/marie/.ssh/authorized_keys" fi ssh -i /tmp/marie_2759627/kvm root@192.168.0.6 @@ -79,8 +81,8 @@ rm -rf /tmp/sbuild-* ``` If that does not help, e.g., because one build alone needs more than the available disk memory, then -it will be necessary to use the `tmp` folder on `scratch`. In order to ensure that the files in the -temporary folder will be owned by root, it is necessary to set up an image inside `/scratch/tmp` +it will be necessary to use the `tmp` folder on `/data/horse`. In order to ensure that the files in the +temporary folder will be owned by root, it is necessary to set up an image inside `/data/horse/tmp` instead of using it directly. E.g., to create a 25 GB of temporary memory image: ```console diff --git a/doc.zih.tu-dresden.de/docs/software/visualization.md b/doc.zih.tu-dresden.de/docs/software/visualization.md index 6c68e9a1a5891b92934ae600c57f23bbf1ebd0df..10ddec0b404c02b3a8b2d1c798be89d6cb66eb39 100644 --- a/doc.zih.tu-dresden.de/docs/software/visualization.md +++ b/doc.zih.tu-dresden.de/docs/software/visualization.md @@ -2,6 +2,8 @@ ## ParaView +_**-- currently under construction--**_ + [ParaView](https://paraview.org) is an open-source, multi-platform data analysis and visualization application. The ParaView package comprises different tools which are designed to meet interactive, batch and in-situ workflows. @@ -9,6 +11,10 @@ batch and in-situ workflows. ParaView is available on ZIH systems from the [modules system](modules.md#module-environments). The following command lists the available versions +_The module environments /hiera, /scs5, /classic and /ml originated from the taurus system are +momentarily under construction. The script will be updated after completion of the redesign +accordingly_ + ```console marie@login$ module avail ParaView diff --git a/doc.zih.tu-dresden.de/docs/support/support.md b/doc.zih.tu-dresden.de/docs/support/support.md index 3582ae264c4ead7f41acdf14e9877af91b8c2d57..13bf47db723831a42684cbf8b4229581aec8c9ec 100644 --- a/doc.zih.tu-dresden.de/docs/support/support.md +++ b/doc.zih.tu-dresden.de/docs/support/support.md @@ -3,7 +3,7 @@ ## Create a Ticket The best way to ask for help send a message to -[hpcsupport@zih.tu-dresden.de](mailto:hpcsupport@zih.tu-dresden.de) with a +[hpc-support@tu-dresden.de](mailto:hpc-support@tu-dresden.de) with a detailed description of your problem. It should include: diff --git a/doc.zih.tu-dresden.de/mkdocs.yml b/doc.zih.tu-dresden.de/mkdocs.yml index 48a697afe5fede6bdadb7e4d033081e3d5a9f4bb..f21789ce2413bcde0f8a2d13f3525f568a77191c 100644 --- a/doc.zih.tu-dresden.de/mkdocs.yml +++ b/doc.zih.tu-dresden.de/mkdocs.yml @@ -26,13 +26,15 @@ nav: - Data Transfer: - Overview: data_transfer/overview.md - Transfer Data Inside ZIH Systems with Datamover: data_transfer/datamover.md - - Transfer Data to/from ZIH Systems via Export Nodes: data_transfer/export_nodes.md + - Transfer Data to/from ZIH Systems via Dataport Nodes: data_transfer/dataport_nodes.md + - Transfer Data to/from old ZIH Systems via Export Nodes: data_transfer/export_nodes.md - Transfer Data between ZIH Systems and Object Storage (S3): data_transfer/object_storage.md - Data Life Cycle Management: - Overview: data_lifecycle/overview.md - Filesystems: - Overview: data_lifecycle/file_systems.md - Permanent Filesystems: data_lifecycle/permanent.md + - Working Filesystems: data_lifecycle/working.md - Lustre: data_lifecycle/lustre.md - BeeGFS: data_lifecycle/beegfs.md - Warm Archive: data_lifecycle/warm_archive.md @@ -54,7 +56,7 @@ nav: - Singularity for Power9 Architecture: software/singularity_power9.md - Virtual Machines: software/virtual_machines.md - GPU-accelerated Containers for Deep Learning (NGC Containers): software/ngc_containers.md - - CI/CD: software/cicd.md + - CI/CD on HPC: software/cicd.md - External Licenses: software/licenses.md - Computational Fluid Dynamics (CFD): software/cfd.md - Mathematics Applications: software/mathematics.md @@ -101,20 +103,17 @@ nav: - Overview: jobs_and_resources/overview.md - HPC Resources: - Overview: jobs_and_resources/hardware_overview.md - - New Systems 2023: - - Architectural Re-Design 2023: jobs_and_resources/architecture_2023.md - - Overview 2023: jobs_and_resources/hardware_overview_2023.md - - Migration 2023: jobs_and_resources/migration_2023.md - - Tests 2023: jobs_and_resources/barnard_test.md - - AMD Rome Nodes: jobs_and_resources/rome_nodes.md - - NVMe Storage: jobs_and_resources/nvme_storage.md - - Alpha Centauri: jobs_and_resources/alpha_centauri.md - - HPE Superdome Flex: jobs_and_resources/sd_flex.md + - CPU Cluster Barnard: jobs_and_resources/barnard.md + - GPU Cluster Alpha Centauri: jobs_and_resources/alpha_centauri.md + - SMP Cluster Julia: jobs_and_resources/julia.md + - CPU Custer Romeo: jobs_and_resources/romeo.md + - GPU Cluster Power9: jobs_and_resources/power9.md - NVIDIA Arm HPC Developer Kit: jobs_and_resources/arm_hpc_devkit.md + - NVMe Storage: jobs_and_resources/nvme_storage.md - Running Jobs: - Batch System Slurm: jobs_and_resources/slurm.md - Job Examples: jobs_and_resources/slurm_examples.md - - Partitions and Limits: jobs_and_resources/partitions_and_limits.md + - Slurm Resource Limits: jobs_and_resources/slurm_limits.md - Slurm Job File Generator: jobs_and_resources/slurm_generator.md - Checkpoint/Restart: jobs_and_resources/checkpoint_restart.md - Binding and Distribution of Tasks: jobs_and_resources/binding_and_distribution_of_tasks.md @@ -131,8 +130,8 @@ nav: - Jupyter Installation: archive/install_jupyter.md - Profile Jobs with Slurm: archive/slurm_profiling.md - Switched-Off Systems: - - Overview 2022: archive/hardware_overview_2022.md - Overview: archive/systems_switched_off.md + - System Taurus: archive/system_taurus.md - Migration From Deimos to Atlas: archive/migrate_to_atlas.md - System Altix: archive/system_altix.md - System Atlas: archive/system_atlas.md @@ -219,7 +218,6 @@ markdown_extensions: - pymdownx.tabbed: alternate_style: True - # - mkdocs-video plugins: - search @@ -252,7 +250,7 @@ extra: # second logo zih_homepage: https://tu-dresden.de/zih zih_name: "center for information services and high performance computing (ZIH)" - hpcsupport_mail: hpcsupport@zih.tu-dresden.de + hpcsupport_mail: hpc-support@tu-dresden.de # links in footer diff --git a/doc.zih.tu-dresden.de/wordlist.aspell b/doc.zih.tu-dresden.de/wordlist.aspell index e50cbf260de330ac63a967fa82511888f7cb3871..fa8f0930030fe52601a3992261312fddcdd4a20f 100644 --- a/doc.zih.tu-dresden.de/wordlist.aspell +++ b/doc.zih.tu-dresden.de/wordlist.aspell @@ -72,6 +72,7 @@ ddl DDP DDR DFG +diskless distr DistributedDataParallel dmtcp @@ -175,7 +176,7 @@ iDataPlex ifort ImageNet img -Infiniband +InfiniBand InfluxDB init inode @@ -245,6 +246,7 @@ Mortem mountpoint mpi Mpi +MPI mpicc mpiCC mpicxx @@ -295,8 +297,6 @@ OpenBLAS OpenCL OpenGL OpenMP -openmpi -OpenMPI OpenSSH Opteron ORCA @@ -326,8 +326,6 @@ png PowerAI PowerShell ppc -rapidgzip -Rapidgzip pre Pre preload @@ -354,6 +352,8 @@ queue quickstart Quickstart randint +rapidgzip +Rapidgzip ratarmount Ratarmount ratarmountcore @@ -409,6 +409,7 @@ Slurm SLURMCluster SMP SMT +sourceable SparkExample spawner Speicherkomplex @@ -453,6 +454,8 @@ transferability Trition TSS TUD +uncomment +uncommenting und undistinguishable unencrypted