diff --git a/doc.zih.tu-dresden.de/docs/software/virtual_machines.md b/doc.zih.tu-dresden.de/docs/software/virtual_machines.md index c6c660d3c5ac052f3362ad950f6ad395e4420bdf..2527bbe91cbb735824598cc90311b88df2eab808 100644 --- a/doc.zih.tu-dresden.de/docs/software/virtual_machines.md +++ b/doc.zih.tu-dresden.de/docs/software/virtual_machines.md @@ -4,21 +4,21 @@ The following instructions are primarily aimed at users who want to build their [Singularity](containers.md) containers on ZIH systems. The Singularity container setup requires a Linux machine with root privileges, the same architecture -and a compatible kernel. If some of these requirements can not be fulfilled, then there is -also the option of using the provided virtual machines (VM) on ZIH systems. +and a compatible kernel. If some of these requirements cannot be fulfilled, then there is also the +option of using the provided virtual machines (VM) on ZIH systems. -Currently, starting VMs is only possible on partitions `ml` and HPDLF. The VMs on the ML nodes are +Currently, starting VMs is only possible on partitions `ml` and `hpdlf`. The VMs on the ML nodes are used to build singularity containers for the Power9 architecture and the HPDLF nodes to build Singularity containers for the x86 architecture. ## Create a Virtual Machine -The `--cloud=kvm` Slurm parameter specifies that a virtual machine should be started. +The Slurm parameter `--cloud=kvm` specifies that a virtual machine should be started. ### On Power9 Architecture ```console -marie@login$ srun -p ml -N 1 -c 4 --hint=nomultithread --cloud=kvm --pty /bin/bash +marie@login$ srun --partition=ml --nodes=1 --cpus-per-task=4 --hint=nomultithread --cloud=kvm --pty /bin/bash srun: job 6969616 queued and waiting for resources srun: job 6969616 has been allocated resources bash-4.2$ @@ -27,7 +27,7 @@ bash-4.2$ ### On x86 Architecture ```console -marie@login$ srun -p hpdlf -N 1 -c 4 --hint=nomultithread --cloud=kvm --pty /bin/bash +marie@login$ srun --partition=hpdlf --nodes=1 --cpus-per-task=4 --hint=nomultithread --cloud=kvm --pty /bin/bash srun: job 2969732 queued and waiting for resources srun: job 2969732 has been allocated resources bash-4.2$ @@ -35,17 +35,17 @@ bash-4.2$ ## Access a Virtual Machine -Since the a security issue on ZIH systems, we restricted the filesystem permissions. Now you have to -wait until the file `/tmp/${SLURM_JOB_USER}\_${SLURM_JOB_ID}/activate` is created, then you can try +After a security issue on ZIH systems, we restricted the filesystem permissions. Now, you have to +wait until the file `/tmp/${SLURM_JOB_USER}_${SLURM_JOB_ID}/activate` is created. Then, you can try to connect via `ssh` into the virtual machine, but it could be that the virtual machine needs some -more seconds to boot and start the SSH daemon. So you may need to try the `ssh` command multiple +more seconds to boot and accept the connection. So you may need to try the `ssh` command multiple times till it succeeds. ```console bash-4.2$ cat /tmp/marie_2759627/activate #!/bin/bash -if ! grep -q -- "Key for the VM on the partition ml" "/home/rotscher/.ssh/authorized_keys" >& /dev/null; then +if ! grep -q -- "Key for the VM on the partition ml" "/home/marie/.ssh/authorized_keys" > /dev/null; then cat "/tmp/marie_2759627/kvm.pub" >> "/home/marie/.ssh/authorized_keys" else sed -i "s|.*Key for the VM on the partition ml.*|ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC3siZfQ6vQ6PtXPG0RPZwtJXYYFY73TwGYgM6mhKoWHvg+ZzclbBWVU0OoU42B3Ddofld7TFE8sqkHM6M+9jh8u+pYH4rPZte0irw5/27yM73M93q1FyQLQ8Rbi2hurYl5gihCEqomda7NQVQUjdUNVc6fDAvF72giaoOxNYfvqAkw8lFyStpqTHSpcOIL7pm6f76Jx+DJg98sXAXkuf9QK8MurezYVj1qFMho570tY+83ukA04qQSMEY5QeZ+MJDhF0gh8NXjX/6+YQrdh8TklPgOCmcIOI8lwnPTUUieK109ndLsUFB5H0vKL27dA2LZ3ZK+XRCENdUbpdoG2Czz Key for the VM on the partition ml|" "/home/marie/.ssh/authorized_keys" @@ -71,7 +71,7 @@ We provide [tools](virtual_machines_tools.md) to automate these steps. You may j The available space inside the VM can be queried with `df -h`. Currently the whole VM has 8 GB and with the installed operating system, 6.6 GB of available space. -Sometimes the Singularity build might fail because of a disk out-of-memory error. In this case it +Sometimes, the Singularity build might fail because of a disk out-of-memory error. In this case, it might be enough to delete leftover temporary files from Singularity: ```console @@ -111,4 +111,4 @@ Bootstraps **shub** and **library** should be avoided. ### Transport Endpoint is not Connected This happens when the SSHFS mount gets unmounted because it is not very stable. It is sufficient to -run `\~/mount_host_data.sh` again or just the SSHFS command inside that script. +run `~/mount_host_data.sh` again or just the SSHFS command inside that script.