From 2ca08ea8b65346c469b5fc50841bfc33632a77bd Mon Sep 17 00:00:00 2001
From: Jan Frenzel <jan.frenzel@tu-dresden.de>
Date: Mon, 15 Nov 2021 11:29:04 +0100
Subject: [PATCH] Added short description about how to use Flink. Resolves
 #218.

---
 doc.zih.tu-dresden.de/docs/software/flink.md | 178 +++++++++++++++++++
 doc.zih.tu-dresden.de/wordlist.aspell        |   1 +
 2 files changed, 179 insertions(+)
 create mode 100644 doc.zih.tu-dresden.de/docs/software/flink.md

diff --git a/doc.zih.tu-dresden.de/docs/software/flink.md b/doc.zih.tu-dresden.de/docs/software/flink.md
new file mode 100644
index 000000000..05fb403c2
--- /dev/null
+++ b/doc.zih.tu-dresden.de/docs/software/flink.md
@@ -0,0 +1,178 @@
+# Apache Flink
+
+[Apache Flink](https://flink.apache.org/) is a framework for processing and integrating Big Data.
+It offers a similar API as [Apache Spark](big_data_frameworks_spark.md), but is more appropriate
+for data stream processing. You can check module versions and availability with the command:
+
+```console
+marie@login$ module avail Flink
+```
+
+**Prerequisites:** To work with the frameworks, you need [access](../access/ssh_login.md) to ZIH
+systems and basic knowledge about data analysis and the batch system
+[Slurm](../jobs_and_resources/slurm.md).
+
+The usage of Big Data frameworks is different from other modules due to their master-worker
+approach. That means, before an application can be started, one has to do additional steps.
+
+The steps are:
+
+1. Load the Flink software module
+1. Configure the Flink cluster
+1. Start a Flink cluster
+1. Start the Flink application
+
+Apache Flink can be used in [interactive](#interactive-jobs) and [batch](#batch-jobs) jobs as
+described below.
+
+## Interactive Jobs
+
+### Default Configuration
+
+Let us assume that two nodes should be used for the computation. Use a `srun` command similar to
+the following to start an interactive session using the partition haswell. The following code
+snippet shows a job submission to haswell nodes with an allocation of two nodes with 60 GB main
+memory exclusively for one hour:
+
+```console
+marie@login$ srun --partition=haswell --nodes=2 --mem=60g --exclusive --time=01:00:00 --pty bash -l
+```
+
+Once you have the shell, load Flink using the command
+
+```console
+marie@compute$ module load Flink
+```
+
+Before the application can be started, the Flink cluster needs to be set up. To do this, configure
+Flink first using configuration template at `$FLINK_ROOT_DIR/conf`:
+
+```console
+marie@compute$ source framework-configure.sh flink $FLINK_ROOT_DIR/conf
+```
+
+This places the configuration in a directory called `cluster-conf-<JOB_ID>` in your `home`
+directory, where `<JOB_ID>` stands for the id of the Slurm job. After that, you can start Flink in
+the usual way:
+
+```console
+marie@compute$ start-cluster.sh
+```
+
+The Flink processes should now be set up and you can start your application, e. g.:
+
+```console
+marie@compute$ flink run $FLINK_ROOT_DIR/examples/batch/KMeans.jar
+```
+
+!!! warning
+
+    Do not delete the directory `cluster-conf-<JOB_ID>` while the job is still
+    running. This leads to errors.
+
+### Custom Configuration
+
+The script `framework-configure.sh` is used to derive a configuration from a template. It takes two
+parameters:
+
+- The framework to set up (Spark, Flink, Hadoop)
+- A configuration template
+
+Thus, you can modify the configuration by replacing the default configuration template with a
+customized one. This way, your custom configuration template is reusable for different jobs. You
+can start with a copy of the default configuration ahead of your interactive session:
+
+```console
+marie@login$ cp -r $FLINK_ROOT_DIR/conf my-config-template
+```
+
+After you have changed `my-config-template`, you can use your new template in an interactive job
+with:
+
+```console
+marie@compute$ source framework-configure.sh flink my-config-template
+```
+
+### Using Hadoop Distributed Filesystem (HDFS)
+
+If you want to use Flink and HDFS together (or in general more than one framework), a scheme
+similar to the following can be used:
+
+```console
+marie@compute$ module load Hadoop
+marie@compute$ module load Flink
+marie@compute$ source framework-configure.sh hadoop $HADOOP_ROOT_DIR/etc/hadoop
+marie@compute$ source framework-configure.sh flink $FLINK_ROOT_DIR/conf
+marie@compute$ start-dfs.sh
+marie@compute$ start-cluster.sh
+```
+
+## Batch Jobs
+
+Using `srun` directly on the shell blocks the shell and launches an interactive job. Apart from
+short test runs, it is **recommended to launch your jobs in the background using batch jobs**. For
+that, you can conveniently put the parameters directly into the job file and submit it via
+`sbatch [options] <job file>`.
+
+Please use a [batch job](../jobs_and_resources/slurm.md) with a configuration, similar to the
+example below:
+
+??? example "flink.sbatch"
+    ```bash
+    #!/bin/bash -l
+    #SBATCH --time=00:05:00
+    #SBATCH --partition=haswell
+    #SBATCH --nodes=2
+    #SBATCH --exclusive
+    #SBATCH --mem=60G
+    #SBATCH --job-name="example-flink"
+
+    ml Flink/1.12.3-Java-1.8.0_161-OpenJDK-Python-3.7.4-GCCcore-8.3.0
+
+    function myExitHandler () {
+        stop-cluster.sh
+    }
+
+    #configuration
+    . framework-configure.sh flink $FLINK_ROOT_DIR/conf
+
+    #register cleanup hook in case something goes wrong
+    trap myExitHandler EXIT
+
+    #start the cluster
+    start-cluster.sh
+
+    #run your application
+    flink run $FLINK_ROOT_DIR/examples/batch/KMeans.jar
+
+    #stop the cluster
+    stop-cluster.sh
+
+    exit 0
+    ```
+
+!!! note
+
+    You could work with simple examples in your home directory, but, according to the
+    [storage concept](../data_lifecycle/overview.md), **please use
+    [workspaces](../data_lifecycle/workspaces.md) for your study and work projects**. For this
+    reason, you have to use advanced options of Jupyterhub and put "/" in "Workspace scope" field.
+
+## FAQ
+
+Q: Command `source framework-configure.sh hadoop
+$HADOOP_ROOT_DIR/etc/hadoop` gives the output:
+`bash: framework-configure.sh: No such file or directory`. How can this be resolved?
+
+A: Please try to re-submit or re-run the job and if that doesn't help
+re-login to the ZIH system.
+
+Q: There are a lot of errors and warnings during the set up of the
+session
+
+A: Please check the work capability on a simple example as shown in this documentation.
+
+!!! help
+
+    If you have questions or need advice, please use the contact form on
+    [https://scads.ai/contact/](https://scads.ai/contact/) or contact the HPC support.
diff --git a/doc.zih.tu-dresden.de/wordlist.aspell b/doc.zih.tu-dresden.de/wordlist.aspell
index c4133b92a..443647e74 100644
--- a/doc.zih.tu-dresden.de/wordlist.aspell
+++ b/doc.zih.tu-dresden.de/wordlist.aspell
@@ -79,6 +79,7 @@ FFT
 FFTW
 filesystem
 filesystems
+flink
 Flink
 FMA
 foreach
-- 
GitLab