Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
hpc-compendium
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Deploy
Releases
Package Registry
Container Registry
Model registry
Operate
Terraform modules
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
ZIH
hpcsupport
hpc-compendium
Commits
ff90571b
Commit
ff90571b
authored
3 years ago
by
Jan Frenzel
Browse files
Options
Downloads
Patches
Plain Diff
Added Flink where Spark is mentioned in big_data_frameworks.md.
parent
a644af4c
No related branches found
No related tags found
2 merge requests
!423
Automated merge from preview to main
,
!419
Mentioned also Flink where Spark is mentioned
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
doc.zih.tu-dresden.de/docs/software/big_data_frameworks.md
+7
-6
7 additions, 6 deletions
doc.zih.tu-dresden.de/docs/software/big_data_frameworks.md
with
7 additions
and
6 deletions
doc.zih.tu-dresden.de/docs/software/big_data_frameworks.md
+
7
−
6
View file @
ff90571b
...
...
@@ -38,8 +38,8 @@ The usage of Flink with Jupyter notebooks is currently under examination.
### Default Configuration
The Spark module
is
available in both
`scs5`
and
`ml`
environments.
Thus, Spark can be executed using different CPU architectures, e.g., Haswell and Power9.
The Spark
and Flink
module
s are
available in both
`scs5`
and
`ml`
environments.
Thus, Spark
and Flink
can be executed using different CPU architectures, e.g., Haswell and Power9.
Let us assume that two nodes should be used for the computation. Use a
`srun`
command similar to
the following to start an interactive session using the partition haswell. The following code
...
...
@@ -61,8 +61,9 @@ Once you have the shell, load desired Big Data framework using the command
marie@compute$
module load Flink
```
Before the application can be started, the Spark cluster needs to be set up. To do this, configure
Spark first using configuration template at
`$SPARK_HOME/conf`
:
Before the application can be started, the cluster with the allocated nodes needs to be set up. To
do this, configure the cluster first using the configuration template at
`$SPARK_HOME/conf`
for
Spark or
`$FLINK_ROOT_DIR/conf`
for Flink:
=== "Spark"
```
console
...
...
@@ -74,7 +75,7 @@ Spark first using configuration template at `$SPARK_HOME/conf`:
```
This places the configuration in a directory called
`cluster-conf-<JOB_ID>`
in your
`home`
directory, where
`<JOB_ID>`
stands for the id of the Slurm job. After that, you can start
Spark
in
directory, where
`<JOB_ID>`
stands for the id of the Slurm job. After that, you can start in
the usual way:
=== "Spark"
...
...
@@ -86,7 +87,7 @@ the usual way:
marie@compute$
start-cluster.sh
```
The
Spark
processes should now be set up and you can start your application, e. g.:
The
necessary background
processes should now be set up and you can start your application, e. g.:
=== "Spark"
```
console
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment