Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
hpc-compendium
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Deploy
Releases
Package registry
Container Registry
Model registry
Operate
Terraform modules
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
ZIH
hpcsupport
hpc-compendium
Commits
cc8c0ecd
Commit
cc8c0ecd
authored
3 years ago
by
Jan Frenzel
Browse files
Options
Downloads
Patches
Plain Diff
Replaced conda description with virtualenv description in big_data_frameworks_spark.md.
parent
f60689bc
No related branches found
No related tags found
4 merge requests
!333
Draft: update NGC containers
,
!322
Merge preview into main
,
!319
Merge preview into main
,
!258
Data Analytics restructuring
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
doc.zih.tu-dresden.de/docs/software/big_data_frameworks_spark.md
+17
-23
17 additions, 23 deletions
....tu-dresden.de/docs/software/big_data_frameworks_spark.md
with
17 additions
and
23 deletions
doc.zih.tu-dresden.de/docs/software/big_data_frameworks_spark.md
+
17
−
23
View file @
cc8c0ecd
...
@@ -10,7 +10,7 @@ Big Data. These frameworks are also offered as software [modules](modules.md) on
...
@@ -10,7 +10,7 @@ Big Data. These frameworks are also offered as software [modules](modules.md) on
`scs5`
partition. You can check module versions and availability with the command
`scs5`
partition. You can check module versions and availability with the command
```
console
```
console
marie@login$
module av Spark
marie@login$
module av
ail
Spark
```
```
The
**aim**
of this page is to introduce users on how to start working with
The
**aim**
of this page is to introduce users on how to start working with
...
@@ -94,7 +94,7 @@ The Spark processes should now be set up and you can start your
...
@@ -94,7 +94,7 @@ The Spark processes should now be set up and you can start your
application, e. g.:
application, e. g.:
```
console
```
console
marie@compute$
spark-submit
--class
org.apache.spark.examples.SparkPi
$SPARK_HOME
/examples/jars/spark-examples_2.1
1-2.4.4
.jar 1000
marie@compute$
spark-submit
--class
org.apache.spark.examples.SparkPi
$SPARK_HOME
/examples/jars/spark-examples_2.1
2-3.0.1
.jar 1000
```
```
!!! warning
!!! warning
...
@@ -161,35 +161,29 @@ run your Jupyter notebook on HPC nodes (the preferable way).
...
@@ -161,35 +161,29 @@ run your Jupyter notebook on HPC nodes (the preferable way).
### Preparation
### Preparation
If you want to run Spark in Jupyter notebooks, you have to prepare it first. This is comparable
If you want to run Spark in Jupyter notebooks, you have to prepare it first. This is comparable
to
the
[
description for custom environments
](
../access/jupyterhub.md#conda
-environment
)
.
to
[
normal python virtual environments
](
../software/python_virtual_environments.md#python-virtual
-environment
)
.
You start with an allocation:
You start with an allocation:
```
console
```
console
marie@login$
srun
--pty
-n
1
-c
2
--mem-per-cpu
=
2500
-t
01:00:00 bash
-l
marie@login$
srun
--pty
-n
1
-c
2
--mem-per-cpu
=
2500
-t
01:00:00 bash
-l
```
```
When a node is allocated, install
t
he required package
with Anaconda
:
When a node is allocated, install he required package
s
:
```
console
```
console
marie@compute$
module load Anaconda3
marie@compute$
cd
marie@compute$
cd
marie@compute$
mkdir
user-kernel
marie@compute$
mkdir
jupyter-kernel
marie@compute$
conda create
--prefix
$HOME
/user-kernel/haswell-py3.6-spark
python
=
3.6
marie@compute$
virtualenv
--system-site-packages
jupyter-kernel/env
#Create virtual environment
Collecting package metadata: done
[...]
Solving environment: done [...]
marie@compute$
source
jupyter-kernel/env/bin/activate
#Activate virtual environment.
marie@compute$
pip
install
ipykernel
marie@compute$
conda activate
$HOME
/user-kernel/haswell-py3.6-spark
[...]
marie@compute$
conda
install
ipykernel
marie@compute$
python
-m
ipykernel
install
--user
--name
haswell-py3.7-spark
--display-name
=
"haswell-py3.7-spark"
Collecting package metadata: done
Installed kernelspec haswell-py3.7-spark in [...]
Solving environment: done [...]
marie@compute$
pip
install
findspark
marie@compute$
python
-m
ipykernel
install
--user
--name
haswell-py3.6-spark
--display-name
=
"haswell-py3.6-spark"
Installed kernelspec haswell-py3.6-spark in [...]
marie@compute$
deactivate
marie@compute$
conda
install
-c
conda-forge findspark
marie@compute$
conda
install
pyspark
marie@compute$
conda deactivate
```
```
You are now ready to spawn a notebook with Spark.
You are now ready to spawn a notebook with Spark.
...
@@ -203,7 +197,7 @@ to the field "Preload modules" and select one of the Spark modules.
...
@@ -203,7 +197,7 @@ to the field "Preload modules" and select one of the Spark modules.
When your Jupyter instance is started, check whether the kernel that
When your Jupyter instance is started, check whether the kernel that
you created in the preparation phase (see above) is shown in the top
you created in the preparation phase (see above) is shown in the top
right corner of the notebook. If it is not already selected, select the
right corner of the notebook. If it is not already selected, select the
kernel
`haswell-py3.
6
-spark`
. Then, you can set up Spark. Since the setup
kernel
`haswell-py3.
7
-spark`
. Then, you can set up Spark. Since the setup
in the notebook requires more steps than in an interactive session, we
in the notebook requires more steps than in an interactive session, we
have created an example notebook that you can use as a starting point
have created an example notebook that you can use as a starting point
for convenience:
[
SparkExample.ipynb
](
misc/SparkExample.ipynb
)
for convenience:
[
SparkExample.ipynb
](
misc/SparkExample.ipynb
)
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment