Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
S
Slurm
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package Registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
tud-zih-energy
Slurm
Commits
2a2ce582
Commit
2a2ce582
authored
12 years ago
by
Danny Auble
Browse files
Options
Downloads
Patches
Plain Diff
backport cray 2.5 docs to 2.4
parent
51c4f7c8
No related branches found
Branches containing commit
No related tags found
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
doc/html/cray.shtml
+12
-12
12 additions, 12 deletions
doc/html/cray.shtml
with
12 additions
and
12 deletions
doc/html/cray.shtml
+
12
−
12
View file @
2a2ce582
<!--#include virtual="header.txt"-->
<!--#include virtual="header.txt"-->
<h1>SLURM User and Administrator Guide for Cray
s
ystems</h1>
<h1>SLURM User and Administrator Guide for Cray
S
ystems</h1>
<h2>User Guide</h2>
<h2>User Guide</h2>
...
@@ -99,7 +99,7 @@ option or add <i>%_with_srun2aprun 1</i> to your <i>~/.rpmmacros</i> file.</p>
...
@@ -99,7 +99,7 @@ option or add <i>%_with_srun2aprun 1</i> to your <i>~/.rpmmacros</i> file.</p>
Setting <i>--ntasks-per-node</i> to the number of cores per node yields the default per-CPU share
Setting <i>--ntasks-per-node</i> to the number of cores per node yields the default per-CPU share
minimum value.</p>
minimum value.</p>
<p>For all cases in between these extremes, set --mem=per_task_
memory
and</p>
<p>For all cases in between these extremes, set --mem=per_task_
node or --mem-per-cpu=memory_per_cpu (node CPU count and task count may differ)
and</p>
<pre>
<pre>
--ntasks-per-node=floor(node_memory / per_task_memory)
--ntasks-per-node=floor(node_memory / per_task_memory)
</pre>
</pre>
...
@@ -111,7 +111,7 @@ option or add <i>%_with_srun2aprun 1</i> to your <i>~/.rpmmacros</i> file.</p>
...
@@ -111,7 +111,7 @@ option or add <i>%_with_srun2aprun 1</i> to your <i>~/.rpmmacros</i> file.</p>
#SBATCH --comment="requesting 7500MB per task on 32000MB/24-core nodes"
#SBATCH --comment="requesting 7500MB per task on 32000MB/24-core nodes"
#SBATCH --ntasks=64
#SBATCH --ntasks=64
#SBATCH --ntasks-per-node=4
#SBATCH --ntasks-per-node=4
#SBATCH --mem=
75
00
#SBATCH --mem=
300
00
</pre>
</pre>
<p>If you would like to fine-tune the memory limit of your application, you can set the same parameters in
<p>If you would like to fine-tune the memory limit of your application, you can set the same parameters in
a salloc session and then check directly, using</p>
a salloc session and then check directly, using</p>
...
@@ -128,7 +128,7 @@ option or add <i>%_with_srun2aprun 1</i> to your <i>~/.rpmmacros</i> file.</p>
...
@@ -128,7 +128,7 @@ option or add <i>%_with_srun2aprun 1</i> to your <i>~/.rpmmacros</i> file.</p>
on CLE 3.x systems for details.</p>
on CLE 3.x systems for details.</p>
<h3>Node ordering options</h3>
<h3>Node ordering options</h3>
<p>SLURM hono
u
rs the node ordering policy set for Cray's Application Level Placement Scheduler (ALPS). Node
<p>SLURM honors the node ordering policy set for Cray's Application Level Placement Scheduler (ALPS). Node
ordering is a configurable system option (ALPS_NIDORDER in /etc/sysconfig/alps). The current
ordering is a configurable system option (ALPS_NIDORDER in /etc/sysconfig/alps). The current
setting is reported by '<i>apstat -svv</i>' (look for the line starting with "nid ordering option") and
setting is reported by '<i>apstat -svv</i>' (look for the line starting with "nid ordering option") and
can not be changed at runtime. The resulting, effective node ordering is revealed by '<i>apstat -no</i>'
can not be changed at runtime. The resulting, effective node ordering is revealed by '<i>apstat -no</i>'
...
@@ -156,7 +156,7 @@ option to any of the commands used to create a job allocation/reservation.</p>
...
@@ -156,7 +156,7 @@ option to any of the commands used to create a job allocation/reservation.</p>
nodes (typically used for pre- or post-processing functionality) then submit a
nodes (typically used for pre- or post-processing functionality) then submit a
batch job with a node count specification of zero.</p>
batch job with a node count specification of zero.</p>
<pre>
<pre>
sbatch -N0 preprocess.bash
sbatch -N0 pre
_
process.bash
</pre>
</pre>
<p><b>Note</b>: Support for Cray job allocations with zero compute nodes was
<p><b>Note</b>: Support for Cray job allocations with zero compute nodes was
added to SLURM version 2.4. Earlier versions of SLURM will return an error for
added to SLURM version 2.4. Earlier versions of SLURM will return an error for
...
@@ -213,7 +213,7 @@ privileges will be required to install these files.</p>
...
@@ -213,7 +213,7 @@ privileges will be required to install these files.</p>
<p>The build is done on a normal service node, where you like
<p>The build is done on a normal service node, where you like
(e.g. <i>/ufs/slurm/build</i> would work).
(e.g. <i>/ufs/slurm/build</i> would work).
Most scripts check for the environment variable LIBROOT.
Most scripts check for the environment variable LIBROOT.
You can either edit the scripts or export this variable. Easiest way:</p>
You can either edit the scripts or export this variable. Easiest way:</p>
<pre>
<pre>
...
@@ -235,7 +235,7 @@ login: # scp ~/slurm/contribs/cray/opt_modulefiles_slurm root@boot:/rr/current/s
...
@@ -235,7 +235,7 @@ login: # scp ~/slurm/contribs/cray/opt_modulefiles_slurm root@boot:/rr/current/s
<h3>Build and install Munge</h3>
<h3>Build and install Munge</h3>
<p>Note the Munge installation process on Cray systems differs
<p>Note the Munge installation process on Cray systems differs
somewhat from that described in the
somewhat from that described in the
<a href="http://code.google.com/p/munge/wiki/InstallationGuide">
<a href="http://code.google.com/p/munge/wiki/InstallationGuide">
MUNGE Installation Guide</a>.</p>
MUNGE Installation Guide</a>.</p>
...
@@ -252,7 +252,7 @@ login: # curl -O http://munge.googlecode.com/files/munge-0.5.10.tar.bz2
...
@@ -252,7 +252,7 @@ login: # curl -O http://munge.googlecode.com/files/munge-0.5.10.tar.bz2
login: # cp munge-0.5.10.tar.bz2 ${LIBROOT}/munge/zip
login: # cp munge-0.5.10.tar.bz2 ${LIBROOT}/munge/zip
login: # chmod u+x ${LIBROOT}/munge/zip/munge_build_script.sh
login: # chmod u+x ${LIBROOT}/munge/zip/munge_build_script.sh
login: # ${LIBROOT}/munge/zip/munge_build_script.sh
login: # ${LIBROOT}/munge/zip/munge_build_script.sh
(generates lots of output and enerates a tar-ball called
(generates lots of output and
g
enerates a tar-ball called
$LIBROOT/munge_build-.*YYYY-MM-DD.tar.gz)
$LIBROOT/munge_build-.*YYYY-MM-DD.tar.gz)
login: # scp munge_build-2011-07-12.tar.gz root@boot:/rr/current/software
login: # scp munge_build-2011-07-12.tar.gz root@boot:/rr/current/software
</pre>
</pre>
...
@@ -433,7 +433,7 @@ parameter in ALPS' <i>nodehealth.conf</i> file.</p>
...
@@ -433,7 +433,7 @@ parameter in ALPS' <i>nodehealth.conf</i> file.</p>
<p>You need to specify the appropriate resource selection plugin (the
<p>You need to specify the appropriate resource selection plugin (the
<i>SelectType</i> option in SLURM's <i>slurm.conf</i> configuration file).
<i>SelectType</i> option in SLURM's <i>slurm.conf</i> configuration file).
Configure <i>SelectType</i> to <i>select/cray</i> The <i>select/cray</i>
Configure <i>SelectType</i> to <i>select/cray</i> The <i>select/cray</i>
plugin provides an interface to ALPS plus issues calls to the
plugin provides an interface to ALPS plus issues calls to the
<i>select/linear</i>, which selects resources for jobs using a best-fit
<i>select/linear</i>, which selects resources for jobs using a best-fit
algorithm to allocate whole nodes to jobs (rather than individual sockets,
algorithm to allocate whole nodes to jobs (rather than individual sockets,
...
@@ -465,7 +465,7 @@ TopologyPlugin=topology/none
...
@@ -465,7 +465,7 @@ TopologyPlugin=topology/none
SchedulerType=sched/backfill
SchedulerType=sched/backfill
# Node selection: use the special-purpose "select/cray" plugin.
# Node selection: use the special-purpose "select/cray" plugin.
# Internally this uses select/linar, i.e. nodes are always allocated
# Internally this uses select/lin
e
ar, i.e. nodes are always allocated
# in units of nodes (other allocation is currently not possible, since
# in units of nodes (other allocation is currently not possible, since
# ALPS does not yet allow to run more than 1 executable on the same
# ALPS does not yet allow to run more than 1 executable on the same
# node, see aprun(1), section LIMITATIONS).
# node, see aprun(1), section LIMITATIONS).
...
@@ -530,7 +530,7 @@ NodeName=DEFAULT Gres=gpu_mem:2g
...
@@ -530,7 +530,7 @@ NodeName=DEFAULT Gres=gpu_mem:2g
NodeName=nid00[002-013,018-159,162-173,178-189]
NodeName=nid00[002-013,018-159,162-173,178-189]
# Frontend nodes: these should not be available to user logins, but
# Frontend nodes: these should not be available to user logins, but
# have all filesystems mounted that are also
# have all filesystems mounted that are also
# available on a login node (/scratch, /home, ...).
# available on a login node (/scratch, /home, ...).
FrontendName=palu[7-9]
FrontendName=palu[7-9]
...
@@ -691,6 +691,6 @@ allocation.</p>
...
@@ -691,6 +691,6 @@ allocation.</p>
<p class="footer"><a href="#top">top</a></p>
<p class="footer"><a href="#top">top</a></p>
<p style="text-align:center;">Last modified 2
7 April
2012</p></td>
<p style="text-align:center;">Last modified 2
5 July
2012</p></td>
<!--#include virtual="footer.txt"-->
<!--#include virtual="footer.txt"-->
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment