Skip to content
Snippets Groups Projects
Commit 2a2ce582 authored by Danny Auble's avatar Danny Auble
Browse files

backport cray 2.5 docs to 2.4

parent 51c4f7c8
No related branches found
No related tags found
No related merge requests found
<!--#include virtual="header.txt"--> <!--#include virtual="header.txt"-->
<h1>SLURM User and Administrator Guide for Cray systems</h1> <h1>SLURM User and Administrator Guide for Cray Systems</h1>
<h2>User Guide</h2> <h2>User Guide</h2>
...@@ -99,7 +99,7 @@ option or add <i>%_with_srun2aprun 1</i> to your <i>~/.rpmmacros</i> file.</p> ...@@ -99,7 +99,7 @@ option or add <i>%_with_srun2aprun 1</i> to your <i>~/.rpmmacros</i> file.</p>
Setting <i>--ntasks-per-node</i> to the number of cores per node yields the default per-CPU share Setting <i>--ntasks-per-node</i> to the number of cores per node yields the default per-CPU share
minimum value.</p> minimum value.</p>
<p>For all cases in between these extremes, set --mem=per_task_memory and</p> <p>For all cases in between these extremes, set --mem=per_task_node or --mem-per-cpu=memory_per_cpu (node CPU count and task count may differ) and</p>
<pre> <pre>
--ntasks-per-node=floor(node_memory / per_task_memory) --ntasks-per-node=floor(node_memory / per_task_memory)
</pre> </pre>
...@@ -111,7 +111,7 @@ option or add <i>%_with_srun2aprun 1</i> to your <i>~/.rpmmacros</i> file.</p> ...@@ -111,7 +111,7 @@ option or add <i>%_with_srun2aprun 1</i> to your <i>~/.rpmmacros</i> file.</p>
#SBATCH --comment="requesting 7500MB per task on 32000MB/24-core nodes" #SBATCH --comment="requesting 7500MB per task on 32000MB/24-core nodes"
#SBATCH --ntasks=64 #SBATCH --ntasks=64
#SBATCH --ntasks-per-node=4 #SBATCH --ntasks-per-node=4
#SBATCH --mem=7500 #SBATCH --mem=30000
</pre> </pre>
<p>If you would like to fine-tune the memory limit of your application, you can set the same parameters in <p>If you would like to fine-tune the memory limit of your application, you can set the same parameters in
a salloc session and then check directly, using</p> a salloc session and then check directly, using</p>
...@@ -128,7 +128,7 @@ option or add <i>%_with_srun2aprun 1</i> to your <i>~/.rpmmacros</i> file.</p> ...@@ -128,7 +128,7 @@ option or add <i>%_with_srun2aprun 1</i> to your <i>~/.rpmmacros</i> file.</p>
on CLE 3.x systems for details.</p> on CLE 3.x systems for details.</p>
<h3>Node ordering options</h3> <h3>Node ordering options</h3>
<p>SLURM honours the node ordering policy set for Cray's Application Level Placement Scheduler (ALPS). Node <p>SLURM honors the node ordering policy set for Cray's Application Level Placement Scheduler (ALPS). Node
ordering is a configurable system option (ALPS_NIDORDER in /etc/sysconfig/alps). The current ordering is a configurable system option (ALPS_NIDORDER in /etc/sysconfig/alps). The current
setting is reported by '<i>apstat -svv</i>' (look for the line starting with "nid ordering option") and setting is reported by '<i>apstat -svv</i>' (look for the line starting with "nid ordering option") and
can not be changed at runtime. The resulting, effective node ordering is revealed by '<i>apstat -no</i>' can not be changed at runtime. The resulting, effective node ordering is revealed by '<i>apstat -no</i>'
...@@ -156,7 +156,7 @@ option to any of the commands used to create a job allocation/reservation.</p> ...@@ -156,7 +156,7 @@ option to any of the commands used to create a job allocation/reservation.</p>
nodes (typically used for pre- or post-processing functionality) then submit a nodes (typically used for pre- or post-processing functionality) then submit a
batch job with a node count specification of zero.</p> batch job with a node count specification of zero.</p>
<pre> <pre>
sbatch -N0 preprocess.bash sbatch -N0 pre_process.bash
</pre> </pre>
<p><b>Note</b>: Support for Cray job allocations with zero compute nodes was <p><b>Note</b>: Support for Cray job allocations with zero compute nodes was
added to SLURM version 2.4. Earlier versions of SLURM will return an error for added to SLURM version 2.4. Earlier versions of SLURM will return an error for
...@@ -213,7 +213,7 @@ privileges will be required to install these files.</p> ...@@ -213,7 +213,7 @@ privileges will be required to install these files.</p>
<p>The build is done on a normal service node, where you like <p>The build is done on a normal service node, where you like
(e.g. <i>/ufs/slurm/build</i> would work). (e.g. <i>/ufs/slurm/build</i> would work).
Most scripts check for the environment variable LIBROOT. Most scripts check for the environment variable LIBROOT.
You can either edit the scripts or export this variable. Easiest way:</p> You can either edit the scripts or export this variable. Easiest way:</p>
<pre> <pre>
...@@ -235,7 +235,7 @@ login: # scp ~/slurm/contribs/cray/opt_modulefiles_slurm root@boot:/rr/current/s ...@@ -235,7 +235,7 @@ login: # scp ~/slurm/contribs/cray/opt_modulefiles_slurm root@boot:/rr/current/s
<h3>Build and install Munge</h3> <h3>Build and install Munge</h3>
<p>Note the Munge installation process on Cray systems differs <p>Note the Munge installation process on Cray systems differs
somewhat from that described in the somewhat from that described in the
<a href="http://code.google.com/p/munge/wiki/InstallationGuide"> <a href="http://code.google.com/p/munge/wiki/InstallationGuide">
MUNGE Installation Guide</a>.</p> MUNGE Installation Guide</a>.</p>
...@@ -252,7 +252,7 @@ login: # curl -O http://munge.googlecode.com/files/munge-0.5.10.tar.bz2 ...@@ -252,7 +252,7 @@ login: # curl -O http://munge.googlecode.com/files/munge-0.5.10.tar.bz2
login: # cp munge-0.5.10.tar.bz2 ${LIBROOT}/munge/zip login: # cp munge-0.5.10.tar.bz2 ${LIBROOT}/munge/zip
login: # chmod u+x ${LIBROOT}/munge/zip/munge_build_script.sh login: # chmod u+x ${LIBROOT}/munge/zip/munge_build_script.sh
login: # ${LIBROOT}/munge/zip/munge_build_script.sh login: # ${LIBROOT}/munge/zip/munge_build_script.sh
(generates lots of output and enerates a tar-ball called (generates lots of output and generates a tar-ball called
$LIBROOT/munge_build-.*YYYY-MM-DD.tar.gz) $LIBROOT/munge_build-.*YYYY-MM-DD.tar.gz)
login: # scp munge_build-2011-07-12.tar.gz root@boot:/rr/current/software login: # scp munge_build-2011-07-12.tar.gz root@boot:/rr/current/software
</pre> </pre>
...@@ -433,7 +433,7 @@ parameter in ALPS' <i>nodehealth.conf</i> file.</p> ...@@ -433,7 +433,7 @@ parameter in ALPS' <i>nodehealth.conf</i> file.</p>
<p>You need to specify the appropriate resource selection plugin (the <p>You need to specify the appropriate resource selection plugin (the
<i>SelectType</i> option in SLURM's <i>slurm.conf</i> configuration file). <i>SelectType</i> option in SLURM's <i>slurm.conf</i> configuration file).
Configure <i>SelectType</i> to <i>select/cray</i> The <i>select/cray</i> Configure <i>SelectType</i> to <i>select/cray</i> The <i>select/cray</i>
plugin provides an interface to ALPS plus issues calls to the plugin provides an interface to ALPS plus issues calls to the
<i>select/linear</i>, which selects resources for jobs using a best-fit <i>select/linear</i>, which selects resources for jobs using a best-fit
algorithm to allocate whole nodes to jobs (rather than individual sockets, algorithm to allocate whole nodes to jobs (rather than individual sockets,
...@@ -465,7 +465,7 @@ TopologyPlugin=topology/none ...@@ -465,7 +465,7 @@ TopologyPlugin=topology/none
SchedulerType=sched/backfill SchedulerType=sched/backfill
# Node selection: use the special-purpose "select/cray" plugin. # Node selection: use the special-purpose "select/cray" plugin.
# Internally this uses select/linar, i.e. nodes are always allocated # Internally this uses select/linear, i.e. nodes are always allocated
# in units of nodes (other allocation is currently not possible, since # in units of nodes (other allocation is currently not possible, since
# ALPS does not yet allow to run more than 1 executable on the same # ALPS does not yet allow to run more than 1 executable on the same
# node, see aprun(1), section LIMITATIONS). # node, see aprun(1), section LIMITATIONS).
...@@ -530,7 +530,7 @@ NodeName=DEFAULT Gres=gpu_mem:2g ...@@ -530,7 +530,7 @@ NodeName=DEFAULT Gres=gpu_mem:2g
NodeName=nid00[002-013,018-159,162-173,178-189] NodeName=nid00[002-013,018-159,162-173,178-189]
# Frontend nodes: these should not be available to user logins, but # Frontend nodes: these should not be available to user logins, but
# have all filesystems mounted that are also # have all filesystems mounted that are also
# available on a login node (/scratch, /home, ...). # available on a login node (/scratch, /home, ...).
FrontendName=palu[7-9] FrontendName=palu[7-9]
...@@ -691,6 +691,6 @@ allocation.</p> ...@@ -691,6 +691,6 @@ allocation.</p>
<p class="footer"><a href="#top">top</a></p> <p class="footer"><a href="#top">top</a></p>
<p style="text-align:center;">Last modified 27 April 2012</p></td> <p style="text-align:center;">Last modified 25 July 2012</p></td>
<!--#include virtual="footer.txt"--> <!--#include virtual="footer.txt"-->
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment