Skip to content
Snippets Groups Projects
Commit 195f0d40 authored by Moe Jette's avatar Moe Jette
Browse files

Minor formatting changes. no change in content.

parent 98c45d4c
No related branches found
No related tags found
No related merge requests found
......@@ -2,7 +2,7 @@
<h1>Large Cluster Administration Guide</h1>
</p>This document contains SLURM administrator information specifically
<p>This document contains SLURM administrator information specifically
for clusters containing 1,024 nodes or more.
Virtually all SLURM components have been validated (through emulation)
for clusters containing up to 16,384 compute nodes.
......
......@@ -180,7 +180,7 @@ SelectType=select/cons_res
<h2>Limitation and future work</h2>
We are aware of several limitations with the current consumable
<p>We are aware of several limitations with the current consumable
resource plug-in and plan to make enhancement the plug-in as we get
time as well as request from users to help us prioritize the features.
......@@ -252,7 +252,7 @@ NODELIST NODES PARTITION STATE CPUS S:C:T MEMORY
hydra[12-16] 5 allNodes* ... 4 2:2:1 2007
</pre>
Using select/cons_res plug-in with CR_Memory
<p>Using select/cons_res plug-in with CR_Memory</p>
<pre>
Example:
srun -N 5 -n 20 --job-mem=1000 sleep 100 & <-- running
......@@ -266,7 +266,7 @@ squeue
1819 allNodes sleep sballe R 0:11 5 hydra[12-16]
</pre>
Using select/cons_res plug-in with CR_Socket_Memory (2 sockets/node)
<p>Using select/cons_res plug-in with CR_Socket_Memory (2 sockets/node)</p>
<pre>
Example 1:
srun -N 5 -n 5 --job-mem=1000 sleep 100 & <-- running
......@@ -287,7 +287,7 @@ squeue
1830 allNodes sleep sballe R 0:07 5 hydra[12-16]
</pre>
Using select/cons_res plug-in with CR_CPU_Memory (4 CPUs/node)
<p>Using select/cons_res plug-in with CR_CPU_Memory (4 CPUs/node)</p>
<pre>
Example 1:
srun -N 5 -n 5 --job-mem=1000 sleep 100 & <-- running
......
......@@ -58,7 +58,7 @@ use the command: <br>
<h2>System Administration</h2>
<p>Three unique components are required to use SLURM on an IBM system.
<p>Three unique components are required to use SLURM on an IBM system.</p>
<ol>
<li>The Federation switch plugin is required.
This component is packaged with the SLURM distrbution.</li>
......@@ -79,7 +79,7 @@ not at this time available for distribution.
Interested parties are welcome to pursue the possible distribution
of this library with IBM and SLURM developers.</li>
</ol>
Until this last issue is resolved, use of SLURM on an IBM AIX system
<p>Until this last issue is resolved, use of SLURM on an IBM AIX system
should not be viewed as a supported configuration (at least outside
of LLNL, which established a contract with IBM for this purpose).</p>
......
......@@ -294,10 +294,10 @@ to -m block:cyclic with --cpu_bind=thread.</p>
<a name="srun_constraints">
<h3>New Constraints</h3></a>
To compliment the existing SLURM job minimum constraints
<p>To compliment the existing SLURM job minimum constraints
(CPUs, memory, temp disk),
constraint flags have also been added to allow a user to
specify a minimum number of sockets, cores, or threads:
specify a minimum number of sockets, cores, or threads:</p>
<PRE>
--mincpus=<i>n</i> minimum number of logical cpus per node
......@@ -308,18 +308,18 @@ specify a minimum number of sockets, cores, or threads:
--tmp=<i>MB</i> minimum amount of temporary disk
</PRE>
These constraints are separate from the -N or -B allocation minimums.
<p>These constraints are separate from the -N or -B allocation minimums.
Using these constraints allows the user to exclude smaller nodes from
the allocation request.
the allocation request.</p>
<p>See also 'srun --help' and 'man srun'</p>
<a name="srun_consres">
<h3>Memory as a Consumable Resource</h3></a>
The --job-mem flag specifies the maximum amount of memory in MB
<p>The --job-mem flag specifies the maximum amount of memory in MB
needed by the job per node. This flag is used to support the memory
as a consumable resource allocation strategy.
as a consumable resource allocation strategy.</p>
<PRE>
--job-mem=<i>MB</i> maximum amount of real memory per node
......@@ -327,9 +327,9 @@ as a consumable resource allocation strategy.
--mem >= --job-mem if --mem is specified.
</PRE>
This flag allows the scheduler to co-allocate jobs on specific nodes
<p>This flag allows the scheduler to co-allocate jobs on specific nodes
given that their added memory requirement do not exceed the amount
of memory on the nodes.
of memory on the nodes.</p>
<p>In order to use memory as a consumable resource, the select/cons_res
......@@ -353,7 +353,7 @@ via <a href="configurator.html">configurator.html</a>.
<a name="srun_ntasks">
<h3>Task invocation as a function of logical processors</h3></a>
The <tt>--ntasks-per-{node,socket,core}=<i>ntasks</i></tt> flags
<p>The <tt>--ntasks-per-{node,socket,core}=<i>ntasks</i></tt> flags
allow the user to request that no more than <tt><i>ntasks</i></tt>
be invoked on each node, socket, or core.
This is similiar to using <tt>--cpus-per-task=<i>ncpus</i></tt>
......@@ -366,7 +366,7 @@ assigned to each node while allowing the OpenMP portion to utilize
all of the parallelism present in the node, or submitting a single
setup/cleanup/monitoring job to each node of a pre-existing
allocation as one step in a larger job script.
This can now be specified via the following flags:
This can now be specified via the following flags:</p>
<PRE>
--ntasks-per-node=<i>n</i> number of tasks to invoke on each node
......@@ -374,9 +374,9 @@ This can now be specified via the following flags:
--ntasks-per-core=<i>n</i> number of tasks to invoke on each core
</PRE>
For example, given a cluster with nodes containing two sockets,
<p>For example, given a cluster with nodes containing two sockets,
each containing two cores, the following commands illustrate the
behavior of these flags:
behavior of these flags:</p>
<pre>
% srun -n 4 hostname
hydra12
......@@ -410,7 +410,7 @@ hydra12
<a name="srun_hints">
<h3>Application hints</h3></a>
Different applications will have various levels of resource
<p>Different applications will have various levels of resource
requirements. Some applications tend to be computationally intensive
but require little to no inter-process communication. Some applications
will be memory bound, saturating the memory bandwidth of a processor
......@@ -418,16 +418,16 @@ before exhausting the computational capabilities. Other applications
will be highly communication intensive causing processes to block
awaiting messages from other processes. Applications with these
different properties tend to run well on a multi-core system given
the right mappings.
the right mappings.</p>
For computationally intensive applications, all cores in a multi-core
<p>For computationally intensive applications, all cores in a multi-core
system would normally be used. For memory bound applications, only
using a single core on each CPU will result in the highest per
core memory bandwidth. For communication intensive applications,
using in-core multi-threading (e.g. hyperthreading, SMT, or TMT)
may also improve performance.
The following command line flags can be used to communicate these
types of application hints to the SLURM multi-core support:
types of application hints to the SLURM multi-core support:</p>
<PRE>
--hint= Bind tasks according to application hints
......@@ -437,10 +437,10 @@ types of application hints to the SLURM multi-core support:
help show this help message
</PRE>
For example, given a cluster with nodes containing two sockets,
<p>For example, given a cluster with nodes containing two sockets,
each containing two cores, the following commands illustrate the
behavior of these flags. In the verbose --cpu_bind output, tasks
are described as 'hostname, task Global_ID Local_ID [PID]':
are described as 'hostname, task Global_ID Local_ID [PID]':</p>
<pre>
% srun -n 4 --hint=compute_bound --cpu_bind=verbose sleep 1
cpu_bind=MASK - hydra12, task 0 0 [15425]: mask 0x1 set
......@@ -716,8 +716,8 @@ trivial and that it assumes that users are experts.</p>
<a name=utilities>
<h2>Extensions to sinfo/squeue/scontrol</h2></a>
Several extensions have also been made to the other SLURM utilities to
make working with multi-core/multi-threaded systems easier.
<p>Several extensions have also been made to the other SLURM utilities to
make working with multi-core/multi-threaded systems easier.</p>
<!- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->
<h3>sinfo</h3>
......@@ -744,8 +744,8 @@ hydra[12-14] 3 parts* idle 8 2:4:1 2007 41447 1
hydra15 1 parts* idle 64 8:4:2 2007 41447 1 (null) none
</PRE>
For user specified output formats (-o/--format) and sorting (-S/--sort),
the following identifiers are available:
<p>For user specified output formats (-o/--format) and sorting (-S/--sort),
the following identifiers are available:</p>
<PRE>
%X Number of sockets per node
......@@ -755,7 +755,7 @@ the following identifiers are available:
sockets, core, threads (S:C:T) per node
</PRE>
For example:
<p>For example:</p>
<PRE>
% sinfo -o '%9P %4c %8z %8X %8Y %8Z'
......@@ -768,8 +768,8 @@ parts* 4 2:2:1 2 2 1
<!- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->
<h3>squeue</h3>
For user specified output formats (-o/--format) and sorting (-S/--sort),
the following identifiers are available:
<p>For user specified output formats (-o/--format) and sorting (-S/--sort),
the following identifiers are available:</p>
<PRE>
%H Minimum number of sockets per node requested by the job.
......@@ -786,7 +786,7 @@ the following identifiers are available:
sockets, cores, threads (S:C:T) per node
</PRE>
Below is an example squeue output after running 7 copies of:
<p>Below is an example squeue output after running 7 copies of:</p>
<UL>
<UL>
......@@ -876,7 +876,7 @@ Constraints:
MinThreads=&lt;count&gt; Set the job's minimum number of threads per core
</PRE>
For example:
<p>For example:</p>
<PRE>
# scontrol update JobID=18 MinThreads=2
......@@ -938,7 +938,7 @@ task/affinity plugin must be first enabled in slurm.conf:
TaskPlugin=task/affinity # enable task affinity
</PRE>
This setting is part of the task launch specific parameters:
<p>This setting is part of the task launch specific parameters:</p>
<PRE>
# o Define task launch specific parameters
......
......@@ -185,7 +185,7 @@ $ srun -n4 -A # allocates four processors and spawns shell for job
&gt; mpirun -np 4 a.out
&gt; exit # exits shell spawned by initial srun command
</pre>
Note that any direct use of <span class="commandline">srun</span>
<p>Note that any direct use of <span class="commandline">srun</span>
will only launch one task per node when the LAM/MPI plugin is used.
To launch more than one task per node usng the
<span class="commandline">srun</span> command, the <i>--mpi=none</i>
......@@ -220,7 +220,7 @@ etc.
&gt; lamhalt
&gt; exit # exits shell spawned by initial srun command
</pre>
Note that any direct use of <span class="commandline">srun</span>
<p>Note that any direct use of <span class="commandline">srun</span>
will only launch one task per node when the LAM/MPI plugin is configured
as the default plugin. To launch more than one task per node usng the
<span class="commandline">srun</span> command, the <i>--mpi=none</i>
......
......@@ -127,7 +127,7 @@ Some macro definitions that may be used in building SLURM include:
<dt>with_ssl
<dd>Specifies SSL libary installation location
</dl>
To build SLURM on our AIX system, the following .rpmmacros file is used:
<p>To build SLURM on our AIX system, the following .rpmmacros file is used:
<pre>
# .rpmmacros
# For AIX at LLNL
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment