and since the algorithm fills up nodes in a consecutive order
(when not in dedicated mode) the algorithm will want to use the
remaining CPUs on Hydra12 first. Because the user has requested
a maximum of two nodes the allocation will put the job on
hold until hydra12 becomes available or if backfill is enabled
until hydra12's remaining CPU gets allocated to another job
which will allow the 4th job to get two dedicated nodes</li>
<li><b>Note!</b> This problem is fixed in SLURM version 1.3.</li>
<li><b>Note!</b> If you want to specify <i>--max_????</i> this
problem can be solved in the current implementation by asking
for the nodes in dedicated mode using <i>--exclusive</i></li>.
<pre>
<ul>
# srun sleep 100 &
<li>Slurm's default <b>select/linear</b> plugin is using a best fit algorithm
# srun sleep 100 &
based on number of consecutive nodes. The same node allocation approach is used
# srun sleep 100 &
in <b>select/cons_res</b> for consistency.</li>
# squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
<li>The <b>select/cons_res</b> plugin is enabled or disabled cluster-wide.</li>
1132 allNodes sleep sballe R 0:05 1 hydra12
1133 allNodes sleep sballe R 0:04 1 hydra12
<li>In the case where <b>select/cons_res</b> is not enabled, the normal Slurm
1134 allNodes sleep sballe R 0:02 1 hydra12
behaviors are not disrupted. The only changes, users see when using the
# srun -N 2-2 -E 2:2 sleep 100 &
<b>select/cons_res</b> plugin, are that jobs can be co-scheduled on nodes when
srun: job 1135 queued and waiting for resources
resources permit it.
#squeue
The rest of Slurm, such as srun and switches (except srun -s ...), etc. are not
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
affected by this plugin. Slurm is, from a user point of view, working the same
1135 allNodes sleep sballe PD 0:00 2 (Resources)
way as when using the default node selection scheme.</li>
1132 allNodes sleep sballe R 0:24 1 hydra12
1133 allNodes sleep sballe R 0:23 1 hydra12
<li>The <i>--exclusive</i> srun switch allows users to request nodes in
1134 allNodes sleep sballe R 0:21 1 hydra12
exclusive mode even when consumable resources is enabled. see "man srun"
</pre>
for details. </li>
<li><b>Proposed solution:</b> Enhance the selection mechanism to go through {node,socket,core,thread}-tuplets to find available match for specific request (bounded knapsack problem). </li>
</ul>
<li>srun's <i>-s</i> or <i>--share</i> is incompatible with the consumable
<li><b>Binding of processes in the case when <i>--overcommit</i> is specified.</b></li>
resource environment and will therefore not be honored. Since in this
<ul>
environment nodes are shared by default, <i>--exclusive</i> allows users to
<li>In the current implementation (SLURM 1.2) we have chosen not
obtain dedicated nodes.</li>
to bind process that have been started with <i>--overcommit</i>
</ul>
flag. The reasoning behind this decision is that the Linux
scheduler will move non-bound processes to available resources
when jobs with process pinning enabled are started. The
non-bound jobs do not affect the bound jobs but co-scheduled
non-bound job would affect each others runtime. We have decided
that for now this is an adequate solution.
</ul>
</ul>
</ol>
<p class="footer"><a href="#top">top</a></p>
<p class="footer"><a href="#top">top</a></p>
...
@@ -282,12 +177,12 @@ JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
...
@@ -282,12 +177,12 @@ JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
<h2>Example of Node Allocations Using Consumable Resource Plugin</h2>
<h2>Example of Node Allocations Using Consumable Resource Plugin</h2>
<p>The following example illustrates the different ways four jobs
<p>The following example illustrates the different ways four jobs
are allocated across a cluster using (1) SLURM's default allocation
are allocated across a cluster using (1) Slurm's default allocation
(exclusive mode) and (2) a processor as consumable resource
(exclusive mode) and (2) a processor as consumable resource
approach.</p>
approach.</p>
<p>It is important to understand that the example listed below is a
<p>It is important to understand that the example listed below is a
contrived example and is only given here to illustrate the use of cpu as
contrived example and is only given here to illustrate the use of CPU as
consumable resources. Job 2 and Job 3 call for the node count to equal
consumable resources. Job 2 and Job 3 call for the node count to equal
the processor count. This would typically be done because
the processor count. This would typically be done because
that one task per node requires all of the memory, disk space, etc. The
that one task per node requires all of the memory, disk space, etc. The
...
@@ -295,12 +190,12 @@ bottleneck would not be processor count.</p>
...
@@ -295,12 +190,12 @@ bottleneck would not be processor count.</p>
<p>Trying to execute more than one job per node will almost certainly severely
<p>Trying to execute more than one job per node will almost certainly severely
impact parallel job's performance.
impact parallel job's performance.
The biggest beneficiary of cpus as consumable resources will be serial jobs or
The biggest beneficiary of CPUs as consumable resources will be serial jobs or
jobs with modest parallelism, which can effectively share resources. On a lot
jobs with modest parallelism, which can effectively share resources. On many
of systems with larger processor count, jobs typically run one fewer task than
systems with larger processor count, jobs typically run one fewer task than
there are processors to minimize interference by the kernel and daemons.</p>
there are processors to minimize interference by the kernel and daemons.</p>
<p>The example cluster is composed of 4 nodes (10 cpus in total):</p>
<p>The example cluster is composed of 4 nodes (10 CPUs in total):</p>
<ul>
<ul>
<li>linux01 (with 2 processors), </li>
<li>linux01 (with 2 processors), </li>
...
@@ -322,12 +217,12 @@ there are processors to minimize interference by the kernel and daemons.</p>
...
@@ -322,12 +217,12 @@ there are processors to minimize interference by the kernel and daemons.</p>