Skip to content
Snippets Groups Projects
Commit 358dbbf1 authored by Moe Jette's avatar Moe Jette
Browse files

update some web pages for slurm v2.0

parent c547fc53
No related branches found
No related tags found
No related merge requests found
......@@ -54,7 +54,7 @@ Job_priority =
<!-------------------------------------------------------------------------->
<a name=age>
<h2>Age Factor</h2>
<h2>Age Factor</h2></a>
<P> The age factor represents the length of time a job has been sitting in the queue and eligible to run. In general, the longer a job waits in the queue, the larger its age factor grows. However, the age factor for a dependent job will not change while it waits for the job it depends on to complete. Also, the age factor of a queued job whose node or time limits exceed the cluster's current limits will not change.</P>
......@@ -62,25 +62,25 @@ Job_priority =
<!-------------------------------------------------------------------------->
<a name=jobsize>
<h2>Job Size Factor</h2>
<h2>Job Size Factor</h2></a>
<P> The job size factor correlates to the number of nodes the job has requested. This factor can be configured to favor larger jobs or smaller jobs based on the state of the <i>PriorityFavorSmall</i> boolean in the slurm configuration file. When <i>PriorityFavorSmall</i> is 0, the larger the job, the greater its job size factor will be. A job that requests all the nodes on the machine will get a job size factor of 1.0. When the <i>PriorityFavorSmall</i> Boolean is 1, the single node job will receive the 1.0 job size factor.</P>
<!-------------------------------------------------------------------------->
<a name=partition>
<h2>Partition Factor</h2>
<h2>Partition Factor</h2></a>
<P> Each node partition can be assigned a factor from 0.0 to 1.0. The higher the number, the greater the job priority will be for jobs that are slated to run in this partition.</P>
<!-------------------------------------------------------------------------->
<a name=qos>
<h2>Quality of Service (QOS) Factor</h2>
<h2>Quality of Service (QOS) Factor</h2></a>
<P> Each QOS can be assigned a factor from 0.0 to 1.0. The higher the number, the greater the job priority will be for jobs that request this QOS.</P>
<!-------------------------------------------------------------------------->
<a name=fairshare>
<h2>Fair-share Factor</h2>
<h2>Fair-share Factor</h2></a>
<!-------------------------------------------------------------------------->
<p style="text-align:center;">Last modified 9 February 2009</p>
......
......@@ -4,65 +4,9 @@
<h2>Index</h2>
<ul>
<li><a href="#11">SLURM Version 1.1, May 2006</a></li>
<li><a href="#12">SLURM Version 1.2, February 2007</a></li>
<li><a href="#13">SLURM Version 1.3, March 2008</a></li>
<li><a href="#14">SLURM Version 1.4, May 2009</a></li>
<li><a href="#15">SLURM Version 1.5 and beyond</a></li>
</ul>
<h2><a name="11">Major Updates in SLURM Version 1.1</a></h2>
<p>SLURM Version 1.1 became available in May 2006.
Major enhancements include:
<ul>
<li>Communications enhancements, validated up to 16,384 node clusters.</li>
<li>File broadcast support (new <i>sbcast</i> command).</li>
<li>Support for distinct executables and arguments by task ID
(see <i>srun --multi-prog</i> option).</li>
<li>Support for binding tasks to the memory on a processor.</li>
<li>The configuration parameter <i>HeartbeatInterval</i> is defunct.
Half the values of configuration parameters <i>SlurmdTimeout</i> and
<i>SlurmctldTimeout</i> are used as the communication frequency for
the slurmctld and slurmd daemons respectively.</li>
<li>Support for PAM to control resource limits by user on each
compute node used. See <i>UsePAM</i> configuration parameter.</li>
<li>Support added for <i>xcpu</i> job launch.</li>
<li>Add support for 1/16 midplane BlueGene blocks.</li>
<li>Add support for overlapping BlueGene blocks.</li>
<li>Add support for dynamic BlueGene block creation on demand.</li>
<li>BlueGene node count specifications are now c-node counts
rather than base partition counts.</li>
</ul>
<h2><a name="12">Major Updates in SLURM Version 1.2</a></h2>
<p>SLURM Version 1.2 became available in February 2007.
Major enhancements include:
<ul>
<li>More complete support for resource management down to the core level
on a node.</li>
<li>Treat memory as a consumable resource on a compute node.</li>
<li>New graphical user interface provided, <i>sview</i>.</li>
<li>Added support for OS X.</li>
<li>Permit batch jobs to be requeued.</li>
<li>Expanded support of Moab and Maui schedulers.</li>
<li><i>Srun</i> command augmented by new commands for each operation:
<i>salloc</i>, <i>sbatch</i>, and <i>sattach</i>.</li>
<li>Sched/wiki plugin (for Moab and Maui Schedulers) rewritten to
provide vastly improved integration.</li>
<li>BlueGene plugin permits use of different boot images per job
specification.</li>
<li>Event trigger mechanism added with new tool <i>strigger</i>.</li>
<li>Added support for task binding to CPUs or memory via <i>cpuset</i>
mechanism.</li>
<li>Added support for configurable
<a href="power_save.html">power savings</a> on idle nodes.</li>
<li>Support for MPICH-MX, MPICH1/shmem and MPICH1/p4 added with
task launch directly from the <i>srun</i> command.</li>
<li>Wrappers available for common Torque/PBS commands
(<i>psub</i>, <i>pstat</i>, and <i>pbsnodes</i>).</li>
<li>Support for <a href="http://www-unix.globus.org/">Globus</a>
(using Torque/PBS command wrappers).</li>
<li>Wrapper available for <i>mpiexec</i> command.</li>
<li><a href="#20">SLURM Version 2.0, May 2009</a></li>
<li><a href="#21">SLURM Version 2.1 and beyond</a></li>
</ul>
<h2><a name="13">Major Updates in SLURM Version 1.3</a></h2>
......@@ -85,34 +29,47 @@ option of using OpenSSL (default) or Munge (GPL).</li>
spawned tasks.</li>
<li>Support added for a much richer job dependency specification
including testing of exit codes and multiple dependencies.</li>
<li>Support added for BlueGene/P systems and HTC (High Throughput
Computing) mode.</li>
</ul>
<h2><a name="14">Major Updates in SLURM Version 1.4</a></h2>
<p>SLURM Version 1.4 is scheduled for released in May 2009.
<h2><a name="20">Major Updates in SLURM Version 2.0</a></h2>
<p>SLURM Version 2.0 is scheduled for released in May 2009.
Major enhancements include:
<ul>
<li>Sophisticated scheduling algorithms are available in a new plugin. Jobs
can be prioritized based upon their age, size and/or fair-share resource
allocation using hierarchical bank accounts.</li>
<li>An assortment of resource limits can be imposed upon individual users
and/or hierarchical bank accounts such as maximum job time limit, maximum
job size, and maximum number of running jobs.</li>
<li>Advanced reservations can be made to insure resources will be available
when needed.</li>
<li>Idle nodes can now be completely powered down when idle and automatically
restarted when their is work available.</li>
<li>Jobs in higher priority partitions (queues) can automatically preempt jobs
in lower priority queues. The preempted jobs will automatically resume execution
upon completion of the higher priority job.</li>
in lower priority queues. The preempted jobs will automatically resume
execution upon completion of the higher priority job.</li>
<li>Specific cores are allocated to jobs and jobs steps in order to effective
preempt or gang schedule jobs.</li>
<li>A new configuration parameter, <i>PrologSlurmctld</i>, can be used to
support the booting of different operating systems for each job.</li>
</ul>
<h2><a name="15">Major Updates in SLURM Version 1.5 and beyond</a></h2>
<h2><a name="21">Major Updates in SLURM Version 2.1 and beyond</a></h2>
<p> Detailed plans for release dates and contents of future SLURM releases have
not been finalized. Anyone desiring to perform SLURM development should notify
<a href="mailto:slurm-dev@lists.llnl.gov">slurm-dev@lists.llnl.gov</a>
to coordinate activities. Future development plans includes:
<ul>
<li>Optimized resource allocation based upon network topology (e.g.
hierarchical switches).</li>
<li>Support for BlueGene/Q systems.</li>
<li>Permit resource allocations (jobs) to change size.</li>
<li>Add Kerberos credential support including credential forwarding
and refresh.</li>
</ul>
<p style="text-align:center;">Last modified 13 November 2008</p>
<p style="text-align:center;">Last modified 9 February 2009</p>
<!--#include virtual="footer.txt"-->
......@@ -85,6 +85,9 @@ FIFO (First In First Out, default), backfill, gang (time-slicing for parallel jo
The Maui Scheduler</a>, and
<a href="http://www.clusterresources.com/pages/products/moab-cluster-suite.php">
Moab Cluster Suite</a>.
There is also a <a href="job_priority.html">job prioritization</a> plugin
available for use with the FIFO, backfil and gang schedulers. Jobs can be
prioritized by age, size, fair-share allocation, etc.</li>
<li><a href="switchplugins.html">Switch or interconnect</a>:
<a href="http://www.quadrics.com/">Quadrics</a>
......@@ -170,6 +173,6 @@ PartitionName=DEFAULT MaxTime=UNLIMITED MaxNodes=4096
PartitionName=batch Nodes=lx[0041-9999]
</pre>
<p style="text-align:center;">Last modified 13 November 2008</p>
<p style="text-align:center;">Last modified 9 February 2009</p>
<!--#include virtual="footer.txt"-->
......@@ -13,6 +13,9 @@ distributions using i386, ia64, and x86_64 architectures.</li>
<ul>
<li><b>BlueGene</b>&#151;SLURM support for IBM's BlueGene/L and BlueGene/P
systems has been thoroughly tested.</li>
<li><b>Cray XT</b>&#151;Much of the infrastructure to support a Cray XT
system is current in SLURM. The interface to ALPS/BASIL remains to be done.
Please contact us if you would be interested in this work.</li>
<li><b>Ethernet</b>&#151;Ethernet requires no special support from SLURM and has
been thoroughly tested.</li>
<li><b>IBM Federation</b>&#151;SLURM support for IBM's Federation Switch
......@@ -21,10 +24,11 @@ has been thoroughly tested.</li>
<li><b>Myrinet</b>&#151;Myrinet, MPICH-GM and MPICH-MX are supported.</li>
<li><b>Quadrics Elan</b>&#151;SLURM support for Quadrics Elan 3 and Elan 4 switches
are available in all versions of SLURM and have been thoroughly tested.</li>
<li><b>Sun Constellation</b>&#151;Three-dimensional torus interconnect.</li>
<li><b>Sun Constellation</b>&#151;Resource allocation has been optimized
for the three-dimensional torus interconnect.</li>
<li><b>Other</b>&#151;SLURM ports to other systems will be gratefully accepted.</li>
</ul>
<p style="text-align:center;">Last modified 22 December 2008</p>
<p style="text-align:center;">Last modified 9 February 2009</p>
<!--#include virtual="footer.txt"-->
<!--#include virtual="header.txt"-->
<h1>SLURM: A Highly Scalable Resource Manager</h1>
<p>SLURM is an open-source resource manager designed for Linux clusters of all
sizes. It provides three key functions. First it allocates exclusive and/or non-exclusive
access to resources (computer nodes) to users for some duration of time so they
can perform work. Second, it provides a framework for starting, executing, and
monitoring work (typically a parallel job) on a set of allocated nodes. Finally,
it arbitrates contention for resources by managing a queue of pending work. </p>
<p>SLURM is not a sophisticated batch system, but it does provide an Applications
Programming Interface (API) for integration with external schedulers such as
<p>SLURM is an open-source resource manager designed for Linux clusters of
all sizes.
It provides three key functions.
First it allocates exclusive and/or non-exclusive access to resources
(computer nodes) to users for some duration of time so they can perform work.
Second, it provides a framework for starting, executing, and monitoring work
(typically a parallel job) on a set of allocated nodes.
Finally, it arbitrates contention for resources by managing a queue of
pending work. </p>
<p>SLURM's design is very modular with dozens of optional plugins.
In its simplest configuration, it can be installed and configured in a
couple of minutes (see <a href="http://www.linux-mag.com/id/7239/1/">
Caos NSA and Perceus: All-in-one Cluster Software Stack</a>
by Jeffrey B. Layton).
More complex configurations rely upon a
<a href="http://www.mysql.com/">MySQL</a> for archiving
<a href="accounting.html">accounting</a> records, managing
<a href="resource_limits.html">resource limits</a> by user or bank account,
or supporting sophisticated <a href="job_priority.html">job prioritization</a>
algorithms.
SLURM also provides an Applications Programming Interface (API) for
integration with external schedulers such as
<a href="http://www.clusterresources.com/pages/products/maui-cluster-scheduler.php">
The Maui Scheduler</a> and
The Maui Scheduler</a> or
<a href="http://www.clusterresources.com/pages/products/moab-cluster-suite.php">
Moab Cluster Suite</a>.
While other resource managers do exist, SLURM is unique in several respects:
Moab Cluster Suite</a>.</p>
<p>While other resource managers do exist, SLURM is unique in several
respects:
<ul>
<li>Its source code is freely available under the
<a href="http://www.gnu.org/licenses/gpl.html">GNU General Public License</a>.</li>
<li>It is designed to operate in a heterogeneous cluster with up to 65,536 nodes.</li>
<li>It is portable; written in C with a GNU autoconf configuration engine. While
initially written for Linux, other UNIX-like operating systems should be easy
porting targets. A plugin mechanism exists to support various interconnects, authentication
mechanisms, schedulers, etc.</li>
<li>It is portable; written in C with a GNU autoconf configuration engine.
While initially written for Linux, other UNIX-like operating systems should
be easy porting targets.</li>
<li>SLURM is highly tolerant of system failures, including failure of the node
executing its control functions.</li>
<li>It is simple enough for the motivated end user to understand its source and
add functionality.</li>
<li>A plugin mechanism exists to support various interconnects, authentication
mechanisms, schedulers, etc. These plugins are documented and simple enough for the motivated end user to understand the source and add functionality.</li>
</ul></p>
<p>SLURM provides resource management on about 1000 computers worldwide,
......@@ -49,6 +65,6 @@ with 10,240 PowerPC processors and a Myrinet switch</li>
<a href="http://www.clusterresources.com">Cluster Resources</a> and
<a href="http://www.sicortex.com">SiCortex</a>.</p>
<p style="text-align:center;">Last modified 29 November 2007</p>
<p style="text-align:center;">Last modified 9 February 2009</p>
<!--#include virtual="footer.txt"-->
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment