<p>While other resource managers do exist, SLURM is unique in several
respects:
<ul>
<li>It is designed to operate in a heterogeneous cluster with up to millions of
processors.</li>
<li>It can accept 1,000 job submissions per second and fully execute 500 simple jobs per second (the rate is dependent upon hardware and system configuration).</li>
<li>Its source code is freely available under the
<li><b>Scalability</b>: It is designed to operate in a heterogeneous cluster
with up to tens of millions of processors.</li>
<li><b>Performance</b>: It can accept 1,000 job submissions per second and
fully execute 500 simple jobs per second (depending upon hardware and system
configuration).</li>
<li><b>Free and Open Source</b>: Its source code is freely available under the
<a href="http://www.gnu.org/licenses/gpl.html">GNU General Public License</a>.</li>
<li>It is portable; written in C with a GNU autoconf configuration engine.
While initially written for Linux, other UNIX-like operating systems should
be easy porting targets.</li>
<li>It is highly tolerant of system failures, including failure of the node
executing its control functions.</li>
<li>A plugin mechanism exists to support various interconnects, authentication
mechanisms, schedulers, etc. These plugins are documented and simple enough
for the motivated end user to understand the source and add functionality.</li>
<li><b>Portability</b>: Written in C with a GNU autoconf configuration engine.
While initially written for Linux, other UNIX-like operating systems have
proven easy porting targets.</li>
<li><b>Power Management</b>: Job can specify their desired CPU frequency and
power use by job is recorded. Idle resources can be powered down until needed.</li>
<li><b>Fault Tolerant</b>: It is highly tolerant of system failures, including
failure of the node executing its control functions.</li>
<li><b>Flexibility</b>: A plugin mechanism exists to support various
interconnects, authentication mechanisms, schedulers, etc. These plugins are
documented and simple enough for the motivated end user to understand the
source and add functionality.</li>
<li><b>Resizable Jobs</b>: Jobs can grow and shrink on demand. Job submissions
can specify size and time limit ranges.</li>
<li><b>Status Jobs</b>: Status running jobs at the level of individual tasks to
help identify load imbalances and other anomalies.</li>
</ul></p>
<p>SLURM provides resource management on many of the most powerful computers in
...
...
@@ -55,7 +64,7 @@ A 20-petaflop IBM BlueGene/Q system with 98,304 compute nodes and 1.6 million