Skip to content
Snippets Groups Projects
Commit ff451574 authored by Mark Grondona's avatar Mark Grondona
Browse files

o many updates to quickstart guide for admins

parent e6f199c5
No related branches found
No related tags found
No related merge requests found
......@@ -55,6 +55,52 @@ structure:Laboratories and Other Field Facilities">
<h3>Overview</h3>
Please see the <a href="quickstart.html">Quick Start User Guide</a> for a general
overview.
<h3>Building and Installing</h3>
<p>Basic instructions to build and install SLURM from source are shown below.
See the README and INSTALL files in the source distribution for more details.
</p>
<ol>
<li><span class="commandline">cd</span> to the directory containing the SLURM
source and type <i>.</i><span class="commandline">/configure</span> with appropriate
options.</li>
<li>Type <span class="commandline">make</span> to compile SLURM.</li>
<li> Type <span class="commandline">make install</span> to install the programs,
documentation, libaries, header files, etc.</li>
</ol>
<p>The most commonly used arguments to the <span class="commandline">configure</span>
command include: </p>
<p style="margin-left:.2in"><span class="commandline">--enable-debug</span><br>
Enable debugging of individual modules.</p>
<p style="margin-left:.2in"><span class="commandline">--prefix=<i>PREFIX</i></span><br>
</i>
Install architecture-independent files in PREFIX; default value is /usr/local.</p>
<p style="margin-left:.2in"><span class="commandline">--sysconfdir=<i>DIR</i></span><br>
</i>
Specify location of SLURM configuration file.</p>
<p>Optional SLURM plugins will be built automatically when the
<span class="commandline">configure</span> script detects that the required
build requirements are present. Build dependencies for various plugins
are denoted below.
</p>
<ul>
<li> <b>Munge</b> The auth/munge plugin will be built if Chris Dunlap's Munge
library is installed. </li>
<li> <b>Authd</b> The auth/authd plugin will be built and installed if
the libauth library and its dependency libe are installed.
</li>
<li> <b>QsNet</b> QsNet support in the form of the switch/elan plugin requires
that the qsnetlibs package (from Quadrics) be installed along
with its development counterpart (i.e. the qsnetheaders
package.) The switch/elan plugin also requires the
presence of the libelanosts library and /etc/elanhosts
configuration file. (See elanhosts(5) man page in that
package for more details)
</ul>
Please see the <a href=download.html>Download</a> page for references to
required software to build these plugins.
<p class="footer"><a href="#top">top</a></p>
<h3>Daemons</h3>
<p><b>slurmctld</b> is sometimes called the &quot;controller&quot; daemon. It
orchestrates SLURM activities, including queuing of job, monitoring node state,
......@@ -71,15 +117,27 @@ shell daemon to export control to SLURM. Because slurmd initiates and manages
user jobs, it must execute as the user root.</p>
<p><b>slurmctld</b> and/or <b>slurmd</b> should be initiated at node startup time
per the SLURM configuration.</p>
<h3>Infrastructure</h3>
<p>All communications between SLURM components are authenticated. The authentication
infrastructure used is specified in the SLURM configuration file and options include:
<h4>Authentication of SLURM communications</h4>
<p>All communications between SLURM components are authenticated. The
authentication infrastructure is provided by a dynamically loaded
plugin chosen at runtame via the <b>AuthType</b> keyword in the SLURM
configuration file. Currently available authentication types include
<a href="http://www.theether.org/authd/">authd</a>,
<a href="ftp://ftp.llnl.gov/pub/linux/munge/">munge</a>, and none.
The default authentication infrastructure is "none". This permits any user to execute
any job as another user. This may be fine for testing purposes, but certainly not for production
use. <b>Configure some AuthType value other than "none" if you want any security.</b>
We recommend the use of munge unless you are experience with authd.</p>
We recommend the use of Munge unless you are experienced with authd.
</p>
<p>While SLURM itself does not rely upon synchronized clocks on all nodes
of a cluster for proper operation, its underlying authentication mechanism
may have this requirement. For instance, if SLURM is making use of the
auth/munge plugin for communication, the clocks on all nodes will need to
be synchronized.
</p>
<h4>MPI support</h4>
<p>Quadrics MPI works directly with SLURM on systems having Quadrics interconnects.
For non-Quadrics interconnect systems, <a href="http://www.lam-mpi.org/">LAM/MPI</a>
is the preferred MPI infrastructure. LAM/MPI uses the command <i>lamboot</i> to
......@@ -87,6 +145,7 @@ initiate job-specific daemons on each node using SLURM's <span class="commandlin
command. This places all MPI processes in a process-tree under the control of
the <b>slurmd</b> daemon. LAM/MPI version 7.1 or higher contains support for
SLURM.</p>
<h4>Scheduler support</h4>
<p>SLURM's default scheduler is FIFO (First-In First-Out). A backfill scheduler
plugin is also available. Backfill scheduling will initiate a lower-priority job
if doing so does not delay the expected initiation time of higher priority jobs;
......@@ -96,43 +155,23 @@ scheduling algorithms to control SLURM's workload. Motivated users can even deve
their own scheduler plugin if so desired. </p>
<p>SLURM uses the syslog function to record events. It uses a range of importance
levels for these messages. Be certain that your system's syslog functionality
is operational. </p>
<p>There is no necessity for synchronized clocks on the nodes. Events occur either
in real-time or based upon message traffic. However, synchronized clocks will
permit easier analysis of SLURM logs from multiple nodes.</p>
<p class="footer"><a href="#top">top</a></p>
is operational.
</p>
<h4>Corefile format</h4>
<p>SLURM is designed to support generating a variety of core file formats for
application codes that fail (see the <i>--core</i> option of the <i>srun</i>n
command). Of particular interest, LLNL has developed a light-weight core file
library to log traceback information. We expect to make this library available
to others at some point in the future.</p>
<h3>Building and Installing</h3>
<p>Basic instructions to build and install SLURM are shown below. See the INSTALL
file for more details. </p>
<ol>
<li><span class="commandline">cd</span> to the directory containing the SLURM
source and type <i>.</i><span class="commandline">/configure</span> with appropriate
options.</li>
<li>Type <span class="commandline">make</span> to compile SLURM.</li>
<li> Type <span class="commandline">make install</span> to install the programs,
documentation, libaries, header files, etc.</li>
</ol>
<p>The most commonly used arguments to the <span class="commandline">configure</span>
command include: </p>
<p style="margin-left:.2in"><span class="commandline">--enable-debug</span><br>
Enable debugging of individual modules.</p>
<p style="margin-left:.2in"><span class="commandline">--prefix=<i>PREFIX</i></span><br>
</i>
Install architecture-independent files in PREFIX; default value is /usr/local.</p>
<p style="margin-left:.2in"><span class="commandline">--sysconfdir=<i>DIR</i></span><br>
</i>
Specify location of SLURM configuration file.</p>
<p style="margin-left:.2in"><span class="commandline">--with-totalview</span><br>
Compile with support for the TotalView debugger
(see <a href="http://www.etnus.com/">http://www.etnus.com</a>).
The kernel patch in <b>etc/ptrace.patch</b> may also be required.</p>
application codes that fail (see the <i>--core</i> option of the <i>srun</i>
command). As of now, SLURM only supports a locally developed lightweight
corefile library which has not yet been released to the public. It is
expected that this library will be available in the near future.
</p>
<h4>Parallel debugger support</h4>
<p>SLURM exports information for parallel debuggers using the specification
detailed <a href=http://www-unix.mcs.anl.gov/mpi/mpi-debug/mpich-attach.txt>here</a>.
This is meant to be exploited by any parallel debugger (notably, TotalView),
and support is unconditionally compiled into SLURM code.
</p>
<p class="footer"><a href="#top">top</a></p>
<h3>Configuration</h3>
<p>The SLURM configuration file includes a wide variety of parameters.
This configuration file must be available on each node of the cluster. A full
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment