Moved remaining information from admin.guide into quick.start.guide.

60627207 · Moe Jette · 4dabf832 · 4dabf832 · 60627207
Commit 60627207 authored 22 years ago by Moe Jette
--- a/doc/html/admin.guide.html
+++ b/doc/html/admin.guide.html
-<html>
-<head>
-<title>SLURM Administrator's Guide</title>
-</head>
-<body>
-
-<h1>SLURM Administrator's Guide</h1>
-
-<h2>Overview</h2>
-Simple Linux Utility for Resource Management (SLURM) is an open source,
-fault-tolerant, and highly scalable cluster management and job 
-scheduling system for Linux clusters having 
-thousands of nodes.  Components include machine status, partition
-management, job management, scheduling and stream copy modules.  
-
-<h2>Build Information</h2>
-TBD
-Include PKI build instructions.
-
-<h2>Configuration</h2>
-There a single SLURM configuration file containing: 
-overall SLURM options, node configurations, and partition configuration. 
-This file is located at "/etc/slurm.conf" by default.
-The file location can be modified at system build time using the 
-DEFAULT_SLURM_CONF parameter.
-The overall SLURM configuration options specify the control and backup 
-control machines. 
-The locations of daemons, state information storage, and other details 
-are specified at build time. 
-See the <a href="#Build">Build Parameters</a> section for details. 
-The node configuration tell SLURM what nodes it is to manage as well as 
-their expected configuration. 
-The partition configuration permits you to define sets (or partitions) 
-of nodes and establish distinct job limits or access control for them. 
-Configuration information may be read or updated using SLURM APIs.
-This configuration file or a copy of it must be accessible on every computer under 
-SLURM management. 
-<p>
-The following parameters may be specified:
-<dl>
-<dt>BackupController
-<dd>The name of the machine where SLURM control functions are to be 
-executed in the event that ControlMachine fails. This node
-may also be used as a compute server if so desired. It will come into service 
-as a controller only upon the failure of ControlMachine and will revert 
-to a "standby" mode when the ControlMachine becomes available once again. 
-This should be a node name without the full domain name (e.g. "lx0002"). 
-While not essential, it is highly recommended that you specify a backup controller.
-
-<dt>ControlMachine
-<dd>The name of the machine where SLURM control functions are executed. 
-This should be a node name without the full domain name (e.g. "lx0001"). 
-This value must be specified.
-
-<dt>Epilog
-<dd>Fully qualified pathname of a program to execute as user root on every 
-node when a user's job completes (e.g. "/usr/local/slurm/epilog"). This may 
-be used to purge files, disable user login, etc. By default there is no epilog.
-
-<dt>FastSchedule
-<dd>If set to 1, then consider the configuration of each node to be that 
-specified in the configuration file. If set to 0, then base scheduling 
-decisions upon the actual configuration of each node. If the number of 
-node configuration entries in the configuration file is signficantly 
-lower than the number of nodes, setting FastSchedule to 1 will permit 
-much faster scheduling decisions to be made. The default value is 1.
-
-<dt>FirstJobId
-<dd>The job id to be used for the first submitted to SLURM without a 
-specific requested value. Job id values generated will incremented by 1 
-for each subsequent job. This may be used to provide a meta-scheduler 
-with a job id space which is disjoint from the interactive jobs. 
-The use of node names containing a numeric suffix will provide faster 
-operation for larger clusters. The default value is 10.
-
-<dt>HashBase
-<dd>If the node names include a sequence number, this value defines the 
-base to be used in building a hash table based upon node name. Value of 8 
-and 10 are recognized for octal and decimal sequence numbers respectively.
-The value of zero is also recognized for node names lacking a sequence number. 
-The default value is 10.
-
-<dt>HeartbeatInterval
-<dd>The interval, in seconds, at which the SLURM controller tests the 
-status of other daemons. The default value is 30 seconds.
-
-<dt>InactiveLimit
-<dd>The interval, in seconds, a job is permitted to be inactive (with 
-no active job steps) before it is terminated. This prevents forgotten 
-jobs to be purged in a timely fashion without waiting for their time 
-limit to be reached. The default value is unlimited (zero). 
-
-<dt>JobCredentialPrivateKey
-<dd>Fully qualified pathname of a file containing a private key used for 
-authentication by Slurm daemons.
-
-<dt>JobCredentialPublicCertificate
-<dd>Fully qualified pathname of a file containing a public key used for 
-authentication by Slurm daemons.
-
-<dt>KillWait
-<dd>The interval, in seconds, given to a job's processes between the 
-SIGTERM and SIGKILL signals upon reaching its time limit. 
-If the job fails to terminate gracefully 
-in the interval specified, it will be forcably terminated. The default 
-value is 30 seconds.
-
-<dt>Prioritize
-<dd>Fully qualified pathname of a program to execute in order to establish 
-the initial priority of a newly submitted job. By default there is no 
-prioritization program and each job gets a priority lower than that of 
-any existing jobs.
-
-<dt>Prolog
-<dd>Fully qualified pathname of a program to execute as user root on every 
-node when a user's job begins execution (e.g. "/usr/local/slurm/prolog"). 
-This may be used to purge files, enable user login, etc. By default there 
-is no prolog.
-
-<dt>ReturnToService
-<dd>If set to 1, then a DOWN node will become available for use 
-upon registration. The default value is 0, which 
-means that a node will remain DOWN until a system administrator explicitly 
-makes it available for use.
-
-<dt>SlurmctldPort
-<dd>The port number that the SLURM controller, <i>slurmctld</i>, listens 
-to for work. The default value is SLURMCTLD_PORT as established at system 
-build time.
-
-<dt>SlurmctldTimeout
-<dd>The interval, in seconds, that the backup controller waits for the 
-primary controller to respond before assuming control. The default value 
-is 300 seconds.
-
-<dt>SlurmdPort
-<dd>The port number that the SLURM compute node daemon, <i>slurmd</i>, listens 
-to for work. The default value is SLURMD_PORT as established at system 
-build time.
-
-<dt>SlurmdTimeout
-<dd>The interval, in seconds, that the SLURM controller waits for <i>slurmd</i> 
-to respond before configuring that node's state to DOWN. The default value 
-is 300 seconds.
-
-<dt>StateSaveLocation
-<dd>Fully qualified pathname of a directory into which the slurm controller, 
-<i>slurmctld</i>, saves its state (e.g. "/usr/local/slurm/checkpoint"). SLURM 
-state will saved here to recover from system failures. The default value is "/tmp".
-If any slurm daemons terminate abnormally, their core files will also be written 
-into this directory.
-
-<dt>TmpFS
-<dd>Fully qualified pathname of the file system available to user jobs for 
-temporary storage. This parameter is used in establishing a node's <i>TmpDisk</i>
-space. The default value is "/tmp".
-
-</dl>
-Any text after "#" until the end of the line in the configuration file 
-will be considered a comment. 
-If you need to use "#" in a value within the configuration file, proceed 
-it with backslash "\").
-The configuration file should contain a keyword followed by an 
-equal sign, followed by the value. 
-Keyword value pairs should be separated from each other by white space. 
-The field descriptor keywords are case sensitive. 
-The size of each line in the file is limited to 1024 characters. 
-A sample SLURM configuration file (without node or partition information)
-follows.
-<pre>
-ControlMachine=lx0001 BackupController=lx0002
-Epilog=/usr/local/slurm/epilog Prolog=/usr/local/slurm/prolog
-FastSchedule=1
-FirstJobId=65536
-HashBase=10
-HeartbeatInterval=60
-InactiveLimit=120
-KillWait=30
-Prioritize=/usr/local/maui/priority
-SlurmctldPort=7002 SlurmdPort=7003
-SlurmctldTimeout=300 SlurmdTimeout=300
-StateSaveLocation=/tmp/slurm.state
-TmpFS=/tmp
-</pre>
-<p>
-The node configuration permits you to identify the nodes (or machines) 
-to be managed by Slurm. You may also identify the characteristics of the 
-nodes in the configuration file. Slurm operates in a heterogeneous environment 
-and users are able to specify resource requirements for each job.
-Slurm is optimized for scheduling systems in which the number of 
-unique configurations is small. It is recommended that the system 
-node configuration be provided in a minimal number of entries. 
-In many systems, this may be accomplished in only a few lines.
-The node configuration specifies the following information: 
-<dl>
-
-<dt>NodeName
-<dd>Name of a node as returned by hostname (e.g. "lx0012"). 
-<a name="NodeExp">A simple regular expression may optionally 
-be used to specify ranges 
-of nodes to avoid building a configuration file with large numbers 
-of entries. The regular expression can contain one  
-pair of square brackets with a sequence of comma separated 
-numbers and/or ranges of numbers separated by a "-"
-(e.g. "linux[0-64,128]", or "lx[15,18,32-33]").</a> 
-If the NodeName is "DEFAULT", the values specified 
-with that record will apply to subsequent node specifications   
-unless explicitly set to other values in that node record or 
-replaced with a different set of default values. 
-For architectures in which the node order is significant, 
-nodes will be considered consecutive in the order defined. 
-For example, if the configuration for NodeName=charlie immediately 
-follows the configuration for NodeName=baker they will be 
-considered adjacent in the computer.
-
-<dt>Feature
-<dd>A comma delimited list of arbitrary strings indicative of some 
-characteristic associated with the node. 
-There is no value associated with a feature at this time, a node 
-either has a feature or it does not.  
-If desired a feature may contain a numeric component indicating, 
-for example, processor speed. 
-By default a node has no features.
-<dt>RealMemory
-<dd>Size of real memory on the node in MegaBytes (e.g. "2048").
-The default value is 1.
-
-<dt>Procs
-<dd>Number of processors on the node (e.g. "2").
-The default value is 1.
-
-<dt>State
-<dd>State of the node with respect to the initiation of user jobs. 
-Acceptable values are "DOWN", "UNKNOWN", "IDLE", and "DRAINING". 
-The <a href="#NodeStates">node states</a> are fully described below. 
-The default value is "UNKNOWN".
-
-<dt>TmpDisk
-<dd>Total size of temporary disk storage in TmpFS in MegaBytes 
-(e.g. "16384"). TmpFS (for "Temporary File System") 
-identifies the location which jobs should use for temporary storage. 
-Note this does not indicate the amount of free 
-space available to the user on the node, only the total file 
-system size. The system administration should insure this file 
-system is purged as needed so that user jobs have access to 
-most of this space. 
-The Prolog and/or Epilog programs (specified in the configuration file) 
-might be used to insure the file system is kept clean. 
-The default value is 1.
-
-<dt>Weight
-<dd>The priority of the node for scheduling purposes. 
-All things being equal, jobs will be allocated the nodes with 
-the lowest weight which satisfies their requirements. 
-For example, a heterogeneous collection of nodes might 
-be placed into a single partition for greater system
-utilization, responsiveness and capability. It would be 
-preferable to allocate smaller memory nodes rather than larger 
-memory nodes if either will satisfy a job's requirements. 
-The units of weight are arbitrary, but larger weights 
-should be assigned to nodes with more processors, memory, 
-disk space, higher processor speed, etc.
-Weight is an integer value with a default value of 1.
-
-</dl>
-<p>
-Only the NodeName must be supplied in the configuration file; all other 
-items are optional.
-It is advisable to establish baseline node configurations in 
-the configuration file, especially if the cluster is heterogeneous. 
-Nodes which register to the system with less than the configured resources 
-(e.g. too little memory), will be placed in the "DOWN" state to 
-avoid scheduling jobs on them. 
-Establishing baseline configurations will also speed SLURM's 
-scheduling process by permitting it to compare job requirements 
-against these (relatively few) configuration parameters and 
-possibly avoid having to perform job requirements  
-against every individual node's configuration.
-The resources checked at node registration time are: Procs, 
-RealMemory and TmpDisk. 
-While baseline values for each of these can be established 
-in the configuration file, the actual values upon node 
-registration are recorded and these actual values may be 
-used for scheduling purposes (depending upon the value of 
-<i>FastSchedule</i> in the configuration file.
-Default values can be specified with a record in which 
-"NodeName" is "DEFAULT". 
-The default entry values will apply only to lines following it in the 
-configuration file and the default values can be reset multiple times 
-in the configuration file with multiple entries where "NodeName=DEFAULT".
-The "NodeName="  specification must be placed on every line 
-describing the configuration of nodes. 
-In fact, it is generally possible and desirable to define the 
-configurations of all nodes in only a few lines.
-This convention permits significant optimization in the scheduling 
-of larger clusters. 
-The field descriptors above are case sensitive. 
-In order to support the concept of jobs requiring consecutive nodes
-on some architectures, 
-node specifications should be place in this file in consecutive order.
-The size of each line in the file is limited to 1024 characters.
-<p>
-<a name="NodeStates">The node states have the following meanings:</a>
-<dl>
-<dt>BUSY
-<dd>The node has been allocated work (one or more user jobs). 
-
-<dt>DOWN
-<dd>The node is unavailable for use. It has been explicitly configured 
-DOWN or failed to respond to system state inquiries or has 
-explicitly removed itself from service due to a failure. This state 
-typically indicates some problem requiring administrator intervention.
-
-<dt>DRAINED
-<dd>The node is idle, but not available for use. The state of a node 
-will automatically change from DRAINING to DRAINED when user job(s) executing 
-on that node terminate. Since this state is entered by explicit 
-administrator request, additional SLURM administrator intervention is typically 
-not required.
-
-<dt>DRAINING
-<dd>The node has been made unavailable for new work by explicit administrator 
-intervention. It is processing some work at present and will enter state
-"DRAINED" when that work has been completed. This might be used to 
-prepare some nodes for maintenance work.
-
-<dt>IDLE
-<dd>The node is idle and available for use.
-
-<dt>UNKNOWN
-<dd>Default initial node state upon startup of SLURM.
-An attempt will be made to contact the node and acquire current state information.
-
-</dl>
-<p>
-SLURM uses a hash table in order to locate table entries rapidly. 
-Each table entry can be directly accessed without any searching
-if the name contains a sequence number suffix. The value of 
-<i>HashBase</i> in the configuration file specifies the hashing algorithm. 
-Possible contains values are "10" and "8" for names containing 
-decimal and octal sequence numbers respectively
-or "0" which processes mixed alpha-numeric without sequence numbers. 
-The default value of <i>HashBase</i> is "10".
-If you use a naming convention lacking a sequence number, it may be 
-desirable to review the hashing function <i>hash_index</i> in the 
-node_mgr.c module. This is especially important in clusters having 
-large numbers of nodes.  The sequence numbers can start at any 
-desired number, but should contain consecutive numbers. The 
-sequence number portion may contain leading zeros for a consistent 
-name length, if so desired. Note that correct operation 
-will be provided with any nodes names, but performance will suffer 
-without this optimization.
-A sample SLURM configuration file (node information only) follows.
-<pre>
-#
-# Node Configurations
-#
-NodeName=DEFAULT TmpDisk=16384 State=IDLE
-NodeName=lx[0001-0002] State=DRAINED
-NodeName=lx[0003-8000] Procs=16 RealMemory=2048 Weight=16
-NodeName=lx[8001-9999] Procs=32 RealMemory=4096 Weight=40 Feature=1200MHz,VizTools
-</pre>
-<p>
-The partition configuration permits you to establish different job 
-limits or access controls for various groups (or partitions) of nodes. 
-Nodes may be in only one partition. The partition configuration 
-file contains the following information: 
-<dl>
-<dt>AllowGroups
-<dd>Comma separated list of group IDs which may use the partition. 
-If at least one group associated with the user submitting the 
-job is in AllowGroups, he will be permitted to use this partition.
-The default value is "ALL". 
-
-<dt>Default
-<dd>If this keyword is set, jobs submitted without a partition 
-specification will utilize this partition.
-Possible values are "YES" and "NO". 
-The default value is "NO".
-
-<dt>RootOnly
-<dd>Specifies if only user ID zero (or user <i>root</i> may 
-initiate jobs in this partition.
-Possible values are "YES" and "NO". 
-The default value is "NO".
-
-<dt>MaxNodes
-<dd>Maximum count of nodes which may be allocated to any single job,
-The default value is "UNLIMITED", which is represented internally as -1.
-
-<dt>MaxTime
-<dd>Maximum wall-time limit for any job in minutes. The default 
-value is "UNLIMITED", which is represented internally as -1.
-
-<dt>Nodes
-<dd>Comma separated list of nodes which are associated with this 
-partition. Node names may be specified using the <a href="#NodeExp">
-real expression syntax</a> described above. A blank list of nodes 
-(i.e. "Nodes= ") can be used if one wants a partition to exist, 
-but have no resources (possibly on a temporary basis).
-
-<dt>PartitionName
-<dd>Name by which the partition may be referenced (e.g. "Interactive"). 
-This name can be specified by users when submitting jobs.
-
-<dt>Shared
-<dd>Ability of the partition to execute more than one job at a 
-time on each node. Shared nodes will offer unpredictable performance 
-for application programs, but can provide higher system utilization 
-and responsiveness than otherwise possible. 
-Possible values are "FORCE", "YES", and "NO". 
-The default value is "NO".
-
-<dt>State
-<dd>State of partition or availability for use.  Possible values 
-are "UP" or "DOWN". The default value is "UP".
-
-</dl>
-<p>
-Only the PartitionName must be supplied in the configuration file. 
-Other parameters will assume default values if not specified. 
-The default values can be specified with a record in which 
-"PartitionName" is "DEFAULT" if non-standard default values are desired. 
-The default entry values will apply only to lines following it in the 
-configuration file and the default values can be reset multiple times 
-in the configuration file with multiple entries where "PartitionName=DEFAULT". 
-The configuration of one partition should be specified per line.
-The field descriptors above are case sensitive. 
-The size of each line in the file is limited to 1024 characters.
-A sample SLURM configuration file (partition information only) follows.
-<p>
-A single job may be allocated nodes from only one partition and 
-satisfy the configuration specifications for that partitions. 
-The job may specify a particular PartitionName, if so desired, 
-or use the system's default partition.
-<pre>
-#
-# Partition Configurations
-#
-PartitionName=DEFAULT MaxTime=30 MaxNodes=2
-PartitionName=login Nodes=lx[0001-0002] State=DOWN
-PartitionName=debug Nodes=lx[0003-0030] State=UP    Default=YES
-PartitionName=class Nodes=lx[0031-0040] AllowGroups=students
-PartitionName=batch Nodes=lx[0041-9999] MaxTime=UNLIMITED MaxNodes=4096 RootOnly=YES
-</pre>
-<p>
-APIs and an administrative tool can be used to alter the SLRUM 
-configuration in real time. 
-When the SLURM controller restarts, it's state will be restored 
-to that at the time it terminated unless the SLURM configuration 
-file is newer, it which case the configuration will be rebuilt 
-from that file. 
-State information not incorporated in the configuration file, 
-such as job state, will be preserved.
-A <a href="#SampleConfig">SLURM configuration file</a> is included 
-at the end of this document.
-
-<h3>Job Configuration</h3>
-The job configuration format specified below is used by the 
-scontrol administration tool to modify job state information: 
-<dl>
-<dt>Contiguous
-<dd>Determine if the nodes allocated to the job must be contiguous.
-Acceptable values are "YES" and "NO" with the default being "NO".
-
-<dt>Features
-<dd>Required features of nodes to be allocated to this job. 
-Features may be combined using "|" for OR, "&amp;" for AND,  
-and square brackets. 
-For example, "Features=1000MHz|1200MHz&amp;CoolTool".
-The feature list is processes left to right except for 
-the grouping by brackets.
-Square brackets are used to identify alternate features, 
-but ones that must apply to every node allocated to the job. 
-For example, some clusters are configured with more than 
-one parallel file system. These parallel file systems 
-may be accessible only to a subset of the nodes in a cluster. 
-The application may not care which parallel file system 
-is used, but all nodes allocated to it must be in the
-subset of nodes assessing a single parallel file system. 
-This might be specified with a specification of 
-"Features=[PFS1|PFS2|PFS3|PFS4]".
-
-<dt>JobName
-<dd>Name to be associated with the job
-
-<dt>JobId
-<dd>Identification for the job, a sequence number. 
-
-<dt>MinMemory
-<dd>Minimum number of megabytes of real memory per node.
-
-<dt>MinProcs
-<dd>Minimum number of processors per node.
-
-<dt>MinTmpDisk
-<dd>Minimum number of megabytes of temporary disk storage per node.
-
-<dt>ReqNodes
-<dd>The total number of nodes required to execute this job.
-
-<dt>ReqNodeList
-<dd>A comma separated list of nodes to be allocated to the job. 
-The nodes may be specified using regular expressions (e.g.
-"lx[0010-0020,0033-0040]" or "baker,charlie,delta").
-
-<dt>ReqProcs
-<dd>The total number of processors required to execute this job.
-
-<dt>Partition
-<dd>Name of the partition in which this job should execute.
-
-<dt>Priority
-<dd>Integer priority of the pending job. The value may 
-be specified by user root initiated jobs, otherwise SLURM will 
-select a value. Generally, higher priority jobs will be initiated 
-before lower priority jobs. 
-
-<dt>Shared
-<dd>Job can share nodes with other jobs. Possible values are YES and NO.
-
-<dt>State
-<dd>State of the job.  Possible values are "PENDING", "STARTING", 
-"RUNNING", and "ENDING". 
-
-<dt>TimeLimit
-<dd>Maximum wall-time limit for the job in minutes. An "UNLIMITED"
-value is represented internally as -1.
-
-</dl>
-
-<a name="Build"><h2>Build Parameters</h2></a>
-The following configuration parameters are established at SLURM build time. 
-State and configuration information may be read or updated using SLURM APIs.
-
-<dl>
-<dt>SLURMCTLD_PATH
-<dd>The fully qualified pathname of the file containing the SLURM daemon
-to execute on the ControlMachine, <i>slurmctld</i>. The default value is "/usr/local/slurm/bin/slurmctld".
-This file must be accessible to the ControlMachine and BackupController.
-
-<dt>SLURMD_PATH
-<dd>The fully qualified pathname of the file containing the SLURM daemon
-to execute on every compute server node. The default value is "/usr/local/slurm/bin/slurmd".
-This file must be accessible to every SLURM compute server.
-
-<dt>SLURM_CONF
-<dd>The fully qualified pathname of the file containing the SLURM 
-configuration file. The default value is "/etc/SLURM.conf".
-
-<dt>SLURMCTLD_PORT
-<dd>The port number that the SLURM controller, <i>slurmctld</i>, listens 
-to for work. 
-
-<dt>SLURMD_PORT
-<dd>The port number that the SLURM compute node daemon, <i>slurmd</i>, listens 
-to for work. 
-</dl>
-
-<h2>scontrol Administration Tool</h2>
-The tool you will primarily use in the administration of SLURM is scontrol. 
-It provides the means of viewing and updating node and partition 
-configurations. It can also be used to update some job state 
-information. You can execute scontrol with a single keyword on 
-the execute line or it will query you for input and process those 
-keywords on an interactive basis. The scontrol keywords are shown below.
-A <a href="#SampleAdmin">sample scontrol session</a> with examples is appended.
-
-<p>
-Usage: scontrol [-q | -v] [&lt;keyword&gt;]<br>
-q is equivalent to the "quiet" keyword<br>
-v is equivalent to the "verbose" keyword<br>
-</pre>
-<dl>
-<dt>abort
-<dd>Cause <i>slurmctld</i> terminate immediately and generate a core file.
-
-<dt>exit
-<dd>Terminate scontrol.
-
-<dt>help
-<dd>Display this list of scontrol commands and options.
-
-<dt>quiet
-<dd>Print no messages other than error messages.
-
-<dt>quit
-<dd>Terminate scontrol.
-
-<dt>reconfigure
-<dd>The SLURM control daemon re-reads its configuration files.
-
-<dt>show &lt;entity&gt; [&lt;ID&gt;]
-<dd>Show the configuration for a given entity. Entity must 
-be "config", "job", "node", "partition" or "step" for SLURM 
-configuration parameters, job, node, partition, and job step 
-information respectively.
-By default, state information for all records is reported. 
-If you only wish to see the state of one entity record, 
-specify either its ID number (assumed if entirely numeric) 
-or its name. <a href="#NodeExp">Regular expressions</a> may 
-be used to identify node names.
-
-<dt>shutdown
-<dd>Cause <i>slurmctld</i> to save state and terminate.
-
-<dt>update &lt;options&gt;
-<dd>Update the configuration information. 
-Options are of the same format as the configuration file
-and the output of the <i>scontrol show</i> command. 
-Not all configuration information can be modified using 
-this mechanism, such as the configuration of a node 
-after it has registered (only a node's state can be modified). 
-One can always modify the SLURM configuration file and 
-use the reconfigure command to rebuild all controller 
-information if required. 
-This command can only be issued by user <b>root</b>.
-
-<dt>verbose
-<dd>Enable detailed logging of scontrol execution state information.
-
-<dt>version
-<dd>Display the scontrol tool version number.
-</dl>
-
-<h2>Miscellaneous</h2>
-There is no necessity for synchronized clocks on the nodes. 
-Events occur either in real-time based upon message traffic 
-or based upon changes in the time on a node. However, synchronized 
-clocks will permit easier analysis of SLURM logs from multiple 
-nodes.
-<p>
-SLURM uses the syslog function to record events. It uses a 
-range of importance levels for these messages. Be certain
-that your system's syslog functionality is operational.
-
-<a name="SampleConfig"><h2>Sample Configuration File</h2></a>
-<pre>
-# 
-# Sample /etc/slurm.conf
-# Author: John Doe
-# Date: 11/06/2001
-#
-ControlMachine=lx0001 BackupController=lx0002
-Epilog="" Prolog=""
-FastSchedule=1
-FirstJobId=65536
-HashBase=10
-HeartbeatInterval=60
-InactiveLimit=120
-KillWait=30
-Prioritize=/usr/local/maui/priority
-SlurmctldPort=7002 SlurmdPort=7003
-SlurmctldTimeout=300 SlurmdTimeout=300
-StateSaveLocation=/tmp/slurm.state
-TmpFS=/tmp
-#
-# Node Configurations
-#
-NodeName=DEFAULT TmpDisk=16384 State=IDLE
-NodeName=lx[0001-0002] State=DRAINED
-NodeName=lx[0003-8000] Procs=16 RealMemory=2048 Weight=16
-NodeName=lx[8001-9999] Procs=32 RealMemory=4096 Weight=40 Feature=1200MHz
-#
-# Partition Configurations
-#
-PartitionName=DEFAULT MaxTime=30 MaxNodes=2
-PartitionName=login Nodes=lx[0001-0002] State=DOWN
-PartitionName=debug Nodes=lx[0003-0030] State=UP    Default=YES
-PartitionName=class Nodes=lx[0031-0040] AllowGroups=students
-PartitionName=batch Nodes=lx[0041-9999] MaxTime=UNLIMITED MaxNodes=4096 RootOnly=YES
-</pre>
-
-<a name="SampleAdmin"><h2>Sample scontrol Execution</h2></a>
-<pre>
-Remove node lx0030 from service, removing jobs as needed:
-  # scontrol
-  scontrol: update NodeName=lx0030 State=DRAINING
-  scontrol: show job
-  ID=1234 Name=Simulation MaxTime=100 Nodes=lx[0029-0030] State=RUNNING User=smith
-  ID=1235 Name=MyBigTest  MaxTime=100 Nodes=lx0020,lx0023 State=RUNNING User=smith
-  scontrol: update job ID=1234 State=ENDING
-  scontrol: show job 1234
-  Job 1234 not found
-  scontrol: show node lx0030
-  Name=lx0030 Partition=class State=DRAINED Procs=16 RealMemory=2048 TmpDisk=16384
-  scontrol: quit
-</pre>
-
-<hr>
-URL = http://www-lc.llnl.gov/dctg-lc/slurm/admin.guide.html
-<p>Last Modified October 22, 2002</p>
-<address>Maintained by <a href="mailto:slurm-dev@lists.llnl.gov">
-slurm-dev@lists.llnl.gov</a></address>
-</body>
-</html>
--- a/doc/html/quickstart.html
+++ b/doc/html/quickstart.html
@@ -100,6 +100,8 @@ it is restored to service. The controller saves its state to disk
 whenever there is a change. This state can be recovered by the controller 
 at startup time. <b>slurmctld</b> would typically execute as a 
 special user specifically for this purpose (not user root). 
+State changes are saved so that jobs and other state can be 
+preserved when slurmctld moves or is restarted.
 <p>
 The <b>slurmd</b> daemon executes on every compute node.
 It resembles a remote shell daemon to export control to SLURM.
@@ -199,12 +201,22 @@ The remaining information provides basic SLURM administration information.
 Individuals only interested in making use of SLURM need not read 
 read further.

-<h3>Authentication</h3>
+<h3>Infrastructure</h3>

 All communications between SLURM components are authenticated. 
 The authentication infrastructure used is specified in the SLURM 
 configuration file and options include: 
 <a href="http://www.theether.org/authd/">authd</a>, munged and none.
+<p>
+SLURM uses the syslog function to record events. It uses a 
+range of importance levels for these messages. Be certain
+that your system's syslog functionality is operational.
+<p>
+There is no necessity for synchronized clocks on the nodes. 
+Events occur either in real-time based upon message traffic. 
+However, synchronized clocks will permit easier analysis of 
+SLURM logs from multiple nodes.
+

 <h3>Configuration</h3>