From fd8a5cc5ad6d0d3d9a9b110e25553b5440ac55cb Mon Sep 17 00:00:00 2001
From: Moe Jette <jette1@llnl.gov>
Date: Tue, 11 Mar 2008 20:27:01 +0000
Subject: [PATCH] more updates for slurm v1.3

---
 doc/html/big_sys.shtml          | 26 ++++++++++-
 doc/html/download.shtml         | 30 +++++++++----
 doc/html/news.shtml             | 14 +++---
 doc/html/overview.shtml         |  2 +-
 doc/html/quickstart_admin.shtml | 80 ++++++++++++---------------------
 doc/man/man5/slurm.conf.5       | 10 ++---
 6 files changed, 88 insertions(+), 74 deletions(-)

diff --git a/doc/html/big_sys.shtml b/doc/html/big_sys.shtml
index 64aa9a43b8f..f92c4684a6c 100644
--- a/doc/html/big_sys.shtml
+++ b/doc/html/big_sys.shtml
@@ -74,13 +74,35 @@ The default value of PMI_TIME is 500 and this is the number of
 microseconds alloted to transmit each key-pair. 
 We have executed up to 16,000 tasks with a value of PMI_TIME=4000.</p>
 
-<h2>Limits</h2>
+<p>The individual slurmd daemons on compute nodes will initiate messages
+to the slurmctld daemon only when they start up or when the epilog 
+completes for a job. When a job allocated a large number of nodes 
+completes, it can cause a very large number of messages to be sent 
+by the slurmd daemons on these nodes to the slurmctld daemon all at
+the same time. In order to spread this message traffic out over time
+and avoid message loss, The <i>EpilogMsgTime</i> parameter may be 
+used. Note that even if messages are lost, they will be retransmitted, 
+but this will result in a delay for reallocating resources to new jobs.</p>
 
+<h2>Other</h2>
+
+<p>SLURM uses hierarchical communications between the slurmd daemons
+in order to increase parallelism and improve performance. The 
+<i>TreeWidth</i> configuration parameter controls the fanout of messages.
+The default value is 50, meaning each slurmd daemon can communicate
+with up to 50 other slurmd daemons and over 2500 nodes can be contacted
+with two message hops.
+The default value will work well for most clusters.
+Optimal system performance can typically be achieved if <i>TreeWidth</i>
+is set to the square root of the number of nodes in the cluster for
+systems having no more than 2500 nodes or the cube root for larger
+systems.</p>
+ 
 <p>The srun command automatically increases its open file limit to 
 the hard limit in order to process all of the standard input and output
 connections to the launched tasks. It is recommended that you set the
 open file hard limit to 8192 across the cluster.</p>
 
-<p style="text-align:center;">Last modified 29 January 2008</p>
+<p style="text-align:center;">Last modified 11 March 2008</p>
 
 <!--#include virtual="footer.txt"-->
diff --git a/doc/html/download.shtml b/doc/html/download.shtml
index 063de9a93f2..bbd20180d7a 100644
--- a/doc/html/download.shtml
+++ b/doc/html/download.shtml
@@ -7,7 +7,7 @@ SLURM source can be downloaded from <br>
 https://sourceforge.net/projects/slurm/</a><br>
 There is also a Debian package named <i>slurm-llnl</i> available at <br>
 <a href="http://www.debian.org/">http://www.debian.org/</a><br>
-The latest stable release of SLURM is version 1.2.</p>
+The latest stable release of SLURM is version 1.3.</p>
 
 <p> Other software available for download includes
 <ul>
@@ -26,12 +26,19 @@ The latest stable release is version 1.4.</p>
 <h1>Related Software</h1>
 <ul>
 
-<li><b>OpenSSL</b> is recommended for secure communications between SLURM 
-components. Download it from 
-<a href="http://www.openssl.org/">http://www.openssl.org/</a>.
-</li>
+<li>Digital signatures (Cypto plugin) are used to insure message are not altered.</li>
+<ul>
+<li><b>OpenSSL</b><br>
+OpenSSL is recommended for generation of digital signatures.
+Download it from <a href="http://www.openssl.org/">http://www.openssl.org/</a>.</li>
+<li><b>Munge</b><br>
+Munge can be used at an alternative to OpenSSL. 
+Munge is available under the Gnu General Public License, but is slower than OpenSSL 
+for the generation of digital signatures. Munge is available from 
+<a href="http://home.gna.org/munge/">http://home.gna.org/munge/</a>.</li>
+</ul> 
 
-<li>Authentication plugins</li>
+<li>Authentication plugins identifies the user originating a message.</li>
 <ul>
 <li><b>Munge</b><br>
 In order to compile the "auth/munge" authentication plugin for SLURM, you will need
@@ -68,7 +75,7 @@ https://sourceforge.net/projects/slurm/</a>.
 <li><a href="http://www.quadrics.com/">Quadrics MPI</a></li>
 </ul>
 
-<li>Schedulers</li>
+<li>Schedulers offering greater control over the workload</li>
 <ul>
 <li><a href="http://www.platform.com/">Load Sharing Facility (LSF)</a></li>
 <li><a href="http://www.clusterresources.com/pages/products/maui-cluster-scheduler.php">
@@ -77,6 +84,13 @@ Maui Scheduler</a></li>
 Moab Cluster Suite</a></li>
 </ul>
 
+<li>Database tools available for managing accounting data</li>
+<ul>
+<li><a href="http://www.clusterresources.com/pages/products/gold-allocation-manager.php">Gold</a></li>
+<li><a href="http://www.mysql.com/">MySQL</a></li>
+<li><a href="http://www.postgresql.org/">PostregreSQL</a><li>
+</ul>
+
 <li>Task Affinity plugins</li>
 <ul>
 <li><a href="http://www.open-mpi.org/software/plpa/">
@@ -85,6 +99,6 @@ Portable Linux Processor Affinity (PLPA)</a></li>
 
 </ul>
 
-<p style="text-align:center;">Last modified 6 December 2007</p>
+<p style="text-align:center;">Last modified 11 March 2008</p>
 
 <!--#include virtual="footer.txt"-->
diff --git a/doc/html/news.shtml b/doc/html/news.shtml
index 9d3a675c198..432b09786f1 100644
--- a/doc/html/news.shtml
+++ b/doc/html/news.shtml
@@ -6,7 +6,7 @@
 <ul>
 <li><a href="#11">SLURM Version 1.1, May 2006</a></li>
 <li><a href="#12">SLURM Version 1.2, February 2007</a></li>
-<li><a href="#13">SLURM Version 1.3, Winter 2007</a></li>
+<li><a href="#13">SLURM Version 1.3, March 2008</a></li>
 <li><a href="#14">SLURM Version 1.4 and beyond</a></li>
 </ul>
 
@@ -65,16 +65,18 @@ task launch directly from the <i>srun</i> command.</li>
 </ul>
 
 <h2><a name="13">Major Updates in SLURM Version 1.3</a></h2>
-<p>SLURM Version 1.3 is scheduled for release in the Winter of 2007.
+<p>SLURM Version 1.3 was relased in March 2008.
 Major enhancements include:
 <ul>
-<li>Job accounting and completion data stored in a database 
+<li>Job accounting and completion data can be stored in a database 
 (MySQL, PGSQL or simple text file).</li>
+<li>SlurmDBD (Slurm Database Deamon) introduced to provide secure
+database support across multiple clusters.</li>
+<li>Gang scheduler plugin added (time-slicing of parallel jobs
+without an external scheduler).</li>
 <li>Cryptography logic moved to a separate plugin with the 
 option of using OpenSSL (default) or Munge (GPL).</li>
 <li>Improved scheduling of multple job steps within a job's allocation.</li>
-<li>Gang scheduling of jobs (time-slicing of parallel jobs
-without an external scheduler).</li>
 <li>Support for job specification of node features with node counts.</li> 
 <li><i>srun</i>'s --alloc, --attach, and --batch options removed (use 
 <i>salloc</i>, <i>sattach</i> or <i>sbatch</i> commands instead).</li>
@@ -95,6 +97,6 @@ to coordinate activies. Future development plans includes:
 and refresh.</li>
 </ul>
 
-<p style="text-align:center;">Last modified 27 November 2007</p>
+<p style="text-align:center;">Last modified 11 March 2008</p>
 
 <!--#include virtual="footer.txt"-->
diff --git a/doc/html/overview.shtml b/doc/html/overview.shtml
index 01df666632a..5184d84641e 100644
--- a/doc/html/overview.shtml
+++ b/doc/html/overview.shtml
@@ -56,7 +56,7 @@ building block approach. These plugins presently include:
 
 <li><a href="jobacct_storageplugins.html">Job Accounting Storage</a>: 
 text file (default if jobacct_gather != none), 
-<a href="http://www.clusterresources.com/pages/products/gold-allocation-manager.ph">Gold</a>
+<a href="http://www.clusterresources.com/pages/products/gold-allocation-manager.php">Gold</a>
 MySQL, PGSQL, SlurmDBD (Slurm Database Daemon) or none</li>
 
 <li><a href="jobcompplugins.html">Job completion logging</a>: 
diff --git a/doc/html/quickstart_admin.shtml b/doc/html/quickstart_admin.shtml
index 12681ff7e8c..6644ba236a8 100644
--- a/doc/html/quickstart_admin.shtml
+++ b/doc/html/quickstart_admin.shtml
@@ -134,7 +134,7 @@ Some macro definitions that may be used in building SLURM include:
 # .rpmmacros
 # For AIX at LLNL
 # Override some RPM macros from /usr/lib/rpm/macros
-# Set other SLURM-specific macros for unconventional file locations
+# Set SLURM-specific macros for unconventional file locations
 #
 %_enable_debug     "--with-debug"
 %_prefix           /admin/llnl
@@ -208,17 +208,24 @@ For more information, see <a href="quickstart.html#mpi">MPI</a>.
 
 <h3>Scheduler support</h3>
 <p>The scheduler used by SLURM is controlled by the <b>SchedType</b> configuration 
-parameter. This is meant to control the relative importance of pending jobs.
-SLURM's default scheduler is FIFO (First-In First-Out). A backfill scheduler 
-plugin is also available. Backfill scheduling will initiate a lower-priority job 
+parameter. This is meant to control the relative importance of pending jobs and 
+several options are available
+SLURM's default scheduler is FIFO (First-In First-Out). 
+SLURM offers a backfill scheduling plugin.
+Backfill scheduling will initiate a lower-priority jobs
 if doing so does not delay the expected initiation time of higher priority jobs; 
 essentially using smaller jobs to fill holes in the resource allocation plan. 
+Effective backfill scheduling does require users to specify job time limits.
+SLURM offers a gang scheduler plugin, which provides time slicing of parallel 
+jobs sharing the same nodes.
 SLURM also supports a plugin for use of 
 <a href="http://www.clusterresources.com/pages/products/maui-cluster-scheduler.php">
 The Maui Scheduler</a> or 
 <a href="http://www.clusterresources.com/pages/products/moab-cluster-suite.php">
-Moab Cluster Suite</a> which offer sophisticated scheduling algorithms. 
-Motivated users can even develop their own scheduler plugin if so desired. </p>
+Moab Cluster Suite</a> which offer sophisticated scheduling algorithms.
+For more information about these options see
+<a href="gang_scheduling">Gang Scheduling</a> and
+<a href="cons_res_share.html">Sharing Consumable Resources</a>.</p> 
 
 <h3>Node selection</h3>
 <p>The node selection mechanism used by SLURM is controlled by the 
@@ -238,6 +245,12 @@ aware and interacts with the BlueGene bridge API).</p>
 levels for these messages. Be certain that your system's syslog functionality
 is operational. </p>
 
+<h3>Accounting</h3>
+<p>SLURM supports accounting records being written to a simple text file,
+directly to a database (MySQL or PostgreSQL), or to a daemon securely 
+managing accounting data for multiple clusters. For more information 
+see <a href="accounting">Accounting</a></p>
+
 <h3>Corefile format</h3>
 <p>SLURM is designed to support generating a variety of core file formats for 
 application codes that fail (see the <i>--core</i> option of the <i>srun</i>
@@ -341,7 +354,8 @@ minimum configuration values will be considered DOWN and not scheduled.
 Note that a more extensive sample configuration file is provided in
 <b>etc/slurm.conf.example</b>. We also have a web-based 
 <a href="configurator.html">configuration tool</a> which can 
-be used to build a simple configuration file.</p>
+be used to build a simple configuration file, which can then be
+manually edited for more complex configurations.</p>
 <pre>
 # 
 # Sample /etc/slurm.conf for mcr.llnl.gov
@@ -545,50 +559,12 @@ Configuration data as of 03/19-13:04:12
 AuthType          = auth/munge
 BackupAddr        = eadevj
 BackupController  = adevj
+BOOT_TIME         = 01/10-09:19:21
+CacheGroups       = 0
+CheckpointType    = checkpoint/none
 ControlAddr       = eadevi
 ControlMachine    = adevi
-DatabaseType      = database/flatfile
-DatabaseHost      = (null)
-DatabasePort      = (null)
-DatabaseUser      = (null)
-Epilog            = (null)
-FastSchedule      = 1
-FirstJobId        = 1
-InactiveLimit     = 0
-JobCompLoc        = /var/tmp/jette/slurm.job.log
-JobCompType       = jobcomp/filetxt
-JobCredPrivateKey = /etc/slurm/slurm.key
-JobCredPublicKey  = /etc/slurm/slurm.cert
-KillWait          = 30
-MaxJobCnt         = 2000
-MinJobAge         = 300
-PluginDir         = /usr/lib/slurm
-Prolog            = (null)
-ReturnToService   = 1
-SchedulerAuth     = (null)
-SchedulerPort     = 65534
-SchedulerType     = sched/backfill
-SlurmUser         = slurm(97)
-SlurmctldDebug    = 4
-SlurmctldLogFile  = /tmp/slurmctld.log
-SlurmctldPidFile  = /tmp/slurmctld.pid
-SlurmctldPort     = 7002 
-SlurmctldTimeout  = 300
-SlurmdDebug       = 65534
-SlurmdLogFile     = /tmp/slurmd.log
-SlurmdPidFile     = /tmp/slurmd.pid
-SlurmdPort        = 7003
-SlurmdSpoolDir    = /tmp/slurmd
-SlurmdTimeout     = 300
-TreeWidth         = 50
-JobAcctStorageType      = jobacct_storage/filetxt
-JobAcctStorageLoc       = /tmp/jobacct.log
-JobAcctGatherFrequncy   = 5
-JobAcctGatherType       = jobacct_gather/linux
-SLURM_CONFIG_FILE = /etc/slurm/slurm.conf
-StateSaveLocation = /usr/local/tmp/slurm/adev
-SwitchType        = switch/elan
-TmpFS             = /tmp
+...
 WaitTime          = 0
 
 Slurmctld(primary/backup) at adevi/adevj are UP/UP
@@ -601,9 +577,9 @@ adev0: scontrol shutdown
 <p>An extensive test suite is available within the SLURM distribution 
 in <i>testsuite/expect</i>. 
 There are about 250 tests which will execute on the order of 2000 jobs 
-and 4000 job steps. 
+and 5000 job steps. 
 Depending upon your system configuration and performance, this test 
-suite will take roughly 40 minutes to complete.
+suite will take roughly 80 minutes to complete.
 The file <i>testsuite/expect/globals</i> contains default paths and
 procedures for all of the individual tests.  You will need to edit this
 file to specify where SLURM and other tools are installed.
@@ -628,6 +604,6 @@ in the NEWS file.
 
 </pre> <p class="footer"><a href="#top">top</a></p>
 
-<p style="text-align:center;">Last modified 1 October 2007</p>
+<p style="text-align:center;">Last modified 11 March 2008</p>
 
 <!--#include virtual="footer.txt"-->
diff --git a/doc/man/man5/slurm.conf.5 b/doc/man/man5/slurm.conf.5
index c2f9d7fb2c3..3c711d6bb64 100644
--- a/doc/man/man5/slurm.conf.5
+++ b/doc/man/man5/slurm.conf.5
@@ -80,19 +80,19 @@ will be written to a the file specified by the
 \fBAccountingStorageLoc\fR parameter.
 The value "accounting_storage/gold" indicates that account records
 will be written to Gold
-(http://www.clusterresources.com/pages/products/gold-allocation-manager.ph),
+(http://www.clusterresources.com/pages/products/gold-allocation-manager.php),
 which maintains its own database.
 The value "accounting_storage/mysql" indicates that accounting records
-should be written to a mysql database specified by the 
+should be written to a MySQL database specified by the 
 \fBAccountingStorageLoc\fR parameter.
 The default value is "accounting_storage/none", which means that
 account records are not maintained. 
 The value "accounting_storage/pgsql" indicates that accounting records
-should be written to a postresql database specified by the 
+should be written to a PostgreSQL database specified by the 
 \fBAccountingStorageLoc\fR parameter.
 The value "accounting_storage/slurmdbd" indicates that accounting records
-will be written to SlurmDbd, which maintains its own database. See 
-"man slurmdbd" for more information.
+will be written to SlurmDDB, which manages an underlying MySQL or 
+PostgreSQL database. See "man slurmdbd" for more information.
 Also see \fBDefaultStorageType\fR.
 
 .TP
-- 
GitLab