From f5f13a68576c2e83e7140454aa071092ffc005f6 Mon Sep 17 00:00:00 2001
From: Moe Jette <jette1@llnl.gov>
Date: Wed, 25 Jun 2008 00:15:31 +0000
Subject: [PATCH] add draft scheduling policy document

---
 doc/html/sched_policy.shtml | 109 ++++++++++++++++++++++++++++++++++++
 1 file changed, 109 insertions(+)
 create mode 100644 doc/html/sched_policy.shtml

diff --git a/doc/html/sched_policy.shtml b/doc/html/sched_policy.shtml
new file mode 100644
index 00000000000..b7a23b530b7
--- /dev/null
+++ b/doc/html/sched_policy.shtml
@@ -0,0 +1,109 @@
+<!--#include virtual="header.txt"-->
+
+<h1>Sheduling Policy</h1>
+
+<p>SLURM scheduling policy support was significantly changed
+in version 1.3 in order to take advantage of the database 
+integration used for storing accounting information.
+This document describes the capabilities available in 
+SLURM version 1.3.4.
+New features are under active development. 
+Familiarity with SLURM's <a href="accounting">Accounting</a> web page
+is strongly recommended before use of this document.</p>
+
+<h2>Configuration</h2>
+
+<p>Scheduling policy information must be stored in a database 
+as specified by the <b>AccountingStorageType</b> configuration parameter
+in the <b>slurm.conf</b> configuration file.
+Information can be recorded in either <a href="http://www.mysql.com/">MySQL</a> 
+or <a href="http://www.postgresql.org/">PostgreSQL</a>.
+For security and performance reasons, the use of 
+SlurmDBD (SLURM Database Daemon) as a front-end to the 
+database is strongly recommended. 
+SlurmDBD uses a SLURM authentication plugin (e.g. Munge).
+SlurmDBD also uses an existing SLURM accounting storage plugin
+to maximize code reuse.
+SlurmDBD uses data caching and prioritization of pending requests
+in order to optimize performance.
+While SlurmDBD relies upon existing SLURM plugins for authentication 
+and database use, the other SLURM commands and daemons are not required 
+on the host where SlurmDBD is installed. 
+Only the <i>slurmdbd</i> and <i>slurm-plugins</i> RPMs are required
+for SlurmDBD execution.</p>
+
+<p>Both accounting and scheduling policy are configured based upon
+an <i>association</i>. An <i>association</i> is a 4-tuple consisting 
+of the cluster name, bank account, user and (optionally) the SLURM 
+partition.
+In order to enforce scheduling policy, set the value of 
+<b>AccountingStorageEnforce</b> to "1" in <b>slurm.conf</b>.
+This prevents users from running any jobs without an valid
+<i>association</i> record in the database and enforces scheduling 
+policy limits that have been configured.</p>
+
+<h2>Tools</h2>
+
+<p>The tool used to manage accounting policy is <i>sacctmgr</i>.
+It can be used to create and delete cluster, user, bank account, 
+and partition records plus their combined <i>association</i> record.
+See <i>man sacctmgr</i> for details on this tools and examples of 
+its use.</p>
+
+<p>A web interface with graphical output is currently under development.</p>
+
+<p>Changes made to the scheduling policy are uploaded to 
+the SLURM control daemons on the various clusters and take effect 
+immediately. When an association is delete, all jobs running or 
+pending which belong to that association are immediately cancelled.
+When limits are lowered, running jobs will not be cancelled to 
+satisfy the new limits, but the new lower limits will be enforced.</p>
+
+<h2>Policies supported</h2>
+
+<p> A limited subset of scheduling policy options are currently 
+supported. 
+The available options are expected to increase as development 
+continues. 
+Most of these scheduling policy options are available not only 
+for an association, but also for each cluster and account. 
+If a new association is created for some user and some scheduling 
+policy options is not specified, the default will be the option 
+for the cluster plus account pair and if that is not specified 
+then the cluster and if that is not specified then no limit 
+will apply.</p>
+
+<p>Currently available cheduling policy options:</p>
+<ul>
+<li><b>MaxJobs</b> Maxiumum number of running jobs for this association</li>
+<li><b>MaxNodes</b> Maxiumum number of nodes for any single jobs in this association</li>
+<li><b>MaxWall</b> Maxiumum wall clock time limit for any single jobs in this association</li>
+</ul>
+
+<p>The <b>MaxNodes</b> and <b>MaxWall</b> options already exist in 
+SLURM's configuration on a per-partition basis, but these options 
+provide the ability to establish limits on a per-user basis.
+The <b>MaxJobs</b> option provides an entirely new mechanism 
+for SLURM to control the workload any individual may place on
+a cluster in order to achieve some balance between users.</p>
+
+<p>The next scheduling policy expected to be added is the concept 
+of fair-share scheduling based upon the hierarchical bank account
+data is already maintained in the SLURM database. 
+The priorities of pending jobs will be adjusted in order to 
+deliver resources in proportion to each association's fair-share.
+Consider the trivial example of a single bank account with 
+two users named Alice and Brian. 
+We might allocate Alice 60 percent of the resources and Brian the 
+remaining 40 precent.
+If Alice has actually used 80 percent of available resources in the 
+recent past, then Brian's pending jobs will automatically be given a
+higher priority than Alice's in order to deliver resources in 
+proportion to the fair-share target. 
+The time window considered in fair-share scheduling will be configurable 
+as well as the relative importance of job age (time waiting to run), 
+but this this example illustrates the concepts involved.</p>
+
+<p style="text-align: center;">Last modified 24 June 2008</p>
+
+</ul></body></html>
-- 
GitLab