From f5f13a68576c2e83e7140454aa071092ffc005f6 Mon Sep 17 00:00:00 2001 From: Moe Jette <jette1@llnl.gov> Date: Wed, 25 Jun 2008 00:15:31 +0000 Subject: [PATCH] add draft scheduling policy document --- doc/html/sched_policy.shtml | 109 ++++++++++++++++++++++++++++++++++++ 1 file changed, 109 insertions(+) create mode 100644 doc/html/sched_policy.shtml diff --git a/doc/html/sched_policy.shtml b/doc/html/sched_policy.shtml new file mode 100644 index 00000000000..b7a23b530b7 --- /dev/null +++ b/doc/html/sched_policy.shtml @@ -0,0 +1,109 @@ +<!--#include virtual="header.txt"--> + +<h1>Sheduling Policy</h1> + +<p>SLURM scheduling policy support was significantly changed +in version 1.3 in order to take advantage of the database +integration used for storing accounting information. +This document describes the capabilities available in +SLURM version 1.3.4. +New features are under active development. +Familiarity with SLURM's <a href="accounting">Accounting</a> web page +is strongly recommended before use of this document.</p> + +<h2>Configuration</h2> + +<p>Scheduling policy information must be stored in a database +as specified by the <b>AccountingStorageType</b> configuration parameter +in the <b>slurm.conf</b> configuration file. +Information can be recorded in either <a href="http://www.mysql.com/">MySQL</a> +or <a href="http://www.postgresql.org/">PostgreSQL</a>. +For security and performance reasons, the use of +SlurmDBD (SLURM Database Daemon) as a front-end to the +database is strongly recommended. +SlurmDBD uses a SLURM authentication plugin (e.g. Munge). +SlurmDBD also uses an existing SLURM accounting storage plugin +to maximize code reuse. +SlurmDBD uses data caching and prioritization of pending requests +in order to optimize performance. +While SlurmDBD relies upon existing SLURM plugins for authentication +and database use, the other SLURM commands and daemons are not required +on the host where SlurmDBD is installed. +Only the <i>slurmdbd</i> and <i>slurm-plugins</i> RPMs are required +for SlurmDBD execution.</p> + +<p>Both accounting and scheduling policy are configured based upon +an <i>association</i>. An <i>association</i> is a 4-tuple consisting +of the cluster name, bank account, user and (optionally) the SLURM +partition. +In order to enforce scheduling policy, set the value of +<b>AccountingStorageEnforce</b> to "1" in <b>slurm.conf</b>. +This prevents users from running any jobs without an valid +<i>association</i> record in the database and enforces scheduling +policy limits that have been configured.</p> + +<h2>Tools</h2> + +<p>The tool used to manage accounting policy is <i>sacctmgr</i>. +It can be used to create and delete cluster, user, bank account, +and partition records plus their combined <i>association</i> record. +See <i>man sacctmgr</i> for details on this tools and examples of +its use.</p> + +<p>A web interface with graphical output is currently under development.</p> + +<p>Changes made to the scheduling policy are uploaded to +the SLURM control daemons on the various clusters and take effect +immediately. When an association is delete, all jobs running or +pending which belong to that association are immediately cancelled. +When limits are lowered, running jobs will not be cancelled to +satisfy the new limits, but the new lower limits will be enforced.</p> + +<h2>Policies supported</h2> + +<p> A limited subset of scheduling policy options are currently +supported. +The available options are expected to increase as development +continues. +Most of these scheduling policy options are available not only +for an association, but also for each cluster and account. +If a new association is created for some user and some scheduling +policy options is not specified, the default will be the option +for the cluster plus account pair and if that is not specified +then the cluster and if that is not specified then no limit +will apply.</p> + +<p>Currently available cheduling policy options:</p> +<ul> +<li><b>MaxJobs</b> Maxiumum number of running jobs for this association</li> +<li><b>MaxNodes</b> Maxiumum number of nodes for any single jobs in this association</li> +<li><b>MaxWall</b> Maxiumum wall clock time limit for any single jobs in this association</li> +</ul> + +<p>The <b>MaxNodes</b> and <b>MaxWall</b> options already exist in +SLURM's configuration on a per-partition basis, but these options +provide the ability to establish limits on a per-user basis. +The <b>MaxJobs</b> option provides an entirely new mechanism +for SLURM to control the workload any individual may place on +a cluster in order to achieve some balance between users.</p> + +<p>The next scheduling policy expected to be added is the concept +of fair-share scheduling based upon the hierarchical bank account +data is already maintained in the SLURM database. +The priorities of pending jobs will be adjusted in order to +deliver resources in proportion to each association's fair-share. +Consider the trivial example of a single bank account with +two users named Alice and Brian. +We might allocate Alice 60 percent of the resources and Brian the +remaining 40 precent. +If Alice has actually used 80 percent of available resources in the +recent past, then Brian's pending jobs will automatically be given a +higher priority than Alice's in order to deliver resources in +proportion to the fair-share target. +The time window considered in fair-share scheduling will be configurable +as well as the relative importance of job age (time waiting to run), +but this this example illustrates the concepts involved.</p> + +<p style="text-align: center;">Last modified 24 June 2008</p> + +</ul></body></html> -- GitLab