From dbf81c476e7d0d52f64a3eeb06db14c929e309c6 Mon Sep 17 00:00:00 2001 From: Moe Jette <jette1@llnl.gov> Date: Wed, 27 May 2009 21:24:58 +0000 Subject: [PATCH] svn merge -r17609:17615 https://eris.llnl.gov/svn/slurm/branches/slurm-2.0 --- NEWS | 1 + auxdir/x_ac_readline.m4 | 2 +- configure | 2 +- doc/html/power_save.shtml | 30 ++++++++++++++++++++++++++++-- doc/html/publications.shtml | 11 +++++++++-- 5 files changed, 40 insertions(+), 6 deletions(-) diff --git a/NEWS b/NEWS index 8be03611fd6..840848c24a2 100644 --- a/NEWS +++ b/NEWS @@ -349,6 +349,7 @@ documents those changes that are of interest to users and admins. * Changes in SLURM 1.3.17 ========================= + -- Fix bug in configure script that can clear user specified LIBS. * Changes in SLURM 1.3.16 ========================= diff --git a/auxdir/x_ac_readline.m4 b/auxdir/x_ac_readline.m4 index fc6b6389209..d9dd393c486 100644 --- a/auxdir/x_ac_readline.m4 +++ b/auxdir/x_ac_readline.m4 @@ -39,7 +39,7 @@ AC_DEFUN([X_AC_READLINE], #include <readline/history.h>]], [[ char *line = readline("in:");]])],[AC_DEFINE([HAVE_READLINE], [1], [Define if you are compiling with readline.])],[READLINE_LIBS=""]) - LIBS="$savedLIBS" + LIBS="$saved_LIBS" fi AC_SUBST(READLINE_LIBS) ]) diff --git a/configure b/configure index 512a534695b..0a25a7fcebf 100755 --- a/configure +++ b/configure @@ -27102,7 +27102,7 @@ fi rm -f core conftest.err conftest.$ac_objext conftest_ipa8_conftest.oo \ conftest$ac_exeext conftest.$ac_ext - LIBS="$savedLIBS" + LIBS="$saved_LIBS" fi diff --git a/doc/html/power_save.shtml b/doc/html/power_save.shtml index 27ba012f0f2..aa29216ce51 100644 --- a/doc/html/power_save.shtml +++ b/doc/html/power_save.shtml @@ -1,6 +1,7 @@ <!--#include virtual="header.txt"--> <h1>Power Saving Guide</h1> + <p>SLURM provides an integrated power saving mechanism for idle nodes. Nodes that remain idle for an configurable period of time can be placed in a power saving mode. @@ -19,6 +20,7 @@ SLURM's support to increase power demands in a gradual fashion.</p> <h2>Configuration</h2> + <p>A great deal of flexibility is offered in terms of when and how idle nodes are put into or removed from power save mode. Note that the SLURM control daemon, <i>slurmctld</i>, must be @@ -168,7 +170,31 @@ support. You can also configure SLURM with programs that perform no aciton as <b>SuspendProgram</b> and <b>ResumeProgram</b> to assess the potential impact of power saving mode before enabling it.</p> -<h2>Fault tolerance</h2> +<h2>Use of Allocations</h2> + +<p>A resource allocation request will be granted as soon as resources +are selected for use, possibly before the nodes are all available +for use. +The launching of job steps will be delayed until the required nodes +have been restored to service (it prints a warning about waiting for +nodes to become available and periodically retries until they are +available).</p> + +<p>In the case of an <i>sbatch</i> command, the batch program will start +when node zero of the allocation is ready for use and pre-processing can +be performed as needed before using <i>srun</i> to launch job steps. +The operation of <i>salloc</i> and <i>srun</i> follow a similar pattern +of getting an job allocation at one time, but possibly being unable to +launch job steps until later. +If <i>ssh</i> or some other tools is used by <i>salloc</i> it may be +desirable to execute "<i>srun /bin/true</i>" or some other command +first to insure that all nodes are booted and ready for use. +We plan to add a job and node state of <i>CONFIGURING</i> in SLURM +version 2.1, which could be used to prevent salloc from executing +any processes (including <i>ssh</i>) until all of the nodes are +ready for use.</p> + +<h2>Fault Tolerance</h2> <p>If the <i>slurmctld</i> daemon is terminated gracefully, it will wait up to <b>SuspendTimeout</b> or <b>ResumeTimeout</b> (whichever @@ -193,6 +219,6 @@ In order to minimize this risk, when the <i>slurmctld</i> daemon is started and node which should be allocated to a job fails to respond, the <b>ResumeProgram</b> will be executed (possibly for a second time).</p> -<p style="text-align:center;">Last modified 26 May 2009</p> +<p style="text-align:center;">Last modified 27 May 2009</p> <!--#include virtual="footer.txt"--> diff --git a/doc/html/publications.shtml b/doc/html/publications.shtml index 1b0d40bf937..49c34f8f758 100644 --- a/doc/html/publications.shtml +++ b/doc/html/publications.shtml @@ -40,13 +40,20 @@ M. Jette and M. Grondona, <i>Proceedings of ClusterWorld Conference and Expo</i>, San Jose, California, June 2003.</p> -<b>SLURM: Simple Linux Utility for Resource Management</b>, +<p><b>SLURM: Simple Linux Utility for Resource Management</b>, A. Yoo, M. Jette, and M. Grondona, <i>Job Scheduling Strategies for Parallel Processing</i>, volume 2862 of <i>Lecture Notes in Computer Science</i>, pages 44-60, Springer-Verlag, 2003.</p> -<p style="text-align:center;">Last modified 19 March 2009</p> +<h2>Interview</h2> + +<p><a href="http://www.rce-cast.com/index.php/Podcast/rce-10-slurm.html"> +RCE 10: SLURM (podcast)</a>: +Brock Palen and Jeff Squyres speak with Morris Jette and +Danny Auble of LLNL about SLURM.</p> + +<p style="text-align:center;">Last modified 27 May 2009</p> <!--#include virtual="footer.txt"--> -- GitLab