diff --git a/doc/html/Makefile.am b/doc/html/Makefile.am index df26b5f1429901ede39f688b6888c978eac6f6de..84a9a8373635a33974ce34b8b223ca8abb5deb0c 100644 --- a/doc/html/Makefile.am +++ b/doc/html/Makefile.am @@ -53,12 +53,10 @@ generated_html = \ fair_tree.html \ mail.html \ man_index.html \ - maui.html \ mc_support.html \ mcs.html \ mcs_plugins.html \ meetings.html \ - moab.html \ mpi_guide.html \ mpiplugins.html \ multi_cluster.html \ diff --git a/doc/html/Makefile.in b/doc/html/Makefile.in index fb8f5c1d7b8fa6ac21bf866b2c1829396d0dad40..3d78f8682c8064a8b373c4c5b3a0fb0fa01cc722 100644 --- a/doc/html/Makefile.in +++ b/doc/html/Makefile.in @@ -507,12 +507,10 @@ generated_html = \ fair_tree.html \ mail.html \ man_index.html \ - maui.html \ mc_support.html \ mcs.html \ mcs_plugins.html \ meetings.html \ - moab.html \ mpi_guide.html \ mpiplugins.html \ multi_cluster.html \ diff --git a/doc/html/documentation.shtml b/doc/html/documentation.shtml index ed25eb9561178538f7acc7db4f4cd3baa7ec3906..abe7cf89249ae3f1baca01e635e8a513246a08d0 100644 --- a/doc/html/documentation.shtml +++ b/doc/html/documentation.shtml @@ -81,8 +81,6 @@ may be found in the <a href="http://slurm.schedmd.com/archive/">archive</a>. </ul> <li>External Schedulers</li> <ul> -<li><a href="maui.html">Maui Scheduler Integration Guide</a></li> -<li><a href="moab.html">Moab Cluster Suite Integration Guide</a></li> <li><a href="http://docs.hp.com/en/5991-4847/ch09s02.html">Submitting Jobs through LSF</a></li> </ul> <li>Specific Systems</li> diff --git a/doc/html/maui.shtml b/doc/html/maui.shtml deleted file mode 100644 index 2d7c8cd59f8370b6f5e24d7d33edf5694d6acd0d..0000000000000000000000000000000000000000 --- a/doc/html/maui.shtml +++ /dev/null @@ -1,185 +0,0 @@ -<!--#include virtual="header.txt"--> - -<h1>Maui Scheduler Integration Guide</h1> -<h2>Overview</h2> -<p>Maui configuration is quite complicated and is really beyond the scope -of any documents we could supply with Slurm. -The best resource for Maui configuration information is the -online documents at Cluster Resources Inc.: -<a href="http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml"> -http://www.clusterresources.com/products/maui/docs/mauiadmin.shtml</a>. - -<p>Maui use. Slurm commands and a wiki interface to communicate. See the -<a href="http://www.clusterresources.com/products/mwm/docs/wiki/wikiinterface.shtml"> -Wiki Interface Specification</a> and -<a href="http://www.clusterresources.com/products/mwm/docs/wiki/socket.shtml"> -Wiki Socket Protocol Description</a> for more information.</p> - -<h2>Configuration</h2> -<p>First, download the Maui scheduler kit from their web site -<a href="http://www.clusterresources.com/pages/products/maui-cluster-scheduler.php"> -http://www.clusterresources.com/pages/products/maui-cluster-scheduler.php</a>. -Note: maui-3.2.6p9 has been validated with Slurm, other versions -should also work properly. -We anticipate the Maui Scheduler to be upgraded to utilize a more -extensive interface to Slurm in early 2007. -The newer Maui Scheduler will be able to utilize a more ful featured -interface to Slurm as descripted in the -<a href="moab.html">Moab Cluster Suite Integration Guide</a>. -This guide will be upgrade at that time.</p> - -<h3>Maui configuration</h3> -<p>Make sure that Slurm is installed and running before building Maui. -Then build Maui from its source distribution. This is a two step process:</p> -<ol> -<li>./configure --with-key=42 --with-wiki -<li>gmake -</ol> -<p>The key of 42 is arbitrary. You can use any value, but it will need to -be a number no larger than 4,294,967,295 (2^32) and specify the same -value as a Slurm configuration parameter described below. -Maui developers have assured us the authentication key will eventually be -set in a configuration file rather than at build time.</p> - -<p>Update the Maui configuration file <i>maui.conf</i> (Copy the file -maui-3.2.6p9/maui.cfg.dist to maui.conf). Add the following configuration -parameters to maui.conf:</p> -<pre> -RMCFG[host] TYPE=WIKI -RMPORT 7321 # selected port -RMHOST host -RMAUTHTYPE[host] CHECKSUM -</pre> -<p><i>host</i> is the hostname where the Slurm controller is running. -This must match the value of <i>ControlMachine</i> configured in -slurm.conf. Note that <i>localhost</i> doesn't work. If you run Maui -and Slurm on the same machine, you must specify the actual host name. -The above example uses a TCP port number of 7321 for -communications between Slurm and Maui, but you can pick any port that -is available and accessible. You can also set a polling interval with</p> -<pre> -RMPOLLINTERVAL 00:00:20 -</pre> -<p>It may be desired to have Maui poll Slurm quite often -- -in this case every 20 seconds. -Note that a job submitted to an idle cluster will not be initiated until -the Maui daemon polls Slurm and decides to make it run, so the value of -RMPOLLINTERVAL should be set to a value appropriate for your site -considering both the desired system responsiveness and the overhead of -executing Maui daemons too frequently.</p> - -<p>In order for Maui to be able to access your Slurm partition, you will -need to define in maui.conf a partition with the same name as the Slurm -partition(s). For example if nodes "linux[0-3]" are in Slurm partition -"PartA", slurm.conf includes a line of this sort:</p> -<pre> -PartitionName=PartA Default=yes Nodes=linux[0-3] -</pre> -<p>The add the corresponding lines to maui.cfg:</p> -<pre> -PARTITIONMODE ON -NODECFG[linux0] PARTITION=PartA -NODECFG[linux1] PARTITION=PartA -NODECFG[linux2] PARTITION=PartA -NODECFG[linux3] PARTITION=PartA -</pre> - -<p>Set the following environment variables and path: -<pre> -set path=(/root/MAUI/maui-3.2.6p9/bin $path) -setenv MAUIHOMEDIR /root/MAUI/maui-3.2.6p9 -</pre> - -<p class="footer"><a href="#top">top</a></p> - -<h3>Slurm configuration</h3> -<p>Set the slurm.conf scheduler parameters as follows:</p> -<pre> -SchedulerType=sched/wiki -SchedulerPort=7321 -SchedulerAuth=42 (for Slurm version 1.1 and earlier only) -</pre> -<p>In this case, "SchedulerAuth" has been set to 42, which was the -authentication key specified when Maui was configured above. -Just make sure the numbers match.</p> - -<p>For Slurm version 1.2 or higher, the authentication key -is stored in a file specific to the wiki-plugin named -<i>wiki.conf</i>. -This file should be protected from reading by users. -It only needs to be readable by <i>SlurmUser</i> (as configured -in <i>slurm.conf</i>) and only needs to exist on computers -where the <i>slurmctld</i> daemon executes. -More information about wiki.conf is available in -a man page distributed with Slurm, although that -includes a description of keywords presently only -supported by the sched/wiki2 plugin for use with the -Moab Scheduler.</p> - -<p>Slurm has some internal scheduling capabilities which are not compatible -with Maui. -<ol> -<li>Do not configure Slurm to use the "priority/multifactor" plugin -as it would set job priorities which conflict with those set by Maui.</li> -<li>Do not use Slurm's <a href="reservations.html">reservation</a> -mechanism, but use that offered by Maui.</li> -<li>Do not use Slurm's <a href="resource_limits.html">resource limits</a> -as those may conflict with those managed by Maui.</li> -</ol></p> - - -<p>The wiki.conf keywords currently supported by Maui include:</p> - -<p><b>AuthKey</b> is a DES based encryption key used to sign -communications between Slurm and Maui or Moab. -This use of this key is essential to insure that a user -not build his own program to cancel other user's jobs in -Slurm. -This should be no more than 32-bit unsigned integer and match -the encryption key in Maui (<i>--with-key</i> on the -configure line) or Moab (<i>KEY</i> parameter in the -<i>moab-private.cfg</i> file). -Note tha. Slurm's wiki plugin does not include a mechanism -to submit new jobs, so even without this key nobody could -run jobs as another user.</p> - -<p><b>ExcludePartitions</b> is used to identify partitions -whose jobs are to be scheduled directly b. Slurm rather -than Maui. -These jobs will be scheduled on a First-Come-First-Served -basis. -This may provide faster response times than Maui scheduling. -Maui will account for and report the jobs, but their initiation -will be outside of Maui's control. -Note that Maui controls for resource reservation, fair share -scheduling, etc. will not apply to the initiation of these jobs. -If more than one partition is to be scheduled directly by -Slurm, use a comma separator between their names.</p> - -<p><b>HidePartitionJobs</b> identifies partitions whose jobs are not -to be reported to Maui. -These jobs will not be accounted for or otherwise visible to Maui. -Any partitions listed here must also be listed in <b>ExcludePartitions</b>. -If more than one partition is to have its jobs hidden, use a comma -separator between their names.</p> - -<p>Here is a sample <i>wiki.conf</i> file</p> -<pre> -# wiki.conf -. Slurm's wiki plugin configuration file -# -# Matches Maui's --with-key configuration parameter -AuthKey=42 -# -# Slurm to directly schedule "debug" partition -# and hide the jobs from Maui -ExcludePartitions=debug -HidePartitionJobs=debug -</pre> -</p> - -<p class="footer"><a href="#top">top</a></p> - -<p style="text-align:center;">Last modified 15 April 2015</p> - -<!--#include virtual="footer.txt"--> diff --git a/doc/html/moab.shtml b/doc/html/moab.shtml deleted file mode 100644 index 5091676b9c0923e382a51f2e160f6d3f6238c2db..0000000000000000000000000000000000000000 --- a/doc/html/moab.shtml +++ /dev/null @@ -1,286 +0,0 @@ -<!--#include virtual="header.txt"--> - -<h1>Moab Cluster Suite Integration Guide</h1> -<h2>Overview</h2> -<p>Moab Cluster Suite configuration is quite complicated and is -beyond the scope of any documents we could supply with Slurm. -The best resource for Moab configuration information is the -online documents at Cluster Resources Inc.: -<a href="http://www.clusterresources.com/products/mwm/docs/slurmintegration.shtml"> -http://www.clusterresources.com/products/mwm/docs/slurmintegration.shtml</a>.</p> - -<p>Moab use. Slurm commands and a wiki interface to communicate. See the -<a href="http://www.clusterresources.com/products/mwm/docs/wiki/wikiinterface.shtml"> -Wiki Interface Specification</a> and -<a href="http://www.clusterresources.com/products/mwm/docs/wiki/socket.shtml"> -Wiki Socket Protocol Description</a> for more information.</p> - -<p>Somewhat more current information abou. Slurm's implementation of the -wiki interface was developed by Michal Novotny (Masaryk University, Czech Republic) -and can be found <a href="http://www.fi.muni.cz/~xnovot19/wiki2.html">here</a>.</p> - -<h2>Configuration</h2> -<p>First, download the Moab scheduler kit from their web site -<a href="http://www.clusterresources.com/pages/products/moab-cluster-suite.php"> -http://www.clusterresources.com/pages/products/moab-cluster-suite.php</a>.<br> -<b>Note:</b> Use Moab version 5.0.0 or higher and Slurm version 1.1.28 -or higher.</p> - -<h3>Slurm configuration</h3> - -<h4>slurm.conf</h4> -<p>Set the <i>slurm.conf</i> scheduler parameters as follows:</p> -<pre> -SchedulerType=sched/wiki2 -SchedulerPort=7321 -</pre> -<p>Running multiple jobs per mode can be accomplished in two different -ways. -The <i>SelectType=select/cons_res</i> parameter can be used to let -Slurm allocate the individual processors, memory, and other -consumable resources. -Alternately, <i>SelectType=select/linear</i> or -<i>SelectType=select/bluegene</i> can be used with the -<i>OverSubscribe=yes</i> or <i>OverSubscribe=force</i> parameter in -partition configuration specifications.</p> - -<p>The default value of <i>SchedulerPort</i> is 7321.</p> - -<p>Slurm has some internal scheduling capabilities which are not compatible -with Moab. -<ol> -<li>Do not configure Slurm to use the "priority/multifactor" plugin -as it would set job priorities which conflict with those set by Moab.</li> -<li>Do not use Slurm's <a href="reservations.html">reservation</a> -mechanism, but use that offered by Moab.</li> -<li>Do not use Slurm's <a href="resource_limits.html">resource limits</a> -as those may conflict with those managed by Moab.</li> -</ol></p> - -<h4>Slurm commands</h4> -<p> Note that the <i>srun --immediate</i> option is not compatible -with Moab. -All jobs must wait for Moab to schedule them rather than being -scheduled immediately by Slurm.</p> - -<a name="wiki.conf"><h4>wiki.conf</h4></a> -<p>Slurm's wiki configuration is stored in a file -specific to the wiki-plugin named <i>wiki.conf</i>. -This file should be protected from reading by users. -It only needs to be readable by <i>SlurmUser</i> (as configured -in <i>slurm.conf</i>) and only needs to exist on computers -where the <i>slurmctld</i> daemon executes. -More information about wiki.conf is available in -a man page distributed with Slurm.</p> - -<p>The currently supported wiki.conf keywords include:</p> - -<p><b>AuthKey</b> is a DES based encryption key used to sign -communications between Slurm and Maui or Moab. -This use of this key is essential to insure that a user -not build his own program to cancel other user's jobs in -Slurm. -This should be no more than 32-bit unsigned integer and match -the encryption key in Maui (<i>--with-key</i> on the -configure line) or Moab (<i>KEY</i> parameter in the -<i>moab-private.cfg</i> file). -Note tha. Slurm's wiki plugin does not include a mechanism -to submit new jobs, so even without this key, nobody can -run jobs as another user.</p> - -<p><b>EPort</b> is an event notification port in Moab. -When a job is submitted to or terminates in Slurm, -Moab is sent a message on this port to begin an attempt -to schedule the computer. -This numeric value should match <i>EPORT</i> configured -in the <i>moab.cnf</i> file.</p> - -<p><b>EHost</b> is the event notification host for Moab. -This identifies the computer on which the Moab daemons -executes which should be notified of events. -By default EHost will be identical in value to the -ControlAddr configured in slurm.conf.</p> - -<p><b>EHostBackup</b> is the event notification backup host for Moab. -Names the computer on which the backup Moab server executes. -It is used in establishing a communications path for event notification. -By default EHostBackup will be identical in value to the -BackupAddr configured in slurm.conf.</p> - -<p><b>ExcludePartitions</b> is used to identify partitions -whose jobs are to be scheduled directly b. Slurm rather -than Moab. -This only affects jobs which are submitted using Slurm -commands (i.e. srun, salloc or sbatch, NOT msub from Moab). -These jobs will be scheduled on a First-Come-First-Served -basis. -This may provide faster response times than Moab scheduling. -Moab will account for and report the jobs, but their initiation -will be outside of Moab's control. -Note that Moab controls for resource reservation, fair share -scheduling, etc. will not apply to the initiation of these jobs. -If more than one partition is to be scheduled directly by -Slurm, use a comma separator between their names.</p> - -<p><b>HidePartitionJobs</b> identifies partitions whose jobs are not -to be reported to Moab. -These jobs will not be accounted for or otherwise visible to Moab. -Any partitions listed here must also be listed in <b>ExcludePartitions</b>. -If more than one partition is to have its jobs hidden, use a comma -separator between their names.</p> - -<p><b>HostFormat</b> controls the format of job task lists built -by Slurm and reported to Moab. -The default value is "0", for which each host name is listed -individually, once per processor (e.g. "tux0:tux0:tux1:tux1:..."). -A value of "1" use. Slurm hostlist expressions with processor -counts (e.g. "tux[0-16]*2"). -This is currently experimental. - -<p><b>JobAggregationTime</b> is used to avoid notifying Moab -of large numbers of events occurring about the same time. -If an event occurs within this number of seconds since Moab was -last notified of an event, another notification is not sent. -This should be an integer number of seconds. -The default value is 10 seconds. -The value should match <i>JOBAGGREGATIONTIME</i> configured -in the <i>moab.cnf</i> file.</p> - -<p><b>JobPriority</b> controls the scheduling of newly arriving jobs -in Slurm. Possible values are "hold" and "run" with "hold" being the -default. When <i>JobPriority=hold</i>, Slurm places all newly arriving -jobs in a HELD state (priority = 0) and lets Moab decide when and -where to run the jobs. When <i>JobPriority=run</i>, Slurm controls -when and where to run jobs. -<b>Note:</b> The "run" option implementation has yet to be completed. -Once the "run" option is available, Moab will be able to modify the -priorities of pending jobs to re-order the job queue.</p> - -<h4>Sample <i>wiki.conf</i> file</h4> -<pre> -# wiki.conf -. Slurm's wiki plugin configuration file -# -# Matches KEY in moab-private.cfg -AuthKey=123456789 -# -# Slurm to directly schedule "debug" partition -# and hide the jobs from Moab -ExcludePartitions=debug -HidePartitionJobs=debug -# -# Have Moab control job scheduling -JobPriority=hold -# -# Moab event notification port, matches EPORT in moab.cfg -EPort=15017 -# Moab event notification host, where the Moab daemon runs -#EHost=tux0 -# -# Moab event notification throttle, -# matches JOBAGGREGATIONTIME in moab.cfg (seconds) -JobAggregationTime=15 -</pre> -</p> - -<h3>Moab Configuration</h3> - -<p>Moab has support for Slurm's WIKI interface by default. -Specify this interface in the <i>moab.cfg</i> file as follows:</p> -<pre> -SCHEDCFG[base] MODE=NORMAL -RMCFG[slurm] TYPE=WIKI:Slurm AUTHTYPE=CHECKSUM -</pre> -<p>In <i>moab-private.cfg</i> specify the private key as follows:</p> -<pre> -CLIENTCFG[RM:slurm] KEY=123456789 -</pre> -<p>Insure that this file is protected from viewing by users. </p> - -<h3>Job Submission</h3> - -<p>Jobs can either be submitted to Moab or directly to Slurm. -Moab's <i>msub</i> command has a <i>--slurm</i> option that can -be placed at the <b>end</b> of the command line and those options -will be passed to Slurm. This can be used to invoke Slurm -options which are not directly supported by Moab (e.g. -system images to boot, task distribution specification across -sockets, cores, and hyperthreads, etc.). -For example: -<pre> -msub my.script -l walltime=600,nodes=2 \ - --slurm --linux-image=/bgl/linux_image2 -</pre> - -<h3>User Environment</h3> - -<p>When a user submits a job to Moab, that job could potentially -execute on a variety of computers, so it is typically necessary -that the user's environment on the execution host be loaded. -Moab relies upon Slurm to perform this action, using the -<i>--get-user-env</i> option for the salloc, sbatch and srun commands. -The Slurm command then executes as user root a command of this sort -as user root:</p> -<pre> -/bin/su - <user> -c \ - "/bin/echo BEGIN; /bin/env; /bin/echo FINI" -</pre> -<p> For typical batch jobs, the job transfer from Moab to -Slurm is performed using <i>sbatch</i> and occurs instantaneously. -The environment is loaded by a Slurm daemon (slurmd) when the -batch job begins execution. -For interactive jobs (<i>msub -I ...</i>), the job transfer -from Moab to Slurm cannot be completed until the environment -variables are loaded, during which time the Moab daemon is -completely non-responsive. -To insure that Moab remains operational, Slurm will abort the above -command within a configurable period of time and look for a cache -file with the user's environment and use that if found. -Otherwise an error is reported to Moab. -The time permitted for loading the current environment -before searching for a cache file is configurable using -the <i>GetEnvTimeout</i> parameter in Slurm's configuration -file, slurm.conf. A value of zero results in immediately -using the cache file. The default value is 2 seconds.</p> - -<p>We have provided a simple program that can be used to build -cache files for users. The program can be found in the Slurm -distribution at <i>contribs/env_cache_builder.c</i>. -This program can support a longer timeout than Moab, but -will report errors for users for whom the environment file -cannot be automatically build (typically due to the user's -"dot" files spawning another shell so the desired command -never execution). -For such user, you can manually build a cache file. -You may want to execute this program periodically to capture -information for new users or changes in existing users' -environment. -A sample execution is shown below. -Run this on the same host as the Moab daemon and execute it as user root.</p> - -<pre> -bash-3.00# make -f /dev/null env_cache_builder -cc env_cache_builder.c -o env_cache_builder -bash-3.00# ./env_cache_builder -Building user environment cache files for Moab/Slurm. -This will take a while. - -Processed 100 users... -***ERROR: Failed to get current user environment variables for alice -***ERROR: Failed to get current user environment variables for brian -Processed 200 users... -Processed 300 users... -***ERROR: Failed to get current user environment variables for christine -***ERROR: Failed to get current user environment variables for david - -Some user environments could not be loaded. -Manually run 'env' for those 4 users. -Write the output to a file with the same name as the user in the - /usr/local/tmp/slurm/atlas/env_cache directory -</pre> - -<p class="footer"><a href="#top">top</a></p> - -<p style="text-align:center;">Last modified 31 March 2016</p> - -<!--#include virtual="footer.txt"-->