Skip to content
Snippets Groups Projects
Commit d27e244b authored by Moe Jette's avatar Moe Jette
Browse files

Move mpi_guide to its own web page.

Add description of future open mpi mode fo operation.
parent 7892601c
No related branches found
No related tags found
No related merge requests found
...@@ -28,6 +28,7 @@ generated_html = \ ...@@ -28,6 +28,7 @@ generated_html = \
maui.html \ maui.html \
mc_support.html \ mc_support.html \
moab.html \ moab.html \
mpi_guide.html \
mpiplugins.html \ mpiplugins.html \
news.html \ news.html \
overview.html \ overview.html \
......
...@@ -277,6 +277,7 @@ generated_html = \ ...@@ -277,6 +277,7 @@ generated_html = \
maui.html \ maui.html \
mc_support.html \ mc_support.html \
moab.html \ moab.html \
mpi_guide.html \
mpiplugins.html \ mpiplugins.html \
news.html \ news.html \
overview.html \ overview.html \
......
...@@ -8,7 +8,7 @@ Also see <a href="publications.html">Publications and Presentations</a>. ...@@ -8,7 +8,7 @@ Also see <a href="publications.html">Publications and Presentations</a>.
<ul> <ul>
<li><a href="quickstart.shtml">Quick Start User Guide</a></li> <li><a href="quickstart.shtml">Quick Start User Guide</a></li>
<li><a href="mc_support.shtml">Support for Multi-core/Multi-threaded Architectures</a></li> <li><a href="mc_support.shtml">Support for Multi-core/Multi-threaded Architectures</a></li>
<li><a href="quickstart.shtml#mpi">Guide to MPI Use</a></li> <li><a href="mpi_guide.shtml">MPI Use Guide</a></li>
<li>Specific Systems</li> <li>Specific Systems</li>
<ul> <ul>
<li><a href="bluegene.shtml">Blue Gene User and Administrator Guide</a></li> <li><a href="bluegene.shtml">Blue Gene User and Administrator Guide</a></li>
......
<!--#include virtual="header.txt"-->
<h1>MPI Use Guide</h1>
<p>MPI use depends upon the type of MPI being used.
There are three fundamentally different modes of operation used
by these various MPI implementation.
<ol>
<li>SLURM directly launches the tasks and performs initialization
of communications (Quadrics MPI, MPICH2, MPICH-GM, MPICH-MX,
MVAPICH, MVAPICH2 and some MPICH1 modes).</li>
<li>SLURM creates a resource allocation for the job and then
mpirun launches tasks using SLURM's infrastructure (OpenMPI,
LAM/MPI and HP-MPI).</li>
<li>SLURM creates a resource allocation for the job and then
mpirun launches tasks using some mechanism other than SLURM,
such as SSH or RSH (BlueGene MPI and some MPICH1 modes).
These tasks initiated outside of SLURM's monitoring
or control. SLURM's epilog should be configured to purge
these tasks when the job's allocation is relinquished. </li>
</ol>
<p>Links to instructions for using several varieties of MPI
with SLURM are provided below.
<ul>
<li><a href="#bluegene_mpi">BlueGene MPI</a></li>
<li><a href="#hp_mpi">HP-MPI</a></li>
<li><a href="#lam_mpi">LAM/MPI</a></li>
<li><a href="#mpich1">MPICH1</a></li>
<li><a href="#mpich2">MPICH2</a></li>
<li><a href="#mpich_gm">MPICH-GM</a></li>
<li><a href="#mpich_mx">MPICH-MX</a></li>
<li><a href="#mvapich">MVAPICH</a></li>
<li><a href="#mvapich2">MVAPICH2</a></li>
<li><a href="#open_mpi">Open MPI</a></li>
<li><a href="#quadrics_mpi">Quadrics MPI</a></li>
</ul></p>
<hr size=4 width="100%">
<h2><a name="open_mpi" href="http://www.open-mpi.org/"><b>Open MPI</b></a></h2>
<p>Open MPI relies upon
SLURM to allocate resources for the job and then mpirun to initiate the
tasks. When using <span class="commandline">salloc</span> command,
<span class="commandline">mpirun</span>'s -nolocal option is recommended.
For example:
<pre>
$ salloc -n4 sh # allocates 4 processors
# and spawns shell for job
&gt; mpirun -np 4 -nolocal a.out
&gt; exit # exits shell spawned by
# initial srun command
</pre>
<p>Note that any direct use of <span class="commandline">srun</span>
will only launch one task per node when the LAM/MPI plugin is used.
To launch more than one task per node using the
<span class="commandline">srun</span> command, the <i>--mpi=none</i>
option will be required to explicitly disable the LAM/MPI plugin.</p>
<h2>Future Use</h2>
There is work underway in both SLURM and Open MPI to support task launch
using the <span class="commandline">srun</span> command.
We expect this mode of operation to be supported late in 2009.
It may differ slightly from the description below.
It relies upon SLURM version 2.0 (or higher) managing
reservations of communication ports for the Open MPI's use.
Specify the range of ports to be reserved in the <i>slurm.conf</i>
file using the <i>MpiParams</i> parameter.
For example: <i>MpiParams=ports:12000-12999</i>.
Then launch tasks using the <span class="commandline">srun</span> command
plus the option <i>--resv-ports</i>.
The ports reserved on every allocated node will be identified in an
environment variable as shown here:
<i>SLURM_STEP_RESV_PORTS=12000-12015</i></p>
<hr size=4 width="100%">
<h2><a name="quadrics_mpi" href="http://www.quadrics.com/"><b>Quadrics MPI</b></a></h2>
<p>Quadrics MPI relies upon SLURM to
allocate resources for the job and <span class="commandline">srun</span>
to initiate the tasks. One would build the MPI program in the normal manner
then initiate it using a command line of this sort:</p>
<pre>
$ srun [options] &lt;program&gt; [program args]
</pre>
<hr size=4 width="100%">
<h2><a name="lam_mpi" href="http://www.lam-mpi.org/"><b>LAM/MPI</b></a></h2>
<p>LAM/MPI relies upon the SLURM
<span class="commandline">salloc</span> or <span class="commandline">sbatch</span>
command to allocate. In either case, specify
the maximum number of tasks required for the job. Then execute the
<span class="commandline">lamboot</span> command to start lamd daemons.
<span class="commandline">lamboot</span> utilizes SLURM's
<span class="commandline">srun</span> command to launch these daemons.
Do not directly execute the <span class="commandline">srun</span> command
to launch LAM/MPI tasks. For example:
<pre>
$ salloc -n16 sh # allocates 16 processors
# and spawns shell for job
&gt; lamboot
&gt; mpirun -np 16 foo args
1234 foo running on adev0 (o)
2345 foo running on adev1
etc.
&gt; lamclean
&gt; lamhalt
&gt; exit # exits shell spawned by
# initial srun command
</pre>
<p>Note that any direct use of <span class="commandline">srun</span>
will only launch one task per node when the LAM/MPI plugin is configured
as the default plugin. To launch more than one task per node using the
<span class="commandline">srun</span> command, the <i>--mpi=none</i>
option would be required to explicitly disable the LAM/MPI plugin
if that is the system default.</p>
<hr size=4 width="100%">
<h2><a name="hp_mpi" href="http://www.hp.com/go/mpi"><b>HP-MPI</b></a></h2>
<p>HP-MPI uses the
<span class="commandline">mpirun</span> command with the <b>-srun</b>
option to launch jobs. For example:
<pre>
$MPI_ROOT/bin/mpirun -TCP -srun -N8 ./a.out
</pre></p>
<hr size=4 width="100%">
<h2><a name="mpich2" href="http://www.mcs.anl.gov/research/projects/mpich2/"><b>MPICH2</b></a></h2>
<p>MPICH2 jobs are launched using the <b>srun</b> command. Just link your program with
SLURM's implementation of the PMI library so that tasks can communicate
host and port information at startup. (The system administrator can add
these option to the mpicc and mpif77 commands directly, so the user will not
need to bother). For example:
<pre>
$ mpicc -L&lt;path_to_slurm_lib&gt; -lpmi ...
$ srun -n20 a.out
</pre>
<b>NOTES:</b>
<ul>
<li>Some MPICH2 functions are not currently supported by the PMI
library integrated with SLURM</li>
<li>Set the environment variable <b>PMI_DEBUG</b> to a numeric value
of 1 or higher for the PMI library to print debugging information</li>
</ul></p>
<hr size=4 width="100%">
<h2><a name="mpich_gm" href="http://www.myri.com/scs/download-mpichgm.html"><b>MPICH-GM</b></a></h2>
<p>MPICH-GM jobs can be launched directly by <b>srun</b> command.
SLURM's <i>mpichgm</i> MPI plugin must be used to establish communications
between the launched tasks. This can be accomplished either using the SLURM
configuration parameter <i>MpiDefault=mpichgm</i> in <b>slurm.conf</b>
or srun's <i>--mpi=mpichgm</i> option.
<pre>
$ mpicc ...
$ srun -n16 --mpi=mpichgm a.out
</pre>
<hr size=4 width="100%">
<h2><a name="mpich_mx" href="http://www.myri.com/scs/download-mpichmx.html"><b>MPICH-MX</b></a></h2>
<p>MPICH-MX jobs can be launched directly by <b>srun</b> command.
SLURM's <i>mpichmx</i> MPI plugin must be used to establish communications
between the launched tasks. This can be accomplished either using the SLURM
configuration parameter <i>MpiDefault=mpichmx</i> in <b>slurm.conf</b>
or srun's <i>--mpi=mpichmx</i> option.
<pre>
$ mpicc ...
$ srun -n16 --mpi=mpichmx a.out
</pre>
<hr size=4 width="100%">
<h2><a name="mvapich" href="http://mvapich.cse.ohio-state.edu/"><b>MVAPICH</b></a></h2>
<p>MVAPICH jobs can be launched directly by <b>srun</b> command.
SLURM's <i>mvapich</i> MPI plugin must be used to establish communications
between the launched tasks. This can be accomplished either using the SLURM
configuration parameter <i>MpiDefault=mvapich</i> in <b>slurm.conf</b>
or srun's <i>--mpi=mvapich</i> option.
<pre>
$ mpicc ...
$ srun -n16 --mpi=mvapich a.out
</pre>
<b>NOTE:</b> If MVAPICH is used in the shared memory model, with all tasks
running on a single node, then use the <i>mpich1_shmem</i> MPI plugin instead.<br>
<b>NOTE (for system administrators):</b> Configure
<i>PropagateResourceLimitsExcept=MEMLOCK</i> in <b>slurm.conf</b> and
start the <i>slurmd</i> daemons with an unlimited locked memory limit.
For more details, see
<a href="http://mvapich.cse.ohio-state.edu/support/mvapich_user_guide.html#x1-420007.2.3">MVAPICH</a>
documentation for "CQ or QP Creation failure".</p>
<hr size=4 width="100%">
<h2><a name="mvapich2" href="http://nowlab.cse.ohio-state.edu/projects/mpi-iba"><b>MVAPICH2</b></a></h2>
<p>MVAPICH2 jobs can be launched directly by <b>srun</b> command.
SLURM's <i>none</i> MPI plugin must be used to establish communications
between the launched tasks. This can be accomplished either using the SLURM
configuration parameter <i>MpiDefault=none</i> in <b>slurm.conf</b>
or srun's <i>--mpi=none</i> option. The program must also be linked with
SLURM's implementation of the PMI library so that tasks can communicate
host and port information at startup. (The system administrator can add
these option to the mpicc and mpif77 commands directly, so the user will not
need to bother). <b>Do not use SLURM's MVAPICH plugin for MVAPICH2.</b>
<pre>
$ mpicc -L&lt;path_to_slurm_lib&gt; -lpmi ...
$ srun -n16 --mpi=none a.out
</pre>
<hr size=4 width="100%">
<h2><a name="bluegene_mpi" href="http://www.research.ibm.com/bluegene/"><b>BlueGene MPI</b></a></h2>
<p>BlueGene MPI relies upon SLURM to create the resource allocation and then
uses the native <span class="commandline">mpirun</span> command to launch tasks.
Build a job script containing one or more invocations of the
<span class="commandline">mpirun</span> command. Then submit
the script to SLURM using <span class="commandline">sbatch</span>.
For example:</p>
<pre>
$ sbatch -N512 my.script
</pre>
<p>Note that the node count specified with the <i>-N</i> option indicates
the base partition count.
See <a href="bluegene.html">BlueGene User and Administrator Guide</a>
for more information.</p>
<hr size=4 width="100%">
<h2><a name="mpich1" href="http://www-unix.mcs.anl.gov/mpi/mpich1/"><b>MPICH1</b></a></h2>
<p>MPICH1 development ceased in 2005. It is recommended that you convert to
MPICH2 or some other MPI implementation.
If you still want to use MPICH1, note that it has several different
programming models. If you are using the shared memory model
(<i>DEFAULT_DEVICE=ch_shmem</i> in the mpirun script), then initiate
the tasks using the <span class="commandline">srun</span> command
with the <i>--mpi=mpich1_shmem</i> option.</p>
<pre>
$ srun -n16 --mpi=mpich1_shmem a.out
</pre>
<p>If you are using MPICH P4 (<i>DEFAULT_DEVICE=ch_p4</i> in
the mpirun script) and SLURM version 1.2.11 or newer,
then it is recommended that you apply the patch in the SLURM
distribution's file <i>contribs/mpich1.slurm.patch</i>.
Follow directions within the file to rebuild MPICH.
Applications must be relinked with the new library.
Initiate tasks using the
<span class="commandline">srun</span> command with the
<i>--mpi=mpich1_p4</i> option.</p>
<pre>
$ srun -n16 --mpi=mpich1_p4 a.out
</pre>
<p>Note that SLURM launches one task per node and the MPICH
library linked within your applications launches the other
tasks with shared memory used for communications between them.
The only real anomaly is that all output from all spawned tasks
on a node appear to SLURM as coming from the one task that it
launched. If the srun --label option is used, the task ID labels
will be misleading.</p>
<p>Other MPICH1 programming models current rely upon the SLURM
<span class="commandline">salloc</span> or
<span class="commandline">sbatch</span> command to allocate resources.
In either case, specify the maximum number of tasks required for the job.
You may then need to build a list of hosts to be used and use that
as an argument to the mpirun command.
For example:
<pre>
$ cat mpich.sh
#!/bin/bash
srun hostname -s | sort -u >slurm.hosts
mpirun [options] -machinefile slurm.hosts a.out
rm -f slurm.hosts
$ sbatch -n16 mpich.sh
sbatch: Submitted batch job 1234
</pre>
<p>Note that in this example, mpirun uses the rsh command to launch
tasks. These tasks are not managed by SLURM since they are launched
outside of its control.</p>
<p style="text-align:center;">Last modified 26 February 2009</p>
<!--#include virtual="footer.txt"-->
...@@ -346,11 +346,11 @@ adev0: scancel 473 ...@@ -346,11 +346,11 @@ adev0: scancel 473
adev0: squeue adev0: squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
</pre> </pre>
<p class="footer"><a href="#top">top</a></p>
<p class="footer"><a href="#top">top</a></p>
<h2><a name="mpi">MPI</a></h2> <h2><a name="mpi">MPI</a></h2>
<p>MPI use depends upon the type of MPI being used. <p>MPI use depends upon the type of MPI being used.
There are three fundamentally different modes of operation used There are three fundamentally different modes of operation used
by these various MPI implementation. by these various MPI implementation.
...@@ -368,215 +368,22 @@ These tasks initiated outside of SLURM's monitoring ...@@ -368,215 +368,22 @@ These tasks initiated outside of SLURM's monitoring
or control. SLURM's epilog should be configured to purge or control. SLURM's epilog should be configured to purge
these tasks when the job's allocation is relinquished. </li> these tasks when the job's allocation is relinquished. </li>
</ol> </ol>
<p>Instructions for using several varieties of MPI with SLURM are <p>Links to instructions for using several varieties of MPI
provided below.</p> with SLURM are provided below.
<p> <a href="http://www.open-mpi.org/"><b>Open MPI</b></a> relies upon
SLURM to allocate resources for the job and then mpirun to initiate the
tasks. When using <span class="commandline">salloc</span> command,
<span class="commandline">mpirun</span>'s -nolocal option is recommended.
For example:
<pre>
$ salloc -n4 sh # allocates 4 processors
# and spawns shell for job
&gt; mpirun -np 4 -nolocal a.out
&gt; exit # exits shell spawned by
# initial srun command
</pre>
<p>Note that any direct use of <span class="commandline">srun</span>
will only launch one task per node when the LAM/MPI plugin is used.
To launch more than one task per node using the
<span class="commandline">srun</span> command, the <i>--mpi=none</i>
option will be required to explicitly disable the LAM/MPI plugin.</p>
<p> <a href="http://www.quadrics.com/"><b>Quadrics MPI</b></a> relies upon SLURM to
allocate resources for the job and <span class="commandline">srun</span>
to initiate the tasks. One would build the MPI program in the normal manner
then initiate it using a command line of this sort:</p>
<pre>
$ srun [options] &lt;program&gt; [program args]
</pre>
<p> <a href="http://www.lam-mpi.org/"><b>LAM/MPI</b></a> relies upon the SLURM
<span class="commandline">salloc</span> or <span class="commandline">sbatch</span>
command to allocate. In either case, specify
the maximum number of tasks required for the job. Then execute the
<span class="commandline">lamboot</span> command to start lamd daemons.
<span class="commandline">lamboot</span> utilizes SLURM's
<span class="commandline">srun</span> command to launch these daemons.
Do not directly execute the <span class="commandline">srun</span> command
to launch LAM/MPI tasks. For example:
<pre>
$ salloc -n16 sh # allocates 16 processors
# and spawns shell for job
&gt; lamboot
&gt; mpirun -np 16 foo args
1234 foo running on adev0 (o)
2345 foo running on adev1
etc.
&gt; lamclean
&gt; lamhalt
&gt; exit # exits shell spawned by
# initial srun command
</pre>
<p>Note that any direct use of <span class="commandline">srun</span>
will only launch one task per node when the LAM/MPI plugin is configured
as the default plugin. To launch more than one task per node using the
<span class="commandline">srun</span> command, the <i>--mpi=none</i>
option would be required to explicitly disable the LAM/MPI plugin
if that is the system default.</p>
<p class="footer"><a href="#top">top</a></p>
<p><a href="http://www.hp.com/go/mpi"><b>HP-MPI</b></a> uses the
<span class="commandline">mpirun</span> command with the <b>-srun</b>
option to launch jobs. For example:
<pre>
$MPI_ROOT/bin/mpirun -TCP -srun -N8 ./a.out
</pre></p>
<p><a href="http://www.mcs.anl.gov/research/projects/mpich2/"><b>
MPICH2</b></a> jobs
are launched using the <b>srun</b> command. Just link your program with
SLURM's implementation of the PMI library so that tasks can communicate
host and port information at startup. (The system administrator can add
these option to the mpicc and mpif77 commands directly, so the user will not
need to bother). For example:
<pre>
$ mpicc -L&lt;path_to_slurm_lib&gt; -lpmi ...
$ srun -n20 a.out
</pre>
<b>NOTES:</b>
<ul> <ul>
<li>Some MPICH2 functions are not currently supported by the PMI <li><a href="mpi_guide.shtml#bluegene_mpi">BlueGene MPI</a></li>
library integrated with SLURM</li> <li><a href="mpi_guide.shtml#hp_mpi">HP-MPI</a></li>
<li>Set the environment variable <b>PMI_DEBUG</b> to a numeric value <li><a href="mpi_guide.shtml#lam_mpi">LAM/MPI</a></li>
of 1 or higher for the PMI library to print debugging information</li> <li><a href="mpi_guide.shtml#mpich1">MPICH1</a></li>
<li><a href="mpi_guide.shtml#mpich2">MPICH2</a></li>
<li><a href="mpi_guide.shtml#mpich_gm">MPICH-GM</a></li>
<li><a href="mpi_guide.shtml#mpich_mx">MPICH-MX</a></li>
<li><a href="mpi_guide.shtml#mvapich">MVAPICH</a></li>
<li><a href="mpi_guide.shtml#mvapich2">MVAPICH2</a></li>
<li><a href="mpi_guide.shtml#open_mpi">Open MPI</a></li>
<li><a href="mpi_guide.shtml#quadrics_mpi">Quadrics MPI</a></li>
</ul></p> </ul></p>
<p><a href="http://www.myri.com/scs/download-mpichgm.html"><b>MPICH-GM</b></a> <p style="text-align:center;">Last modified 26 February 2009</p>
jobs can be launched directly by <b>srun</b> command.
SLURM's <i>mpichgm</i> MPI plugin must be used to establish communications
between the launched tasks. This can be accomplished either using the SLURM
configuration parameter <i>MpiDefault=mpichgm</i> in <b>slurm.conf</b>
or srun's <i>--mpi=mpichgm</i> option.
<pre>
$ mpicc ...
$ srun -n16 --mpi=mpichgm a.out
</pre>
<p><a href="http://www.myri.com/scs/download-mpichmx.html"><b>MPICH-MX</b></a>
jobs can be launched directly by <b>srun</b> command.
SLURM's <i>mpichmx</i> MPI plugin must be used to establish communications
between the launched tasks. This can be accomplished either using the SLURM
configuration parameter <i>MpiDefault=mpichmx</i> in <b>slurm.conf</b>
or srun's <i>--mpi=mpichmx</i> option.
<pre>
$ mpicc ...
$ srun -n16 --mpi=mpichmx a.out
</pre>
<p><a href="http://mvapich.cse.ohio-state.edu/"><b>MVAPICH</b></a>
jobs can be launched directly by <b>srun</b> command.
SLURM's <i>mvapich</i> MPI plugin must be used to establish communications
between the launched tasks. This can be accomplished either using the SLURM
configuration parameter <i>MpiDefault=mvapich</i> in <b>slurm.conf</b>
or srun's <i>--mpi=mvapich</i> option.
<pre>
$ mpicc ...
$ srun -n16 --mpi=mvapich a.out
</pre>
<b>NOTE:</b> If MVAPICH is used in the shared memory model, with all tasks
running on a single node, then use the <i>mpich1_shmem</i> MPI plugin instead.<br>
<b>NOTE (for system administrators):</b> Configure
<i>PropagateResourceLimitsExcept=MEMLOCK</i> in <b>slurm.conf</b> and
start the <i>slurmd</i> daemons with an unlimited locked memory limit.
For more details, see
<a href="http://mvapich.cse.ohio-state.edu/support/mvapich_user_guide.html#x1-420007.2.3">MVAPICH</a>
documentation for "CQ or QP Creation failure".</p>
<p><a href="http://nowlab.cse.ohio-state.edu/projects/mpi-iba"><b>MVAPICH2</b></a>
jobs can be launched directly by <b>srun</b> command.
SLURM's <i>none</i> MPI plugin must be used to establish communications
between the launched tasks. This can be accomplished either using the SLURM
configuration parameter <i>MpiDefault=none</i> in <b>slurm.conf</b>
or srun's <i>--mpi=none</i> option. The program must also be linked with
SLURM's implementation of the PMI library so that tasks can communicate
host and port information at startup. (The system administrator can add
these option to the mpicc and mpif77 commands directly, so the user will not
need to bother). <b>Do not use SLURM's MVAPICH plugin for MVAPICH2.</b>
<pre>
$ mpicc -L&lt;path_to_slurm_lib&gt; -lpmi ...
$ srun -n16 --mpi=none a.out
</pre>
<p><a href="http://www.research.ibm.com/bluegene/"><b>BlueGene MPI</b></a> relies
upon SLURM to create the resource allocation and then uses the native
<span class="commandline">mpirun</span> command to launch tasks.
Build a job script containing one or more invocations of the
<span class="commandline">mpirun</span> command. Then submit
the script to SLURM using <span class="commandline">sbatch</span>.
For example:</p>
<pre>
$ sbatch -N512 my.script
</pre>
<p>Note that the node count specified with the <i>-N</i> option indicates
the base partition count.
See <a href="bluegene.html">BlueGene User and Administrator Guide</a>
for more information.</p>
<p><a href="http://www-unix.mcs.anl.gov/mpi/mpich1/"><b>MPICH1</b></a>
development ceased in 2005. It is recommended that you convert to
MPICH2 or some other MPI implementation.
If you still want to use MPICH1, note that it has several different
programming models. If you are using the shared memory model
(<i>DEFAULT_DEVICE=ch_shmem</i> in the mpirun script), then initiate
the tasks using the <span class="commandline">srun</span> command
with the <i>--mpi=mpich1_shmem</i> option.</p>
<pre>
$ srun -n16 --mpi=mpich1_shmem a.out
</pre>
<p>If you are using MPICH P4 (<i>DEFAULT_DEVICE=ch_p4</i> in
the mpirun script) and SLURM version 1.2.11 or newer,
then it is recommended that you apply the patch in the SLURM
distribution's file <i>contribs/mpich1.slurm.patch</i>.
Follow directions within the file to rebuild MPICH.
Applications must be relinked with the new library.
Initiate tasks using the
<span class="commandline">srun</span> command with the
<i>--mpi=mpich1_p4</i> option.</p>
<pre>
$ srun -n16 --mpi=mpich1_p4 a.out
</pre>
<p>Note that SLURM launches one task per node and the MPICH
library linked within your applications launches the other
tasks with shared memory used for communications between them.
The only real anomaly is that all output from all spawned tasks
on a node appear to SLURM as coming from the one task that it
launched. If the srun --label option is used, the task ID labels
will be misleading.</p>
<p>Other MPICH1 programming models current rely upon the SLURM
<span class="commandline">salloc</span> or
<span class="commandline">sbatch</span> command to allocate resources.
In either case, specify the maximum number of tasks required for the job.
You may then need to build a list of hosts to be used and use that
as an argument to the mpirun command.
For example:
<pre>
$ cat mpich.sh
#!/bin/bash
srun hostname -s | sort -u >slurm.hosts
mpirun [options] -machinefile slurm.hosts a.out
rm -f slurm.hosts
$ sbatch -n16 mpich.sh
sbatch: Submitted batch job 1234
</pre>
<p>Note that in this example, mpirun uses the rsh command to launch
tasks. These tasks are not managed by SLURM since they are launched
outside of its control.</p>
<p style="text-align:center;">Last modified 16 July 2008</p>
<!--#include virtual="footer.txt"--> <!--#include virtual="footer.txt"-->
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
<head> <head>
<title>SLURM Web pages for Review and Release</title> <title>SLURM Web pages for Review and Release</title>
<!-- Updated 25 February 2009 --> <!-- Updated 26 February 2009 -->
</head> </head>
<body> <body>
...@@ -34,6 +34,7 @@ ...@@ -34,6 +34,7 @@
<li><a href="https://computing-pre.llnl.gov/linux/slurm/maui.html">maui.html</a></li> <li><a href="https://computing-pre.llnl.gov/linux/slurm/maui.html">maui.html</a></li>
<li><a href="https://computing-pre.llnl.gov/linux/slurm/mc_support.html">mc_support.html</a></li> <li><a href="https://computing-pre.llnl.gov/linux/slurm/mc_support.html">mc_support.html</a></li>
<li><a href="https://computing-pre.llnl.gov/linux/slurm/moab.html">moab.html</a></li> <li><a href="https://computing-pre.llnl.gov/linux/slurm/moab.html">moab.html</a></li>
<li><a href="https://computing-pre.llnl.gov/linux/slurm/mpi_guide.html">mpi_guide.html</a></li>
<li><a href="https://computing-pre.llnl.gov/linux/slurm/mpiplugins.html">mpiplugins.html</a></li> <li><a href="https://computing-pre.llnl.gov/linux/slurm/mpiplugins.html">mpiplugins.html</a></li>
<li><a href="https://computing-pre.llnl.gov/linux/slurm/news.html">news.html</a></li> <li><a href="https://computing-pre.llnl.gov/linux/slurm/news.html">news.html</a></li>
<li><a href="https://computing-pre.llnl.gov/linux/slurm/overview.html">overview.html</a></li> <li><a href="https://computing-pre.llnl.gov/linux/slurm/overview.html">overview.html</a></li>
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment