Skip to content
Snippets Groups Projects
Commit 6d57b632 authored by Morris Jette's avatar Morris Jette
Browse files

Merge branch 'slurm-2.6'

parents 47c92666 3c4892ad
No related branches found
No related tags found
No related merge requests found
...@@ -9,78 +9,78 @@ ...@@ -9,78 +9,78 @@
<a href="#Administration">Administration</a><br> <a href="#Administration">Administration</a><br>
<a href="#Profiling">Profiling Jobs</a><br> <a href="#Profiling">Profiling Jobs</a><br>
<a href="#HDF5">HDF5</a><br> <a href="#HDF5">HDF5</a><br>
<a href="#DataSeries">Data Series</a><br> <a href="#DataSeries">Data Structure</a><br>
<a id="Overview"></a> <a id="Overview"></a>
<h2>Overview</h2> <h2>Overview</h2>
The AcctGatherProfileType/hdf5 plugin allows SLURM to coordinate collecting <p>The AcctGatherProfileType/hdf5 plugin allows SLURM to coordinate collecting
data on jobs it runs on a cluster that is more detailed than is practical to data on jobs it runs on a cluster that is more detailed than is practical to
include in its database. The data comes from periodically sampling various include in its database. The data comes from periodically sampling various
performance data either collected by SLURM, the operating system, or performance data either collected by SLURM, the operating system, or
component software. The plugin will record the data from each source component software. The plugin will record the data from each source
as a <b>Time Series</b> and also accumulate totals for each statistic for as a <b>Time Series</b> and also accumulate totals for each statistic for
the job. the job.</p>
<p>Time Series are energy data collected by an acct_gather_energy plugin, <p>Time Series are energy data collected by an acct_gather_energy plugin,
I/O data from a network interface collected by an acct_gather_infiniband plugin, I/O data from a network interface collected by an acct_gather_infiniband plugin,
I/O data from parallel file systems such as Lustre collected by an I/O data from parallel file systems such as Lustre collected by an
acct_gather_filesystem plugin, and task performance data such as local disk I/O, acct_gather_filesystem plugin, and task performance data such as local disk I/O,
cpu consumption, and memory use from a jobacct_gather plugin. cpu consumption, and memory use from a jobacct_gather plugin.
Data from other sources may be added in the future. Data from other sources may be added in the future.</p>
<p>The data is collected into a file on a shared file system for each step on <p>The data is collected into a file on a shared file system for each step on
each allocated node of a job and then merged into a HDF5 file. each allocated node of a job and then merged into a HDF5 file.
Individual files on a shared file system was chosen because it is possible Individual files on a shared file system was chosen because it is possible
that the data is voluminous so solutions that pass data to the SLURM control that the data is voluminous so solutions that pass data to the SLURM control
daemon via RPC may not scale to very large clusters or jobs with daemon via RPC may not scale to very large clusters or jobs with
many allocated nodes. many allocated nodes.</p>
<p>A separate <a href="acct_gather_profile_plugins.html"> <p>A separate <a href="acct_gather_profile_plugins.html">
SLURM Profile Accounting Plugin API (AcctGatherProfileType)</a> documents how SLURM Profile Accounting Plugin API (AcctGatherProfileType)</a> documents how
write other Profile Accounting plugins. write other Profile Accounting plugins.</P>
<a id="Administration"></a> <a id="Administration"></a>
<h2>Administration</h2> <h2>Administration</h2>
<h3>Shared File System</h3> <h3>Shared File System</h3>
<div style="margin-left: 20px;"> <div style="margin-left: 20px;">
The HDF5 Profile Plugin requires a common shared file system on all the compute <p>The HDF5 Profile Plugin requires a common shared file system on all
nodes. While a job is running, the plugin writes a file into this file the compute nodes. While a job is running, the plugin writes a
system for each step of the job on each node. When the job ends, file into this file system for each step of the job on each node. When
the merge process is launched and the node-step files are combined into one the job ends, the merge process is launched and the node-step files
HDF5 file for the job. are combined into one HDF5 file for the job.</p>
<p>
The root of the directory structure is declared in the <b>ProfileHDF5Dir</b> <p>The root of the directory structure is declared in the <b>ProfileHDF5Dir</b>
option in the acct_gather.conf file. The directory will be created by SLURM option in the acct_gather.conf file. The directory will be created by SLURM
if it doesn't exist. if it doesn't exist.</p>
<p>
Each user that creates a profile will have a subdirector to the profile <p>Each user that creates a profile will have a subdirector to the profile
directory that has read/write permission only for the user. directory that has read/write permission only for the user.</p>
</span> </span>
</div> </div>
<h3>Configuration parameters</h3> <h3>Configuration parameters</h3>
<div style="margin-left: 20px;"> <p><div style="margin-left: 20px;">
The profile plugin is enabled in the <p>The profile plugin is enabled in the
<a href="slurm.conf.html">slurm.conf</a> file, but is internally <a href="slurm.conf.html">slurm.conf</a> file, but is internally
configured in the configured in the
<a href="acct_gather.conf.html">acct_gather.conf</a> file. <a href="acct_gather.conf.html">acct_gather.conf</a> file.</p>
</div> </div>
<div style="margin-left: 20px;"> <div style="margin-left: 20px;">
<h4>slurm.conf parameters</h4> <h4>slurm.conf parameters</h4>
<div style="margin-left: 20px;"> <div style="margin-left: 20px;">
<br><b>AcctGatherProfileType=acct_gather_profile/hdf5</b> enables the HDF5 <p><b>AcctGatherProfileType=acct_gather_profile/hdf5</b> enables the HDF5
plugin. plugin.</p>
<br><b>JobAcctGatherFrequency=[energy=freq[,lustre=freq[,network=freq[task=freq]]]]</b> <p><b>JobAcctGatherFrequency=[energy=freq[,lustre=freq[,network=freq[task=freq]]]]</b>
sets default sample frequencies for data types. sets default sample frequencies for data types.</p>
</div> </div>
</div> </div>
<div style="margin-left: 20px;"> <div style="margin-left: 20px;">
<h4>act_gather.conf parameters</h4> <h4>act_gather.conf parameters</h4>
<div style="margin-left: 20px;"> <div style="margin-left: 20px;">
These parameters are directly used by the HDF5 Profile Plugin. <p>These parameters are directly used by the HDF5 Profile Plugin.</p>
<dl> <dl>
<dt><B>ProfileHDF5Dir</B>=&lt;path&gt;</dt> <dt><B>ProfileHDF5Dir</B>=&lt;path&gt;</dt>
<dd>This parameter is the path to the shared folder into which the <dd>This parameter is the path to the shared folder into which the
...@@ -104,54 +104,52 @@ add the --profile option to the launch scripts.</dd> ...@@ -104,54 +104,52 @@ add the --profile option to the launch scripts.</dd>
<div style="margin-left: 20px;"> <div style="margin-left: 20px;">
<h4>Time Series Control Parameters</h4> <h4>Time Series Control Parameters</h4>
<div style="margin-left: 20px;"> <div style="margin-left: 20px;">
Other plugins add time series data to the HDF5 collection. They typically <p>Other plugins add time series data to the HDF5 collection. They typically
have a default polling frequency specified in slurm.conf in the have a default polling frequency specified in slurm.conf in the
JobAcctGatherFrequency parameter. The polling frequency can be overridden JobAcctGatherFrequency parameter. The polling frequency can be overridden
using the --acctg-freq using the --acctg-freq
<a href="srun.html">srun</a> parameter. <a href="srun.html">srun</a> parameter.
They are both of the form task=sec,energy=sec,luster=sec,network=sec. They are both of the form task=sec,energy=sec,luster=sec,network=sec.<p>
<p>
The IPMI energy plugin also needs the EnergyIPMIFrequency value set <p>The IPMI energy plugin also needs the EnergyIPMIFrequency value set
in the acct_gather.conf file. This sets the rate at which the plugin samples in the acct_gather.conf file. This sets the rate at which the plugin samples
the external sensors. This value should be the same as the energy=sec in the external sensors. This value should be the same as the energy=sec in
either JobAcctGatherFrequency or --acctg-freq. either JobAcctGatherFrequency or --acctg-freq.</p>
<p>
Note that the IPMI and profile sampling are not synchronous. <p>Note that the IPMI and profile sampling are not synchronous.
The profile sample simply takes the last available IPMI sample value. The profile sample simply takes the last available IPMI sample value.
If the profile energy sample is more frequent than the IPMI sample rate, If the profile energy sample is more frequent than the IPMI sample rate,
the IPMI value will be repeated. If the profile energy sample is greater the IPMI value will be repeated. If the profile energy sample is greater
than the IPMI rate, IPMI values will be lost. than the IPMI rate, IPMI values will be lost.</p>
<p>
Also note that smallest effective IPMI (EnergyIPMIFrequency) sample rate <p>Also note that smallest effective IPMI (EnergyIPMIFrequency) sample rate
for 2013 era Intel processors is 3 seconds. for 2013 era Intel processors is 3 seconds.</p>
<p>
</div> </div>
</div> </div>
<a id="Profiling"></a> <a id="Profiling"></a>
<h2>Profiling Jobs</h2> <h2>Profiling Jobs</h2>
<h3>Data Collection</h3> <h3>Data Collection</h3>
The --profile option on salloc|sbatch|srun controls whether data is <p>The --profile option on salloc|sbatch|srun controls whether data is
collected and what type of data is collected. If --profile is not specified collected and what type of data is collected. If --profile is not specified
no data collected unless the <B>ProfileHDF5CollectDefault</B> no data collected unless the <B>ProfileHDF5CollectDefault</B>
option is used in acct_gather.conf. --profile on the command line overrides option is used in acct_gather.conf. --profile on the command line overrides
any value specified in the configuration file.<p> any value specified in the configuration file.</p>
<DT><B>--profile</B>=&lt;all|none|[energy[,|task[,|lustre[,|network]]]]&gt; <DT><B>--profile</B>=&lt;all|none|[energy[,|task[,|lustre[,|network]]]]&gt;
<DD> <DD>
enables detailed data collection by the acct_gather_profile plugin. <p>enables detailed data collection by the acct_gather_profile plugin.
Detailed data are typically time-series that are stored in a HDF5 file for Detailed data are typically time-series that are stored in a HDF5 file for
the job.</DD> the job.</p></DD>
</DT> </DT>
<P>
<div style="margin-left: 20px;"> <div style="margin-left: 20px;">
<DL> <DL>
<DT><B>All</B> <DT><B>All</B>
<DD>All data types are collected. (Cannot be combined with other values.) <DD>All data types are collected. (Cannot be combined with other values.)
</DD></DT> </DD></DT>
<P>
<DT><B>None</B> <DT><B>None</B>
<DD>No data types are collected. This is the default. (Cannot be combined with <DD>No data types are collected. This is the default. (Cannot be
other values.) combined with other values.)
</DD></DT> </DD></DT>
<DT><B>Energy</B> <DT><B>Energy</B>
...@@ -170,37 +168,37 @@ other values.) ...@@ -170,37 +168,37 @@ other values.)
</div> </div>
<h3>Data Consolidation</h3> <h3>Data Consolidation</h3>
The node-step files are merged into one HDF5 file for the job using the <p>The node-step files are merged into one HDF5 file for the job using the
<a href="sh5util.html">sh5util</a>. <a href="sh5util.html">sh5util</a>.</p>
<p>If the job is started with sbatch, the command line may added to the normal <p>If the job is started with sbatch, the command line may added to the normal
launch script, For example; launch script, For example:</p>
<pre> <pre>
sbatch -n1 -d$SLURM_JOB_ID --wrap="sh5util -j $SLURM_JOB_ID" sbatch -n1 -d$SLURM_JOB_ID --wrap="sh5util -j $SLURM_JOB_ID"
</pre> </pre>
<h3>Data Extraction</h3> <h3>Data Extraction</h3>
The <a href="sh5util.html">sh5util</a> program can also be used to extract <p>The <a href="sh5util.html">sh5util</a> program can also be used to extract
specific data from the HDF5 file and write it in <i>comma separated value (csv)</i> specific data from the HDF5 file and write it in <i>comma separated value (csv)</i>
form for importation into other analysis tools such as spreadsheets. form for importation into other analysis tools such as spreadsheets.</p>
<a id="HDF5"></a> <a id="HDF5"></a>
<h2>HDF5</h2> <h2>HDF5</h2>
HDF5 is a well known structured data set that allows heterogeneous but <p>HDF5 is a well known structured data set that allows heterogeneous but
related data to be stored in one file. related data to be stored in one file.
(.i.e. sections for energy statistics, network I/O, Task data, ) (.i.e. sections for energy statistics, network I/O, Task data, etc.)
Its internal structure resembles a Its internal structure resembles a
file system with <b>groups</b> being similar to <i>directories</i> and file system with <b>groups</b> being similar to <i>directories</i> and
<b>data sets</b> being similar to <i>files</i>. It also allows <b>attributes</b> <b>data sets</b> being similar to <i>files</i>. It also allows <b>attributes</b>
to be attached to groups to store application defined properties. to be attached to groups to store application defined properties.</p>
<p>There are commodity programs, notably <p>There are commodity programs, notably
<a href="http://www.hdfgroup.org/hdf-java-html/hdfview/index.html"> <a href="http://www.hdfgroup.org/hdf-java-html/hdfview/index.html">
HDFView</a> for viewing and manipulating these files. HDFView</a> for viewing and manipulating these files.
<p>Below is a screen shot from HDFView expanding the job tree and showing the <p>Below is a screen shot from HDFView expanding the job tree and showing the
attributes for a specific task. attributes for a specific task.</p>
<p> <br>
<img src="hdf5_task_attr.png" width="275" height="275" > <img src="hdf5_task_attr.png" width="275" height="275" >
...@@ -212,8 +210,8 @@ attributes for a specific task. ...@@ -212,8 +210,8 @@ attributes for a specific task.
<td><img src="hdf5_job_outline.png" width="205" height="570"></td> <td><img src="hdf5_job_outline.png" width="205" height="570"></td>
<td style="vertical-align: top;"> <td style="vertical-align: top;">
<div style="margin-left: 5px;"> <div style="margin-left: 5px;">
In the job file, there will be a group for each <b>step</b> of the job. <p>In the job file, there will be a group for each <b>step</b> of the job.
Within each step, there will be a group for nodes, and a group for tasks. Within each step, there will be a group for nodes, and a group for tasks.</p>
</div> </div>
<ul> <ul>
<li> <li>
...@@ -240,13 +238,13 @@ executed. This set of groups is essentially a cross reference table. ...@@ -240,13 +238,13 @@ executed. This set of groups is essentially a cross reference table.
</table> </table>
<h3>Energy Data</h3> <h3>Energy Data</h3>
<b>AcctGatherEnergyType=acct_gather_energy/ipmi</b> <p><b>AcctGatherEnergyType=acct_gather_energy/ipmi</b>
is required in slurm.conf to collect energy data. is required in slurm.conf to collect energy data.
Appropriately set energy=freq in either JobAcctGatherFrequency in slurm.conf Appropriately set energy=freq in either JobAcctGatherFrequency in slurm.conf
or in --acctg-freq on the command line. or in --acctg-freq on the command line.
Also appropriately set EnergyIPMIFrequency in acct_gather.conf. Also appropriately set EnergyIPMIFrequency in acct_gather.conf.</p>
<p>Each data sample in the Energe Time Series contains the following data items. <p>Each data sample in the Energe Time Series contains the following data items.
<DL> </p><DL>
<DT><B>Date Time</B> <DT><B>Date Time</B>
<DD>Time of day at which the data sample was taken. This can be used to <DD>Time of day at which the data sample was taken. This can be used to
correlate activity with other sources such as logs.</DD></DT> correlate activity with other sources such as logs.</DD></DT>
...@@ -259,13 +257,13 @@ correlate activity with other sources such as logs.</DD></DT> ...@@ -259,13 +257,13 @@ correlate activity with other sources such as logs.</DD></DT>
</DL> </DL>
<h3>Luster Data</h3> <h3>Luster Data</h3>
<b>AcctGatherFilesystemType=acct_gather_filesystem/lustre</b> <p><b>AcctGatherFilesystemType=acct_gather_filesystem/lustre</b>
is required in slurm.conf to collect task data. is required in slurm.conf to collect task data.
Appropriately set luster=freq in either JobAcctGatherFrequency in slurm.conf Appropriately set luster=freq in either JobAcctGatherFrequency in slurm.conf
or in --acctg-freq on the command line. or in --acctg-freq on the command line.</p>
<p>
Each data sample in the Lustre Time Series contains the following data items. <p>Each data sample in the Lustre Time Series contains the following data items.
<DL> </p><DL>
<DT><B>Date Time</B> <DT><B>Date Time</B>
<DD>Time of day at which the data sample was taken. This can be used to <DD>Time of day at which the data sample was taken. This can be used to
correlate activity with other sources such as logs.</DD></DT> correlate activity with other sources such as logs.</DD></DT>
...@@ -282,11 +280,12 @@ correlate activity with other sources such as logs.</DD></DT> ...@@ -282,11 +280,12 @@ correlate activity with other sources such as logs.</DD></DT>
</DL> </DL>
<h3>Network (Infiniband Data)</h3> <h3>Network (Infiniband Data)</h3>
<b>JobAcctInfinibandType=acct_gather_infiniband/ofed</b> <p><b>JobAcctInfinibandType=acct_gather_infiniband/ofed</b>
is required in slurm.conf to collect task data. is required in slurm.conf to collect task data.
Appropriately set network=freq in either JobAcctGatherFrequency in slurm.conf Appropriately set network=freq in either JobAcctGatherFrequency in slurm.conf
or in --acctg-freq on the command line. or in --acctg-freq on the command line.</p>
<p>Each data sample in the Network Time Series contains the following data items. <p>Each data sample in the Network Time Series contains the following
data items.</p>
<DL> <DL>
<DT><B>Date Time</B> <DT><B>Date Time</B>
<DD>Time of day at which the data sample was taken. This can be used to <DD>Time of day at which the data sample was taken. This can be used to
...@@ -304,11 +303,12 @@ correlate activity with other sources such as logs.</DD></DT> ...@@ -304,11 +303,12 @@ correlate activity with other sources such as logs.</DD></DT>
</DL> </DL>
<h3>Task Data</h3> <h3>Task Data</h3>
<b>JobAcctGatherType=jobacct_gather/linux</b> <p><b>JobAcctGatherType=jobacct_gather/linux</b>
is required in slurm.conf to collect task data. is required in slurm.conf to collect task data.
Appropriately set task=freq in either JobAcctGatherFrequency in slurm.conf Appropriately set task=freq in either JobAcctGatherFrequency in slurm.conf
or in --acctg-freq on the command line. or in --acctg-freq on the command line.</p>
<p>Each data sample in the Task Time Series contains the following data items. <p>Each data sample in the Task Time Series contains the following data
items.</p>
<DL> <DL>
<DT><B>Date Time</B> <DT><B>Date Time</B>
<DD>Time of day at which the data sample was taken. This can be used to <DD>Time of day at which the data sample was taken. This can be used to
...@@ -336,6 +336,6 @@ correlate activity with other sources such as logs.</DD></DT> ...@@ -336,6 +336,6 @@ correlate activity with other sources such as logs.</DD></DT>
<p class="footer"><a href="#top">top</a></p> <p class="footer"><a href="#top">top</a></p>
<p style="text-align:center;">Last modified 12 June 2013</p> <p style="text-align:center;">Last modified 1 July 2013</p>
<!--#include virtual="footer.txt"--> <!--#include virtual="footer.txt"-->
...@@ -53,7 +53,10 @@ help identify load imbalances and other anomalies.</li> ...@@ -53,7 +53,10 @@ help identify load imbalances and other anomalies.</li>
</ul></p> </ul></p>
<p>Slurm provides workload management on many of the most powerful computers in <p>Slurm provides workload management on many of the most powerful computers in
the world including: the world. On the June 2013 <a href="http://www.top500.org">Top500</a> list,
five of the ten top systems use Slurm including the number one system.
These five systems alone contain over 5.7 million cores.
A few of the systems using Slurm are listed below:
<ul> <ul>
<li><a href="http://www.top500.org/blog/lists/2013/06/press-release/"> <li><a href="http://www.top500.org/blog/lists/2013/06/press-release/">
Tianhe-2</a> designed by Tianhe-2</a> designed by
...@@ -74,7 +77,7 @@ is a <a herf="http://www.dell.com">Dell</a> with over ...@@ -74,7 +77,7 @@ is a <a herf="http://www.dell.com">Dell</a> with over
80,000 <a href="http://www.intel.com">Intel</a> Xeon cores, 80,000 <a href="http://www.intel.com">Intel</a> Xeon cores,
Intel Phi co-processors, plus Intel Phi co-processors, plus
128 <a href="http://www.nvidia.com">NVIDIA</a> GPUs 128 <a href="http://www.nvidia.com">NVIDIA</a> GPUs
delivering 2.66 Petaflops.</li> delivering 5.17 Petaflops.</li>
<li><a href="http://www-hpc.cea.fr/en/complexe/tgcc-curie.htm">TGCC Curie</a>, <li><a href="http://www-hpc.cea.fr/en/complexe/tgcc-curie.htm">TGCC Curie</a>,
owned by <a href="http://www.genci.fr">GENCI</a> and operated in the TGCC by owned by <a href="http://www.genci.fr">GENCI</a> and operated in the TGCC by
...@@ -110,6 +113,6 @@ named after Monte Rosa in the Swiss-Italian Alps, elevation 4,634m. ...@@ -110,6 +113,6 @@ named after Monte Rosa in the Swiss-Italian Alps, elevation 4,634m.
</ul> </ul>
<p style="text-align:center;">Last modified 24 June 2013</p> <p style="text-align:center;">Last modified 1 July 2013</p>
<!--#include virtual="footer.txt"--> <!--#include virtual="footer.txt"-->
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment