proctrack_plugins.shtml 6.65 KiB
<!--#include virtual="header.txt"-->
<h1><a name="top">SLURM Process Tracking Plugin API</a></h1>
<h2> Overview</h2>
<p> This document describes SLURM process tracking plugins and the API
that defines them.
It is intended as a resource to programmers wishing to write their
own SLURM process tracking plugins.
This is version 0 of the API.</p>
<p>SLURM process tracking plugins are SLURM plugins that implement
the SLURM process tracking API described herein.
They must conform to the SLURM Plugin API with the following
specifications:</p>
<p><span class="commandline">const char plugin_type[]</span><br>
The major type must be "proctrack."
The minor type can be any recognizable abbreviation for the type
of proctrack. We recommend, for example:</p>
<ul>
<li><b>aix</b>—Perform process tracking on an AIX platform.
NOTE: This requires a kernel extension that records
ever process creation and termination.</li>
<li><b>linuxproc</b>—Perform process tracking based upon a scan
of the Linux process table and use the parent process ID to determine
what processes are members of a SLURM job. NOTE: This mechanism is
not entirely reliable for process tracking.</li>
<li><b>pgid</b>—Use process group ID to determine
what processes are members of a SLURM job. NOTE: This mechanism is
not entirely reliable for process tracking.</li>
<li><b>rms</b>—Use a Quadrics RMS kernel patch to
establish what processes are members of a SLURM job.
NOTE: This requires a kernel patch that records
every process creation and termination.</li>
<li><b>sgj_job</b>—Use <a href="http://oss.sgi.com/projects/pagg/">
SGI's Process Aggregates (PAGG) kernel module</a>.
NOTE: This kernel module records every process creation
and termination.</li>
</ul>
<p>The <span class="commandline">plugin_name</span> and
<span class="commandline">plugin_version</span> symbols required
by the SLURM Plugin API require no specialization for process tracking.
Note carefully, however, the versioning discussion below.</p>
<p>The programmer is urged to study
<span class="commandline">src/plugins/proctrack/pgid/proctrack_pgid.c</span>
for an example implementation of a SLURM proctrack plugin.</p>
<p class="footer"><a href="#top">top</a></p>
<h2>Data Objects</h2>
<p> The implementation must support a container id of type uint32_t.
This container ID is maintained by the plugin directly in the slurmd
job structure using the field named <i>cont_id</i>.</p>
<p>The implementation must maintain (though not necessarily directly export) an
enumerated <b>errno</b> to allow SLURM to discover as practically as possible
the reason for any failed API call.
These values must not be used as return values in integer-valued functions
in the API.
The proper error return value from integer-valued functions is SLURM_ERROR.
The implementation should endeavor to provide useful and pertinent information
by whatever means is practical.
Successful API calls are not required to reset errno to a known value.</p>
<p class="footer"><a href="#top">top</a></p>
<h2>API Functions</h2>
<p>The following functions must appear. Functions which are not implemented should
be stubbed.</p>
<p class="commandline">int slurm_container_create (slurmd_job_t *job);</p>
<p style="margin-left:.2in"><b>Description</b>: Create a container.
The container should be valid
<span class="commandline">slurm_container_destroy()</span> is called.
This function must put the container ID directoy in the job structure's
variable <i>cont_id</i>.</p>
<p style="margin-left:.2in"><b>Argument</b>:
<span class="commandline"> job</span> (input/output)
Pointer to a slurmd job structure.</p>
<p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure,
the plugin should return SLURM_ERROR and set the errno to an appropriate value
to indicate the reason for failure.</p>
<p class="commandline">int slurm_container_add (slurmd_job_t *job, pid_t pid);</p>
<p style="margin-left:.2in"><b>Description</b>: Add a specific process ID
to a given job's container.</p>
<p style="margin-left:.2in"><b>Arguments</b>:<br>
<span class="commandline"> job</span> (input)
Pointer to a slurmd job structure.<br>
<span class="commandline"> pid</span> (input)
The ID of the process to add to this job's container.</p>
<p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure,
the plugin should return SLURM_ERROR and set the errno to an appropriate value
to indicate the reason for failure.</p>
<p class="commandline">int slurm_container_signal (uint32_t id, int signal);</p>
<p style="margin-left:.2in"><b>Description</b>: Signal all processes in a given
job's container.</p>
<p style="margin-left:.2in"><b>Arguments</b>:<br>
<span class="commandline"> id</span> (input)
Job container's ID.<br>
<span class="commandline"> signal</span> (input)
Signal to be sent to processes. Note that a signal of zero
just tests for the existence of processes in a given job container.</p>
<p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if the signal
was sent.
If the signal can not be sent, the function should return SLURM_ERROR and set
its errno to an appropriate value to indicate the reason for failure.</p>
<p class="footer"><a href="#top">top</a></p>
<p class="commandline">int slurm_container_destroy (uint32_t id);</p>
<p style="margin-left:.2in"><b>Description</b>: Destroy or otherwise
invalidate a job container.
This does not imply the container is empty, just that it is no longer
needed.</p>
<p style="margin-left:.2in"><b>Arguments</b>:
<span class="commandline"> id</span> (input)
Job container's ID.</p>
<p style="margin-left:.2in"><b>Returns</b>: SLURM_SUCCESS if successful. On failure,
the plugin should return SLURM_ERROR and set the errno to an appropriate value
to indicate the reason for failure.</p>
<p class="commandline">uint32_t slurm_container_find (pid_t pid);</p>
<p style="margin-left:.2in"><b>Description</b>:
Given a process ID, return its job container ID.</p>
<p style="margin-left:.2in"><b>Arguments</b>:
<span class="commandline"> pid</span> (input)
A process ID.</p>
<p style="margin-left:.2in"><b>Returns</b>: The job container ID
with this process or zero if none is found.</p>
<h2>Versioning</h2>
<p> This document describes version 0 of the SLURM Process Tracking API.
Future releases of SLURM may revise this API. A process tracking plugin
conveys its ability to implement a particular API version using the
mechanism outlined for SLURM plugins.</p>
<p class="footer"><a href="#top">top</a></p>
<p style="text-align:center;">Last modified 6 June 2006</p>
<!--#include virtual="footer.txt"-->