Skip to content
Snippets Groups Projects
Commit 5b2f74b9 authored by Morris Jette's avatar Morris Jette
Browse files

scancel of pack job leader signals all pack job components

parent dffaeaad
No related branches found
No related tags found
No related merge requests found
......@@ -41,7 +41,7 @@ unique <i>job_id</i>.</li>
<li><i>pack_job_id</i>: This identification number applies to all components
of the heterogeneous job. All components of the same job will have the same
<i>pack_job_id</i> value and it will be equal to the <i>job_id</i> of the
first component.</li>
first component. We refer to this as the "pack leader".</li>
<li><i>pack_job_id_set</i>: Regular expression identifying all <i>job_id</i>
values associated with the job.</li>
<li><i>pack_job_offset</i>: A unique sequence number applied to each component
......@@ -72,8 +72,8 @@ For example "123+4" would represent heterogeneous job id 123 and it's fifth
component (note: the first component has a <i>pack_job_offset</i>value of 0).</p>
<p>A request for a specific job ID that identifes a ID of the first component
of a heterogenous job will return information about all pack job components.
For example:</p>
of a heterogenous job (i.e. the "pack leader" will return information about
all components of that job. For example:</p>
<pre>
$ squeue --job=93
JOBID PARTITION NAME USER ST TIME NODES NODELIST
......@@ -82,6 +82,25 @@ JOBID PARTITION NAME USER ST TIME NODES NODELIST
93+2 debug bash adam R 18:18 1 nid00021
</pre>
<p>A request to cancel or otherwise signal a pack leader will be applied to
all components of that pack job. A request to cancel a specific component of
the pack job using the "#+#" notation will apply on to that specific component.
For example:</p>
<pre>
$ squeue --job=93
JOBID PARTITION NAME USER ST TIME NODES NODELIST
93+0 debug bash adam R 19:18 1 nid00001
93+1 debug bash adam R 19:18 1 nid00011
93+2 debug bash adam R 19:18 1 nid00021
$ scancel 93+1
$ squeue --job=93
JOBID PARTITION NAME USER ST TIME NODES NODELIST
93+0 debug bash adam R 19:38 1 nid00001
93+2 debug bash adam R 19:38 1 nid00021
$ squeue --job=93
JOBID PARTITION NAME USER ST TIME NODES NODELIST
</pre>
<h2><a name="limitations">Limitations</a></h2>
<p>In a federation of clusters, a heterogeneous job will execute entirely on
......
......@@ -4785,6 +4785,24 @@ extern int job_signal(uint32_t job_id, uint16_t signal, uint16_t flags,
return _job_signal(job_ptr, signal, flags, uid, preempt);
}
 
/* Signal all components of a pack job */
static int _pack_job_signal(struct job_record *job_ptr, uint16_t signal,
uint16_t flags, uid_t uid, bool preempt)
{
ListIterator iter;
int rc = SLURM_SUCCESS, rc1;
struct job_record *pack_job_ptr;
iter = list_iterator_create(job_ptr->pack_job_list);
while ((pack_job_ptr = (struct job_record *) list_next(iter))) {
rc1 = _job_signal(pack_job_ptr, signal, flags, uid, preempt);
rc = MAX(rc, rc1);
}
list_iterator_destroy(iter);
return rc;
}
/*
* job_str_signal - signal the specified job
* IN job_id_str - id of the job to be signaled, valid formats include "#"
......@@ -4830,6 +4848,10 @@ extern int job_str_signal(char *job_id_str, uint16_t signal, uint16_t flags,
int jobs_done = 0, jobs_signalled = 0;
struct job_record *job_ptr_done = NULL;
job_ptr = find_job_record(job_id);
if (job_ptr && job_ptr->pack_job_list) {
return _pack_job_signal(job_ptr, signal, flags, uid,
preempt);
}
if (job_ptr && (job_ptr->array_task_id == NO_VAL) &&
(job_ptr->array_recs == NULL)) {
/* This is a regular job, not a job array */
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment