Avoid eternal requeue if the SlurmctldProlog fails

Reset the flag before requeuing, otherwise once SlurmctldProlog fails once for the job the job will never be launched successfully. Bug 11574.

Avoid eternal requeue if the SlurmctldProlog fails
Reset the flag before requeuing, otherwise once SlurmctldProlog fails once for the job the job will never be launched successfully. Bug 11574.
f442de77 · Carlos Tripiana Montes · Tim Wickberg · bee1a89a · f442de77 · f442de77
Commit f442de77 authored 3 years ago by Carlos Tripiana Montes Committed by Tim Wickberg 3 years ago
--- a/NEWS
+++ b/NEWS
@@ -21,6 +21,7 @@ documents those changes that are of interest to users and administrators.
 -- srun - fix broken node step allocation in a heterogeneous allocation.
 -- Fail step creation if -n is not multiple of --ntasks-per-gpu.
 -- job_container/tmpfs - Fix slowdown on teardown.
+ -- Fix problem with SlurmctldProlog where requeued jobs would never launch.
  
 * Changes in Slurm 20.11.7
 ==========================

--- a/src/slurmctld/prep_slurmctld.c
+++ b/src/slurmctld/prep_slurmctld.c
@@ -74,6 +74,7 @@ extern void prep_prolog_slurmctld_callback(int rc, uint32_t job_id)

 	/* all async prologs have completed, continue on now */
 	if (job_ptr->prep_prolog_failed) {
+		job_ptr->prep_prolog_failed = false;
 		if ((rc = job_requeue(0, job_id, NULL, false, 0))) {
 			info("unable to requeue JobId=%u: %s", job_id,
 			     slurm_strerror(rc));