From b414712e4407c98ec231b5c0c3d956c5a732f74c Mon Sep 17 00:00:00 2001
From: Moe Jette <jette1@llnl.gov>
Date: Sat, 9 Apr 2011 23:57:09 +0000
Subject: [PATCH] slurmctld: keep original nice value when putting job on hold

The current code erases the old nice value (both negative and positive) when a job is
put on hold so that the job has a 0 nice component upon release.

This interaction causes difficulties if the nice value set at submission time had been
set there for a reason, for instance when
 * a system administrator has allowed to set a negative nice value;
 * the user wanted to keep this as a low-priority job and wants his/her other jobs
   to go first (indenpendent of the hold option);
 * the nice value is used for other semantics - at our site for instance, we use it
   for computed "base priority values" that are computed by looking at how much of
   their quota a given group has already (over)used.

Here is an example which illustrates the loss of original nice values:

  [2011-03-31T09:47:53] sched: update_job: setting priority to 0 for job_id 55
  [2011-03-31T09:47:53] sched: update_job: setting priority to 0 for job_id 66
  [2011-03-31T09:47:53] sched: update_job: setting priority to 0 for job_id 77
  [2011-03-31T09:47:54] sched: update_job: setting priority to 0 for job_id 88
  [2011-03-31T09:47:54] sched: update_job: setting priority to 0 for job_id 99
  [2011-03-31T09:47:54] sched: update_job: setting priority to 0 for job_id 110

This is from user 'kraused' whose project 's310' is within the allocated quota and thus
has an initial nice value of -542 (set via the job_submit/lua plugin).

However, by putting his jobs on hold, he has lost this advantage:

  JOBID     USER   PRIORITY        AGE  FAIRSHARE    JOBSIZE  PARTITION        QOS   NICE
     55  kraused      15181        153          0       5028      10000          0      0
     66  kraused      15181        153          0       5028      10000          0      0
     77  kraused      15181        153          0       5028      10000          0      0
     88  kraused      15178        150          0       5028      10000          0      0
     99  kraused      15178        150          0       5028      10000          0      0
    110  kraused      15178        150          0       5028      10000          0      0

I believe that resetting the nice value has been there for a reason, thus the patch prevents
reset of current nice value only if the operation is not user/administrator hold.
---
 src/slurmctld/job_mgr.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/slurmctld/job_mgr.c b/src/slurmctld/job_mgr.c
index ec564dc97cd..c2bbd945299 100644
--- a/src/slurmctld/job_mgr.c
+++ b/src/slurmctld/job_mgr.c
@@ -7045,7 +7045,8 @@ int update_job(job_desc_msg_t * job_specs, uid_t uid)
 			xfree(job_ptr->state_desc);
 		} else if (authorized ||
 			 (job_ptr->priority > job_specs->priority)) {
-			job_ptr->details->nice = NICE_OFFSET;
+			if (job_specs->priority != 0)
+				job_ptr->details->nice = NICE_OFFSET;
 			if (job_specs->priority == INFINITE) {
 				job_ptr->direct_set_prio = 0;
 				_set_job_prio(job_ptr);
-- 
GitLab