This file describes changes in recent versions of SLURM. It primarily
documents those changes that are of interest to users and admins.
* Changes in SLURM 2.2.0.rc4
-- Correction in logic to spread out over time highly parallel messages to
minimize lost messages. Effects slurmd epilog complete messages and PMI
key-pair transmissions. Patch from Gerrit Renker, CSCS.
-- Fixed issue where if a system has unset messages to the dbd in 2.1 and
upgrades to 2.2. Messages are now processed correctly now.
-- Fixed issue where assoc_mgr cache wasn't always loaded correctly if the
slurmdbd wasn't running when the slurmctld was started.

Danny Auble
-- Make sure on a pthread create in step launch that the error code is looked
-- Fix setting up default acct/wckey when upgrading from 2.1 to 2.2.
-- Fix issue with associations attached to a specific partition with no other
association, and requesting a different partition.
-- Added perlapi to the slurmdb to the slurm.spec.
-- In sched/backfill, correct handling of CompleteWait parameter to avoid
backfill scheduling while a job is completing. Patch from Gerrit Renker,
-- Send message back to user when trying to launch job on computing lacking
that user ID. Patch from Hongjia Cao, NUDT.
-- BLUEGENE - Fix it so 1 midplane clusters will run small block jobs.
-- Add Command and WorkDir to the output of "scontrol show job" for job
allocations created using srun (not just sbatch).
-- Fixed sacctmgr to not add blank defaultqos' when doing a cluster dump.
-- Correct processing of memory and disk space specifications in the salloc,
sbatch, and srun commands to work properly with a suffix of "MB", "GB",
etc. and not only with a single letter (e.g. "M", "G", etc.).
-- Prevent nodes with suspended jobs from being powered down by SLURM.
-- Normalized the way pidfile are created by the slurm daemons.
* Changes in SLURM 2.2.0.rc3
-- Modify sacctmgr command to accept plural versions of options (e.g. "Users"
in addition to "User"). Patch from Don Albert, BULL.

Danny Auble
-- BLUEGENE - make it so reset of boot counter happens only on state change
and not when a new job comes along.
-- Modify srun and salloc signal handling so they can be interrupted while
waiting for an allocation. This was broken in version 2.2.0.rc2.
-- Fix NULL pointer reference in sview. Patch from Gerrit Renker, CSCS.
-- Fix file descriptor leak in slurmstepd on spank_task_post_fork() failure.
Patch from Gerrit Renker, CSCS.
-- Fix bug in preserving job state information when upgrading from SLURM
version 2.1. Bug introduced in version 2.2.0-pre10. Patch from Par
Andersson, NSC.

Danny Auble
-- Fix bug where if using the slurmdbd if a job wasn't able to start right
away some accounting information may be lost.

Danny Auble
-- BLUEGENE - when a prolog failure happens the offending block is put in
an error state.
-- Changed the last column heading of the sshare output from "FS Usage" to
"FairShare" and added more detail to the sshare man page.
-- Fix bug in enforcement of reservation by account name. Used wrong index
into an array. Patch from Gerrit Renker, CSCS.
-- Modify job_submit/lua plugin to treat any non-zero return code from the
job_submit and job_modify functions as an error and the user request should
-- Fix bug which would permit pending job to be started on completing node
when job preemption is configured.
* Changes in SLURM 2.2.0.rc2
-- Fix memory leak in job step allocation logic. Patch from Hongjia Cao, NUDT.
-- If a preempted job was submitted with the --no-requeue option then cancel
rather than requeue it.

Danny Auble
-- Fix for problems when adding a user for the first time to a new cluster
with a 2.1 sacctmgr without specifying a default account.
-- Resend TERMINATE_JOB message only to nodes that the job still has not
terminated on. Patch from Hongjia Cao, NUDT.
-- Treat time limit specification of "0:300" as a request for 300 seconds
(5 minutes) instead of one minute.
-- Modify sched/backfill plugin logic to continue working its way down the
queue of jobs rather than restarting at the top if there are no changes in
job, node, or partition state between runs. Patch from Hongjia Cao, NUDT.
-- Improve scalability of select/cons_res logic. Patch from Matthieu Hautreux,

Danny Auble
-- Fix for possible deadlock in the slurmstepd when cancelling a job that is
also writing a large amount of data to stderr.
-- Fix in select/cons_res to eliminate "mem underflow" error when the
slurmctld is reconfigured while a job is in completing state.
-- Send a message to the a user's job when it's real or virual memory limit
is exceeded. :

Danny Auble
-- Apply rlimits right before execing the users task so to lower the risk of
the task exiting because the slurmstepd ran over a limit (log file size,
-- Add scontrol command of "uhold <job_id>" so that an administrator can hold
a job and let the job's owner release it. The scontrol command of
"hold <job_id>" when executed by a SLURM administrator can only be released
by a SLURM administrator and not the job owner.
-- Change atoi to slurm_atoul in mysql plugin, needed for running on 32-bit
systems in some cases.
-- If a batch job is found to be missing from a node, make its termination
state be NODE_FAIL rather than CANCELLED.

Danny Auble
-- Fatal error put back if running a bluegene or cray plugin from a controller
not of that type.

Danny Auble
-- Make sure jobacct_gather plugin is not shutdown before messing with the
proccess list.

Moe Jette
-- Modify signal handling in srun and salloc commands to avoid deadlock if the
malloc function is interupted and called again. The malloc function is
thread safe, but not reentrant, which is a problem when signal handling if

Moe Jette
the malloc function itself has a lock. Problem fixed by moving signal
handling in those commands to a new pthread.
-- In srun set job abort flag on completion to handle the case when a user
cancels a job while the node is not responding but slurmctld has not yet
the node down. Patch from Hongjia Cao, NUDT.
-- Streamline the PMI logic if no duplicate keys are included in the key-pairs
managed. Substantially improves performance for large numbers of tasks.
Adds support for SLURM_PMI_KVS_NO_DUP_KEYS environment variable. Patch
from Hongjia Cao, NUDT.

Danny Auble
-- Fix issues with sview dealing with older versions of sview and saving
-- Remove references to --mincores, --minsockets, and --minthreads from the
salloc, sbatch and srun man pages. These options are defunct, Patch from
Rod Schultz, Bull.

Danny Auble
-- Made openssl not be required to build RPMs, it is not required anymore
since munge is the default crypto plugin.

Danny Auble
-- sacctmgr now has smarts to figure out if a qos is a default qos when
modifing a user/acct or removing a qos.
-- For reservations on BlueGene systems, set and report c-node counts rather
than midplane counts.
* Changes in SLURM 2.2.0.rc1
-- Add show_flags parameter to the slurm_load_block_info() function.
-- perlapi has been brought up to speed courtesy of Hongjia Coa. (make sure to
run 'make clean' if building in a different dir than source)
-- Fixed regression in pre12 in crypto/munge when running with
--enable-multiple-slurmd which would cause the slurmd's to core.
-- Fixed regression where cpu count wasn't figured out correctly for steps.
-- Fixed issue when using old mysql that can't handle a '.' in the table
-- Mysql plugin works correctly without the SlurmDBD

Danny Auble
-- Added ability to query batch step with sstat. Currently no accounting data
is stored for the batch step, but the internals are inplace if we decide to
do that in the future.
-- Fixed some backwards compatibility issues with 2.2 talking to 2.1.

Danny Auble
-- Fixed regression where modifying associations didn't get sent to the
-- Made sshare sort things the same way saccmgr list assoc does
-- Fixed issue with default accounts being set up correctly.

Danny Auble
-- Changed sortting in the slurmctld so sshare output is similar to that of
sacctmgr list assoc.
-- Modify reservation logic so that daily and weekly reservations maintain
the same time when daylight savings time starts or ends in the interim.
-- Edit to make reservations handle updates to associations.
-- Added the derived exit code to the slurmctld job record and the derived
exit code and string to the job record in the SLURM db.
-- Added slurm-sjobexit RPM for SLURM job exit code management tools.
-- Added ability to use sstat/sacct against the batch step.
-- Added OnlyDefaults option to sacctmgr list associations.
-- Modified the fairshare priority formula to F = 2**(-Ue/S)
-- Modify the PMI functions key-pair exchange function to support a 32-bit
counter for larger job sizes. Patch from Hongjia Cao, NUDT.
-- In sched/builtin - Make the estimated job start time logic faster (borrowed
new logic from sched/backfill and added pthread) and more accurate.
-- In select/cons_res fix bug that could result in a job being allocated zero
CPUs on some nodes. Patch from Hongjia Cao, NUDT.
-- Fix bug in sched/backfill that could set expected start time of a job too
far in the future.

Danny Auble
-- Added ability to enforce new limits given to associations/qos on
pending jobs.
-- Increase max message size for the slurmdbd from 1000000 to 16*1024*1024
-- Increase number of active threads in the slurmdbd from 50 to 100
-- Fixed small bug in src/common/slurmdb_defs.c reported by Bjorn-Helge Mevik
-- Fixed sacctmgr's ability to query associations against qos again.
-- Fixed sview show config on non-bluegene systems.
-- Fixed bug in selecting jobs based on sacct -N option
-- Fix bug that prevented job Epilog from running more than once on a node if
a job was requeued and started no job steps.
-- Fixed issue where node index wasn't stored correcting when using DBD.
-- Enable srun's use of the --nodes option with --exclusive (previously the
--nodes option was ignored).
-- Added UsageThreshold and Flags to the QOS object.
-- Patch to improve threadsafeness in the mysql plugins.
-- Add support for fair-share scheduling to be based upon resource use at
the level of bank accounts and ignore use of individual users. Patch by
Par Andersson, National Supercomputer Centre, Sweden.
* Changes in SLURM 2.2.0.pre12
-- Log if Prolog or Epilog run for longer than MessageTimeout / 2.
-- Log the RPC number associated with messages from slurmctld that timeout.
-- Fix bug in select/cons_res logic when job allocation includes --overcommit
and --ntasks-per-node options and the node has fewer CPUs than the count
specified by --ntasks-per-node.
-- Fix bug in gang scheduling and job preemption logic so that preempted jobs
get resumed properly after a slurmctld hot-start.
-- Fix bug in select/linear handling of gang scheduled jobs that could result
in run_job_cnt underflow error message.
-- Fix bug in gang scheduling logic to properly support partitions added
using the scontrol command.
-- Fix a segmentation fault in sview where the 'excluded_partitions' field
was set to NULL, caused by the absence of ~/.slurm/sviewrc.
-- Rewrote some calls to is_user_any_coord() in src/plugins/accounting_storage
modules to make use of is_user_any_coord()'s return value.
-- Add configure option of --with=dimensions=#.
-- Modify srun ping logic so that srun would only be considered not responsive
if three ping messages were not responded to. Patch from Hongjia Cao (NUDT).
Loading full blame...