NEWS

This file describes changes in recent versions of SLURM. It primarily
documents those changes that are of interest to users and admins.

* Changes in SLURM 2.2.0.rc3
============================
 -- Modify sacctmgr command to accept plural versions of options (e.g. "Users"
    in addition to "User"). Patch from Don Albert, BULL.
 -- BLUEGENE - make it so reset of boot counter happens only on state change
    and not when a new job comes along.
 -- Modify srun and salloc signal handling so they can be interrupted while
    waiting for an allocation. This was broken in version 2.2.0.rc2.
 -- Fix NULL pointer reference in sview. Patch from Gerrit Renker, CSCS.
 -- Fix file descriptor leak in slurmstepd on spank_task_post_fork() failure.
    Patch from Gerrit Renker, CSCS.
 -- Fix bug in preserving job state information when upgrading from SLURM
    version 2.1. Bug introduced in version 2.2.0-pre10. Patch from Par
    Andersson, NSC.

* Changes in SLURM 2.2.0.rc2
============================
 -- Fix memory leak in job step allocation logic. Patch from Hongjia Cao, NUDT.
 -- If a preempted job was submitted with the --no-requeue option then cancel
    rather than requeue it.
 -- Fix for problems when adding a user for the first time to a new cluster
    with a 2.1 sacctmgr without specifying a default account.
 -- Resend TERMINATE_JOB message only to nodes that the job still has not
    terminated on. Patch from Hongjia Cao, NUDT.
 -- Treat time limit specification of "0:300" as a request for 300 seconds
    (5 minutes) instead of one minute.
 -- Modify sched/backfill plugin logic to continue working its way down the
    queue of jobs rather than restarting at the top if there are no changes in
    job, node, or partition state between runs. Patch from Hongjia Cao, NUDT.
 -- Improve scalability of select/cons_res logic. Patch from Matthieu Hautreux,
    CEA.
 -- Fix for possible deadlock in the slurmstepd when cancelling a job that is
    also writing a large amount of data to stderr.
 -- Fix in select/cons_res to eliminate "mem underflow" error when the
    slurmctld is reconfigured while a job is in completing state.
 -- Send a message to the a user's job when it's real or virual memory limit
    is exceeded. :
 -- Apply rlimits right before execing the users task so to lower the risk of
    the task exiting because the slurmstepd ran over a limit (log file size,
    etc.)
 -- Add scontrol command of "uhold <job_id>" so that an administrator can hold
    a job and let the job's owner release it. The scontrol command of
    "hold <job_id>" when executed by a SLURM administrator can only be released
    by a SLURM administrator and not the job owner.
 -- Change atoi to slurm_atoul in mysql plugin, needed for running on 32-bit
    systems in some cases.
 -- If a batch job is found to be missing from a node, make its termination
    state be NODE_FAIL rather than CANCELLED.
 -- Fatal error put back if running a bluegene or cray plugin from a controller
    not of that type.
 -- Make sure jobacct_gather plugin is not shutdown before messing with the
    proccess list.
 -- Modify signal handling in srun and salloc commands to avoid deadlock if the
    malloc function is interupted and called again. The malloc function is
    thread safe, but not reentrant, which is a problem when signal handling if
    the malloc function itself has a lock. Problem fixed by moving signal
    handling in those commands to a new pthread.
 -- In srun set job abort flag on completion to handle the case when a user
    cancels a job while the node is not responding but slurmctld has not yet
    the node down. Patch from Hongjia Cao, NUDT.
 -- Streamline the PMI logic if no duplicate keys are included in the key-pairs
    managed. Substantially improves performance for large numbers of tasks.
    Adds support for SLURM_PMI_KVS_NO_DUP_KEYS environment variable. Patch
    from Hongjia Cao, NUDT.
 -- Fix issues with sview dealing with older versions of sview and saving
    defaults.
 -- Remove references to --mincores, --minsockets, and --minthreads from the
    salloc, sbatch and srun man pages. These options are defunct, Patch from
    Rod Schultz, Bull.
 -- Made openssl not be required to build RPMs, it is not required anymore
    since munge is the default crypto plugin.
 -- sacctmgr now has smarts to figure out if a qos is a default qos when
    modifing a user/acct or removing a qos.
 -- For reservations on BlueGene systems, set and report c-node counts rather
    than midplane counts.

* Changes in SLURM 2.2.0.rc1
============================
 -- Add show_flags parameter to the slurm_load_block_info() function.
 -- perlapi has been brought up to speed courtesy of Hongjia Coa. (make sure to
    run 'make clean' if building in a different dir than source)
 -- Fixed regression in pre12 in crypto/munge when running with
    --enable-multiple-slurmd which would cause the slurmd's to core.
 -- Fixed regression where cpu count wasn't figured out correctly for steps.
 -- Fixed issue when using old mysql that can't handle a '.' in the table
    name.
 -- Mysql plugin works correctly without the SlurmDBD
 -- Added ability to query batch step with sstat.  Currently no accounting data
    is stored for the batch step, but the internals are inplace if we decide to
    do that in the future.
 -- Fixed some backwards compatibility issues with 2.2 talking to 2.1.
 -- Fixed regression where modifying associations didn't get sent to the
    slurmctld.
 -- Made sshare sort things the same way saccmgr list assoc does
    (alphabetically)
 -- Fixed issue with default accounts being set up correctly.
 -- Changed sortting in the slurmctld so sshare output is similar to that of
    sacctmgr list assoc.
 -- Modify reservation logic so that daily and weekly reservations maintain
    the same time when daylight savings time starts or ends in the interim.
 -- Edit to make reservations handle updates to associations.
 -- Added the derived exit code to the slurmctld job record and the derived
    exit code and string to the job record in the SLURM db.
 -- Added slurm-sjobexit RPM for SLURM job exit code management tools.
 -- Added ability to use sstat/sacct against the batch step.
 -- Added OnlyDefaults option to sacctmgr list associations.
 -- Modified the fairshare priority formula to F = 2**(-Ue/S)
 -- Modify the PMI functions key-pair exchange function to support a 32-bit
    counter for larger job sizes. Patch from Hongjia Cao, NUDT.
 -- In sched/builtin - Make the estimated job start time logic faster (borrowed
    new logic from sched/backfill and added pthread) and more accurate.
 -- In select/cons_res fix bug that could result in a job being allocated zero
    CPUs on some nodes. Patch from Hongjia Cao, NUDT.
 -- Fix bug in sched/backfill that could set expected start time of a job too
    far in the future.
 -- Added ability to enforce new limits given to associations/qos on
    pending jobs.
 -- Increase max message size for the slurmdbd from 1000000 to 16*1024*1024
 -- Increase number of active threads in the slurmdbd from 50 to 100
 -- Fixed small bug in src/common/slurmdb_defs.c reported by Bjorn-Helge Mevik
 -- Fixed sacctmgr's ability to query associations against qos again.
 -- Fixed sview show config on non-bluegene systems.
 -- Fixed bug in selecting jobs based on sacct -N option
 -- Fix bug that prevented job Epilog from running more than once on a node if
    a job was requeued and started no job steps.
 -- Fixed issue where node index wasn't stored correcting when using DBD.
 -- Enable srun's use of the --nodes option with --exclusive (previously the
    --nodes option was ignored).
 -- Added UsageThreshold and Flags to the QOS object.
 -- Patch to improve threadsafeness in the mysql plugins.
 -- Add support for fair-share scheduling to be based upon resource use at
    the level of bank accounts and ignore use of individual users. Patch by
    Par Andersson, National Supercomputer Centre, Sweden.

* Changes in SLURM 2.2.0.pre12
==============================
 -- Log if Prolog or Epilog run for longer than MessageTimeout / 2.
 -- Log the RPC number associated with messages from slurmctld that timeout.
 -- Fix bug in select/cons_res logic when job allocation includes --overcommit
    and --ntasks-per-node options and the node has fewer CPUs than the count
    specified by --ntasks-per-node.
 -- Fix bug in gang scheduling and job preemption logic so that preempted jobs
    get resumed properly after a slurmctld hot-start.
 -- Fix bug in select/linear handling of gang scheduled jobs that could result
    in run_job_cnt underflow error message.
 -- Fix bug in gang scheduling logic to properly support partitions added
    using the scontrol command.
 -- Fix a segmentation fault in sview where the 'excluded_partitions' field
    was set to NULL, caused by the absence of ~/.slurm/sviewrc.
 -- Rewrote some calls to is_user_any_coord() in src/plugins/accounting_storage
    modules to make use of is_user_any_coord()'s return value.
 -- Add configure option of --with=dimensions=#.
 -- Modify srun ping logic so that srun would only be considered not responsive
    if three ping messages were not responded to. Patch from Hongjia Cao (NUDT).
 -- Preserve a node's ReasonTime field after scontrol reconfig command. Patch
    from Hongjia Cao (NUDT).
 -- Added the authority for users with AdminLevel's defined in the SLURM db
    (Operators and Admins) and account coordinators to invoke commands that
    affect jobs, reservations, nodes, etc.
 -- Fix for slurmd restart on completing node with no tasks to get the correct
    state, completing. Patch from Hongjia Cao (NUDT).
 -- Prevent scontrol setting a node's Reason="". Patch from Hongjia Cao (NUDT).
 -- Add new functions hostlist_ranged_string_malloc, 
    hostlist_ranged_string_xmalloc, hostlist_deranged_string_malloc, and
    hostlist_deranged_string_xmalloc which will allocate memory as needed.
 -- Make the slurm commands support both the --cluster and --clusters option.
    Previously, some commands support one of those options, but not the other.
 -- Fix bug when resizing a job that has steps running on some of those nodes.
    Avoid killing the job step on remaining nodes. Patch from Rod Schultz
    (BULL). Also fix bug related to tracking the CPUs allocated to job steps
    on each node after releasing some nodes from the job's allocation.
 -- Applied patch from Rod Schultz / Matthieu Hautreux to keep the Node-to-Host
    cache from becoming corrupted when a hostname cannot be resolved.
 -- Export more symbols in libslurm for job and node state information
    translation (numbers to strings). Patch from Hongia Cao, NUDT.
 -- Add logic to retry sending RESPONSE_LAUNCH_TASKS messages from slurmd to
    srun. Patch from Hongia Cao, NUDT.
 -- Modify bit_unfmt_hexmask() and bit_unfmt_binmask() functions to clear the
    bitmap input before setting the bits indicated in the input string.
 -- Add SchedulerParameters option of bf_window to control how far into the
    future that the backfill scheduler will look when considering jobs to start.
    The default value is one day. See "man slurm.conf" for details.
 -- Fix bug that can result in duplicate job termination records in accounting
    for job termination when slurmctld restarts or reconfigures.
 -- Modify plugin and library logic as needed to support use of the function
    slurm_job_step_stat() from user commands.
 -- Fix race condition in which PrologSlurmctld failure could cause slurmctld
    to abort.
 -- Fix bug preventing users in secondary user groups from being granted access
    to partitions configured with AllowGroups.
 -- Added support for a default account and wckey per cluster within accounting.
 -- Modified select/cons_res plugin so that if MaxMemPerCPU is configured and a
    job specifies it's memory requirement, then more CPUs than requested will
    automatically be allocated to a job to honor the MaxMemPerCPU parameter.
 -- Added the derived_ec (exit_code) member to job_info_t.  exit_code captures
    the exit code of the job script (or salloc) while derived_ec contains the
    highest exit code of all the job steps.