NEWS

This file describes changes in recent versions of SLURM. It primarily
documents those changes that are of interest to users and admins.

* Changes in SLURM 2.3.1
========================
 -- Do not remove the backup slurmctld's pid file when it assumes control, only
    when it actually shuts down. Patch from Andriy Grytsenko (Massive Solutions
    Limited).
 -- Avoid clearing a job's reason from JobHeldAdmin or JobHeldUser when it is
    otherwise updated using scontrol or sview commands. Patch based upon work
    by Phil Eckert (LLNL).
 -- BLUEGENE - Fix for if changing the defined blocks in the bluegene.conf and
    jobs happen to be running on blocks not in the new config.
 -- Many cosmetic modifications to eliminate warning message from GCC version
    4.6 compiler.
 -- Fix for sview reservation tab when finding correct reservation.
 -- Fix for handling QOS limits per user on a reconfig of the slurmctld.
 -- Do not treat the absence of a gres.conf file as a fatal error on systems
    configured with GRES, but set GRES counts to zero.
 -- BLUEGENE - Update correctly the state in the reason of a block if an
    admin sets the state to error.
 -- BLUEGENE - handle reason of blocks in error more correctly between
    restarts of the slurmctld.
 -- BLUEGENE - Fix minor potential memory leak when setting block error reason.
 -- BLUEGENE - Fix if running in Static/Overlap mode and full system block
    is in an error state, won't deny jobs.
 -- Fix for accounting where your cluster isn't numbered in counting order
    (i.e. 1-9,0 instead of 0-9).  The bug would cause 'sacct -N nodename' to
    not give correct results on these systems.
 -- Fix to GRES allocation logic when resources are associated with specific
    CPUs on a node. Patch from Steve Trofinoff, CSCS.
 -- Fix bugs in sched/backfill with respect to QOS reservation support and job
    time limits. Patch from Alejandro Lucero Palau (Barcelona Supercomputer
    Center).
 -- BGQ - fix to set up corner correctly for sub block jobs.
 -- Major re-write of the CPU Management User and Administrator Guide (web
    page) by Martin Perry, Bull.
 -- BLUEGENE - If removing blocks from system that once existed cleanup of old
    block happens correctly now.
 -- Prevent slurmctld crashing with configuration of MaxMemPerCPU=0.
 -- Prevent job hold by operator or account coordinator of his own job from
    being an Administrator Hold rather than User Hold by default.
 -- Cray - Fix for srun.pl parsing to avoid adding spaces between option and
    argument (e.g. "-N2" parsed properly without changing to "-N 2").

* Changes in SLURM 2.3.0-2
==========================
 -- Fix for memory issue inside sview.
 -- Fix issue where if a job was pending and the slurmctld was restarted a
    variable wasn't initialized in the job structure making it so that job
    wouldn't run.

* Changes in SLURM 2.3.0
========================
 -- BLUEGENE - make sure we only set the jobinfo_select start_loc on a job
    when we are on a small block, not a regular one.
 -- BGQ - fix issue where not copying the correct amount of memory.
 -- BLUEGENE - fix clean start if jobs were running when the slurmctld was
    shutdown and then the system size changed.  This would probably only happen
    if you were emulating a system.
 -- Fix sview for calling a cray system from a non-cray system to get the
    correct geometry of the system.
 -- BLUEGENE - fix to correctly import pervious version of block state file.
 -- BLUEGENE - handle loading better when doing a clean start with static
    blocks.
 -- Add sinfo format and sort option "%n" for NodeHostName and "%o" for
    NodeAddr.
 -- If a job is deferred due to partition limits, then re-test those limits
    after a partition is modified. Patch from Don Lipari.
 -- Fix bug which would crash slurmcld if job's owner (not root) tries to clear
    a job's licenses by setting value to "".
 -- Cosmetic fix for printing out debug info in the priority plugin.
 -- In sview when switching from a bluegene machine to a regular linux cluster
    and vice versa the node->base partition lists will be displayed if setup
    in your .slurm/sviewrc file.
 -- BLUEGENE - Fix for creating full system static block on a BGQ system.
 -- BLUEGENE - Fix deadlock issue if toggling between Dynamic and Static block
    allocation with jobs running on blocks that don't exist in the static
    setup.
 -- BLUEGENE - Modify code to only give HTC states to BGP systems and not
    allow them on Q systems.
 -- BLUEGENE - Make it possible for an admin to define multiple dimension
    conn_types in a block definition.
 -- BGQ - Alter tools to output multiple dimensional conn_type.

* Changes in SLURM 2.3.0.rc2
============================
 -- With sched/wiki or sched/wiki2 (Maui or Moab scheduler), insure that a
    requeued job's priority is reset to zero.
 -- BLUEGENE - fix to run steps correctly in a BGL/P emulated system.
 -- Fixed issue where if there was a network issue between the slurmctld and
    the DBD where both remained up but were disconnected the slurmctld would
    get registered again with the DBD.
 -- Fixed issue where if the DBD connection from the ctld goes away because of
    a POLLERR the dbd_fail callback is called.
 -- BLUEGENE - Fix to smap command-line mode display.
 -- Change in GRES behavior for job steps: A job step's default generic
    resource allocation will be set to that of the job. If a job step's --gres
    value is set to "none" then none of the generic resources which have been
    allocated to the job will be allocated to the job step.
 -- Add srun environment value of SLURM_STEP_GRES to set default --gres value
    for a job step.
 -- Require SchedulerTimeSlice configuration parameter to be at least 5 seconds
    to avoid thrashing slurmd daemon.
 -- Cray - Fix to make nodes state in accounting consistent with state set by
    ALPS.
 -- Cray - A node DOWN to ALPS will be marked DOWN to SLURM only after reaching
    SlurmdTimeout. In the interim, the node state will be NO_RESPOND. This
    change makes behavior makes SLURM handling of the node DOWN state more
    consistent with ALPS. This change effects only Cray systems.
 -- Cray - Fix to work with 4.0.* instead of just 4.0.0
 -- Cray - Modify srun/aprun wrapper to map --exclusive to -F exclusive and
    --share to -F share. Note this does not consider the partition's Shared
    configuration, so it is an imperfect mapping of options.
 -- BLUEGENE - Added notice in the print config to tell if you are emulated
    or not.
 -- BLUEGENE - Fix job step scalability issue with large task count.
 -- BGQ - Improved c-node selection when asked for a sub-block job that
    cannot fit into the available shape.
 -- BLUEGENE - Modify "scontrol show step" to show  I/O nodes (BGL and BGP) or
    c-nodes (BGQ) allocated to each step. Change field name from "Nodes=" to
    "BP_List=".
 -- Code cleanup on step request to get the correct select_jobinfo.
 -- Memory leak fixed for rolling up accounting with down clusters.
 -- BGQ - fix issue where if first job step is the entire block and then the
    next parallel step is ran on a sub block, SLURM won't over subscribe cnodes.
 -- Treat duplicate switch name in topology.conf as fatal error. Patch from Rod
    Schultz, Bull
 -- Minor update to documentation describing the AllowGroups option for a
    partition in the slurm.conf.
 -- Fix problem with _job_create() when not using qos's.  It makes
    _job_create() consistent with similar logic in select_nodes().
 -- GrpCPURunMins in a QOS flushed out.
 -- Fix for squeue -t "CONFIGURING" to actually work.
 -- CRAY - Add cray.conf parameter of SyncTimeout, maximum time to defer job
    scheduling if SLURM node or job state are out of synchronization with ALPS.
 -- If salloc was run as interactive, with job control, reset the foreground
    process group of the terminal to the process group of the parent pid before
    exiting. Patch from Don Albert, Bull.
 -- BGQ - set up the corner of a sub block correctly based on a relative
    position in the block instead of absolute.
 -- BGQ - make sure the recently added select_jobinfo of a step launch request
    isn't sent to the slurmd where environment variables would be overwritten
    incorrectly.

* Changes in SLURM 2.3.0.rc1
============================
 -- NOTE THERE HAVE BEEN NEW FIELDS ADDED TO THE JOB AND PARTITION STATE SAVE
    FILES AND RPCS. PENDING AND RUNNING JOBS WILL BE LOST WHEN UPGRADING FROM
    EARLIER VERSION 2.3 PRE-RELEASES AND RPCS WILL NOT WORK WITH EARLIER
    VERSIONS.
 -- select/cray: Add support for Accelerator information including model and
    memory options.
 -- Cray systems: Add support to suspend/resume salloc command to insure that
    aprun does not get initiated when the job is suspended. Processes suspended
    and resumed are determined by using process group ID and parent process ID,
    so some processes may be missed. Since salloc runs as a normal user, it's
    ability to identify processes associated with a job is limited.
 -- Cray systems: Modify smap and sview to display all nodes even if multiple
    nodes exist at each coordinate.
 -- Improve efficiency of select/linear plugin with topology/tree plugin
    configured, Patch by Andriy Grytsenko (Massive Solutions Limited).
 -- For front-end architectures on which job steps are run (emulated Cray and
    BlueGene systems only), fix bug that would free memory still in use.
 -- Add squeue support to display a job's license information. Patch by Andy
    Roosen (University of Deleware).
 -- Add flag to the select APIs for job suspend/resume indicating if the action
    is for gang scheduling or an explicit job suspend/resume by the user. Only
    an explicit job suspend/resume will reset the job's priority and make
    resources exclusively held by the job available to other jobs.
 -- Fix possible invalid memory reference in sched/backfill. Patch by Andriy
    Grytsenko (Massive Solutions Limited).
 -- Add select_jobinfo to the task launch RPC. Based upon patch by Andriy
    Grytsenko (Massive Solutions Limited).
 -- Add DefMemPerCPU/Node and MaxMemPerCPU/Node to partition configuration.
    This improves flexibility when gang scheduling only specific partitions.
 -- Added new enums to print out when a job is held by a QOS instead of an
    association limit.
 -- Enhancements to sched/backfill performance with select/cons_res plugin.
    Patch from Bjørn-Helge Mevik, University of Oslo.
 -- Correct job run time reported by smap for suspended jobs.
 -- Improve job preemption logic to avoid preempting more jobs than needed.
 -- Add contribs/arrayrun tool providing support for job arrays. Contributed by
    Bjørn-Helge Mevik, University of Oslo. NOTE: Not currently packaged as RPM
    and manual file editing is required.
 -- When suspending a job, wait 2 seconds instead of 1 second between sending
    SIGTSTP and SIGSTOP. Some MPI implementation were not stopping within the
    1 second delay.
 -- Add support for managing devices based upon Linux cgroup container. Based
    upon patch by Yiannis Georgiou, Bull.
 -- Fix memory buffering bug if a AllowGroups parameter of a partition has 100
    or more users. Patch by Andriy Grytsenko (Massive Solutions Limited).
 -- Fix bug in generic resource tracking of gres associated with specific CPUs.
    Resources were being over-allocated.
 -- On systems with front-end nodes (IBM BlueGene and Cray) limit batch jobs to
    only one CPU of these shared resources.
 -- Set SLURM_MEM_PER_CPU or SLURM_MEM_PER_NODE environment variables for both
    interactive (salloc) and batch jobs if the job has a memory limit. For Cray
    systems also set CRAY_AUTO_APRUN_OPTIONS environment variable with the
    memory limit.