NEWS

This file describes changes in recent versions of SLURM. It primarily
documents those changes that are of interest to users and admins.

* Changes in SLURM 2.3.0.pre3
=============================
 -- BGQ - Initial port added, very limited functionality.
 -- Minor typos fixed
 -- Various bug fixes for Cray systems.
 -- Fix bug that when setting a compute node to idle state, it was failing to
    set the systems up_node_bitmap.
 -- BLUEGENE - code reorder
 -- BLUEGENE - Now only one select plugin for all Bluegene systems.
 -- Modify srun to set the SLURM_JOB_NAME environment variable when srun is
    used to create a new job allocation. Not set when srun is used to create a
    job step within an existing job allocation.

* Changes in SLURM 2.3.0.pre2
=============================
 -- Log a job's requeue or cancellation due to preemption to that job's stderr:
    "*** JOB 65547 CANCELLED AT 2011-01-21T12:59:33 DUE TO PREEMPTION ***".
 -- Added new job termination state of JOB_PREEMPTED, "PR" or "PREEMPTED" to
    indicate job termination was due to preemption.
 -- Optimize advanced reservations resource selection for computer topology.
    The logic has been added to select/linear and select/cons_res, but will
    not be enabled until the other select plugins are modified.
 -- Remove checkpoint/xlch plugin.
 -- Disable deletion of partitions that have unfinished jobs (pending,
    running or suspended states). Patch from Martin Perry, BULL.
 -- In sview, disable the sorting of node records by name at startup for
    clusters over 1000 nodes. Users can enable this by selecting the "Name"
    tab. This change dramatically improves scalability of sview.
 -- Report error when trying to change a node's state from scontrol for Cray
    systems.
 -- Do not attempt to read the batch script for non-batch jobs. This patch
    eliminates some inappropriate error messages.
 -- Preserve NodeHostName when reordering nodes due to system topology.
 -- On Cray/ALPS systems  do node inventory before scheduling jobs.
 -- Disable some salloc options on Cray systems.
 -- Disable scontrol's wait_job command on Cray systems.
 -- Disable srun command on native Cray/ALPS systems.
 -- Updated configure option "--enable-cray-emulation" (still under
    development) to emulate a cray XT/XE system, and auto-detect a real
    Cray XT/XE systems (removed no longer needed --enable-cray configure
    option).  Building on native Cray systems requires the
    cray-MySQL-devel-enterprise rpm and expat XML parser library/headers.

* Changes in SLURM 2.3.0.pre1
=============================
 -- Added that when a slurmctld closes the connection to the database it's
    registered host and port are removed.
 -- Added flag to slurmdbd.conf TrackSlurmctldDown where if set will mark idle
    resources as down on a cluster when a slurmctld disconnects or is no
    longer reachable.
 -- Added support for more than one front-end node to run slurmd on
    architectures where the slurmd does not execute on the compute nodes
    (e.g. BlueGene). New configuration parameters FrontendNode and FrontendAddr
    added. See "man slurm.conf" for more information.
 -- With the scontrol show job command when using the --details option, show
    a batch job's script.
 -- Add ability to create reservations or partitions and submit batch jobs
    using sview. Also add the ability to delete reservations and partitions.
 -- Added new configuration parameter MaxJobId. Once reached, restart job ID
    values at FirstJobId.
 -- When restarting slurmctld with priority/basic, increment all job priorities
    so the highest job priority becomes TOP_PRIORITY.

* Changes in SLURM 2.2.3
========================
 -- Update srun, salloc, and sbatch man page description of --distribution
    option. Patches from Rod Schulz, Bull.
 -- Applied patch from Martin Perry to fix "Incorrect results for task/affinity
    block second distribution and cpus-per-task > 1" bug.
 -- Avoid setting a job's eligible time while held (priority == 0).
 -- Substantial performance improvement to backfill scheduling. Patch from
    Bjorn-Helge Mevik, University of Oslo.
 -- Make timeout for communications to the slurmctld be based upon the
    MessageTimeout configuration parameter rather than always 3 seconds.
    Patch from Matthieu Hautreux, CEA.
 -- Add new scontrol option of "show aliases" to report every NodeName that is
    associated with a given NodeHostName when running multiple slurmd daemons
    per compute node (typically used for testing purposes). Patch from
    Matthieu Hautreux, CEA.

* Changes in SLURM 2.2.2
========================
 -- Correct logic to set correct job hold state (admin or user) when setting
    the job's priority using scontrol's "update jobid=..." rather than its
    "hold" or "holdu" commands.
 -- Modify squeue to report unset --mincores, --minthreads or --extra-node-info
    values as "*" rather than 65534. Patch from Rod Schulz, BULL.
 -- Report the StartTime of a job as "Unknown" rather than the year 2106 if its
    expected start time was too far in the future for the backfill scheduler
    to compute.
 -- Prevent a pending job reason field from inappropriately being set to
    "Priority".
 -- In sched/backfill with jobs having QOS_FLAG_NO_RESERVE set, then don't
    consider the job's time limit when attempting to backfill schedule. The job
    will just be preempted as needed at any time.
 -- Eliminated a bug in sbatch when no valid target clusters are specified.
 -- When explicitly sending a signal to a job with the scancel command and that
    job is in a pending state, then send the request directly to the slurmctld
    daemon and do not attempt to send the request to slurmd daemons, which are
    not running the job anyway.
 -- In slurmctld, properly set the up_node_bitmap when setting it's state to
    IDLE (in case the previous node state was DOWN).
 -- Fix smap to process block midplane names correctly when on a bluegene
    system.
 -- Fix smap to once again print out the Letter 'ID' for each line of a block/
    partition view.
 -- Corrected the NOTES section of the scancel man page
 -- Fix for accounting_storage/mysql plugin to correctly query cluster based
    transactions.
 -- Fix issue when updating database for clusters that were previously deleted
    before upgrade to 2.2 database.
 -- BLUEGENE - Handle mesh torus check better in dynamic mode.
 -- BLUEGENE - Fixed race condition when freeing block, most likely only would
    happen in emulation.
 -- Fix for calculating used QOS limits correctly on a slurmctld reconfig.
 -- BLUEGENE - Fix for bad conn-type set when running small blocks in HTC mode.
 -- If salloc's --no-shell option is used, then do not attempt to preserve the
    terminal's state.
 -- Add new SLURM configure time parameter of --disable-salloc-background. If
    set, then salloc can only execute in the foreground. If started in the
    background, then a message will be printed and the job allocation halted
    until brought into the foreground.
    NOTE: THIS IS A CHANGE IN DEFAULT SALLOC BEHAVIOR FROM V2.2.1, BUT IS
    CONSISTENT WITH V2.1 AND EARLIER.
 -- Added the Multi-Cluster Operation web page.
 -- Removed remnant code for enforcing max sockets/cores/threads in the
    cons_res plugin (see last item in 2.1.0-pre5).  This was responsible
    for a bug reported by Rod Schultz.
 -- BLUEGENE - Set correct env vars for HTC mode on a P system to get correct
    block.
 -- Correct RunTime reported by "scontrol show job" for pending jobs.

* Changes in SLURM 2.2.1
========================
 -- Fix setting derived exit code correctly for jobs that happen to have the
    same jobid.
 -- Better checking for time overflow when rolling up in accounting.
 -- Add scancel --reservation option to cancel all jobs associated with a
    specific reservation.
 -- Treat reservation with no nodes like one that starts later (let jobs of any
    size get queued and do not block any pending jobs).
 -- Fix bug in gang scheduling logic that would temporarily resume to many jobs
    after a job completed.
 -- Change srun message about job step being deferred due to SlurmctldProlog
    running to be more clear and only print when --verbose option is used.
 -- Made it so you could remove the hold on jobs with sview by setting the
    priority to infinite.
 -- BLUEGENE - better checking small blocks in dynamic mode whether a full
    midplane job could run or not.
 -- Decrease the maximum sleep time between srun job step creation retry
    attempts from 60 seconds to 29 seconds. This should eliminate a possible
    synchronization problem with gang scheduling that could result in job
    step creation requests only occuring when a job is suspended.
 -- Fix to prevent changing a held job's state from HELD to DEPENDENCY
    until the job is released. Patch from Rod Schultz, Bull.
 -- Fixed sprio -M to reflect PriorityWeight values from remote cluster.
 -- Fix bug in sview when trying to update arbitrary field on more than one
    job. Formerly would display information about one job, but update next
    selected job.
 -- Made it so QOS with UsageFactor set to 0 would make it so jobs running
    under that QOS wouldn't add time to fairshare or association/qos
    limits.
 -- Fixed issue where QOS priority wasn't re-normalized until a slurmctld
    restart when a QOS priority was changed.
 -- Fix sprio to use calculated numbers from slurmctld instead of
    calulating it own numbers.
 -- BLUEGENE - fixed race condition with preemption where if the wind blows the
    right way the slurmctld could lock up when preempting jobs to run others.
 -- BLUEGENE - fixed epilog to wait until MMCS job is totally complete before
    finishing.
 -- BLUEGENE - more robust checking for states when freeing blocks.
 -- Added correct files to the slurm.spec file for correct perl api rpm
    creation.
 -- Added flag "NoReserve" to a QOS to make it so all jobs are created equal
    within a QOS.  So if larger, higher priority jobs are unable to run they
    don't prevent smaller jobs from running even if running the smaller
    jobs delay the start of the larger, higher priority jobs.
 -- BLUEGENE - Check preemptees one by one to preempt lower priority jobs first
    instead of first fit.
 -- In select/cons_res, correct handling of the option
    SelectTypeParameters=CR_ONE_TASK_PER_CORE.
 -- Fix for checking QOS to override partition limits, previously if not using
    QOS some limits would be overlooked.
 -- Fix bug which would terminate a job step if any of the nodes allocated to
    it were removed from the job's allocation. Now only the tasks on those
    nodes are terminated.
 -- Fixed issue when using a storage_accounting plugin directly without the
    slurmDBD updates weren't always sent correctly to the slurmctld, appears to
    OS dependent, reported by Fredrik Tegenfeldt.

* Changes in SLURM 2.2.0
========================
 -- Change format of Duration field in "scontrol show reservation" output from
    an integer number of minutes to "[days-]hours:minutes:seconds".
 -- Add support for changing the reservation of pending or running jobs.
 -- On Cray systems only, salloc sends SIGKILL to spawned process group when
    job allocation is revoked. Patch from Gerrit Renker, CSCS.