NEWS

This file describes changes in recent versions of SLURM. It primarily
documents those changes that are of interest to users and admins.

* Changes in SLURM 2.6.0-pre1
=============================
 -- Add "state" field to job step information reported by scontrol.
 -- Notify srun to retry step creation upon completion of other job steps
    rather than polling. This results in much faster throughput for job step
    execution with --exclusive option.
 -- Added "ResvEpilog" and "ResvProlog" configuration parameters to execute a
    program at the beginning and end of each reservation.
 -- Added "slurm_load_job_user" function. This is a variation of
    "slurm_load_jobs", but accepts a user ID argument, potentially resulting
    in substantial performance improvement for "squeue --user=ID"
 -- Added "slurm_load_node_single" function. This is a variation of
    "slurm_load_nodes", but accepts a node name argument, potentially resulting
    in substantial performance improvement for "sinfo --nodes=NAME".
 -- Added "HealthCheckNodeState" configuration parameter identify node states
    on which HealthCheckProgram should be executed.

* Changes in SLURM 2.5.1
========================
 -- Correction to hostlist sorting for hostnames that contain two numeric
    components and the first numeric component has various sizes (e.g.
    "rack9blade1" should come before "rack10blade1")
 -- BGQ - Only poll on initialized blocks instead of calling getBlocks on
    each block independently.
 -- Fix of task/affinity plugin logic for Power7 processors having hyper-
    threading disabled (cpu mask has gaps).
 -- Fix of job priority ordering with sched/builtin and priority/multifactor.
    Patch from Chris Read.
 -- CRAY - Fix for setting up the aprun for a large job (+2000 nodes).
 -- Fix for race condition related to compute node boot resulting in node being
    set down with reason of "Node <name> unexpectedly rebooted"
 -- RAPL - Fix for handling errors when opening msr files.
 -- BGQ - Fix for salloc/sbatch to do the correct allocation when asking for
    -N1 -n#.
 -- BGQ - in emulation make it so we can pretend to run large jobs (>64k nodes)
 -- BLUEGENE - Correct method to update conn_type of a job.
 -- BLUEGENE - Fix issue with preemption when needing to preempt multiple jobs
    to make one job run.
 -- Fixed issue where if an srun dies inside of an allocation abnormally it
    would of also killed the allocation.
 -- FRONTEND - fixed issue where if a systems nodes weren't defined in the
    slurm.conf with NodeAddr's signals going to a step could be handled
    incorrectly.

* Changes in SLURM 2.5.0
========================
 -- Add DenyOnLimit flag for QOS to deny jobs at submission time if they
    request resources that reach a 'Max' limit.
 -- Permit SlurmUser or operator to change QOS of non-pending jobs (e.g.
    running jobs).
 -- BGQ - move initial poll to beginning of realtime interaction, which will
    also cause it to run if the realtime server ever goes away.

* Changes in SLURM 2.5.0-rc2
============================
 -- Modify sbcast logic to survive slurmd daemon restart while file a
    transmission is in progress.
 -- Add retry logic to munge encode/decode calls. This is needed if the munge
    deamon is under very heavy load (e.g. with 1000 slurmd daemons per compute
    node).
 -- Add launch and acct_gather_energy plugins to RPMs.
 -- Restore support for srun "--mpi=list" option.
 -- CRAY - Introduce step accounting for a Cray.
 -- Modify srun to abandon I/O 60 seconds after the last task ends. Otherwise
    an aborted slurmstepd can cause the srun process to hang indefinitely.
 -- ENERGY - RAPL - alter code to close open files (and only open them once
    where needed)
 -- If the PrologSlurmctld fails, then requeue the job an indefinite number
    of times instead of only one time.

* Changes in SLURM 2.5.0-rc1
============================
 -- Added Prolog and Epilog Guide (web page). Based upon work by Jason Sollom,
    Cray Inc. and used by permission.
 -- Restore gang scheduling functionality. Preemptor was not being scheduled.
    Fix for bugzilla #3.
 -- Add "cpu_load" to node information. Populate CPULOAD in node information
    reported to Moab cluster manager.
 -- Preempt jobs only when insufficient idle resources exist to start job,
    regardless of the node weight.
 -- Added priority/multifactor2 plugin based upon ticket distribution system.
    Work by Janne Blomqvist, Aalto University.
 -- Add SLURM_NODELIST to environment variables available to Prolog and Epilog.
 -- Permit reservations to allow or deny access by account and/or user.
 -- Add ReconfigFlags value of KeepPartState. See "man slurm.conf" for details.
 -- Modify the task/cgroup plugin adding a task_pre_launch_priv function and
    move slurmstepd outside of the step's cgroup. Work by Matthieu Hautreux.
 -- Intel MIC processor support added using gres/mic plugin. BIG thanks to
    Olli-Pekka Lehto, CSC-IT Center for Science Ltd.
 -- Accounting - Change empty jobacctinfo structs to not actually be used
    instead of putting 0's into the database we put NO_VALS and have sacct
    figure out jobacct_gather wasn't used.
 -- Cray - Prevent calling basil_confirm more than once per job using a flag.
 -- Fix bug with topology/tree and job with min-max node count. Now try to
    get max node count rather than minimizing leaf switches used.
 -- Add AccountingStorageEnforce=safe option to provide method to avoid jobs
    launching that wouldn't be able to run to completion because of a
    GrpCPUMins limit.
 -- Add support for RFC 5424 timestamps in logfiles. Disable with configuration
    option of "--disable-rfc5424time". By Janne Blomqvist, Aalto University.
 -- CRAY - Replace srun.pl with launch/aprun plugin to use srun to wrap the
    aprun process instead of a perl script.
 -- srun - Rename --runjob-opts to --launcher-opts to be used on systems other
    than BGQ.
 -- Added new DebugFlags - Energy for AcctGatherEnergy plugins.
 -- start deprecation of sacct --dump --fdump
 -- BGQ - added --verbose=OFF when srun --quiet is used
 -- Added acct_gather_energy/rapl plugin to record power consumption by job.
    Work by Yiannis Georgiou, Martin Perry, et. al., Bull

* Changes in SLURM 2.5.0.pre3
=============================
 -- Add Google search to all web pages.
 -- Add sinfo -T option to print reservation information. Work by Bill Brophy,
    Bull.
 -- Force slurmd exit after 2 minute wait, even if threads are hung.
 -- Change node_req field in struct job_resources from 8 to 32 bits so we can
    run more than 256 jobs per node.
 -- sched/backfill: Improve accuracy of expected job start with respect to
    reservations.
 -- sinfo partition field size will be set the the length of the longest
    partition name by default.
 -- Make it so the parse_time will return a valid 0 if given epoch time and
    set errno == ESLURM_INVALID_TIME_VALUE on error instead.
 -- Correct srun --no-alloc logic when node count exceeds node list or task
    task count is not a multiple of the node count. Work by Hongjia Cao, NUDT.
 -- Completed integration with IBM Parallel Environment including POE and IBM's
    NRT switch library.

* Changes in SLURM 2.5.0.pre2
=============================
 -- When running with multiple slurmd daemons per node, enable specifying a
    range of ports on a single line of the node configuration in slurm.conf.
 -- Add reservation flag of Part_Nodes to allocate all nodes in a partition to
    a reservation and automatically change the reservation when nodes are
    added to or removed from the reservation. Based upon work by
    Bill Brophy, Bull.
 -- Add support for advanced reservation for specific cores rather than whole
    nodes. Current limiations: homogeneous cluster, nodes idle when reservation
    created, and no more than one reservation per node. Code is still under
    development. Work by Alejandro Lucero Palau, et. al, BSC.
 -- Add DebugFlag of Switch to log switch plugin details.
 -- Correct job node_cnt value in job completion plugin when job fails due to
    down node. Previously was too low by one.
 -- Add new srun option --cpu-freq to enable user control over the job's CPU
    frequency and thus it's power consumption. NOTE: cpu frequency is not
    currently preserved for jobs being suspended and later resumed. Work by
    Don Albert, Bull.
 -- Add node configuration information about "boards" and optimize task
    placement on minimum number of boards. Work by Rod Schultz, Bull.

* Changes in SLURM 2.5.0.pre1
=============================
 -- Add new output to "scontrol show configuration" of LicensesUsed. Output is
    "name:used/total"
 -- Changed jobacct_gather plugin infrastructure to be cleaner and easier to
    maintain.
 -- Change license option count separator from "*" to ":" for consistency with
    the gres option (e.g. "--licenses=foo:2 --gres=gpu:2"). The "*" will still
    be accepted, but is no longer documented.
 -- Permit more than 100 jobs to be scheduled per node (new limit is 250
    jobs).
 -- Restructure of srun code to allow outside programs to utilize existing
    logic.

* Changes in SLURM 2.4.6
========================
 -- Correct WillRun authentication logic when issued for non-job owner.
 -- BGQ - fix memory leak
 -- BGQ - Fix to check block for action 'D' if it also has nodes in error.

* Changes in SLURM 2.4.5
========================
 -- Cray - On job kill requeust, send SIGCONT, SIGTERM, wait KillWait and send
    SIGKILL. Previously just sent SIGKILL to tasks.
 -- BGQ - Fix issue when running srun outside of an allocation and only
    specifying the number of tasks and not the number of nodes.
 -- BGQ - validate correct ntasks_per_node
 -- BGQ - when srun -Q is given make runjob be quiet
 -- Modify use of OOM (out of memory protection) for Linux 2.6.36 kernel
    or later. NOTE: If you were setting the environment variable
    SLURMSTEPD_OOM_ADJ=-17, it should be set to -1000 for Linux 2.6.36 kernel
    or later.
 -- BGQ - Fix job step timeout actually happen when done from within an
    allocation.
 -- Reset node MAINT state flag when a reservation's nodes or flags change.
 -- Accounting - Fix issue where QOS usage was being zeroed out on a
    slurmctld restart.
 -- BGQ - Add 64 tasks per node as a valid option for srun when used with
    overcommit.
 -- BLUEGENE - With Dynamic layout mode - Fix issue where if a larger block
    was already in error and isn't deallocating and underlying hardware goes
    bad one could get overlapping blocks in error making the code assert when
    a new job request comes in.
 -- BGQ - handle pending actions on a block better when trying to deallocate it.
 -- Accounting - Fixed issue where if nodenames have changed on a system and
    you query against that with -N and -E you will get all jobs during that