NEWS

This file describes changes in recent versions of SLURM. It primarily
documents those changes that are of interest to users and admins.

* Changes in Slurm 2.6.6
========================
 -- sched/backfill - Fix bug that could result in failing to reserve resources
    for high priority jobs.
 -- Correct job RunTime if requeued from suspended state.
 -- Reset job priority from zero (held) on manual resume from suspend state.
 -- If FastSchedule=0 then do not DOWN a node with low memory or disk size.
 -- Remove vestigial note.
 -- Update sshare.1 man page making it consistent with sacctmgr.1.
 -- Do not reset a job's priority when the slurmctld restarts if previously
    set to some specific value.
 -- sview - Fix regression where the Node tab wasn't able to add/remove columns.
 -- Fix slurmstepd lock when job terminates inside the infiniband
    network traffic accounting plugin.
 -- Correct the documentation to read filesystem instead of Lustre. Update
    the srun help.
 -- Fix the acct_gather_filesystem_lustre.c to compute the Lustre accounting
    data correctly accumulating differences between sampling intervals.
    Fix the data structure mismatch between acct_gather_filesystem_lustre.c
    and slurm_jobacct_gather.h which caused the hdf5 plugin to log incorrect
    data.

* Changes in Slurm 2.6.5
========================
 -- Correction to hostlist parsing bug introduced in v2.6.4 for hostlists with
    more than one numeric range in brackets (e.g. rack[0-3]_blade[0-63]").
 -- Add notification if using proctrack/cgroup and task/cgroup when oom hits.
 -- Corrections to advanced reservation logic with overlapping jobs.
 -- job_submit/lua - add cpus_per_task field to those available.
 -- Add cpu_load to the node information available using the Perl API.
 -- Correct a job's GRES allocation data in accounting records for non-Cray
    systems.
 -- Substantial performance improvement for systems with Shared=YES or FORCE
    and large numbers of running jobs (replace bubble sort with quick sort).
 -- proctrack/cgroup - Add locking to prevent race condition where one job step
    is ending for a user or job at the same time another job stepsis starting
    and the user or job container is deleted from under the starting job step.
 -- Fixed sh5util loop when there are no node-step files.
 -- Fix race condition on batch job termination that could result in a job exit
    code of 0xfffffffe if the slurmd on node zero registers its active jobs at
    the same time that slurmstepd is recording the job's exit code.
 -- Correct logic returning remaining job dependencies in job information
    reported by scontrol and squeue. Eliminates vestigial descriptors with
    no job ID values (e.g. "afterany").
 -- Improve performance of REQUEST_JOB_INFO_SINGLE RPC by removing unnecessary
    locks and use hash function to find the desired job.
 -- jobcomp/filetxt - Reopen the file when slurmctld daemon is reconfigured
    or gets SIGHUP.
 -- Remove notice of CVE with very old/deprecated versions of Slurm in
    news.html.
 -- Fix if hwloc_get_nbobjs_by_type() returns zero core count (set to 1).
 -- Added ApbasilTimeout parameter to the cray.conf configuration file.
 -- Handle in the API if parts of the node structure are NULL.
 -- Fix srun hang when IO fails to start at launch.
 -- Fix for GRES bitmap not matching the GRES count resulting in abort
    (requires manual resetting of GRES count, changes to gres.conf file,
    and slurmd restarts).
 -- Modify sview to better support job arrays.
 -- Modify squeue to support longer job ID values (for many job array tasks).
 -- Fix race condition in authentication credential creation that could corrupt
    memory. (NOTE: This race condition has existed since 2003 and would be
    exceedingly rare.)
 -- HDF5 - Fix minor memory leak.
 -- Slurmstepd variable initialization - Without this patch, free() is called
    on a random memory location (i.e. whatever is on the stack), which can
    result in slurmstepd dying and a completed job not being purged in a
    timely fashion.
 -- Fix slurmstepd race condition when separate threads are reading and
    modifying the job's environment, which can result in the slurmstepd failing
    with an invalid memory reference.
 -- Fix erroneous error messages when running gang scheduling.
 -- Fix minor memory leak.
 -- scontrol modified to suspend, resume, hold, uhold, or release multiple
    jobs in a space separated list.
 -- Minor debug error when a connection goes away at the end of a job.
 -- Validate return code from calls to slurm_get_peer_addr
 -- BGQ - Fix issues with making sure all cnodes are accounted for when mulitple
    steps cause multiple cnodes in one allocation to go into error at the
    same time.
 -- scontrol show job - Correct NumNodes value calculated based upon job
    specifications.
 -- BGQ - Fix issue if user runs multiple sub-block jobs inside a multiple
    midplane block that starts on a higher coordinate than it ends (i.e if a
    block has midplanes [0010,0013] 0013 is the start even though it is
    listed second in the hostlist).
 -- BGQ - Add midplane to the total_cnodes used in the runjob_mux plugin
    for better debug.
 -- Update AllocNodes paragraph in slurm.conf.5.

* Changes in Slurm 2.6.4
========================
 -- Fixed sh5util to print its usage.
 -- Corrected commit f9a3c7e4e8ec.
 -- Honor ntasks-per-node option with exclusive node allocations.
 -- sched/backfill - Prevent invalid memory reference if bf_continue option is
    configured and slurm is reconfigured during one of the sleep cycles or if
    there are any changes to the partition configuration or if the normal
    scheduler runs and starts a job that the backfill scheduler is actively
    working on.
 -- Update man pages information about acct-freq and JobAcctGatherFrequency
    to reflect only the latest supported format.
 -- Minor document update to include note about PrivateData=Usage for the
    slurm.conf when using the DBD.
 -- Expand information reported with DebugFlags=backfill.
 -- Initiate jobs pending to run in a reservation as soon as the reservation
    becomes active.
 -- Purged expired reservation even if it has pending jobs.
 -- Corrections to calculation of a pending job's expected start time.
 -- Remove some vestigial logic treating job priority of 1 as a special case.
 -- Memory freeing up to avoid minor memory leaks at close of daemons
 -- Updated documentation to give correct units being displayed.
 -- Report AccountingStorageBackupHost with "scontrol show config".
 -- init scripts ignore quotes around Pid file name specifications.
 -- Fixed typo about command case in quickstart.html.
 -- task/cgroup - handle new cpuset files, similar to commit c4223940.
 -- Replace the tempname() function call with mkstemp().
 -- Fix for --cpu_bind=map_cpu/mask_cpu/map_ldom/mask_ldom plus
    --mem_bind=map_mem/mask_mem options, broken in 2.6.2.
 -- Restore default behavior of allocating cores to jobs on a cyclic basis
    across the sockets unless SelectTypeParameters=CR_CORE_DEFAULT_DIST_BLOCK
    or user specifies other distribution options.
 -- Enforce JobRequeue configuration parameter on node failure. Previously
    always requeued the job.
 -- acct_gather_energy/ipmi - Add delay before retry on read error.
 -- select/cons_res with GRES and multiple threads per core, fix possible
    infinite loop.
 -- proctrack/cgroup - Add cgroup create retry logic in case one step is
    starting at the same time as another step is ending and the logic to create
    and delete cgroups overlaps.
 -- Improve setting of job wait "Reason" field.
 -- Correct sbatch documentation and job_submit/pbs plugin "%j" is job ID,
    not "%J" (which is job_id.step_id).
 -- Improvements to sinfo performance, especially for large numbers of
    partitions.
 -- SlurmdDebug - Permit changes to slurmd debug level with "scontrol reconfig"
 -- smap - Avoid invalid memory reference with hidden nodes.
 -- Fix sacctmgr modify qos set preempt+/-=.
 -- BLUEGENE - fix issue where node count wasn't set up correctly when srun
    preforms the allocation, regression in 2.6.3.
 -- Add support for dependencies of job array elements (e.g.
    "sbatch --depend=afterok:123_4 ...") or all elements of a job array (e.g.
    "sbatch --depend=afterok:123 ...").
 -- Add support for new options in sbatch qsub wrapper:
    -W block=true	(wait for job completion)
    Clear PBS_NODEFILE environment variable
 -- Fixed the MaxSubmitJobsPerUser limit in QOS which limited submissions
    a job too early.
 -- sched/wiki, sched/wiki2 - Fix to work with change logic introduced in
    version 2.6.3 preventing Maui/Moab from starting jobs.
 -- Updated the QOS limits documentation and man page.

* Changes in Slurm 2.6.3
========================
 -- Add support for some new #PBS options in sbatch scripts and qsub wrapper:
    -l accelerator=true|false	(GPU use)
    -l mpiprocs=#	(processors per node)
    -l naccelerators=#	(GPU count)
    -l select=#		(node count)
    -l ncpus=#		(task count)
    -v key=value	(environment variable)
    -W depend=opts	(job dependencies, including "on" and "before" options)
    -W umask=#		(set job's umask)
 -- Added qalter and qrerun commands to torque package.
 -- Corrections to qstat logic: job CPU count and partition time format.
 -- Add job_submit/pbs plugin to translate PBS job dependency options to the
    extend possible (no support for PBS "before" options) and set some PBS
    environment variables.
 -- Add spank/pbs plugin to set a bunch of PBS environment variables.
 -- Backported sh5util from master to 2.6 as there are some important
    bugfixes and the new item extraction feature.
 -- select/cons_res - Correct MacCPUsPerNode partition constraint for CR_Socket.
 -- scontrol - for setdebugflags command, avoid parsing "-flagname" as an
    scontrol command line option.
 -- Fix issue with step accounting if a job is requeued.
 -- Close file descriptors on exec of prolog, epilog, etc.
 -- Fix issue when a user has held a job and then sets the begin time
    into the future.
 -- Scontrol - Enable changing a job's stdout file.
 -- Fix issues where memory or node count of a srun job is altered while the
    srun is pending.  The step creation would use the old values and possibly
    hang srun since the step wouldn't be able to be created in the modified
    allocation.
 -- Add support for new SchedulerParameters value of "bf_max_job_part", the
    maximum depth the backfill scheduler should go in any single partition.
 -- acct_gather/infiniband plugin - Correct packets_in/out values.
 -- BLUEGENE - Don't ignore a conn-type request from the user.
 -- BGQ - Force a request on a Q for a MESH to be a TORUS in a dimension that
    can only be a TORUS (1).
 -- Change max message length from 100MB to 1GB before generating "Insane
    message length" error.
 -- sched/backfill - Prevent possible memory corruption due to use of
    bf_continue option and long running scheduling cycle (pending jobs could
    have been cancelled and purged).
 -- CRAY - fix AcceleratorAllocation depth correctly for basil 1.3
 -- Created the environment variable SLURM_JOB_NUM_NODES for srun jobs and
    updated the srun man page.
 -- BLUEGENE/CRAY - Don't set env variables that pertain to a node when Slurm