NEWS

This file describes changes in recent versions of SLURM. It primarily
documents those changes that are of interest to users and admins.

* Changes in Slurm 2.6.7
========================
 -- Properly enforce a job's cpus-per-task option when a job's allocation is
    constrained on some nodes by the mem-per-cpu option.
 -- Correct the slurm.conf man pages and checkpoint_blcr.html page
    describing that jobs must be drained from cluster before deploying
    any checkpoint plugin.
 -- Fix issue where if using munge and munge wasn't running and a slurmd
    needed to forward a message the slurmd would core dump.
 -- Update srun.1 man page documenting the PMI2 support.
 -- Fix slurmctld core dump when a jobs gets its qos updated but there
    is not a corresponding association.

* Changes in Slurm 2.6.6
========================
 -- sched/backfill - Fix bug that could result in failing to reserve resources
    for high priority jobs.
 -- Correct job RunTime if requeued from suspended state.
 -- Reset job priority from zero (held) on manual resume from suspend state.
 -- If FastSchedule=0 then do not DOWN a node with low memory or disk size.
 -- Remove vestigial note.
 -- Update sshare.1 man page making it consistent with sacctmgr.1.
 -- Do not reset a job's priority when the slurmctld restarts if previously
    set to some specific value.
 -- sview - Fix regression where the Node tab wasn't able to add/remove columns.
 -- Fix slurmstepd lock when job terminates inside the infiniband
    network traffic accounting plugin.
 -- Correct the documentation to read filesystem instead of Lustre. Update
    the srun help.
 -- Fix the acct_gather_filesystem_lustre.c to compute the Lustre accounting
    data correctly accumulating differences between sampling intervals.
    Fix the data structure mismatch between acct_gather_filesystem_lustre.c
    and slurm_jobacct_gather.h which caused the hdf5 plugin to log incorrect
    data.
 -- Don't allow PMI_TIME to be zero which will cause floating exception.
 -- Fix purging of old reservation errors in database.
 -- MYSQL - If starting the plugin and the database isn't up attempt to
    connect in a loop instead of producing a fatal.
 -- BLUEGENE - If IONodesPerMP changes in bluegene.conf recalculate bitmaps
    based on ionode count correctly on slurmctld restart.
 -- Fix step allocation when some CPUs are not available due to memory limits.
    This happens when one step is active and using memory that blocks the
    scheduling of another step on a portion of the CPUs needed. The new step
    is now delayed rather than aborting with "Requested node configuration is
    not available".
 -- Make sure node limits get assessed if no node count was given in request.
 -- Removed obsolete slurm_terminate_job() API.
 -- Update documentation about QOS limits
 -- Retry task exit message from slurmstepd to srun on message timeout.
 -- Correction to logic reserving all nodes in a specified partition.
 -- Added support for selecting AMD GPU by setting GPU_DEVICE_ORDINAL env var.
 -- Properly enforce GrpSubmit limit for job arrays.
 -- CRAY - fix issue with using CR_ONE_TASK_PER_CORE
 -- CRAY - fix memory leak when using accelerators

* Changes in Slurm 2.6.5
========================
 -- Correction to hostlist parsing bug introduced in v2.6.4 for hostlists with
    more than one numeric range in brackets (e.g. rack[0-3]_blade[0-63]").
 -- Add notification if using proctrack/cgroup and task/cgroup when oom hits.
 -- Corrections to advanced reservation logic with overlapping jobs.
 -- job_submit/lua - add cpus_per_task field to those available.
 -- Add cpu_load to the node information available using the Perl API.
 -- Correct a job's GRES allocation data in accounting records for non-Cray
    systems.
 -- Substantial performance improvement for systems with Shared=YES or FORCE
    and large numbers of running jobs (replace bubble sort with quick sort).
 -- proctrack/cgroup - Add locking to prevent race condition where one job step
    is ending for a user or job at the same time another job stepsis starting
    and the user or job container is deleted from under the starting job step.
 -- Fixed sh5util loop when there are no node-step files.
 -- Fix race condition on batch job termination that could result in a job exit
    code of 0xfffffffe if the slurmd on node zero registers its active jobs at
    the same time that slurmstepd is recording the job's exit code.
 -- Correct logic returning remaining job dependencies in job information
    reported by scontrol and squeue. Eliminates vestigial descriptors with
    no job ID values (e.g. "afterany").
 -- Improve performance of REQUEST_JOB_INFO_SINGLE RPC by removing unnecessary
    locks and use hash function to find the desired job.
 -- jobcomp/filetxt - Reopen the file when slurmctld daemon is reconfigured
    or gets SIGHUP.
 -- Remove notice of CVE with very old/deprecated versions of Slurm in
    news.html.
 -- Fix if hwloc_get_nbobjs_by_type() returns zero core count (set to 1).
 -- Added ApbasilTimeout parameter to the cray.conf configuration file.
 -- Handle in the API if parts of the node structure are NULL.
 -- Fix srun hang when IO fails to start at launch.
 -- Fix for GRES bitmap not matching the GRES count resulting in abort
    (requires manual resetting of GRES count, changes to gres.conf file,
    and slurmd restarts).
 -- Modify sview to better support job arrays.
 -- Modify squeue to support longer job ID values (for many job array tasks).
 -- Fix race condition in authentication credential creation that could corrupt
    memory. (NOTE: This race condition has existed since 2003 and would be
    exceedingly rare.)
 -- HDF5 - Fix minor memory leak.
 -- Slurmstepd variable initialization - Without this patch, free() is called
    on a random memory location (i.e. whatever is on the stack), which can
    result in slurmstepd dying and a completed job not being purged in a
    timely fashion.
 -- Fix slurmstepd race condition when separate threads are reading and
    modifying the job's environment, which can result in the slurmstepd failing
    with an invalid memory reference.
 -- Fix erroneous error messages when running gang scheduling.
 -- Fix minor memory leak.
 -- scontrol modified to suspend, resume, hold, uhold, or release multiple
    jobs in a space separated list.
 -- Minor debug error when a connection goes away at the end of a job.
 -- Validate return code from calls to slurm_get_peer_addr
 -- BGQ - Fix issues with making sure all cnodes are accounted for when mulitple
    steps cause multiple cnodes in one allocation to go into error at the
    same time.
 -- scontrol show job - Correct NumNodes value calculated based upon job
    specifications.
 -- BGQ - Fix issue if user runs multiple sub-block jobs inside a multiple
    midplane block that starts on a higher coordinate than it ends (i.e if a
    block has midplanes [0010,0013] 0013 is the start even though it is
    listed second in the hostlist).
 -- BGQ - Add midplane to the total_cnodes used in the runjob_mux plugin
    for better debug.
 -- Update AllocNodes paragraph in slurm.conf.5.

* Changes in Slurm 2.6.4
========================
 -- Fixed sh5util to print its usage.
 -- Corrected commit f9a3c7e4e8ec.
 -- Honor ntasks-per-node option with exclusive node allocations.
 -- sched/backfill - Prevent invalid memory reference if bf_continue option is
    configured and slurm is reconfigured during one of the sleep cycles or if
    there are any changes to the partition configuration or if the normal
    scheduler runs and starts a job that the backfill scheduler is actively
    working on.
 -- Update man pages information about acct-freq and JobAcctGatherFrequency
    to reflect only the latest supported format.
 -- Minor document update to include note about PrivateData=Usage for the
    slurm.conf when using the DBD.
 -- Expand information reported with DebugFlags=backfill.
 -- Initiate jobs pending to run in a reservation as soon as the reservation
    becomes active.
 -- Purged expired reservation even if it has pending jobs.
 -- Corrections to calculation of a pending job's expected start time.
 -- Remove some vestigial logic treating job priority of 1 as a special case.
 -- Memory freeing up to avoid minor memory leaks at close of daemons
 -- Updated documentation to give correct units being displayed.
 -- Report AccountingStorageBackupHost with "scontrol show config".
 -- init scripts ignore quotes around Pid file name specifications.
 -- Fixed typo about command case in quickstart.html.
 -- task/cgroup - handle new cpuset files, similar to commit c4223940.
 -- Replace the tempname() function call with mkstemp().
 -- Fix for --cpu_bind=map_cpu/mask_cpu/map_ldom/mask_ldom plus
    --mem_bind=map_mem/mask_mem options, broken in 2.6.2.
 -- Restore default behavior of allocating cores to jobs on a cyclic basis
    across the sockets unless SelectTypeParameters=CR_CORE_DEFAULT_DIST_BLOCK
    or user specifies other distribution options.
 -- Enforce JobRequeue configuration parameter on node failure. Previously
    always requeued the job.
 -- acct_gather_energy/ipmi - Add delay before retry on read error.
 -- select/cons_res with GRES and multiple threads per core, fix possible
    infinite loop.
 -- proctrack/cgroup - Add cgroup create retry logic in case one step is
    starting at the same time as another step is ending and the logic to create
    and delete cgroups overlaps.
 -- Improve setting of job wait "Reason" field.
 -- Correct sbatch documentation and job_submit/pbs plugin "%j" is job ID,
    not "%J" (which is job_id.step_id).
 -- Improvements to sinfo performance, especially for large numbers of
    partitions.
 -- SlurmdDebug - Permit changes to slurmd debug level with "scontrol reconfig"
 -- smap - Avoid invalid memory reference with hidden nodes.
 -- Fix sacctmgr modify qos set preempt+/-=.
 -- BLUEGENE - fix issue where node count wasn't set up correctly when srun
    preforms the allocation, regression in 2.6.3.
 -- Add support for dependencies of job array elements (e.g.
    "sbatch --depend=afterok:123_4 ...") or all elements of a job array (e.g.
    "sbatch --depend=afterok:123 ...").
 -- Add support for new options in sbatch qsub wrapper:
    -W block=true	(wait for job completion)
    Clear PBS_NODEFILE environment variable
 -- Fixed the MaxSubmitJobsPerUser limit in QOS which limited submissions
    a job too early.
 -- sched/wiki, sched/wiki2 - Fix to work with change logic introduced in
    version 2.6.3 preventing Maui/Moab from starting jobs.
 -- Updated the QOS limits documentation and man page.

* Changes in Slurm 2.6.3
========================
 -- Add support for some new #PBS options in sbatch scripts and qsub wrapper:
    -l accelerator=true|false	(GPU use)
    -l mpiprocs=#	(processors per node)
    -l naccelerators=#	(GPU count)
    -l select=#		(node count)
    -l ncpus=#		(task count)
    -v key=value	(environment variable)
    -W depend=opts	(job dependencies, including "on" and "before" options)
    -W umask=#		(set job's umask)
 -- Added qalter and qrerun commands to torque package.
 -- Corrections to qstat logic: job CPU count and partition time format.