NEWS

This file describes changes in recent versions of Slurm. It primarily
documents those changes that are of interest to users and administrators.

* Changes in Slurm 16.05.0pre1
===============================
 -- Add sbatch "--wait" option that waits for job completion before exiting.
    Exit code will match that of spawned job.
 -- Modify advanced reservation save/restore logic for core reservations to
    support configuration changes (changes in configured nodes or cores counts).
 -- Allow ControlMachine, BackupController, DbdHost and DbdBackupHost to be
    either short or long hostname.
 -- Job output and error files can now contain "%" character by specifying
    a file name with two consecutive "%" characters. For example,
    "sbatch -o "slurm.%%.%j" for job ID 123 will generate an output file named
    "slurm.%.123".
 -- Pass user name in Prolog RPC from controller to slurmd when using
    PrologFlags=Alloc. Allows SLURM_JOB_USER env variable to be set when using
    Native Slurm on a Cray.
 -- Add "NumTasks" to job information visible to Slurm commands.
 -- Add mail wrapper script "smail" that will include job statistics in email
    notification messages.
 -- Remove vestigial "SICP" job option (inter-cluster job option). Completely
    different logic will be forthcoming.
 -- Fix case where the primary and backup dbds would both be performing rollup.
 -- Add an ack reply from slurmd to slurmstepd when job setup is done and the
    job is ready to be executed.
 -- Removed support for authd. authd has not been developed and supported since
    several years.
 -- Introduce a new parameter requeue_setup_env_fail in SchedulerParameters.
    A job that fails to setup the environment will be requeued and the node
    drained.
 -- Add ValidateTimeout and OtherTimeout to "scontrol show burst" output.
 -- Increase default sbcast buffer size from 512KB to 8MB.
 -- Enable the hdf5 profiling of the batch step.
 -- Eliminate redundant environment and script files for job arrays.
 -- Implemented the checking configuration functionality using the new -C
    options of slurmctld. To check for configuration errors in slurm.conf
    run: 'slurmctld -C'.
 -- Stop searching sbatch scripts for #PBS directives after 100 lines of
    non-comments. Stop parsing #PBS or #SLURM directives after 1024 characters
    into a line. Required for decent perforamnce with huge scripts.
 -- Add debug flag for timing Cray portions of the code.
 -- Remove all *.la files from RPMs.
 -- Add Multi-Category Security (MCS) infrastructure to permit nodes to be bound
    to specific users or groups.
 -- Install the pmi2 unix sockets in slurmd spool directory instead of /tmp.
 -- Implement the getaddrinfo and getnameinfo instead of gethostbyaddr and
    gethostbyname.
 -- Finished PMIx implementation.
 -- Implemented the --without=package option for configure.
 -- Fix sshare to show each individual cluster with -M,--clusters option.
 -- Added --deadline option to salloc, sbatch and srun. Jobs which can not be
    completed by the user specified deadline will be terminated with a state of
    "Deadline" or "DL".
 -- Implemented and documented PMIX protocol which is used to bootstrap an
    MPI job. PMIX is an alternative to PMI and PMI2.
 -- Change default CgroupMountpoint (in cgroup.conf) from "/cgroup" to
    "/sys/fs/cgroup" to match current standard.
 -- Add #BSUB options to sbatch to read in from the batch script.
 -- HDF: Change group name of node from nodename to nodeid.
 -- The partition-specific SelectTypeParameters parameter can now be used to
    change the memory allocation tracking specification in the global
    SelectTypeParameters configuration parameter. Supported partition-specific
    values are CR_Core, CR_Core_Memory, CR_Socket and CR_Socket_Memory. If the
    global SelectTypeParameters value includes memory allocation management and
    the partition-specific value does not, then memory allocation management for
    that partition will NOT be supported (i.e. memory can be over-allocated).
    Likewise the global SelectTypeParameters might not include memory management
    while the partition-specific value does.
 -- Burst buffer/cray - Add support for multiple buffer pools including support
    for different resource granularity by pool.
 -- Burst buffer advanced reservation units treated as bytes (per documentation)
    rather than GB.

* Changes in Slurm 15.08.7
==========================
 -- sched/backfill: If a job can not be started within the configured
    backfill_window, set it's start time to 0 (unknown) rather than the end
    of the backfill_window.
 -- Remove the 1024-character limit on lines in batch scripts.
 -- burst_buffer/cray: Round up swap size by configured granularity.
 -- select/cray: Log repeated aeld reconnects.
 -- task/affinity: Disable core-level task binding if more CPUs required than
    available cores.
 -- Preemption/gang scheduling: If a job is suspended at slurmctld restart or
    reconfiguration time, then leave it suspended rather than resume+suspend.
 -- Don't use lower weight nodes for job allocation when topology/tree used.
 -- BGQ - If a cable goes into error state remove the under lying block on
    a dynamic system and mark the block in error on a static/overlap system.
 -- BGQ - Fix regression in 9cc4ae8add7f where blocks would be deleted on
    static/overlap systems when some hardware issue happens when restarting
    the slurmctld.
 -- Log if CLOUD node configured without a resume/suspend program or suspend
    time.
 -- MYSQL - Better locking around g_qos_count which was previously unprotected.
 -- Correct size of buffer used for jobid2str to avoid truncation.
 -- Fix allocation/distribution of tasks across multiple nodes when
    --hint=nomultithread is requested.
 -- If a reservation's nodes value is "all" then track the current nodes in the
    system, even if those nodes change.

* Changes in Slurm 15.08.6
==========================
 -- In slurmctld log file, log duplicate job ID found by slurmd. Previously was
    being logged as prolog/epilog failure.
 -- If a job is requeued while in the process of being launch, remove it's
    job ID from slurmd's record of active jobs in order to avoid generating a
    duplicate job ID error when launched for the second time (which would
    drain the node).
 -- Cleanup messages when handling job script and environment variables in
    older directory structure formats.
 -- Prevent triggering gang scheduling within a partition if configured with
    PreemptType=partition_prio and PreemptMode=suspend,gang.
 -- Decrease parallelism in job cancel request to prevent denial of service
    when cancelling huge numbers of jobs.
 -- If all ephemeral ports are in use, try using other port numbers.
 -- Revert way lib lua is handled when doing a dlopen, fixing a regression in
    15.08.5.
 -- Set the debug level of the rmdir message in xcgroup_delete() to debug2.
 -- Fix the qstat wrapper when user is removed from the system but still
    has running jobs.
 -- Log the request to terminate a job at info level if DebugFlags includes
    the Steps keyword.
 -- Fix potential memory corruption in _slurm_rpc_epilog_complete as well as
    _slurm_rpc_complete_job_allocation.
 -- Fix cosmetic display of AccountingStorageEnforce option "nosteps" when
    in use.
 -- If a job can never be started due to unsatisfied job dependencies, report
    the full original job dependency specification rather than the dependencies
    remaining to be satisfied (typically NULL).
 -- Refactor logic to synchronize active batch jobs and their script/environment
    files, reducing overhead dramatically for large numbers of active jobs.
 -- Avoid hard-link/copy of script/environment files for job arrays. Use the
    master job record file for all tasks of the job array.
    NOTE: Job arrays submitted to Slurm version 15.08.6 or later will fail if
    the slurmctld daemon is downgraded to an earlier version of Slurm.
 -- Move slurmctld mail handler to separate thread for improved performance.
 -- Fix containment of adopted processes from pam_slurm_adopt.
 -- If a pending job array has multiple reasons for being in a pending state,
    then print all reasons in a comma separated list.

* Changes in Slurm 15.08.5
==========================
 -- Prevent "scontrol update job" from updating jobs that have already finished.
 -- Show requested TRES in "squeue -O tres" when job is pending.
 -- Backfill scheduler: Test association and QOS node limits before reserving
    resources for pending job.
 -- burst_buffer/cray: If teardown operations fails, sleep and retry.
 -- Clean up the external pids when using the PrologFlags=Contain feature
    and the job finishes.
 -- burst_buffer/cray: Support file staging when job lacks job-specific buffer
    (i.e. only persistent burst buffers).
 -- Added srun option of --bcast to copy executable file to compute nodes.
 -- Fix for advanced reservation of burst buffer space.
 -- BurstBuffer/cray: Add logic to terminate dw_wlm_cli child processes at
    shutdown.
 -- If job can't be launch or requeued, then terminate it.
 -- BurstBuffer/cray: Enable clearing of burst buffer string on completed job
    as a means of recovering from a failure mode.
 -- Fix wrong memory free when parsing SrunPortRange=0-0 configuration.
 -- BurstBuffer/cray: Fix job record purging if cancelled from pending state.
 -- BGQ - Handle database throw correctly when syncing users on blocks.
 -- MySQL - Make sure we don't have a NULL string returned when not
    requesting any specific association.
 -- sched/backfill: If max_rpc_cnt is configured and the backlog of RPCs has
    not cleared after yielding locks, then continue to sleep.
 -- Preserve the job dependency description displayed in 'scontrol show job'
    even if the dependee jobs was terminated and cleaned causing the
    dependent to never run because of DependencyNeverSatisfied.
 -- Correct job task count calculation if only node count and ntasks-per-node
    options supplied.
 -- Make sure the association manager converts any string to be lower case
    as all the associations from the database will be lower case.
 -- Sanity check for xcgroup_delete() to verify incoming parameter is valid.
 -- Fix formatting for sacct with variables that switched from uint32_t to
    uint64_t.
 -- Fix a typo in sacct man page.
 -- Set up extern step to track any childern of an ssh if it leaves anything
    else behind.
 -- Prevent slurmdbd divide by zero if no associations defined at rollup time.
 -- Multifactor - Add sanity check to make sure pending jobs are handled
    correctly when PriorityFlags=CALCULATE_RUNNING is set.
 -- Add slurmdb_find_tres_count_in_string() to slurm db perl api.
 -- Make lua dlopen() conditional on version found at build.
 -- sched/backfill - Delay backfill scheduler for completing jobs only if
    CompleteWait configuration parameter is set (make code match documentation).
 -- Release a job's allocated licenses only after epilog runs on all nodes
    rather than at start of termination process.
 -- Cray job NHC delayed until after burst buffer released and epilog completes
    on all allocated nodes.
 -- Fix abort of srun if using PrologFlags=NoHold
 -- Let devices step_extern cgroup inherit attributes of job cgroup.
 -- Add new hook to Task plugin to be able to put adopted processes in the
    step_extern cgroups.
 -- Fix AllowUsers documentation in burst_buffer.conf man page. Usernames are
    comma separated, not colon delimited.
 -- Fix issue with time limit not being set correctly from a QOS when a job
    requests no time limit.
 -- Various CLANG fixes.
 -- In both sched/basic and backfill: If a job can not be started due to some