NEWS

This file describes changes in recent versions of SLURM. It primarily
documents those changes that are of interest to users and admins.

* Changes in SLURM 1.4.0-pre5
=============================
 -- Correction in setting of SLURM_CPU_BIND environment variable.
 -- Rebuild slurmctld's job select_jobinfo->node_bitmap on restart/reconfigure
    of the daemon rather than restoring the bitmap since the nodes in a system
    can change (be added or removed).
 -- Add configuration option "--with-cpusetdir=PATH" for non-standard 
    locations.
 -- Get new multi-core data structures working on BlueGene systems.
 -- Modify PMI_Get_clique_ranks() to return an array of integers rather 
    than a char * to satisfy PMI standard. Correct logic in 
    PMI_Get_clique_size() for when srun --overcommit option is used.
 -- Fix bug in select/cons_res, allocated a job all of the processors on a 
    node when the --exclusive option is specified as a job submit option.
 -- Add NUMA cpu_bind support to the task affinity plugin. Binds tasks to
    a set of CPUs that belong NUMA locality domain with the appropriate
    --cpu-bind option (ldoms, rank_ldom, map_ldom, and mask_ldom), see
    "man srun" for more information.

* Changes in SLURM 1.4.0-pre4
=============================
 -- For task/affinity, force jobs to use a particular task binding by setting
    the TaskPluginParam configuration parameter rather than slurmd's
    SLURM_ENFORCED_CPU_BIND environment variable.
 -- Enable full preemption of jobs by partition with select/cons_res 
    (cons_res_preempt.patch from Chris Holmes, HP).
 -- Add configuration parameter DebugFlags to provide detailed logging for
    specific subsystems (steps and triggers so far).
 -- srun's --no-kill option is passed to slurmctld so that a job step is 
    killed even if the node where srun executes goes down (unless the 
    --no-kill option is used, previous termination logic would fail if 
    srun was not responding).
 -- Transfer a job step's core bitmap from the slurmctld to the slurmd
    within the job step credential.
 -- Add cpu_bind, cpu_bind_type, mem_bind and mem_bind_type to job allocation
    request and job_details structure in slurmctld. Add support to --cpu_bind
    and --mem_bind options from salloc and sbatch commands.

* Changes in SLURM 1.4.0-pre3
=============================
 -- Internal changes: CPUs per node changed from 32-bit to 16-bit size.
    Node count fields changed from 16-bit to 32-bit size in some structures.
 -- Remove select plugin functions select_p_get_extra_jobinfo(),
    select_p_step_begin() and select_p_step_fini().
 -- Remove the following slurmctld job structure fields: num_cpu_groups,
    cpus_per_node, cpu_count_reps, alloc_lps_cnt, alloc_lps, and used_lps.
    Use equivalent fields in new "select_job" structure, which is filled
    in by the select plugins.
 -- Modify mem_per_task in job step request from 16-bit to 32-bit size.
    Use new "select_job" structure for the job step's memory management.
 -- Add core_bitmap_job to slurmctld's job step structure to identify
    which specific cores are allocated to the step.
 -- Add new configuration option OverTimeLimit to permit jobs to exceed 
    their (soft) time limit by a configurable amount. Backfill scheduling
    will be based upon the soft time limit.
 -- Remove select_g_get_job_cores(). That data is now within the slurmctld's
    job structure.

* Changes in SLURM 1.4.0-pre2
=============================
 -- Remove srun's --ctrl-comm-ifhn-addr option (for PMI/MPICH2). It is no
    longer needed.
 -- Modify power save mode so that nodes can be powered off when idle. See
    https://computing.llnl.gov/linux/slurm/power_save.html or 
    "man slurm.conf" (SuspendProgram and related parameters) for more 
    information.
 -- Added configuration parameter PrologSlurmctld, which can be used to boot
    nodes into a particular state for each job. See "man slurm.conf" for 
    details.
 -- Add configuration parameter CompleteTime to control how long to wait for 
    a job's completion before allocating already released resources to pending
    jobs. This can be used to reduce fragmentation of resources. See
    "man slurm.conf" for details.
 -- Make default CryptoType=crypto/munge. OpenSSL is now completely optional.
 -- Make default AuthType=auth/munge rather than auth/none.
 -- Change output format of "sinfo -R" from "%35R %N" to "%50R %N".

* Changes in SLURM 1.4.0-pre1
=============================
 -- Save/restore a job's task_distribution option on slurmctld retart.
    NOTE: SLURM must be cold-started on converstion from version 1.3.x.
 -- Remove task_mem from job step credential (only job_mem is used now).
 -- Remove --task-mem and --job-mem options from salloc, sbatch and srun
    (use --mem-per-cpu or --mem instead).
 -- Remove DefMemPerTask from slurm.conf (use DefMemPerCPU or DefMemPerNode
    instead).
 -- Modify slurm_step_launch API call. Move launch host from function argument
    to element in the data structure slurm_step_launch_params_t, which is
    used as a function argument.
 -- Add state_reason_string to job state with optional details about why
    a job is pending.
 -- Make "scontrol show node" output match scontrol input for some fields
    ("Cores" changed to "CoresPerSocket", etc.).
 -- Add support for a new node state "FUTURE" in slurm.conf. These node records
    are created in SLURM tables for future use without a reboot of the SLURM
    daemons, but are not reported by any SLURM commands or APIs.

* Changes in SLURM 1.3.11
=========================
 -- Bluegene/P support added (minimally tested, but builds correctly, 
    no dynamic layout mode).
 -- Fix infinite loop when using accounting_storage/mysql plugin either from
    the slurmctld or slurmdbd daemon.
 -- Added more thread safety for assoc_mgr in the controller.
 -- For sched/wiki2 (Moab), permit clearing of a job's dependencies with the 
    JOB_MODIFY option "DEPEND=0".
 -- Do not set a running or pending job's EndTime when changing it's time 
    limit.
 -- Fix bug in use of "include" parameter within the plugstack.conf file.
 -- Fix bug in the parsing of negative numeric values in configuration files.
 -- Propagate --cpus-per-task parameter from salloc or sbatch input line to
    the SLURM_CPUS_PER_TASK environment variable in the spawned shell for 
    srun to use.
 -- Add support for srun --cpus-per-task=0. This can be used to spawn tasks
    without allocating resouces for the job step from the job's allocation
    when running multiple job steps with the --exclusive option.
 -- Remove registration messages from saved messages when bringing down cluster.
    Without causes deadlock if wrong cluster name is given.
 -- Correction to build like for srun debugger (export symbols).
 -- sacct will now display more properly allocations made with salloc with only 
    one step.

* Changes in SLURM 1.3.10
=========================
 -- Fix several bugs in the hostlist functions:
    - Fix hostset_insert_range() to do proper accounting of hl->nhosts (count).
    - Avoid assertion failure when callinsg hostset_create(NULL).
    - Fix return type of hostlist and hostset string functions from size_t to
      ssize_t.
    - Add check for NULL return from hostlist_create().
    - Rewrite of hostrange_hn_within(), avoids reporting "tst0" in the hostlist
      "tst".
 -- Modify squeue to accept "--nodes=<hostlist>" rather than 
    "--node=<node_name>" and report all jobs with any allocated nodes from set
    of nodes specified. From Par Anderson, National Supercomputer Centre, 
    Sweden.
 -- Fix bug preventing use of TotalView debugger with TaskProlog configured or 
    or srun's --task-prolog option.
 -- Improve reliability of batch job requeue logic in the event that the slurmd
    daemon is temporarily non-responsive (for longer than the configured
    MessageTimeout value but less than the SlurmdTimeout value).
 -- In sched/wiki2 (Moab) report a job's MAXNODES (maximum number of permitted
    nodes).
 -- Fixed SLURM_TASKS_PER_NODE to live up more to it's name on an allocation. 
    Will now contain the number of tasks per node instead of the number of CPUs
    per node.  This is only for a resource allocation. Job steps already have 
    the environment variable set correctly.
 -- Configuration parameter PropagateResourceLimits has new option of "NONE".
 -- User's --propagate options take precidence over PropagateResourceLimits
    configuration parameter in both srun and sbatch commands.
 -- When Moab is in use (salloc or sbatch is executed with the --get-user-env
    option to be more specific), load the user's default resource limits rather
    than propagating the Moab daemon's limits.
 -- Fix bug in slurmctld restart logic for recovery of batch jobs that are
    initiated as a job step rather than an independent job (used for LSF).
 -- Fix bug that can cause slurmctld restart to fail, bug introduced in SLURM
    version 1.3.9. From Eygene Ryabinkin, Kurchatov Institute, Russia.
 -- Permit slurmd configuration parameters to be set to new values from 
    previously unset values.

* Changes in SLURM 1.3.9
========================
 -- Fix jobs being cancelled by ctrl-C to have correct cancelled state in 
    accounting.
 -- Slurmdbd will only cache user data, made for faster start up
 -- Improved support for job steps in FRONT_END systems
 -- Added support to dump and load association information in the controller
    on start up if slurmdbd is unresponsive
 -- BLUEGENE - Added support for sched/backfill plugin
 -- sched/backfill modified to initiate multiple jobs per cycle.
 -- Increase buffer size in srun to hold task list expressions. Critical 
    for jobs with 16k tasks or more.
 -- Added support for eligible jobs and downed nodes to be sent to accounting
    from the controller the first time accounting is turned on.
 -- Correct srun logic to support --tasks-per-node option without task count.
 -- Logic in place to handle multiple versions of RPCs within the slurmdbd. 
    THE SLURMDBD MUST BE UPGRADED TO THIS VERSION BEFORE UPGRADING THE 
    SLURMCTLD OR THEY WILL NOT TALK.  
    Older versions of the slurmctld will continue to talk to the new slurmdbd.
 -- Add support for new job dependency type: singleton. Only one job from a 
    given user with a given name will execute with this dependency type.
    From Matthieu Hautreux, CEA.
 -- Updated contribs/python/hostlist to version 1.3: See "CHANGES" file in
    that directory for details. From Kent Engstrom, NSC.
 -- Add SLURM_JOB_NAME environment variable for jobs submitted using sbatch.
    In order to prevent the job steps from all having the same name as the 
    batch job that spawned them, the SLURM_JOB_NAME environment variable is
    ignored when setting the name of a job step from within an existing 
    resource allocation.
 -- For use with sched/wiki2 (Moab only), set salloc's default shell based 
    upon the user who the job runs as rather than the user submitting the job 
    (user root).
 -- Fix to sched/backfill when job specifies no time limit and the partition
    time limit is INFINITE.
 -- Validate a job's constraints (node features) at job submit or modification 
    time. Major re-write of resource allocation logic to support more complex
    job feature requests.