This file describes changes in recent versions of Slurm. It primarily
documents those changes that are of interest to users and administrators.
* Changes in Slurm 15.08.0pre3
-- CRAY - addition of acct_gather_energy/cray plugin.
-- Add job credential to "Run Prolog" RPC used with a configuration of
PrologFlags=alloc. This allows the Prolog to be passed identification of
GPUs allocated to the job.
-- Add SLURM_JOB_CONSTAINTS to environment variables available to the Prolog.
-- Added "--mail=stage_out" option to job submission commands to notify user
when burst buffer state out is complete.
* Changes in Slurm 15.08.0pre2
-- Add the environment variables SLURM_JOB_ACCOUNT, SLURM_JOB_QOS
and SLURM_JOB_RESERVATION in the batch/srun jobs.
-- Properly enforce partition Shared=YES option. Previously oversubscribing
resources required gang scheduling to be configured.
-- Enable per-partition gang scheduling resource resolution (e.g. the partition
can have SelectTypeParameters=CR_CORE, while the global value is CR_SOCKET).
-- Make it so a newer version of a slurmstepd can talk to an older srun.

Brian Christiansen
allocation. Nodes could have been added while waiting for an allocation.
-- Expanded --cpu-freq parameters to include min-max:governor specifications.
--cpu-freq now supported on salloc and sbatch.
-- Add support for optimized job allocations with respect to SGI Hypercube
NOTE: Only supported with select/linear plugin.
NOTE: The program contribs/sgi/netloc_to_topology can be used to build
Slurm's topology.conf file.
-- Remove 64k validation of incoming RPC nodelist size. Validated at 64MB
when unpacking.
-- In slurmstepd() add the user primary group if it is not part of the
groups sent from the client.
-- Added BurstBuffer field to advanced reservations.
-- For advanced reservation, replace flag "License_only" with flag "Any_Nodes".
It can be used to indicate the an advanced reservation resources (licenses
and/or burst buffers) can be used with any compute nodes.
-- Allow users to specify the srun --resv-ports as 0 in which case no ports
will be reserved. The default behaviour is to allocate one port per task.
-- Interpret a partition configuration of "Nodes=ALL" in slurm.conf as
including all nodes defined in the cluster.
-- Added new configuration parameters PowerParameters and PowerPlugin.
-- Added power management plugin infrastructure.
-- If job already exceeded one of its QOS/Accounting limits do not
return error if user modifies QOS unrelated job settings.
-- When caching user ids of AllowGroups use both getgrnam_r() and getgrent_r()
then remove eventual duplicate entries.
-- Remove rpm dependency between slurm-pam and slurm-devel.
-- Remove support for the XCPU (cluster management) package.
-- Add Slurmdb::jobs_get() interface to perl api.
-- Performance improvement when sending data from srun to stepds when
processing fencing.
-- Add the feature to specify arbitrary field separator when running
sacct -p or sacct -P. The command line option is --separator.
-- Introduce slurm.conf parameter to use Proportional Set Size (PSS) instead
of RSS to determinate the memory footprint of a job.
Add an slurm.conf option not to kill jobs that is over memory limit.
-- Add job submission command options: --sicp (available for inter-cluster
dependencies) and --power (specify power management options) to salloc,
sbatch, and srun commands.
-- Add DebugFlags option of SICP (inter-cluster option logging).
-- In order to support inter-cluster job dependencies, the MaxJobID
configuration parameter default value has been reduced from 4,294,901,760
to 2,147,418,112 and it's maximum value is now 2,147,463,647.
-- Add QOS name to the output of a partition in squeue/scontrol/sview/smap.
* Changes in Slurm 15.08.0pre1
-- Add sbcast support for file transfer to resources allocated to a job step
rather than a job allocation.
-- Change structures with association in them to assoc to save space.
-- Add support for job dependencies jointed with OR operator (e.g.
-- Add "--bb" (burst buffer specification) option to salloc, sbatch, and srun.
-- Added configuration parameters BurstBufferParameters and BurstBufferType.
-- Added burst_buffer plugin infrastructure (needs many more functions).
-- Make it so when the fanout logic comes across a node that is down we abandon
the tree to avoid worst case scenarios when the entire branch is down and
we have to try each serially.
-- Add better error reporting of invalid partitions at submission time.
-- Move will-run test for multiple clusters from the sbatch code into the API
so that it can be used with DRMAA.
-- If a non-exclusive allocation requests --hint=nomultithread on a
CR_CORE/SOCKET system lay out tasks correctly.
-- Avoid including unused CPUs in a job's allocation when cores or sockets are
-- Added new job state of STOPPED indicating processes have been stopped with a
SIGSTOP (using scancel or sview), but retain its allocated CPUs. Job state
returns to RUNNING when SIGCONT is sent (also using scancel or sview).
-- Added EioTimeout parameter to slurm.conf. It is the number of seconds srun
waits for slurmstepd to close the TCP/IP connection used to relay data
between the user application and srun when the user application terminates.
-- Remove slurmctld/dynalloc plugin as the work was never completed, so it is
not worth the effort of continued support at this time.
-- Remove DynAllocPort configuration parameter.
-- Add advance reservation flag of "replace" that causes allocated resources
to be replaced with idle resources. This maintains a pool of available
resources that maintains a constant size (to the extent possible).
-- Added SchedulerParameters option of "bf_busy_nodes". When selecting
resources for pending jobs to reserve for future execution (i.e. the job
can not be started immediately), then preferentially select nodes that are
in use. This will tend to leave currently idle resources available for
backfilling longer running jobs, but may result in allocations having less
than optimal network topology. This option is currently only supported by
the select/cons_res plugin.
-- Permit "SuspendTime=NONE" as slurm.conf value rather than only a numeric
value to match "scontrol show config" output.
-- Add the 'scontrol show cache' command which displays the associations
in slurmctld.
-- Test more frequently for node boot completion before starting a job.
Provides better responsiveness.
-- Permit PreemptType=qos and PreemptMode=suspend,gang to be used together.
A high-priority QOS job will now oversubscribe resources and gang schedule,
but only if there are insufficient resources for the job to be started
without preemption. NOTE: That with PreemptType=qos, the partition's
Shared=FORCE:# configuration option will permit one job more per resource
to be run than than specified, but only if started by preemption.
-- Remove the CR_ALLOCATE_FULL_SOCKET configuration option. It is now the
-- Fix a race condition in PMI2 when fencing counters can be out of sync.
-- Increase the MAX_PACK_MEM_LEN define to avoid PMI2 failure when fencing
with large amount of ranks.
-- Add QOS option to a partition. This will allow a partition to have
all the limits a QOS has. If a limit is set in both QOS the partition
QOS will override the job's QOS unless the job's QOS has the
PartitionQOS flag set.
-- The task_dist_states variable has been split into "flags" and "base"
to give user greater control over task distribution. The srun --dist options
has been modified to accept a "Pack" and "NoPack" option. These options can
be used to override the CR_PACK_NODE configuration option.
* Changes in Slurm 14.11.5
-- Correct the squeue command taking into account that a node can
have NULL name if it is not in DNS but still in slurm.conf.
-- Fix slurmdbd regression which would cause a segfault when a node is set
down with no reason.
-- BGQ - Fix issue with job arrays not being handled correctly
in the runjob_mux plugin.
-- Print FAIR_TREE, if configured, in "scontrol show config" output for
-- Add SLURM_JOB_GPUS environment variable to those available in the Prolog.
-- Load lua-5.2 library if using lua5.2 for lua job submit plugin.
-- GRES logic: Prevent bad node_offset due to not preserving no_consume flag.
* Changes in Slurm 14.11.4
-- Make sure assoc_mgr locks are initialized correctly.
-- Correct check of enforcement when filling in an association.
-- Make sacctmgr print out classification correctly for clusters.
-- Add array_task_str to the perlapi job info.
-- Fix for slurmctld abort with GRES types configured and no CPU binding.
-- Fix for GRES scheduling where count > 1 per topology type (or GRES types).
-- Make CR_ONE_TASK_PER_CORE work correctly with task/affinity.
-- job_submit/pbs - Fix possible deadlock.
-- job_submit/lua - Add "alloc_node" to job information available.
-- Fix memory leak in mysql accounting when usage rollup happens.
-- If users specify ALL together with other variables using the
--export sbatch/srun command line option, propagate the users'
environ to the execution side.
-- Fix job array scheduling anomaly that can stop scheduling of valid tasks.
-- Fix perl api tests for libslurmdb to work correctly.
-- Remove some misleading logs related to non-consumable GRES.
-- Allow --ignore-pbs to take effect when read as an #SBATCH argument.

Brian Christiansen
-- Fix Slurmdb::clusters_get() in perl api from not returning information.
-- Fix TaskPluginParam=Cpusets from logging error message about not being able
to remove cpuset dir which was already removed by the release_agent.
-- Fix the file name substitution for job stderr when %A, %a %j and %u
are specified.
-- Remove minor warning when compiling slurmstepd.
-- Fix database resources so they can add new clusters to them after they have
initially been added.
-- Use the slurm_getpwuid_r wrapper of getpwuid_r to handle possible
-- Correct the scontrol man page and command listing which node states can
be set by the command.
-- Stop sacct from printing non-existent stat information for
Front End systems.
-- Correct srun and acct_gather.conf man pages, mention Filesystem instead
of Lustre.
-- When a job using multiple partition starts send to slurmdbd only
the partition in which the job runs.
-- ALPS - Fix depth for MemoryAllocation in BASIL with CLE 5.2.3.
-- Fix assoc_mgr hash to deal with users that don't have a uid yet when making
-- When a job uses multiple partition set the environment variable
SLURM_JOB_PARTITION to be the one in which the job started.
-- Print spurious message about the absence of cgroup.conf at log level debug2
instead of info.
-- Enable CUDA v7.0+ use with a Slurm configuration of TaskPlugin=task/cgroup
ConstrainDevices=yes (in cgroup.conf). With that configuration
CUDA_VISIBLE_DEVICES will start at 0 rather than the device number.
Loading full blame...