Newer
Older
This file describes changes in recent versions of Slurm. It primarily
documents those changes that are of interest to users and administrators.
* Changes in Slurm 15.08.0pre2
==============================
-- Add the environment variables SLURM_JOB_ACCOUNT, SLURM_JOB_QOS
and SLURM_JOB_RESERVATION in the batch/srun jobs.
-- Properly enforce partition Shared=YES option. Previously oversubscribing
resources required gang scheduling to be configured.
-- Enable per-partition gang scheduling resource resolution (e.g. the partition
can have SelectTypeParameters=CR_CORE, while the global value is CR_SOCKET).
-- Make it so a newer version of a slurmstepd can talk to an older srun.

Brian Christiansen
committed
allocation. Nodes could have been added while waiting for an allocation.
-- Expanded --cpu-freq parameters to include min-max:governor specifications.
--cpu-freq now supported on salloc and sbatch.
-- Add support for optimized job allocations with respect to SGI Hypercube
topology.
NOTE: Only supported with select/linear plugin.
NOTE: The program contribs/sgi/netloc_to_topology can be used to build
Slurm's topology.conf file.
-- Remove 64k validation of incoming RPC nodelist size. Validated at 64MB
when unpacking.
-- In slurmstepd() add the user primary group if it is not part of the
groups sent from the client.
-- Added BurstBuffer field to advanced reservations.
-- For advanced reservation, replace flag "License_only" with flag "Any_Nodes".
It can be used to indicate the an advanced reservation resources (licenses
and/or burst buffers) can be used with any compute nodes.
-- Allow users to specify the srun --resv-ports as 0 in which case no ports
will be reserved. The default behaviour is to allocate one port per task.
-- Interpret a partition configuration of "Nodes=ALL" in slurm.conf as
including all nodes defined in the cluster.
-- Added new configuration parameters PowerParameters and PowerPlugin.
-- Added power management plugin infrastructure.
-- If job already exceeded one of its QOS/Accounting limits do not
return error if user modifies QOS unrelated job settings.
-- When caching user ids of AllowGroups use both getgrnam_r() and getgrent_r()
then remove eventual duplicate entries.
-- Remove rpm dependency between slurm-pam and slurm-devel.
* Changes in Slurm 15.08.0pre1
==============================
-- Add sbcast support for file transfer to resources allocated to a job step
rather than a job allocation.
-- Change structures with association in them to assoc to save space.
-- Add support for job dependencies jointed with OR operator (e.g.
"--depend=afterok:123?afternotok:124").
-- Add "--bb" (burst buffer specification) option to salloc, sbatch, and srun.
-- Added configuration parameters BurstBufferParameters and BurstBufferType.
-- Added burst_buffer plugin infrastructure (needs many more functions).
-- Make it so when the fanout logic comes across a node that is down we abandon
the tree to avoid worst case scenarios when the entire branch is down and
we have to try each serially.
-- Add better error reporting of invalid partitions at submission time.
-- Move will-run test for multiple clusters from the sbatch code into the API
so that it can be used with DRMAA.
-- If a non-exclusive allocation requests --hint=nomultithread on a
CR_CORE/SOCKET system lay out tasks correctly.
-- Avoid including unused CPUs in a job's allocation when cores or sockets are
allocated.
-- Added new job state of STOPPED indicating processes have been stopped with a
SIGSTOP (using scancel or sview), but retain its allocated CPUs. Job state
returns to RUNNING when SIGCONT is sent (also using scancel or sview).
-- Added EioTimeout parameter to slurm.conf. It is the number of seconds srun
waits for slurmstepd to close the TCP/IP connection used to relay data
between the user application and srun when the user application terminates.
-- Remove slurmctld/dynalloc plugin as the work was never completed, so it is
not worth the effort of continued support at this time.
-- Remove DynAllocPort configuration parameter.
-- Add advance reservation flag of "replace" that causes allocated resources
to be replaced with idle resources. This maintains a pool of available
resources that maintains a constant size (to the extent possible).
-- Added SchedulerParameters option of "bf_busy_nodes". When selecting
resources for pending jobs to reserve for future execution (i.e. the job
can not be started immediately), then preferentially select nodes that are
in use. This will tend to leave currently idle resources available for
backfilling longer running jobs, but may result in allocations having less
than optimal network topology. This option is currently only supported by
the select/cons_res plugin.
-- Permit "SuspendTime=NONE" as slurm.conf value rather than only a numeric
value to match "scontrol show config" output.
-- Add the 'scontrol show cache' command which displays the associations
in slurmctld.
-- Test more frequently for node boot completion before starting a job.
Provides better responsiveness.
-- Permit PreemptType=qos and PreemptMode=suspend,gang to be used together.
A high-priority QOS job will now oversubscribe resources and gang schedule,
but only if there are insufficient resources for the job to be started
without preemption. NOTE: That with PreemptType=qos, the partition's
Shared=FORCE:# configuration option will permit one job more per resource
to be run than than specified, but only if started by preemption.
-- Remove the CR_ALLOCATE_FULL_SOCKET configuration option. It is now the
default.
-- Fix a race condition in PMI2 when fencing counters can be out of sync.
-- Increase the MAX_PACK_MEM_LEN define to avoid PMI2 failure when fencing
with large amount of ranks.
-- Add QOS option to a partition. This will allow a partition to have
all the limits a QOS has. If a limit is set in both QOS the partition
QOS will override the job's QOS unless the job's QOS has the
PartitionQOS flag set.
-- The task_dist_states variable has been split into "flags" and "base"
components. Added SLURM_DIST_PACK_NODES and SLURM_DIST_NO_PACK_NODES values
to give user greater control over task distribution. The srun --dist options
has been modified to accept a "Pack" and "NoPack" option. These options can
be used to override the CR_PACK_NODE configuration option.
* Changes in Slurm 14.11.4
==========================
-- Make sure assoc_mgr locks are initialized correctly.
-- Correct check of enforcement when filling in an association.
-- Make sacctmgr print out classification correctly for clusters.
-- Add array_task_str to the perlapi job info.
-- Fix for slurmctld abort with GRES types configured and no CPU binding.
-- Fix for GRES scheduling where count > 1 per topology type (or GRES types).
-- Make CR_ONE_TASK_PER_CORE work correctly with task/affinity.
-- job_submit/pbs - Fix possible deadlock.
-- job_submit/lua - Add "alloc_node" to job information available.
-- Fix memory leak in mysql accounting when usage rollup happens.
-- If users specify ALL together with other variables using the
--export sbatch/srun command line option, propagate the users'
environ to the execution side.
-- Fix job array scheduling anomaly that can stop scheduling of valid tasks.
-- Fix perl api tests for libslurmdb to work correctly.
-- Remove some misleading logs related to non-consumable GRES.
-- Allow --ignore-pbs to take effect when read as an #SBATCH argument.
* Changes in Slurm 14.11.3
==========================
-- Prevent vestigial job record when canceling a pending job array record.
-- Fix job array hash table bug, could result in slurmctld infinite loop or
invalid memory reference.
-- In srun honor ntasks_per_node before looking at cpu count when the user
doesn't request a number of tasks.
-- Fix ghost job when submitting job after all jobids are exhausted.
-- MySQL - Enhanced coordinator security checks.
-- Fix for task/affinity if an admin configures a node for having threads
but then sets CPUs to only represent the number of cores on the node.
-- Make it so previous versions of salloc/srun work with newer versions
of Slurm daemons.
-- Avoid delay on commit for PMI rank 0 to improve performance with some
MPI implementations.
-- auth/munge - Correct logic to read old format AccountingStoragePass.
-- Reset node "RESERVED" state as appropriate when deleting a maintenance
reservation.
-- Prevent a job manually suspended from being resumed by gang scheduler once
free resources are available.
-- Prevent invalid job array task ID value if a task is started using gang
scheduling.
-- Fix documentation bugs in slurm.conf.5. DenyAccount should be DenyAccounts.
-- For backward compatibility with older versions of OMPI not compiled
with --with-pmi restore the SLURM_STEP_RESV_PORTS in the job environment.
-- Update the html documentation describing the integration with openmpi.
-- Fix sacct when searching by nodelist.
-- Fix cosmetic info statements when dealing with a job array task instead of
a normal job.
-- BGQ - Put print statement under a DebugFlag. This was just an oversight.
-- BLUEGENE - Remove check that would erroneously remove the CONFIGURING
flag from a job while the job is waiting for a block to boot.
-- Fix segfault in slurmstepd when job exceeded memory limit.
-- Fix race condition that could start a job that is dependent upon a job array
before all tasks of that job array complete.
* Changes in Slurm 14.11.2
==========================
-- Fix issue with association hash not getting the correct index which
could result in seg fault.
-- Avoid huge malloc if GRES configured with "Type" and huge "Count".
-- Fix jobs from starting in overlapping reservations that won't finish before
a "maint" reservation begins.
-- When node gets drained while in state mixed display its status as draining
in sinfo output.
-- Allow priority/multifactor to work with sched/wiki(2) if all priorities
have no weight. This allows for association and QOS decay limits to work.
-- Fix "squeue --start" to override SQUEUE_FORMAT env variable.

Brian Christiansen
committed
-- Fix scancel to be able to cancel multiple jobs that are space delimited.
-- Log Cray MPI job calling exit() without mpi_fini(), but do not treat it as
a fatal error. This partially reverts logic added in version 14.03.9.
-- sview - Fix displaying of suspended steps elapsed times.
-- Increase number of messages that get cached before throwing them away
when the DBD is down.
-- Fix jobs from starting in overlapping reservations that won't finish before
a "maint" reservation begins.
-- Restore GRES functionality with select/linear plugin. It was broken in
version 14.03.10.
-- Fix bug with GRES having multiple types that can cause slurmctld abort.

Brian Christiansen
committed
-- Fix squeue issue with not recognizing "localhost" in --nodelist option.
-- Make sure the bitstrings for a partitions Allow/DenyQOS are up to date
when running from cache.
-- Add smap support for job arrays and larger job ID values.
-- Fix possible race condition when attempting to use QOS on a system running
Loading
Loading full blame...