Newer
Older
This file describes changes in recent versions of SLURM. It primarily
documents those changes that are of interest to users and admins.
* Changes in SLURM 2.2.3
========================
-- Update srun, salloc, and sbatch man page description of --distribution
option. Patches from Rod Schulz, Bull.
-- Applied patch from Martin Perry to fix "Incorrect results for task/affinity
block second distribution and cpus-per-task > 1" bug.
-- Avoid setting a job's eligible time while held (priority == 0).
-- Substantial performance improvement to backfill scheduling. Patch from
Bjorn-Helge Mevik, University of Oslo.
-- Make timeout for communications to the slurmctld be based upon the
MessageTimeout configuration parameter rather than always 3 seconds.
Patch from Matthieu Hautreux, CEA.
-- Add new scontrol option of "show aliases" to report every NodeName that is
associated with a given NodeHostName when running multiple slurmd daemons
per compute node (typically used for testing purposes). Patch from
Matthieu Hautreux, CEA.
-- Fix for handling job names with a "'" in the name within MySQL accounting.
Patch from Gerrit Renker, CSCS.
-- Modify condition under which salloc execution delayed until moved to the
foreground. Patch from Gerrit Renker, CSCS.
a) salloc is not run in allocation-only (--no-shell) mode,
b) stdin is from a terminal (tcgetattr does not return ENOTTY),
c) salloc has been invoked from a login shell (not a nested one),
d) salloc has been configured at compile-time to support background
execution and is not currently in the background process group.
-- Abort salloc if no controlling terminal and --no-shell option is not used
("setsid salloc ..." is disabled). Patch from Gerrit Renker, CSCS.
-- Fix to gang scheduling logic which could cause jobs to not be suspended
or resumed when appropriate.
* Changes in SLURM 2.2.2
========================
-- Correct logic to set correct job hold state (admin or user) when setting
the job's priority using scontrol's "update jobid=..." rather than its
"hold" or "holdu" commands.
-- Modify squeue to report unset --mincores, --minthreads or --extra-node-info
values as "*" rather than 65534. Patch from Rod Schulz, BULL.
-- Report the StartTime of a job as "Unknown" rather than the year 2106 if its
expected start time was too far in the future for the backfill scheduler
to compute.
-- Prevent a pending job reason field from inappropriately being set to
"Priority".
-- In sched/backfill with jobs having QOS_FLAG_NO_RESERVE set, then don't
consider the job's time limit when attempting to backfill schedule. The job
will just be preempted as needed at any time.
-- Eliminated a bug in sbatch when no valid target clusters are specified.
-- When explicitly sending a signal to a job with the scancel command and that
job is in a pending state, then send the request directly to the slurmctld
daemon and do not attempt to send the request to slurmd daemons, which are
not running the job anyway.
-- In slurmctld, properly set the up_node_bitmap when setting it's state to
IDLE (in case the previous node state was DOWN).

Danny Auble
committed
-- Fix smap to process block midplane names correctly when on a bluegene
system.
-- Fix smap to once again print out the Letter 'ID' for each line of a block/
partition view.
-- Corrected the NOTES section of the scancel man page

Danny Auble
committed
-- Fix for accounting_storage/mysql plugin to correctly query cluster based
transactions.
-- Fix issue when updating database for clusters that were previously deleted
before upgrade to 2.2 database.
-- BLUEGENE - Handle mesh torus check better in dynamic mode.

Danny Auble
committed
-- BLUEGENE - Fixed race condition when freeing block, most likely only would
happen in emulation.
-- Fix for calculating used QOS limits correctly on a slurmctld reconfig.
-- BLUEGENE - Fix for bad conn-type set when running small blocks in HTC mode.
-- If salloc's --no-shell option is used, then do not attempt to preserve the
terminal's state.

Moe Jette
committed
-- Add new SLURM configure time parameter of --disable-salloc-background. If
set, then salloc can only execute in the foreground. If started in the
background, then a message will be printed and the job allocation halted
until brought into the foreground.
NOTE: THIS IS A CHANGE IN DEFAULT SALLOC BEHAVIOR FROM V2.2.1, BUT IS
CONSISTENT WITH V2.1 AND EARLIER.
-- Added the Multi-Cluster Operation web page.
-- Removed remnant code for enforcing max sockets/cores/threads in the
cons_res plugin (see last item in 2.1.0-pre5). This was responsible
for a bug reported by Rod Schultz.

Danny Auble
committed
-- BLUEGENE - Set correct env vars for HTC mode on a P system to get correct
block.
-- Correct RunTime reported by "scontrol show job" for pending jobs.

Danny Auble
committed
* Changes in SLURM 2.2.1
========================
-- Fix setting derived exit code correctly for jobs that happen to have the
same jobid.
-- Better checking for time overflow when rolling up in accounting.
-- Add scancel --reservation option to cancel all jobs associated with a
specific reservation.
-- Treat reservation with no nodes like one that starts later (let jobs of any
size get queued and do not block any pending jobs).
-- Fix bug in gang scheduling logic that would temporarily resume to many jobs
after a job completed.
-- Change srun message about job step being deferred due to SlurmctldProlog
running to be more clear and only print when --verbose option is used.

Danny Auble
committed
-- Made it so you could remove the hold on jobs with sview by setting the
priority to infinite.
-- BLUEGENE - better checking small blocks in dynamic mode whether a full
midplane job could run or not.
-- Decrease the maximum sleep time between srun job step creation retry
attempts from 60 seconds to 29 seconds. This should eliminate a possible
synchronization problem with gang scheduling that could result in job

Danny Auble
committed
step creation requests only occuring when a job is suspended.
-- Fix to prevent changing a held job's state from HELD to DEPENDENCY
until the job is released. Patch from Rod Schultz, Bull.
-- Fixed sprio -M to reflect PriorityWeight values from remote cluster.
-- Fix bug in sview when trying to update arbitrary field on more than one
job. Formerly would display information about one job, but update next
selected job.

Danny Auble
committed
-- Made it so QOS with UsageFactor set to 0 would make it so jobs running
under that QOS wouldn't add time to fairshare or association/qos
limits.

Danny Auble
committed
-- Fixed issue where QOS priority wasn't re-normalized until a slurmctld
restart when a QOS priority was changed.

Danny Auble
committed
-- Fix sprio to use calculated numbers from slurmctld instead of
calulating it own numbers.

Danny Auble
committed
-- BLUEGENE - fixed race condition with preemption where if the wind blows the
right way the slurmctld could lock up when preempting jobs to run others.

Danny Auble
committed
-- BLUEGENE - fixed epilog to wait until MMCS job is totally complete before
finishing.
-- BLUEGENE - more robust checking for states when freeing blocks.
-- Added correct files to the slurm.spec file for correct perl api rpm
creation.
-- Added flag "NoReserve" to a QOS to make it so all jobs are created equal
within a QOS. So if larger, higher priority jobs are unable to run they
don't prevent smaller jobs from running even if running the smaller
jobs delay the start of the larger, higher priority jobs.

Danny Auble
committed
-- BLUEGENE - Check preemptees one by one to preempt lower priority jobs first
instead of first fit.
-- In select/cons_res, correct handling of the option
SelectTypeParameters=CR_ONE_TASK_PER_CORE.

Danny Auble
committed
-- Fix for checking QOS to override partition limits, previously if not using
QOS some limits would be overlooked.
-- Fix bug which would terminate a job step if any of the nodes allocated to
it were removed from the job's allocation. Now only the tasks on those
nodes are terminated.
-- Fixed issue when using a storage_accounting plugin directly without the
slurmDBD updates weren't always sent correctly to the slurmctld, appears to
OS dependent, reported by Fredrik Tegenfeldt.

Danny Auble
committed
* Changes in SLURM 2.2.0
========================
-- Change format of Duration field in "scontrol show reservation" output from
an integer number of minutes to "[days-]hours:minutes:seconds".
-- Add support for changing the reservation of pending or running jobs.
-- On Cray systems only, salloc sends SIGKILL to spawned process group when
job allocation is revoked. Patch from Gerrit Renker, CSCS.

Danny Auble
committed
-- Fix for sacctmgr to work correctly when modifying user associations where
all the associations contain a partition.
-- Minor mods to salloc signal handling logic: forwards more signals and
releases allocation on real-time signals. Patch from Gerrit Renker, CSCS.
-- Add salloc logic to preserve tty attributes after abnormal exit. Patch
from Mark Grondona, LLNL.
-- BLUEGENE - Fix for issue in dynamic mode when trying to create a block
overlapping a block with no job running on it but in configuring state.
-- BLUEGENE - Speedup by skipping blocks that are deallocating for other jobs
when starting overlapping jobs in dynamic mode.
-- Fix for sacct --state to work correctly when not specifying a start time.
-- Fix upgrade process in accounting from 2.1 for clusters named "cluster".
-- Export more jobacct_common symbols needed for the slurm api on some systems.
* Changes in SLURM 2.2.0.rc4
============================
-- Correction in logic to spread out over time highly parallel messages to
minimize lost messages. Effects slurmd epilog complete messages and PMI
key-pair transmissions. Patch from Gerrit Renker, CSCS.
-- Fixed issue where if a system has unset messages to the dbd in 2.1 and
upgrades to 2.2. Messages are now processed correctly now.
-- Fixed issue where assoc_mgr cache wasn't always loaded correctly if the
slurmdbd wasn't running when the slurmctld was started.

Danny Auble
committed
-- Make sure on a pthread create in step launch that the error code is looked
at. Improves fault-tolerance of slurmd.
-- Fix setting up default acct/wckey when upgrading from 2.1 to 2.2.
-- Fix issue with associations attached to a specific partition with no other
association, and requesting a different partition.
-- Added perlapi to the slurmdb to the slurm.spec.
-- In sched/backfill, correct handling of CompleteWait parameter to avoid
backfill scheduling while a job is completing. Patch from Gerrit Renker,
CSCS.
-- Send message back to user when trying to launch job on computing lacking
that user ID. Patch from Hongjia Cao, NUDT.
-- BLUEGENE - Fix it so 1 midplane clusters will run small block jobs.
-- Add Command and WorkDir to the output of "scontrol show job" for job
allocations created using srun (not just sbatch).
-- Fixed sacctmgr to not add blank defaultqos' when doing a cluster dump.
-- Correct processing of memory and disk space specifications in the salloc,
sbatch, and srun commands to work properly with a suffix of "MB", "GB",
etc. and not only with a single letter (e.g. "M", "G", etc.).
-- Prevent nodes with suspended jobs from being powered down by SLURM.
-- Normalized the way pidfile are created by the slurm daemons.

Danny Auble
committed
-- Fixed modifying the root association to no read in it's last value
when clearing a limit being set.
-- Revert some resent signal handling logic from salloc so that SIGHUP sent
after the job allocation will properly release the allocation and cause
salloc to exit.
-- BLUEGENE - Fix for recreating a block in a ready state.
-- Fix debug flags for incorrect logic when dealing with DEBUG_FLAG_WIKI.
Loading
Loading full blame...