Newer
Older
This file describes changes in recent versions of Slurm. It primarily
documents those changes that are of interest to users and administrators.
* Changes in Slurm 14.11.10
===========================
-- Fix truncation of job reason in squeue.
-- If a node is in DOWN or DRAIN state, leave it unavailable for allocation
when powered down.
-- Update the slurm.conf man page documenting better nohold_on_prolog_fail
variable.
-- Don't trucate task ID information in "squeue --array/-r" or "sview".
-- Fix a bug which caused scontrol to core dump when releasing or
holding a job by name.
-- Fix unit conversion bug in slurmd which caused wrong memory calculation
for cgroups.
-- Fix issue with GRES in steps so that if you have multiple exclusive steps
and you use all the GRES up instead of reporting the configuration isn't
available you hold the requesting step until the GRES is available.
-- Fix slurmdbd backup to use DbdAddr when contacting the primary.
-- Fix to handle arrays with respect to number of jobs submitted. Previously
only 1 job was accounted (against MaxSubmitJob) for when an array was
submitted.
-- Correct counting for job array limits, job count limit underflow possible
when master cancellation of master job record.
-- For pending jobs have sacct print 0 for nnodes instead of the bogus 2.
-- Fix for tracking node state when jobs that have been allocated exclusive
access to nodes (i.e. entire nodes) and later relinquish some nodes. Nodes
would previously appear partly allocated and prevent use by other jobs.

Brian Christiansen
committed
-- Fix updating job in db after extending job's timelimit past partition's
timelimit.

Brian Christiansen
committed
-- Fix srun -I<timeout> from flooding the controller with step create requests.
-- Requeue/hold batch job launch request if job already running (possible if
node went to DOWN state, but jobs remained active).
-- If a job's CPUs/task ratio is increased due to configured MaxMemPerCPU,
then increase it's allocated CPU count in order to enforce CPU limits.
-- Don't mark powered down node as not responding. This could be triggered by
race condition of the node suspend and ping logic.
-- Don't requeue RPC going out from slurmctld to DOWN nodes (can generate
repeating communication errors).
-- Propagate sbatch "--dist=plane=#" option to srun.
-- Fix sacct to not return all jobs if the -j option is given with a trailing
','.
* Changes in Slurm 14.11.9
==========================
-- Correct "sdiag" backfill cycle time calculation if it yields locks. A
microsecond value was being treated as a second value resulting in an
overflow in the calcuation.
-- Fix segfault when updating timelimit on jobarray task.
-- Fix to job array update logic that can result in a task ID of 4294967294.
-- Fix of job array update, previous logic could fail to update some tasks
of a job array for some fields.
-- CRAY - Fix seg fault if a blade is replaced and slurmctld is restarted.
-- Fix plane distribution to allocate in blocks rather than cyclically.
-- squeue - Remove newline from job array ID value printed.
-- squeue - Enable filtering for job state SPECIAL_EXIT.
-- Prevent job array task ID being inappropriately set to NO_VAL.
-- MYSQL - Make it so you don't have to restart the slurmctld
to gain the correct limit when a parent account is root and you
remove a subaccount's limit which exists on the parent account.
-- MYSQL - Close chance of setting the wrong limit on an association
when removing a limit from an association on multiple clusters
at the same time.
-- MYSQL - Fix minor memory leak when modifying an association but no
change was made.
-- srun command line of either --mem or --mem-per-cpu will override both the
SLURM_MEM_PER_CPU and SLURM_MEM_PER_NODE environment variables.
-- Prevent slurmctld abort on update of advanced reservation that contains no
nodes.

Danny Auble
committed
-- ALPS - Revert commit 2c95e2d22 which also removes commit 2e2de6a4 allowing
cray with the SubAllocate option to work as it did with 2.5.
-- Properly parse CPU frequency data on POWER systems.
-- Correct sacct.a man pages describing -i option.
-- Capture salloc/srun information in sdiag statistics.
-- Fix bug in node selection with topology optimization.

Nathan Yee
committed
-- Read in correct number of nodes from SLURM_HOSTFILE when specifying nodes
and --distribution=arbitrary.
-- Fix segfault in Bluegene setups where RebootQOSList is defined in
bluegene.conf and accounting is not setup.
-- MYSQL - Update mod_time when updating a start job record or adding one.
-- MYSQL - Fix issue where if an association id ever changes on at least a
portion of a job array is pending after it's initial start in the
database it could create another row for the remain array instead
of using the already existing row.
-- Fix scheduling anomaly with job arrays submitted to multiple partitions,
jobs could be started out of priority order.
-- If a host has suspened jobs do not reboot it. Reboot only hosts
with no jobs in any state.
-- ALPS - Fix issue when using --exclusive flag on srun to do the correct
thing (-F exclusive) instead of -F share.
-- Fix a bug in the controller which display jobs in CF state as RUNNING.
-- Preserve advanced _core_ reservation when nodes added/removed/resized on
slurmctld restart. Rebuild core_bitmap as needed.
-- Fix for non-standard Munge port location for srun/pmi use.
-- Fix gang scheduling/preemption issue that could cancel job at startup.
-- Fix a bug in squeue which prevented squeue -tPD to print array jobs.

Brian Christiansen
committed
-- Sort job arrays in job queue according to array_task_id when priorities are
equal.
-- Fix segfault in sreport when there was no response from the dbd.
-- ALPS - Fix compile to not link against -ljob and -lexpat with every lib
or binary.

Brian Christiansen
committed
-- Fix testing for CR_Memory when CR_Memory and CR_ONE_TASK_PER_CORE are used
with select/linear.
-- MySQL - Fix minor memory leak if a connection ever goes away whist using it.
-- ALPS - Make it so srun --hint=nomultithread works correctly.
-- Prevent job array task ID from being reported as NO_VAL if last task in the
array gets requeued.
-- Fix some potential deadlock issues when state files don't exist in the
association manager.
-- Correct RebootProgram logic when executed outside of a maintenance
reservation.
-- Requeue job if possible when slurmstepd aborts.
* Changes in Slurm 14.11.8
==========================
-- Eliminate need for user to set user_id on job_update calls.
-- Correct list of unavailable nodes reported in a job's "reason" field when
that job can not start.
-- Map job --mem-per-cpu=0 to --mem=0.
-- Fix squeue -o %m and %d unit conversion to Megabytes.
-- Fix issue with incorrect time calculation in the priority plugin when
a job runs past it's time limit.
-- Prevent users from setting job's partition to an invalid partition.
-- Fix sreport core dump when requesting
'job SizesByAccount grouping=individual'.
-- select/linear: Correct count of CPUs allocated to job on system with
hyperthreads.

Brian Christiansen
committed
-- Fix race condition where last array task might not get updated in the db.
-- CRAY - Remove libpmi from rpm install
-- Fix squeue -o %X output to correctly handle NO_VAL and suffix.
-- When deleting a job from the system set the job_id to 0 to avoid memory
corruption if thread uses the pointer basing validity off the id.
-- Fix issue where sbatch would set ntasks-per-node to 0 making any srun
afterward cause a divide by zero error.
-- switch/cray: Refine logic to set PMI_CRAY_NO_SMP_ENV environment variable.
-- When sacctmgr loads archives with version less than 14.11 set the array
task id to NO_VAL, so sacct can display the job ids correctly.
-- When using memory cgroup if a task uses more memory than requested
the failures are logged into memory.failcnt count file by cgroup
and the user is notified by slurmstepd about it.
-- Fix scheduling inconsistency with GRES bound to specific CPUs.
-- If user belongs to a group which has split entries in /etc/group
search for its username in all groups.
-- Do not consider nodes explicitly powered up as DOWN with reason of "Node
unexpected rebooted".
-- Use correct slurmd spooldir when creating cpu-frequency locks.
-- Note that TICKET_BASED fairshare will be deprecated in the future. Consider
using the FAIR_TREE algorithm instead.
-- Set job's reason to BadConstaints when job can't run on any node.
-- Prevent abort on update of reservation with no nodes (licenses only).
-- Prevent slurmctld from dumping core if job_resrcs is missing in the
job data structure.
-- Fix squeue to print array task ids according to man page when
SLURM_BITSTR_LEN is defined in the environment.
-- In squeue, sort jobs based on array job ID if available.
-- Fix the calculation of job energy by not including the NO_VAL values.
-- Advanced reservation fixes: enable update of bluegene reservation, avoid
abort on multi-core reservations.
-- Set the totalview_stepid to the value of the job step instead of NO_VAL.
-- Fix slurmdbd core dump if the daemon does not have connection with
the database.
-- Display error message when attempting to modify priority of a held job.
-- Backfill scheduler: The configured backfill_interval value (default 30
seconds) is now interpretted as a maximum run time for the backfill
scheduler. Once reached, the scheduler will build a new job queue and
start over, even if not all jobs have been tested.
-- Backfill scheduler now considers OverTimeLimit and KillWait configuration
parameters to estimate when running jobs will exit.
-- Correct task layout with CR_Pack_Node option and more than 1 CPU per task.
-- Fix the scontrol man page describing the release argument.
-- When job QOS is modified, do so before attempting to change partition in
order to validate the partition's Allow/DenyQOS parameter.
* Changes in Slurm 14.11.7
==========================
-- Initialize some variables used with the srun --no-alloc option that may
cause random failures.
-- Add SchedulerParameters option of sched_min_interval that controls the
minimum time interval between any job scheduling action. The default value
is zero (disabled).
-- Change default SchedulerParameters=max_sched_time from 4 seconds to 2.
-- Refactor scancel so that all pending jobs are cancelled before starting
cancellation of running jobs. Otherwise they happen in parallel and the
pending jobs can be scheduled on resources as the running jobs are being
cancelled.
-- ALPS - Add new cray.conf variable NoAPIDSignalOnKill. When set to yes this
will make it so the slurmctld will not signal the apid's in a batch job.
Instead it relies on the rpc coming from the slurmctld to kill the job to
end things correctly.
-- ALPS - Have the slurmstepd running a batch job wait for an ALPS release
before ending the job.
-- Initialize variables in consumable resource plugin to prevent core dump.
-- Fix scancel bug which could return an error on attempt to signal a job step.
-- In slurmctld communication agent, make the thread timeout be the configured
value of MessageTimeout rather than 30 seconds.
-- sshare -U/--Users only flag was used uninitialized.
-- Cray systems, add "plugstack.conf.template" sample SPANK configuration file.
Loading
Loading full blame...