This file describes changes in recent versions of Slurm. It primarily
documents those changes that are of interest to users and admins.
* Changes in Slurm 14.03.8
-- Fix minor memory leak when Job doesn't have nodes on it (Meaning the job
has finished)
-- Fix sinfo/sview to be able to query against nodes in reserved and other
-- Make sbatch read in (SLURM|SBATCH)_HINT in order to handle sruns in the
script that will use it.
-- srun properly interprets a leading "." in the executable name based upon
the working directory of the compute node rather than the submit host.
-- Fix Lustre misspellings in hdf5 guide

Kilian Cavalotti
-- Fix wrong reference in slurm.conf man page to what --profile option should
be used for AcctGatherFilesystemType.
-- Update HDF5 document to point out the SlurmdUser is who creates the
ProfileHDF5Dir directory as well as all it's sub-directories and files.
-- CRAY NATIVE - Remove error message for srun's ran inside an salloc that
had --network= specified.
-- Defer job step initiation of required GRES are in use by other steps rather
than immediately returning an error.
-- Deprecate --cpu_bind from sbatch and salloc. These never worked correctly
and only caused confusion since the cpu_bind options mostly refer to a
step we opted to only allow srun to set them in future versions.
-- Modify sgather to work if Nodename and NodeHostname differ.
-- Changed use of JobContainerPlugin where it should be JobContainerType.
-- Fix for possible error if job has GRES, but the step explicitly requests a
GRES count of zero.
-- Make "srun --gres=none ..." work when executed without a job allocation.
-- Change the global eio_shutdown_time to a field in eio handle.
-- Advanced reservation fixes for heterogeneous systems, especially when
reserving cores.
-- If --hint=nomultithread is used in a job allocation make sure any srun's
ran inside the allocation can read the environment correctly.
-- If batchdir can't be made set errno correctly so the slurmctld is notified
-- Remove repeated batch complete if batch directory isn't able to be made
since the slurmd will send the same message.
-- sacctmgr fix default format for list transactions.
* Changes in Slurm 14.03.7
-- Add note to MaxNodesPerUser and multiple jobs running on the same node
counting as multiple nodes.
-- PerlAPI - fix renamed call from slurm_api_set_conf_file to
-- Fix gres race condition that could result in job deallocation error message.
-- Correct NumCPUs count for jobs with --exclusive option.
-- When creating reservation with CoreCnt, check that Slurm uses
SelectType=select/cons_res, otherwise don't send the request to slurmctld
and return an error.
-- Save the state of scheduled node reboots so they will not be lost should the
slurmctld restart.
-- In select/cons_res plugin - Insure the node count does not exceed the task
-- switch/nrt - Unload tables rather than windows at job end, to release CAU.
-- When HealthCheckNodeState is configured as IDLE don't run the
HealthCheckProgram for nodes in any other states than IDLE.
-- Minor sanity check to verify the string sent in isn't NULL when using
-- CRAY NATIVE - Fix issue on heavy systems to only run the NHC once per
job/step completion.
-- Remove unneeded step cleanup for pending steps.
-- Fix issue where if a batch job was manually requeued the batch step
information wasn't stored in accounting.
-- When job is release from a requeue hold state clean up its previous
exit code.
-- Correct the srun man page about how the output from the user application
is sent to srun.
-- Increase the timeout of the main thread while waiting for the i/o thread.
Allow up to 180 seconds for the i/o thread to complete.
-- When using sacct -c to read the job completion data compute the correct
job elapsed time.
-- Perl package: Define some missing node states.
-- When using AccountingStorageType=accounting_storage/mysql zero out the
database index for the array elements avoiding duplicate database values.
-- Reword the explanation of cputime and cputimeraw in the sacct man page.
-- JobCompType allows "jobcomp/mysql" as valid name but the code used
"job_comp/mysql" setting an incorrect default database.
-- Try to load only when necessary.
-- When nodes scheduled for reboot, set state to DOWN rather than FUTURE so
they are still visible to sinfo. State set to IDLE after reboot completes.
-- Apply BatchStartTimeout configuration to task launch and avoid aborting
srun commands due to long running Prolog scripts.
-- Fix minor memory leaks when freeing node_info_t structure.
-- If a batch script is requeued and running steps get correct exit code/signal
previous it was always -2.
-- If step exitcode hasn't been set display with sacct the -2 instead
of acting like it is a signal and exitcode.
-- Send calculated step_rc for batch step instead of raw status as
done for normal steps.
-- If a job times out, set the exit code in accounting to 1 instead of the
signal 1.
-- Update the acct_gather.conf.5 man page removing the reference to
-- Fix gang scheduling for jobs submitted to multiple partitions.
-- Enable srun to submit job to multiple partitions.
-- Update slurm.conf man page. When Epilog or Prolog fail the node state
is set ro DRAIN.
-- Start a job in the highest priority partition possible, even if it requires
preempting other jobs and delaying initiation, rather than using a lower
priority partition. Previous logic would preempt lower priority jobs, but
then might start the job in a lower priority partition and not use the
resources released by the preempted jobs.
-- Fix SelectTypeParameters=CR_PACK_NODES for srun making both job and step
resource allocation.
-- BGQ - Make it possible to pack multiple tasks on a core when not using
the entire cnode.
-- MYSQL - if unable to connect to mysqld close connection that was inited.
-- DBD - when connecting make sure we wait MessageTimeout + 5 since the
timeout when talking to the Database is the same timeout so a race
condition could occur in the requesting client when receiving the response
if the database is unresponsive.
* Changes in Slurm 14.03.6
-- Added examples to demonstrate the use of the sacct -T option to the man
-- Fix for regression in 14.03.5 with sacctmgr load when Parent has "'"
around it.
-- Update comments in sacctmgr dump header.
-- Fix for possible abort on change in GRES configuration.
-- CRAY - fix modules file, (backport from 14.11 commit 78fe86192b.
-- Fix race condition which could result in requeue if batch job exit and node
registration occur at the same time.
-- switch/nrt - Unload job tables (in addition to windows) in user space mode.
-- Differentiate between two identical debug messages about purging vestigial
job scripts.
-- If the socket used by slurmstepd to communicate with slurmd exist when
slurmstepd attempts to create it, for example left over from a previous
requeue or crash, delete it and recreate it.
* Changes in Slurm 14.03.5
-- If a srun runs in an exclusive allocation and doesn't use the entire
allocation and CR_PACK_NODES is set layout tasks appropriately.
-- Correct Shared field in job state information seen by scontrol, sview, etc.
-- Print Slurm error string in scontrol update job and reset the Slurm errno
before each call to the API.
-- Fix task/cgroup to handle -mblock:fcyclic correctly
-- Fix for core-based advanced reservations where the distribution of cores
across nodes is not even.
-- Fix issue where association maxnodes wouldn't be evaluated correctly if a
QOS had a GrpNodes set.
-- GRES fix with multiple files defined per line in gres.conf.
-- When a job is requeued make sure accounting marks it as such.
-- Print the state of requeued job as REQUEUED.
-- Fix if a job's partition was taken away from it don't allow a requeue.
-- Make sure we lock on the conf when sending slurmd's conf to the slurmstepd.
-- Fix issue with sacctmgr 'load' not able to gracefully handle bad formatted
-- sched/backfill: Correct job start time estimate with advanced reservations.
-- Error message added when in proctrack/cgroup the step freezer path isn't
able to be destroyed for debug.
-- Added extra index's into the database for better performance when
deleting users.
-- Fix issue with wckeys when tracking wckeys, but not enforcing them,
you could get multiple '*' wckeys.
-- Fix bug which could report to squeue the wrong partition for a running job
that is submitted to multiple partitions.
-- Report correct CPU count allocated to job when allocated whole node even if
not using all CPUs.
-- If job's constraints cannot be satisfied put it in pending state with reason
BadConstraints and don't remove it.
-- sched/backfill - If job started with infinite time limit, set its end_time
one year in the future.
-- Clear QOS GrpUsedCPUs when resetting raw usage if QOS is not using any cpus.
-- Remove log message left over from debugging.
-- When using CR_PACK_NODES fix make --ntasks-per-node work correctly.
-- Report correct partition associated with a step if the job is submitted to
multiple partitions.
-- Fix to allow removing of preemption from a QOS
-- If the proctrack plugins fail to destroy the job container print an error
message and avoid to loop forever, give up after 120 seconds.
-- Make srun obey POSIX convention and increase the exit code by 128 when the
process terminated by a signal.
-- Sanity check for acct_gather_energy/rapl
-- If the proctrack plugins fail to destroy the job container print an error
message and avoid to loop forever, give up after 120 seconds.
-- If the sbatch command specifies the option --signal=B:signum sent the signal
to the batch script only.
-- If we cancel a task and we have no other exit code send the signal and
exit code.
-- Added note about InnoDB storage engine being used with MySQL.
-- Set the job exit code when the job is signaled and set the log level to
debug2() when processing an already completed job.
-- Reset diagnostics time stamp when "sdiag --reset" is called.
-- squeue and scontrol to report a job's "shared" value based upon partition
options rather than reporting "unknown" if job submission does not use
--exclusive or --shared option.
-- task/cgroup - Fix cpuset binding for batch script.
-- sched/backfill - Fix anomaly that could result in jobs being scheduled out
of order.
-- Expand pseudo-terminal size data structure field sizes from 8 to 16 bits.
-- Set the job exit code when the job is signaled and set the log level to
Loading full blame...