Newer
Older
This file describes changes in recent versions of Slurm. It primarily
documents those changes that are of interest to users and administrators.
* Changes in Slurm 16.05.0pre1
===============================
-- Add sbatch "--wait" option that waits for job completion before exiting.
Exit code will match that of spawned job.
-- Modify advanced reservation save/restore logic for core reservations to
support configuration changes (changes in configured nodes or cores counts).

Brian Christiansen
committed
-- Allow ControlMachine, BackupController, DbdHost and DbdBackupHost to be
either short or long hostname.
-- Job output and error files can now contain "%" character by specifying
a file name with two consecutive "%" characters. For example,
"sbatch -o "slurm.%%.%j" for job ID 123 will generate an output file named
"slurm.%.123".
-- Pass user name in Prolog RPC from controller to slurmd when using
PrologFlags=Alloc. Allows SLURM_JOB_USER env variable to be set when using
Native Slurm on a Cray.
-- Add "NumTasks" to job information visible to Slurm commands.
-- Add mail wrapper script "smail" that will include job statistics in email
notification messages.
-- Remove vestigial "SICP" job option (inter-cluster job option). Completely
different logic will be forthcoming.

Brian Christiansen
committed
-- Fix case where the primary and backup dbds would both be performing rollup.
-- Add an ack reply from slurmd to slurmstepd when job setup is done and the
job is ready to be executed.
-- Removed support for authd. authd has not been developed and supported since
-- Introduce a new parameter requeue_setup_env_fail in SchedulerParameters.
A job that fails to setup the environment will be requeued and the node
drained.
-- Add ValidateTimeout and OtherTimeout to "scontrol show burst" output.
-- Increase default sbcast buffer size from 512KB to 8MB.
-- Enable the hdf5 profiling of the batch step.
-- Eliminate redundant environment and script files for job arrays.
-- Implemented the checking configuration functionality using the new -C
options of slurmctld. To check for configuration errors in slurm.conf
run: 'slurmctld -C'.
-- Stop searching sbatch scripts for #PBS directives after 100 lines of
non-comments. Stop parsing #PBS or #SLURM directives after 1024 characters
into a line. Required for decent perforamnce with huge scripts.
-- Add debug flag for timing Cray portions of the code.
-- Add Multi-Category Security (MCS) infrastructure to permit nodes to be bound
to specific users or groups.
-- Install the pmi2 unix sockets in slurmd spool directory instead of /tmp.
-- Implement the getaddrinfo and getnameinfo instead of gethostbyaddr and
gethostbyname.
-- Finished PMIx implementation.
-- Implemented the --without=package option for configure.
-- Fix sshare to show each individual cluster with -M,--clusters option.
-- Added --deadline option to salloc, sbatch and srun. Jobs which can not be
completed by the user specified deadline will be terminated with a state of
"Deadline" or "DL".
-- Implemented and documented PMIX protocol which is used to bootstrap an
MPI job. PMIX is an alternative to PMI and PMI2.
-- Change default CgroupMountpoint (in cgroup.conf) from "/cgroup" to
"/sys/fs/cgroup" to match current standard.
-- Add #BSUB options to sbatch to read in from the batch script.
-- HDF: Change group name of node from nodename to nodeid.
-- The partition-specific SelectTypeParameters parameter can now be used to
change the memory allocation tracking specification in the global
SelectTypeParameters configuration parameter. Supported partition-specific
values are CR_Core, CR_Core_Memory, CR_Socket and CR_Socket_Memory. If the
global SelectTypeParameters value includes memory allocation management and
the partition-specific value does not, then memory allocation management for
that partition will NOT be supported (i.e. memory can be over-allocated).
Likewise the global SelectTypeParameters might not include memory management
while the partition-specific value does.
-- Burst buffer/cray - Add support for multiple buffer pools including support
for different resource granularity by pool.
-- Burst buffer advanced reservation units treated as bytes (per documentation)
rather than GB.
-- Add an "scontrol top <jobid>" command to re-order the priorities of a user's
pending jobs. May be disabled with the "disable_user_top" option in the
SchedulerParameters configuration parameter.
-- Modify sview to display negative job nice values.
-- Increase job's nice value field from 16 to 32 bits.
-- Cray: Not running the Node Health Check after every job and step is now the
default. Configure SelectTypeParameters with the NHC and/or NHC_STEP to
run them.
-- Remove deprecated job_submit/cnode plugin.
-- Enhance slurm.conf option EnforcePartLimit to include options like "ANY" and
"ALL". "Any" is equivalent to "Yes" and "All" will check all partitions
a job is submitted to and if any partition limit is violated the job will
be rejected even if it could possibly run on another partition.
-- Add "features_act" field (currently active features) to the node
information. Output of scontrol, sinfo, and sview changed accordingly.
The field previously displayed as "Features" is now "AvailableFeatures"
while the new field is displayed as "ActiveFeatures".
-- Remove Sun Constellation, IBM Federation Switches (replaced by NRT switch
plugin) and long-defunct Quadrics Elan support.

Janne Blomqvist
committed
-- Rework group caching to work better in environments with
enumeration disabled. Removed CacheGroups config directive, group
membership lists are now always cached, controlled by
GroupUpdateTime parameter. GroupUpdateForce parameter default
value changed to 1.
* Changes in Slurm 15.08.7
==========================
-- sched/backfill: If a job can not be started within the configured
backfill_window, set it's start time to 0 (unknown) rather than the end
of the backfill_window.
-- Remove the 1024-character limit on lines in batch scripts.
-- burst_buffer/cray: Round up swap size by configured granularity.
-- select/cray: Log repeated aeld reconnects.
-- task/affinity: Disable core-level task binding if more CPUs required than
available cores.
-- Preemption/gang scheduling: If a job is suspended at slurmctld restart or
reconfiguration time, then leave it suspended rather than resume+suspend.
-- Don't use lower weight nodes for job allocation when topology/tree used.
-- BGQ - If a cable goes into error state remove the under lying block on
a dynamic system and mark the block in error on a static/overlap system.
-- BGQ - Fix regression in 9cc4ae8add7f where blocks would be deleted on
static/overlap systems when some hardware issue happens when restarting
the slurmctld.

Alejandro Sanchez
committed
-- Log if CLOUD node configured without a resume/suspend program or suspend
time.
-- MYSQL - Better locking around g_qos_count which was previously unprotected.
-- Correct size of buffer used for jobid2str to avoid truncation.

Brian Christiansen
committed
-- Fix allocation/distribution of tasks across multiple nodes when
--hint=nomultithread is requested.
-- If a reservation's nodes value is "all" then track the current nodes in the
system, even if those nodes change.
-- Fix formatting if using "tree" option with sreport.
-- Make it so sreport prints out a line for non-existent TRES instead of
error message.
-- Set job's reason to "Priority" when higher priority job in that partition
(or reservation) can not start rather than leaving the reason set to
"Resources".
-- Fix memory corruption when a new non-generic TRES is added to the
DBD for the first time. The corruption is only noticed at shutdown.
-- burst_buffer/cray - Improve tracking of allocated resources to handle race
condition when reading state while buffer allocation is in progress.
-- If a job is submitted only with -c option and numcpus is updated before
the job starts update the cpus_per_task appropriately.
-- Update salloc/sbatch/srun documentation to mention time granularity.
-- Fixed memory leak when freeing assoc_mgr_info_msg_t.
-- Prevent possible use of empty reservation core bitmap, causing abort.
-- Remove unneeded pack32's from qos_rec when qos_rec is NULL.
-- Make sacctmgr print MaxJobsPerUser when adding/altering a QOS.
-- Correct dependency formatting to print array task ids if set.
-- Update sacctmgr help with current QOS options.
-- Update slurmstepd to initialize authentication before task launch.
-- burst_cray/cray: Eliminate need for dedicated nodes.
-- If no MsgAggregationParams is set don't set the internal string to
anything. The slurmd will process things correctly after the fact.
-- Fix output from api when printing job step not found.
-- Don't allow user specified reservation names to disrupt the normal
reservation sequeuece numbering scheme.
-- Fix scontrol to be able to accept TRES as an option when creating
a reservation.
-- contrib/torque/qstat.pl - return exit code of zero even with no records
printed for 'qstat -u'.
-- When a reservation is created or updated, compress user provided node names
using hostlist functions (e.g. translate user input of "Nodes=tux1,tux2"
into "Nodes=tux[1-2]").
-- Change output routines for scontrol show partition/reservation to handle
unexpectedly large strings.
-- Add more partition fields to "scontrol write config" output file.
-- Backfill scheduling fix: If a job can't be started due to a "group" resource
limit, rather than reserve resources for it when the next job ends, don't
reserve any resources for it.
-- Avoid slurmstepd abort if malloc fails during accounting gather operation.
* Changes in Slurm 15.08.6
==========================
-- In slurmctld log file, log duplicate job ID found by slurmd. Previously was
being logged as prolog/epilog failure.
-- If a job is requeued while in the process of being launch, remove it's
job ID from slurmd's record of active jobs in order to avoid generating a
duplicate job ID error when launched for the second time (which would
drain the node).

Tim Wickberg
committed
-- Cleanup messages when handling job script and environment variables in
older directory structure formats.
-- Prevent triggering gang scheduling within a partition if configured with
PreemptType=partition_prio and PreemptMode=suspend,gang.
-- Decrease parallelism in job cancel request to prevent denial of service
when cancelling huge numbers of jobs.
-- If all ephemeral ports are in use, try using other port numbers.
-- Revert way lib lua is handled when doing a dlopen, fixing a regression in
15.08.5.

Brian Christiansen
committed
-- Set the debug level of the rmdir message in xcgroup_delete() to debug2.
-- Fix the qstat wrapper when user is removed from the system but still
has running jobs.
-- Log the request to terminate a job at info level if DebugFlags includes
the Steps keyword.
-- Fix potential memory corruption in _slurm_rpc_epilog_complete as well as
_slurm_rpc_complete_job_allocation.
-- Fix cosmetic display of AccountingStorageEnforce option "nosteps" when
in use.
-- If a job can never be started due to unsatisfied job dependencies, report
the full original job dependency specification rather than the dependencies
remaining to be satisfied (typically NULL).
-- Refactor logic to synchronize active batch jobs and their script/environment
files, reducing overhead dramatically for large numbers of active jobs.
-- Avoid hard-link/copy of script/environment files for job arrays. Use the
master job record file for all tasks of the job array.
NOTE: Job arrays submitted to Slurm version 15.08.6 or later will fail if
Loading
Loading full blame...