Newer
Older
This file describes changes in recent versions of SLURM. It primarily
documents those changes that are of interest to users and admins.
* Changes in SLURM 2.2.0.pre10
==============================

Danny Auble
committed
-- Fix issue when EnforcePartLimits=yes in slurm.conf all jobs where no nodecnt
was specified the job would be seen to have maxnodes=0 which would not
allow jobs to run.

Danny Auble
committed
-- Fix issue where if not suspending a job the gang scheduler does the correct
kill procedure.

Danny Auble
committed
-- Fixed some issues when dealing with jobs from a 2.1 system so they live
after an upgrade.
-- In srun, log if --cpu_bind options are specified, but not supported by the
current system configuration.

Danny Auble
committed
-- Various Patchs from Hongjia Cao dealing with bugs found in sacctmgr and
the slurmdbd.
-- Fix bug in changing the nodes allocated to a running job and some node
names specified are invalid, avoid invalid memory reference.
-- Fixed filename substitution of %h and %n based on patch from Ralph Bean
-- Added better job sorting logic when preempting jobs with qos.
-- Log the IP address and port number for some communication errors.
-- Fix bug in select/cons_res when --cpus_per_task option is used, could
oversubscribe resources.
-- In srun, do not implicitly set the job's maximum node count based upon a
required hostlist.
-- Avoid running the HealthCheckProgram on non-responding nodes rather than
DOWN nodes.
-- Fix bug in handling of poll() functions on OS X (SLURM was ignoring POLLIN
if POLLHUP flag was set at the same time).

Danny Auble
committed
-- Pulled Cray logic out of common/node_select.c into it's own
select/cray plugin cons_res is the default. To use linear add 'Linear' to
SelectTypeParameters.
-- Fixed bug where resizing jobs didn't correctly set used limits correctly.
-- Change sched/backfill default time interval to 30 seconds and defer attempt
to backfill schedule if slurmctld has more than 5 active RPCs. General
improvements in logic scalability.
-- Add SchedulerParameters option of default_sched_depth=# to control how
many jobs on queue should be tested for attempted scheduling when a job
completes or other routine events. Default value is 100 jobs. The full job
queue is tested on a less frequent basis. This option can dramatically
improve performance on systems with thousands of queued jobs.
-- Gres/gpu now sets the CUDA_VISIBLE_DEVICES environment to control which
GPU devices should be used for each job or job step and CUDA version 3.1+
is used. NOTE: SLURM's generic resource support is still under development.
-- Modify select/cons_res to pack jobs onto allocated nodes differently and

Moe Jette
committed
minimize system fragmentation. For example on nodes with 8 CPUs each, a
job needing 10 CPUs will now ideally be allocated 8 CPUs on one node and
2 CPUs on another node. Previously the job would have ideally been
allocated 5 CPUs on each node, fragmenting the unused resources more.
-- Modified the behavior of update_job() in job_mgr.c to return when the first
error is encountered instead of continuing with more job updates.
-- Removed all references to the following slurm.conf parameters, all of which
have been removed or replaced since version 2.0 or earlier: HashBase,
HeartbeatInterval, JobAcctFrequency, JobAcctLogFile (instead use
AccountingStorageLoc), JobAcctType, KillTree, MaxMemPerTask, and
MpichGmDirectSupport.
-- Fix bug in slurmctld restart logic that improperly reported jobs had
invalid features: "Job 65537 has invalid feature list: fat".
* Changes in SLURM 2.2.0.pre9
=============================

Danny Auble
committed
-- sbatch can now submit jobs to multiple clusters and run on the earliest
available.
-- Fix bug introduced in pre8 that prevented job dependencies and job
triggers from working without the --enable-debug configure option.
-- Replaced slurm_addr with slurm_addr_t
-- Skeleton code added for BlueGeneQ.
-- Jobs can now be submitted to multiple partitions (job queues) and use the
one permitting earliest start time.

Danny Auble
committed
-- Change slurmdb_coord_table back to acct_coord_table to keep consistant
with < 2.1.
-- Introduced locking system similar to that in the slurmctld for the
assoc_mgr.
-- Added ability to change a users name in accounting.
-- Restore squeue support for "%G" format (group id) accidentally removed in
2.2.0.pre7.
-- Added preempt_mode option to QOS.
-- Added a grouping=individual for sreport size reports.
-- Added remove_qos logic to jobs running under a QOS that was removed.
-- scancel now exits with a 1 if any job is non-existant when canceling.
-- Better handling of select plugins that don't exist on various systems for
cross cluster communication. Slurmctld, slurmd, and slurmstepd now only
load the default select plugin as well.
-- Prevent scontrol from aborting if getlogin() returns NULL.
-- Prevent scontrol segfault when there are hidden nodes.
-- Prevent srun segfault after task launch failure.
-- Added job_submit/lua plugin.
-- Fixed sinfo on a bluegene system to print correctly the output for:

Danny Auble
committed
sinfo -e -o "%9P %6m %.4c %.22F %f"
-- Add scontrol commands "hold" and "release" to simplify setting a job's
priority to 0 or 1. Also tests that the job is in pending state.
-- Increase maximum node list size (for incoming RPC) from 1024 bytes to 64k.
-- In the backup slurmctld, purge triggers before recovering trigger state to
avoid duplicate entries.
-- Fix bug in sacct processing of --fields= option.
-- Fix bug in checkpoint/blcr for jobs spanning multiple nodes introduced when
changing some variable names in version 2.2.0.pre5.
-- Removed the vestigal set_max_cluster_usage() function from the Priority
Plugin API.
-- Modify the output of "scontrol show job" for the field ReqS:C:T=. Fields
not specified by the user will be reported as "*" instead of 65534.
-- Added DefaultQOS option for an association.

Danny Auble
committed
-- BLUEGENE - Added -B option to the slurmctld to clear created blocks from
the system on start.
-- BLUEGENE - Added option to scontrol & sview to recreate existing blocks.

Danny Auble
committed
-- Fixed flags for returning messages to use the correct munge key when going
cross-cluster.

Danny Auble
committed
-- BLUEGENE - Added option to scontrol & sview to resume blocks in an error
state instead of just freeing them.

Danny Auble
committed
-- sview patched to allow multiple row selection of jobs, patch from Dan Rusak
(rosemontelabs@cox.net)
-- Lower default slurmctld server thread count from 1024 to 256. Some systems
process threads on a last-in first-out basis and the high thread count was
causing unexpectedly high delays for some RPCs.
-- Added to sacctmgr the ability for admins to reset the raw usage of a user
or account
-- Improved the efficiency of a few lines in sacctmgr
* Changes in SLURM 2.2.0.pre8
=============================
-- Add DebugFlags parameter of "Backfill" for sched/backfill detailed logging.
-- Add DebugFlags parameter of "Gang" for detailed logging of gang scheduling
activities.
-- Add DebugFlags parameter of "Priority" for detailed logging of priority
multifactor activities.
-- Add DebugFlags parameter of "Reservation" for detailed logging of advanced
reservations.
-- Add run time to mail message upon job termination and queue time for mail
message upon job begin.
-- Add email notification option for job requeue.
-- Generate a fatal error if the srun --relative option is used when not
within an existing job allocation.
-- Modify the meaning of InactiveLimit slightly. It will now cancel the job
allocation created using the salloc or srun command if those commands
cease responding for the InactiveLimit regardless of any running job steps.
This parameter will no longer effect jobs spawned using sbatch.
-- Remove AccountingStoragePass and JobCompPass from configuration RPC and
scontrol show config command output. The use of SlurmDBD is still strongly
recommended as SLURM will have limited database functionality or protection
otherwise.
-- Add sbatch options of --export and SBATCH_EXPORT to control which
environment variables (if any) get propagated to the spawned job. This is
particularly important for jobs that are submitted on one cluster and run
on a different cluster.
-- Fix bug in select/linear when used with gang scheduling and there are
preempted jobs at the time slurmctld restarts that can result in over-
subscribing resources.
-- Added keeping track of the qos a job is running with in accounting.

Danny Auble
committed
-- Fix for handling correctly jobs that resize, and also reporting correct
stats on a job after it finishes.

Danny Auble
committed
-- Modify gang scheduler so with SelectTypeParameter=CR_CPUS and task
affinity is enabled, keep track of the individual CPUs allocated to jobs
rather than just the count of CPUs allocated (which could overcommit
specific CPUs for running jobs).
-- Modify select/linear plugin data structures to eliminate underflow errors
for the exclusive_cnt and tot_job_cnt variables (previously happened when
slurmctld reconfigured while the job was in completing state).
-- Change slurmd's working directory (and location of core files) to match
that of the slurmctld daemon: the same directory used for log files,
SlurmdLogFile (if specified with an absolute pathname) otherwise the
directory used to save state, SlurmdSpoolDir.
-- Add sattach support for the --pty option.
-- Modify slurmctld communications logic to accept incoming messages on more
than one port for improved scalability.
-- Add SchedulerParameters option of "defer" to avoid trying to schedule a
job at submission time, but to attempt scheduling many jobs at once for
improved performance under heavy load.
-- Correct logic controlling slurmctld thread limit eliminating check of
RLIMIT_STACK.
-- Make slurmctld's trigger logic more robust in the event that job records
get purged before their trigger can be processed (e.g. MinJobAge=1).
-- Add support for users to hold/release their own jobs (submit the job with
srun/sbatch --hold/-H option or use "scontrol update jobid=# priority=0"
to hold and "scontrol update jobid=# priority=1" to release).
-- Added ability for sacct to query jobs by qos and a range of timelimits.
-- Added ability for sstat to query pids of steps running.
-- Support time specification in UTS format with a prefix of "uts" (e.g.
"sbatch --begin=uts458389988 my.script").
* Changes in SLURM 2.2.0.pre7
=============================
-- Fixed issue with sacctmgr if querying against non-existent cluster it

Danny Auble
committed
works the same way as 2.1.
-- Added infrastructure to support allocation of generic node resources (gres).
-Modified select/linear and select/cons_res plugins to allocate resources
at the level of a job without oversubcription.
-Get sched/backfill operating with gres allocations.
-Get gres configuration changes (reconfiguration) working.
-Have job steps allocate resources.
-Modified job step credential to include the job's and step's gres
allocation details.
-Integrate with HWLOC library to identify GPUs and NICs configured on each
node.
-- SLURM commands (squeue, sinfo, etc...) can now go cross-cluster on like
linux systems. Cross-cluster for bluegene to linux and such should
-- Added the ability to configure PreemptMode on a per-partition basis.
-- Change slurmctld's default thread limit count to 1024, but adjust that down
Loading
Loading full blame...