Newer
Older
This file describes changes in recent versions of SLURM. It primarily
documents those changes that are of interest to users and admins.
-- Do not purge inactive interactive jobs that lack a port to ping (added
for MR+ operation).
-- Advanced reservations with hostname and core counts now supports asymetric
reservations (e.g. specific different core count for each node).
-- Added slurmctld/dynalloc plugin for MapReduce+ support.
-- Added "DynAllocPort" configuration parameter.
-- Added partition paramter of SelectType to override system-wide value.
Added cr_type to partition_info data structure.
* Changes in SLURM 2.6.0pre1
============================
-- Add "state" field to job step information reported by scontrol.
-- Notify srun to retry step creation upon completion of other job steps
rather than polling. This results in much faster throughput for job step
execution with --exclusive option.
-- Added "ResvEpilog" and "ResvProlog" configuration parameters to execute a
program at the beginning and end of each reservation.
-- Added "slurm_load_job_user" function. This is a variation of
"slurm_load_jobs", but accepts a user ID argument, potentially resulting
in substantial performance improvement for "squeue --user=ID"
-- Added "slurm_load_node_single" function. This is a variation of
"slurm_load_nodes", but accepts a node name argument, potentially resulting
in substantial performance improvement for "sinfo --nodes=NAME".
-- Added "HealthCheckNodeState" configuration parameter identify node states
on which HealthCheckProgram should be executed.
-- Remove sacct --dump --formatted-dump options which were deprecated in
2.5.
-- Added support for job arrays (phase 1 of effort). See "man sbatch" option
-a/--array for details.
-- Add new AccountStorageEnforce options of 'nojobs' and 'nosteps' which will
allow the use of accounting features like associations, qos and limits but
not keep track of jobs or steps in accounting.
-- Cray - Add new cray.conf parameter of "AlpsEngine" to specify the
communication protocol to be used for ALPS/BASIL.
-- select/cons_res plugin: Correction to CPU allocation count logic in for
cores without hyperthreading.
-- Added new SelectTypeParameter value of "CR_ALLOCATE_FULL_SOCKET".
-- Added PriorityFlags value of "TICKET_BASED" and merged priority/multifactor2
plugin into priority/multifactor plugin.

Morris Jette
committed
-- Add "KeepAliveTime" configuration parameter controlling how long sockets
used for srun/slurmstepd communications are kept alive after disconnect.
-- Added SLURM_SUBMIT_HOST to salloc, sbatch and srun job environment.
-- Added SLURM_ARRAY_TASK_ID to environment of job array.
-- Added squeue --array/-r option to optimize output for job arrays.
-- Added "SlurmctldPlugstack" configuration parameter for generic stack of
slurmctld daemon plugins.
-- Removed contribs/arrayrun tool. Use native support for job arrays.
-- Modify default installation locations for RPMs to match "make install":
_prefix /usr/local
_slurm_sysconfdir %{_prefix}/etc/slurm
_mandir %{_prefix}/share/man
_infodir %{_prefix}/share/info
-- Add acct_gather_energy/ipmi which works off freeipmi for energy gathering
* Changes in Slurm 2.5.4
========================
-- Fix bug in PrologSlurmctld use that would block job steps until node
responds.
* Changes in SLURM 2.5.3
========================
-- Gres/gpu plugin - If no GPUs requested, set CUDA_VISIBLE_DEVICES=NoDevFiles.
This bug was introduced in 2.5.2 for the case where a GPU count was
configured, but without device files.
-- task/affinity plugin - Fix bug in CPU masks for some processors.
-- Modify sacct command to get format from SACCT_FORMAT environment variable.
-- BGQ - Changed order of library inclusions and fixed incorrect declaration
to compile correctly on newer compilers
-- Fix for not building sview if glib exists on a system but not the gtk libs.
-- BGQ - Fix for handling a job cleanup on a small block if the job has long
since left the system.
-- Fix race condition in job dependency logic which can result in invalid
memory reference.
* Changes in SLURM 2.5.2
========================
-- Fix advanced reservation recovery logic when upgrading from version 2.4.
-- BLUEGENE - fix for QOS/Association node limits.
-- Add missing "safe" flag from print of AccountStorageEnforce option.
-- Fix logic to optimize GRES topology with respect to allocated CPUs.
-- Add job_submit/all_partitions plugin to set a job's default partition
to ALL available partitions in the cluster.
-- Modify switch/nrt logic to permit build without libnrt.so library.
-- Handle srun task launch failure without duplicate error messages or abort.

Matthieu Hautreux
committed
-- Fix bug in QoS limits enforcement when slurmctld restarts and user not yet
added to the QOS list.
-- Fix issue where sjstat and sjobexitmod was installed in 2 different RPMs.
-- Fix for job request of multiple partitions in which some partitions lack
nodes with required features.
-- Permit a job to use a QOS they do not have access to if an administrator
manually set the job's QOS (previously the job would be rejected).
-- Make more variables available to job_submit/lua plugin: slurm.MEM_PER_CPU,
slurm.NO_VAL, etc.
-- Fix topology/tree logic when nodes defined in slurm.conf get re-ordered.
-- In select/cons_res, correct logic to allocate whole sockets to jobs. Work
by Magnus Jonsson, Umea University.
-- In select/cons_res, correct logic when job removed from only some nodes.
-- Avoid apparent kernel bug in 2.6.32 which apparently is solved in
at least 3.5.0. This avoids a stack overflow when running jobs on
more than 120k nodes.
-- BLUEGENE - If we made a block that isn't runnable because of a overlapping
block, destroy it correctly.
-- Switch/nrt - Dynamically load libnrt.so from within the plugin as needed.
This eliminates the need for libnrt.so on the head node.
-- BLUEGENE - Fix in reservation logic that could cause abort.
* Changes in SLURM 2.5.1
========================
-- Correction to hostlist sorting for hostnames that contain two numeric
components and the first numeric component has various sizes (e.g.
"rack9blade1" should come before "rack10blade1")
-- BGQ - Only poll on initialized blocks instead of calling getBlocks on
each block independently.
-- Fix of task/affinity plugin logic for Power7 processors having hyper-
threading disabled (cpu mask has gaps).
-- Fix of job priority ordering with sched/builtin and priority/multifactor.
-- CRAY - Fix for setting up the aprun for a large job (+2000 nodes).

Morris Jette
committed
-- Fix for race condition related to compute node boot resulting in node being
set down with reason of "Node <name> unexpectedly rebooted"
-- RAPL - Fix for handling errors when opening msr files.
-- BGQ - Fix for salloc/sbatch to do the correct allocation when asking for
-N1 -n#.
-- BGQ - in emulation make it so we can pretend to run large jobs (>64k nodes)
-- BLUEGENE - Correct method to update conn_type of a job.
-- BLUEGENE - Fix issue with preemption when needing to preempt multiple jobs
to make one job run.
-- Fixed issue where if an srun dies inside of an allocation abnormally it
would of also killed the allocation.
-- FRONTEND - fixed issue where if a systems nodes weren't defined in the
slurm.conf with NodeAddr's signals going to a step could be handled
incorrectly.

Morris Jette
committed
-- If sched/backfill starts a job with a QOS having NO_RESERVE and not job
time limit, start it with the partition time limit (or one year if the
partition has no time limit) rather than NO_VAL (140 year time limit);
-- Alter hostlist logic to allocate large grid dynamically instead of on
stack.
-- Change RPC version checks to support version 2.5 slurmctld with version 2.4
slurmd daemons.
-- Correct core reservation logic for use with select/serial plugin.
-- Exit scontrol command on stdin EOF.
-- Disable job --exclusive option with select/serial plugin.
* Changes in SLURM 2.5.0
========================
-- Add DenyOnLimit flag for QOS to deny jobs at submission time if they
request resources that reach a 'Max' limit.
-- Permit SlurmUser or operator to change QOS of non-pending jobs (e.g.
running jobs).
-- BGQ - move initial poll to beginning of realtime interaction, which will
also cause it to run if the realtime server ever goes away.
-- Modify sbcast logic to survive slurmd daemon restart while file a
transmission is in progress.
-- Add retry logic to munge encode/decode calls. This is needed if the munge
deamon is under very heavy load (e.g. with 1000 slurmd daemons per compute
node).
-- Add launch and acct_gather_energy plugins to RPMs.
-- Restore support for srun "--mpi=list" option.
-- CRAY - Introduce step accounting for a Cray.
-- Modify srun to abandon I/O 60 seconds after the last task ends. Otherwise
an aborted slurmstepd can cause the srun process to hang indefinitely.
-- ENERGY - RAPL - alter code to close open files (and only open them once
where needed)
-- If the PrologSlurmctld fails, then requeue the job an indefinite number
of times instead of only one time.
-- Added Prolog and Epilog Guide (web page). Based upon work by Jason Sollom,
Cray Inc. and used by permission.
-- Restore gang scheduling functionality. Preemptor was not being scheduled.
Fix for bugzilla #3.
-- Add "cpu_load" to node information. Populate CPULOAD in node information
reported to Moab cluster manager.
-- Preempt jobs only when insufficient idle resources exist to start job,
regardless of the node weight.
-- Added priority/multifactor2 plugin based upon ticket distribution system.
Work by Janne Blomqvist, Aalto University.
-- Add SLURM_NODELIST to environment variables available to Prolog and Epilog.
-- Permit reservations to allow or deny access by account and/or user.
-- Add ReconfigFlags value of KeepPartState. See "man slurm.conf" for details.
-- Modify the task/cgroup plugin adding a task_pre_launch_priv function and
move slurmstepd outside of the step's cgroup. Work by Matthieu Hautreux.
-- Intel MIC processor support added using gres/mic plugin. BIG thanks to
Olli-Pekka Lehto, CSC-IT Center for Science Ltd.
-- Accounting - Change empty jobacctinfo structs to not actually be used
instead of putting 0's into the database we put NO_VALS and have sacct
figure out jobacct_gather wasn't used.
-- Cray - Prevent calling basil_confirm more than once per job using a flag.
-- Fix bug with topology/tree and job with min-max node count. Now try to
get max node count rather than minimizing leaf switches used.
Loading
Loading full blame...