Newer
Older
This file describes changes in recent versions of SLURM. It primarily
documents those changes that are of interest to users and admins.
* Changes in SLURM 2.5.0.pre4
=============================
-- Added Prolog and Epilog Guide (web page). Based upon work by Jason Sollom,
Cray Inc. and used by permission.
-- Restore gang scheduling functionality. Preemptor was not being scheduled.
Fix for bugzilla #3.
-- Add "cpu_load" to node information. Populate CPULOAD in node information
reported to Moab cluster manager.
-- Preempt jobs only when insufficient idle resources exist to start job,
regardless of the node weight.
-- Added priority/multifactor2 plugin based upon ticket distribution system.
Work by Janne Blomqvist, Aalto University.
-- Add SLURM_NODELIST to environment variables available to Prolog and Epilog.
-- Permit reservations to allow or deny access by account and/or user.
-- Add ReconfigFlags value of KeepPartState. See "man slurm.conf" for details.
-- Modify the task/cgroup plugin adding a task_pre_launch_priv function and
move slurmstepd outside of the step's cgroup. Work by Matthieu Hautreux.
-- Intel MIC processor support added using gres/mic plugin. BIG thanks to
Olli-Pekka Lehto, CSC-IT Center for Science Ltd.
* Changes in SLURM 2.5.0.pre3
=============================
-- Add Google search to all web pages.
-- Add sinfo -T option to print reservation information. Work by Bill Brophy,
Bull.
-- Force slurmd exit after 2 minute wait, even if threads are hung.
-- Change node_req field in struct job_resources from 8 to 32 bits so we can
run more than 256 jobs per node.
-- sched/backfill: Improve accuracy of expected job start with respect to
reservations.
-- sinfo partition field size will be set the the length of the longest
partition name by default.
-- Make it so the parse_time will return a valid 0 if given epoch time and
set errno == ESLURM_INVALID_TIME_VALUE on error instead.
-- Correct srun --no-alloc logic when node count exceeds node list or task
task count is not a multiple of the node count. Work by Hongjia Cao, NUDT.
-- Completed integration with IBM Parallel Environment including POE and IBM's
NRT switch library.
* Changes in SLURM 2.5.0.pre2
=============================
-- When running with multiple slurmd daemons per node, enable specifying a
range of ports on a single line of the node configuration in slurm.conf.
-- Add reservation flag of Part_Nodes to allocate all nodes in a partition to
a reservation and automatically change the reservation when nodes are
added to or removed from the reservation. Based upon work by
Bill Brophy, Bull.
-- Add support for advanced reservation for specific cores rather than whole
nodes. Current limiations: homogeneous cluster, nodes idle when reservation
created, and no more than one reservation per node. Code is still under
development. Work by Alejandro Lucero Palau, et. al, BSC.
-- Add DebugFlag of Switch to log switch plugin details.
-- Correct job node_cnt value in job completion plugin when job fails due to
down node. Previously was too low by one.
-- Add new srun option --cpu-freq to enable user control over the job's CPU
frequency and thus it's power consumption. NOTE: cpu frequency is not
currently preserved for jobs being suspended and later resumed. Work by
Don Albert, Bull.
* Changes in SLURM 2.5.0.pre1
=============================
-- Add new output to "scontrol show configuration" of LicensesUsed. Output is
"name:used/total"
-- Changed jobacct_gather plugin infrastructure to be cleaner and easier to
maintain.
-- Change license option count separator from "*" to ":" for consistency with
the gres option (e.g. "--licenses=foo:2 --gres=gpu:2"). The "*" will still
be accepted, but is no longer documented.
-- Permit more than 100 jobs to be scheduled per node (new limit is 250
-- Restructure of srun code to allow outside programs to utilize existing
logic.
* Changes in SLURM 2.4.4
========================
-- BGQ - minor fix to make build work in emulated mode.
-- BGQ - Fix if large block goes into error and the next highest priority jobs
are planning on using the block. Previously it would fail those jobs
erroneously.
-- BGQ - Fix issue when a cnode going to an error (not SoftwareError) state
with a job running or trying to run on it.
-- Execute slurm_spank_job_epilog when there is no system Epilog configured.
-- Fix for srun --test-only to work correctly with timelimits
-- BGQ - If a job goes away while still trying to free it up in the
database, and the job is running on a small block make sure we free up
the correct node count.
-- BGQ - Logic added to make sure a job has finished on a block before it is
purged from the system if its front-end node goes down.
-- Modify strigger so that a filter option of "--user=0" is supported.
-- Correct --mem-per-cpu logic for core or socket allocations with multiple
threads per core.
-- Fix for older < glibc 2.4 systems to use euidaccess() instead of eaccess().
-- BLUEGENE - Do not alter a pending job's node count when changing it's
partition.
-- Fix for older < glibc 2.4 systems to use euidaccess instead of eaccess.
-- BGQ - Add functionality to make it so we track the actions on a block.
This is needed for when a free request is added to a block but there are
jobs finishing up so we don't start new jobs on the block since they will
fail on start.
-- BGQ - Fixed InactiveLimit to work correctly to avoid scenarios where a
user's pending allocation was started with srun and then for some reason
the slurmctld was brought down and while it was down the srun was removed.
-- Fixed InactiveLimit math to work correctly
-- BGQ - Add logic to make it so blocks can't use a midplane with a nodeboard
in error for passthrough.
-- BGQ - Make it so if a nodeboard goes in error any block using that midplane
for passthrough gets removed on a dynamic system.
-- BGQ - Fix for printing realtime server debug correctly.
-- BGQ - Cleaner handling of cnode failures when reported through the runjob
interface instead of through the normal method.
-- smap - spread node information across multiple lines for larger systems.
-- Cray - Defer salloc until after PrologSlurmctld completes.
-- Correction to slurmdbd communications failure handling logic, incorrect
error codes returned in some cases.
* Changes in SLURM 2.4.3
========================
-- Accounting - Fix so complete 32 bit numbers can be put in for a priority.
-- cgroups - fix if initial directory is non-existent SLURM creates it
correctly. Before the errno wasn't being checked correctly
-- BGQ - fixed srun when only requesting a task count and not a node count
to operate the same way salloc or sbatch did and assign a task per cpu
by default instead of task per node.
-- Fix salloc --gid to work correctly. Reported by Brian Gilmer
-- BGQ - fix smap to set the correct default MloaderImage
-- Close the batch job's environment file when it contains no data to avoid
leaking file descriptors.
-- Fix sbcast's credential to last till the end of a job instead of the
previous 20 minute time limit. The previous behavior would fail for
large files 20 minutes into the transfer.
-- Return ESLURM_NODES_BUSY rather than ESLURM_NODE_NOT_AVAIL error on job
submit when required nodes are up, but completing a job or in exclusive
job allocation.
-- Add HWLOC_FLAGS so linking to libslurm works correctly
-- BGQ - If using backfill and a shared block is running at least one job
and a job comes through backfill and can fit on the block without ending
jobs don't set an end_time for the running jobs since they don't need to
end to start the job.
-- Initialize bind_verbose when using task/cgroup.
-- BGQ - Fix for handling backfill much better when sharing blocks.
-- BGQ - Fix for making small blocks on first pass if not sharing blocks.
-- BLUEGENE - Remove force of default conn_type instead of leaving NAV
when none are requested. The Block allocator sets it up temporarily so
this isn't needed.
-- BLUEGENE - Fix deadlock issue when dealing with bad hardware if using
static blocks.
-- Fix to mysql plugin during rollup to only query suspended table when jobs
reported some suspended time.
-- Fix compile with glibc 2.16 (Kacper Kowalik)
-- BGQ - fix for deadlock where a block has error on it and all jobs
running on it are preemptable by scheduling job.
-- proctrack/cgroup: Exclude internal threads from "scontrol list pids".
Patch from Matthieu Hautreux, CEA.
-- Memory leak fixed for select/linear when preempting jobs.
-- Fix if updating begin time of a job to update the eligible time in
accounting as well.
-- BGQ - make it so you can signal steps when signaling the job allocation.
-- BGQ - Remove extra overhead if a large block has many cnode failures.
-- Priority/Multifactor - Fix issue with age factor when a job is estimated to
start in the future but is able to run now.
-- CRAY - update to work with ALPS 5.1
-- BGQ - Handle issue of speed and mutexes when polling instead of using the
realtime server.
-- BGQ - Fix minor sorting issue with sview when sorting by midplanes.
-- Accounting - Fix for handling per user max node/cpus limits on a QOS
correctly for current job.
-- Update documentation for -/+= when updating a reservation's
users/accounts/flags
-- Update pam module to work if using aliases on nodes instead of actual
host names.
-- Correction to task layout logic in select/cons_res for job with minimum
and maximum node count.
-- BGQ - Put final poll after realtime comes back into service to avoid
having the realtime server go down over and over again while waiting
for the poll to finish.
-- task/cgroup/memory - ensure that ConstrainSwapSpace=no is correctly
handled. Work by Matthieu Hautreux, CEA.
-- CRAY - Fix for sacct -N option to work correctly
-- CRAY - Update documentation to describe installation from rpm instead
or previous piecemeal method.
-- Fix sacct to work with QOS' that have previously been deleted.
-- Added all available limits to the output of sacctmgr list qos
* Changes in SLURM 2.4.2
========================
-- BLUEGENE - Correct potential deadlock issue when hardware goes bad and
there are jobs running on that hardware.
-- If job is submitted to more than one partition, it's partition pointer can
be set to an invalid value. This can result in the count of CPUs allocated
on a node being bad, resulting in over- or under-allocation of its CPUs.
Patch by Carles Fenoy, BSC.
-- Fix bug in task layout with select/cons_res plugin and --ntasks-per-node
option. Patch by Martin Perry, Bull.
-- BLUEGENE - remove race condition where if a block is removed while waiting
for a job to finish on it the number of unused cpus wasn't updated
correctly.
Loading
Loading full blame...