Newer
Older
This file describes changes in recent versions of Slurm. It primarily
documents those changes that are of interest to users and admins.
* Changes in Slurm 14.11.0pre1
==============================
-- Modify etc/cgroup.release_common.example to set specify full path to the
scontrol command. Also find cgroup mount point by reading cgroup.conf file.
-- Improve qsub wrapper support for passing environment variables.
-- Modify sdiag to report Slurm RPC traffic by user, type, count and time
consumed.
-- In select plugins, stop triggering extra logging based upon the debug flag
-- Added SchedulerParameters options of bf_yield_interval and bf_yield_sleep
to control how frequently and for how long the backfill scheduler will
relinquish its locks.
-- To support larger numbers of jobs when the StateSaveDirectory is on a
file system that supports a limited number of files in a directory, add a
subdirectory called "hash.#" based upon the last digit of the job ID.
-- More gracefully handle missing batch script file. Just kill the job and do
not drain the compute node.
-- Add support for allocation of GRES by model type for heterogenous systems
(e.g. request a Kepler GPU, a Tesla GPU, or a GPU of any type).
-- Record and enable display of nodes anticipated to be used for pending jobs.
-- Modify squeue --start option to print the nodes expected to be used for
pending job (in addition to expected start time, etc.).
-- Add association hash to the assoc_mgr.
-- Better logic to handle resized jobs when the DBD is down.
-- Introduce MemLimitEnforce yes|no in slurm.conf. If set no Slurm will
not terminate jobs if they exceed requested memory.
* Changes in Slurm 14.03.2
==========================
-- Update configure to set correct version without having to run autogen.sh
* Changes in Slurm 14.03.1
==========================
-- Add support for job std_in, std_out and std_err fields in Perl API.
-- Add "Scheduling Configuration Guide" web page.
-- BGQ - fix check for jobinfo when it is NULL
-- Do not check cleaning on "pending" steps.
-- task/cgroup plugin - Fix for building on older hwloc (v1.0.2).
-- In the PMI implementation by default don't check for duplicate keys.
Set the SLURM_PMI_KVS_DUP_KEYS if you want the code to check for
duplicate keys.
-- Permit user root to propagate resource limits higher than the hard limit
slurmd has on that compute node has (i.e. raise both current and maximum
limits).
-- Fix issue with license used count when doing an scontrol reconfig.
-- Fix the PMI iterator to not report duplicated keys.
-- Fix issue with sinfo when -o is used without the %P option.
-- Rather than immediately invoking an execution of the scheduling logic on
every event type that can enable the execution of a new job, queue its
execution. This permits faster execution of some operations, such as
modifying large counts of jobs, by executing the scheduling logic less
frequently, but still in a timely fashion.
-- If the environment variable is greater than MAX_ENV_STRLEN don't
set it in the job env otherwise the exec() fails.
-- Optimize scontrol hold/release logic for job arrays.
-- Modify srun to report an exit code of zero rather than nine if some tasks
exit with a return code of zero and others are killed with SIGKILL. Only an
exit code of zero did this.
-- Avoid slurmctld crash getting job info if detail_ptr is NULL.
-- Fix sacctmgr add user where both defaultaccount and accounts are specified.
-- Added SchedulerParameters option of max_sched_time to limit how long the
main scheduling loop can execute for.
-- Added SchedulerParameters option of sched_interval to control how frequently
the main scheduling loop will execute.
-- Move start time of main scheduling loop timeout after locks are aquired.
-- Add squeue job format option of "%y" to print a job's nice value.
-- Update scontrol update jobID logic to operate on entire job arrays.
-- Fix PrologFlags=Alloc to run the prolog on each of the nodes in the
allocation instead of just the first.
-- Fix race condition if a step is starting while the slurmd is being
restarted.
-- Make sure a job's prolog has ran before starting a step.
-- BGQ - Fix invalid memory read when using DefaultConnType in the
bluegene.conf
-- Make sure we send node state to the DBD on clean start of controller.
-- Fix some sinfo and squeue sorting anomalies due to differences in data
types.
-- Only send message back to slurmctld when PrologFlags=Alloc is used on a
Cray/ALPS system, otherwise use the slurmd to wait on the prolog to gate
the start of the step.
-- Remove need to check PrologFlags=Alloc in slurmd since we can tell if prolog
has ran yet or not.
-- Fix squeue to use a correct macro to check job state.
-- BGQ - Fix incorrect logic issues if MaxBlockInError=0 in the bluegene.conf.
-- priority/basic - Insure job priorities continue to decrease when jobs are
submitted with the --nice option.
-- Make the PrologFlag=Alloc work on batch scripts
-- Make PrologFlag=NoHold (automatically sets PrologFlag=Alloc) not hold in
salloc/srun, instead wait in the slurmd when a step hits a node and the
prolog is still running.
-- Added --cpu-freq=highm1 (high minus one) option.
-- Expand StdIn/Out/Err string length output by "scontrol show job" from 128
to 1024 bytes.
-- squeue %F format will now print the job ID for non-array jobs.
-- Use quicksort for all priority based job sorting, which improves performance
significantly with large job counts.
-- If a job has already been released from a held state ignore successive
release requests.
-- Fix srun/salloc/sbatch man pages for the --no-kill option.
-- Add squeue -L/--licenses option to filter jobs by license names.
-- Handle abort job on node on front end systems without core dumping.
-- Fix dependency support for job arrays.
-- When updating jobs verify the update request is not identical to
the current settings.
-- When sorting jobs and priorities are equal sort by job_id.
-- Do not overwrite existing reason for node being down or drained.
-- Requeue batch job if Munge is down and credential can not be created.
-- Make _slurm_init_msg_engine() tolerate bug in bind() returning a busy
ephemeral port.
-- Don't block scheduling of entire job array if it could run in multiple
partitions.
-- Introduce a new debug flag Protocol to print protocol requests received
together with the remote IP address and port.
-- CRAY - Set up the network even when only using 1 node.
-- CRAY - Greatly reduce the number of error messages produced from the task
plugin and provide more information in the message.
* Changes in Slurm 14.03.0
==========================
-- job_submit/lua: Fix invalid memory reference if script returns error message
for user.
-- Add logic to sleep and retry if slurm.conf can't be read.
-- Reset a node's CpuLoad value at least once each SlurmdTimeout seconds.
-- Scheduler enhancements for reservations: When a job needs to run in
reservation, but can not due to busy resources, then do not block all jobs
in that partition from being scheduled, but only the jobs in that
reservation.
-- Export "SLURM*" environment variables from sbatch even if --export=NONE.
-- When recovering node state if the Slurm version is 2.6 or 2.5 set the
protocol version to be SLURM_2_5_PROTOCOL_VERSION which is the minimum
supported version.
-- Update the scancel man page documenting the -s option.
-- Update sacctmgr man page documenting how to modify account's QOS.
-- Fix for sjstat which currently does not print >1TB memory values correctly.
-- Change xmalloc()/xfree() to malloc()/free() in hostlist.c for better
performance.
-- Update squeue.1 man page describing the SPECIAL_EXIT state.
-- Added scontrol option of errnumstr to return error message given a slurm
error number.
-- If srun invoked with the --multi-prog option, but no task count, then use
the task count provided in the MPMD configuration file.
-- Prevent sview abort on some systems when adding or removing columns to the
display for nodes, jobs, partitions, etc.
-- Add job array hash table for improved performance.
-- Make AccountingStorageEnforce=all not include nojobs or nosteps.
-- Added sacctmgr mod qos set RawUsage=0.
-- Modify hostlist functions to accept more than two numeric ranges (e.g.
"row[1-3]rack[0-8]slot[0-63]")
-- Run job scheduling logic immediately when nodes enter service.
-- Added sbatch '--parsable' option to output only the job id number and the
cluster name separated by a semicolon. Errors will still be displayed.
-- Added failure management "slurmctld/nonstop" plugin.
-- Prevent jobs being killed when a checkpoint plugin is enabled or disabled.
-- Update the documentation about SLURM_PMI_KVS_NO_DUP_KEYS environment
variable.
-- select/cons_res bug fix for range of node counts with --cpus-per-task
option (e.g. "srun -N2-3 -c2 hostname" would allocate 2 CPUs on the first
node and 0 CPUs on the second node).
-- Change reservation flags field from 16 to 32-bits.
-- Add reservation flag value of "FIRST_CORES".
-- Added the idea of Resources to the database. Framework for handling
license servers outside of Slurm.
-- When starting the slurmctld only send past job/node state information to
accounting if running for the first time (should speed up startup
dramatically on systems with lots of nodes or lots of jobs).
-- Make job array expressions more flexible to accept multiple step counts in
the expression (e.g. "--array=1-10:2,50-60:5,123").
-- switch/cray - add state save/restore logic tracking allocated ports.
-- SchedulerParameters - Replace max_job_bf with bf_max_job_start (both will
work for now).
-- Add SchedulerParameters options of preempt_reorder_count and
preempt_strict_order.
-- Make memory types in acct_gather uint64_t to handle systems with more than
4TB of memory on them.
-- BGQ - --export=NONE option for srun to make it so only the SLURM_JOB_ID
and SLURM_STEP_ID env vars are set.
-- Munge plugins - Add sleep between retries if can't connect to socket.
-- Added DebugFlags value of "License".
-- Added --enable-developer which will give you -Werror when compiling.
-- Fix for job request with GRES count of zero.
-- Job array dependency logic: Cache results for major performance improvement.
-- Modify squeue to support filter on job states Special_Exit and Resizing.
-- Defer purging job record until after EpilogSlurmctld completes.
-- Fix handling RPCs from a 14.03 slurmctld to a 2.6 slurmd
* Changes in Slurm 14.03.0pre6
==============================
-- Modify slurmstepd to log messages according to the LogTimeFormat
parameter in slurm.conf.
-- Insure that overlapping reservations do not oversubscribe available
licenses.
-- Added core specialization logic to select/cons_res plugin.
-- Added whole_node field to job_resources structure and enable gang scheduling
for jobs with core specialization.
-- When using FastSchedule = 1 the nodes with less than configured resources
are not longer set DOWN, they are set to DRAIN instead.
-- Modified 'sacctmgr show associations' command to show GrpCPURunMins
by default.
-- Replace the hostlist_push() function with a more efficient
hostlist_push_host().
-- Modify the reading of lustre file system statistics to print more
information when debug and when io error occur.
-- Add specialized core count field to job credential data.
NOTE: This changes the communications protocol from other pre-releases of
version 14.03. All programs must be cancelled and daemons upgraded from
previous pre-releases of version 14.03. Upgrades from version 2.6 or earlier
can take place without loss of jobs
-- Add version number to node and front-end configuration information visible
using the scontrol tool.
-- Add idea of a RESERVED flag for node state so idle resources are marked
not "idle" when in a reservation.
-- Added core specialization plugin infrastructure.
-- Added new job_submit/trottle plugin to control the rate at which a user
can submit jobs.
-- CRAY - added network performance counters option.
-- Allow scontrol suspend/resume to accept jobid in the format jobid_taskid
to suspend/resume array elements.
-- In the slurmctld job record, split "shared" variable into "share_res" (share
resource) and "whole_node" fields.
-- Fix the format of SLURM_STEP_RESV_PORTS. It was generated incorrectly
when using the hostlist_push_host function and input surrounded by [].
-- Modify the srun --slurmd-debug option to accept debug string tags
(quiet, fatal, error, info verbose) beside the numerical values.
-- Fix the bug where --cpu_bind=map_cpu is interpreted as mask_cpu.
-- Update the documentation egarding the state of cpu frequencies after
a step using --cpu-freq completes.
-- CRAY - Fix issue when a job is requeued and nhc is still running as it is
being scheduled to run again. This would erase the previous job info
that was still needed to clean up the nodes from the previous job run.
(Bug 526).
-- Set SLURM_JOB_PARTITION environment variable set for all job allocations.
-- Set SLURM_JOB_PARTITION environment variable for Prolog program.
-- Added SchedulerParameters option of partition_job_depth to limit scheduling
logic depth by partition.
-- Handle the case in which errno is not reset to 0 after calling
getgrent_r(), which causes the controller to core dump.
-- Added squeue format option of "%X" (core specialization count).
-- Added core specialization web page (just a start for now).
-- Added the SLURM_ARRAY_JOB_ID and SLURM_ARRAY_TASK_ID
-- Fix bug in job step allocation failing due to memory limit.
-- Modify the pbsnodes script to reflect its output on a TORQUE system.
-- Add ability to clear a node's DRAIN flag using scontrol or sview by setting
it's state to "UNDRAIN". The node's base state (e.g. "DOWN" or "IDLE") will
not be changed.
-- Modify the output of 'scontrol show partition' by displaying
DefMemPerCPU=UNLIMITED and MaxMemPerCPU=UNLIMITED when these limits are
configured as 0.
-- mpirun-mic - Major re-write of the command wrapper for Xeon Phi use.
-- Add new configuration parameter of AuthInfo to specify port used by
authentication plugin.
-- Fixed conditional RPM compiling.
-- Corrected slurmstepd ident name when logging to syslog.
-- Fixed sh5util loop when there are no node-step files.
-- Add SLURM_CLUSTER_NAME to environment variables passed to PrologSlurmctld,
Prolog, EpilogSlurmctld, and Epilog
Loading
Loading full blame...