Newer
Older
This file describes changes in recent versions of SLURM. It primarily
documents those changes that are of interest to users and admins.
* Changes in SLURM 1.4.0-pre4
=============================
-- For task/affinity, force jobs to use a particular task binding by setting
the TaskPluginParam configuration parameter rather than slurmd's
SLURM_ENFORCED_CPU_BIND environment variable.
-- Enable full preemption of jobs by partition with select/cons_res
(cons_res_preempt.patch from Chris Holmes, HP).

Moe Jette
committed
* Changes in SLURM 1.4.0-pre3
=============================
-- Internal changes: CPUs per node changed from 32-bit to 16-bit size.
Node count fields changed from 16-bit to 32-bit size in some structures.
-- Remove select plugin functions select_p_get_extra_jobinfo(),
select_p_step_begin() and select_p_step_fini().
-- Remove the following slurmctld job structure fields: num_cpu_groups,
cpus_per_node, cpu_count_reps, alloc_lps_cnt, alloc_lps, and used_lps.
Use equivalent fields in new "select_job" structure, which is filled
in by the select plugins.
-- Modify mem_per_task in job step request from 16-bit to 32-bit size.
Use new "select_job" structure for the job step's memory management.
-- Add core_bitmap_job to slurmctld's job step structure to identify

Moe Jette
committed
-- Add new configuration option OverTimeLimit to permit jobs to exceed
their (soft) time limit by a configurable amount. Backfill scheduling
will be based upon the soft time limit.

Moe Jette
committed
-- Remove select_g_get_job_cores(). That data is now within the slurmctld's
job structure.
* Changes in SLURM 1.4.0-pre2
=============================
-- Remove srun's --ctrl-comm-ifhn-addr option (for PMI/MPICH2). It is no
longer needed.
-- Modify power save mode so that nodes can be powered off when idle. See
https://computing.llnl.gov/linux/slurm/power_save.html or
"man slurm.conf" (SuspendProgram and related parameters) for more
information.
-- Added configuration parameter PrologSlurmctld, which can be used to boot
nodes into a particular state for each job. See "man slurm.conf" for
details.
-- Add configuration parameter CompleteTime to control how long to wait for
a job's completion before allocating already released resources to pending
jobs. This can be used to reduce fragmentation of resources. See
"man slurm.conf" for details.
-- Make default CryptoType=crypto/munge. OpenSSL is now completely optional.
-- Make default AuthType=auth/munge rather than auth/none.
-- Change output format of "sinfo -R" from "%35R %N" to "%50R %N".
* Changes in SLURM 1.4.0-pre1
=============================
-- Save/restore a job's task_distribution option on slurmctld retart.
NOTE: SLURM must be cold-started on converstion from version 1.3.x.
-- Remove task_mem from job step credential (only job_mem is used now).
-- Remove --task-mem and --job-mem options from salloc, sbatch and srun
(use --mem-per-cpu or --mem instead).
-- Remove DefMemPerTask from slurm.conf (use DefMemPerCPU or DefMemPerNode
instead).
-- Modify slurm_step_launch API call. Move launch host from function argument
to element in the data structure slurm_step_launch_params_t, which is
used as a function argument.
-- Add state_reason_string to job state with optional details about why
a job is pending.
-- Make "scontrol show node" output match scontrol input for some fields
("Cores" changed to "CoresPerSocket", etc.).
-- Add support for a new node state "FUTURE" in slurm.conf. These node records
are created in SLURM tables for future use without a reboot of the SLURM
daemons, but are not reported by any SLURM commands or APIs.
* Changes in SLURM 1.3.10
=========================
-- Fix several bugs in the hostlist functions:
- Fix hostset_insert_range() to do proper accounting of hl->nhosts (count).
- Avoid assertion failure when callinsg hostset_create(NULL).
- Fix return type of hostlist and hostset string functions from size_t to
ssize_t.
- Add check for NULL return from hostlist_create().
- Rewrite of hostrange_hn_within(), avoids reporting "tst0" in the hostlist
"tst".
* Changes in SLURM 1.3.9
========================
-- Fix jobs being cancelled by ctrl-C to have correct cancelled state in
accounting.
-- Slurmdbd will only cache user data, made for faster start up
-- Improved support for job steps in FRONT_END systems
-- Added support to dump and load association information in the controller
on start up if slurmdbd is unresponsive
-- BLUEGENE - Added support for sched/backfill plugin
-- sched/backfill modified to initiate multiple jobs per cycle.
-- Increase buffer size in srun to hold task list expressions. Critical
for jobs with 16k tasks or more.
-- Added support for eligible jobs and downed nodes to be sent to accounting
from the controller the first time accounting is turned on.
-- Correct srun logic to support --tasks-per-node option without task count.
-- Logic in place to handle multiple versions of RPCs within the slurmdbd.
THE SLURMDBD MUST BE UPGRADED TO THIS VERSION BEFORE UPGRADING THE
SLURMCTLD OR THEY WILL NOT TALK.
Older versions of the slurmctld will continue to talk to the new slurmdbd.
-- Add support for new job dependency type: singleton. Only one job from a
given user with a given name will execute with this dependency type.
From Matthieu Hautreux, CEA.
-- Updated contribs/python/hostlist to version 1.3: See "CHANGES" file in
that directory for details. From Kent Engstrom, NSC.
-- Add SLURM_JOB_NAME environment variable for jobs submitted using sbatch.
In order to prevent the job steps from all having the same name as the
batch job that spawned them, the SLURM_JOB_NAME environment variable is
ignored when setting the name of a job step from within an existing
resource allocation.
-- For use with sched/wiki2 (Moab only), set salloc's default shell based
upon the user who the job runs as rather than the user submitting the job
(user root).
-- Fix to sched/backfill when job specifies no time limit and the partition
time limit is INFINITE.
-- Validate a job's constraints (node features) at job submit or modification
time. Major re-write of resource allocation logic to support more complex
job feature requests.
-- For sched/backfill, correct logic to support job constraint specification
(e.g. node features).
-- Correct power save logic to avoid trying to wake DOWN node. From Matthieu
Hautreux, CEA.
-- Cancel a job step when one of it's nodes goes DOWN based upon the job
step's --no-kill option, by default the step is killed (previously the
job step remained running even without the --no-kill option).
-- Fix bug in logic to remove whitespace from plugstack.conf.
-- Add new configuration parameter SallocDefaultCommand to control what
shell that salloc launches by default.
-- When enforcing PrivateData configuration parameter, failures return
"Access/permission denied" rather than "Invalid user id".
-- From sbatch and srun, if the --dependency option is specified then set
the environment variable SLURM_JOB_DEPENDENCY to the same value.
-- In plugin jobcomp/filetxt, use ISO8601 formats for time by default (e.g.
YYYY-MM-DDTHH:MM:SS rather than MM/DD-HH:MM:SS). This restores the default
behavior from Slurm version 1.2. Change the value of USE_ISO8601 in
src/plusings/jobcomp/filetxt/jobcomp_filetxt.c to revert the behavior.
-- Add support for configuration option of ReturnToService=2, which will
return a DOWN to use if the node was previous set DOWN for any reason.
-- Removed Gold accounting plugin. This plugin was to be used for accounting
but has seen not been maintained and is no longer needed. If using this
please contact slurm-dev@llnl.gov.
-- When not enforcing associations and running accounting if a user
submits a job to an account that does not have an association on the
cluster the account will be changed to the default account to help
avoid trash in the accounting system. If the users default account
does not have an association on the cluster the requested account
will be used.
-- Add configuration parameter "--have-front-end" to define HAVE_FRONT_END
in config.h and run slurmd only on a front end (suitable only for SLURM
development and testing).
* Changes in SLURM 1.3.8
========================
-- Added PrivateData flags for Users, Usage, and Accounts to Accounting.
If using slurmdbd, set in the slurmdbd.conf file. Otherwise set in the
slurm.conf file. See "man slurm.conf" or "man slurmdbd.conf" for details.
-- Reduce frequency of resending job kill RPCs. Helpful in the event of
network problems or down nodes.
-- Fix memory leak caused under heavy load when running with select/cons_res
plus sched/backfill.
-- For salloc, if no local command is specified, execute the user's default
shell.
-- BLUEGENE - patch to make sure when starting a job blocks required to be
freed are checked to make sure no job is running on them. If one is found
we will requeue the new job. No job will be lost.
-- BLUEGENE - Set MPI environment variables from salloc.
-- BLUEGENE - Fix threading issue for overlap mode
-- Reject batch scripts containing DOS linebreaks.
-- BLUEGENE - Added wait for block boot to salloc
* Changes in SLURM 1.3.7
========================
-- Add jobid/stepid to MESSAGE_TASK_EXIT to address race condition when
a job step is cancelled, another is started immediately (before the
first one completely terminates) and ports are reused.
NOTE: This change requires that SLURM be updated on all nodes of the
cluster at the same time. There will be no impact upon currently running
jobs (they will ignore the jobid/stepid at the end of the message).
-- Added Python module to process hostslists as used by SLURM. See
contribs/python/hostlist. Supplied by Kent Engstrom, National
Supercomputer Centre, Sweden.
-- Report task termination due to signal (restored functionality present
in slurm v1.2).
-- Remove sbatch test for script size being no larger than 64k bytes.
The current limit is 4GB.
-- Disable FastSchedule=0 use with SchedulerType=sched/gang. Node
configuration must be specified in slurm.conf for gang scheduling now.
-- For sched/wiki and sched/wiki2 (Maui or Moab scheduler) disable the ability
of a non-root user to change a job's comment field (used by Maui/Moab for
storing scheduler state information).
-- For sched/wiki (Maui) add pending job's future start time to the state
info reported to Maui.
-- Improve reliability of job requeue logic on node failure.
-- Add logic to ping non-responsive nodes even if SlurmdTimeout=0. This permits
the node to be returned to use when it starts responding rather than
remaining in a non-usable state.
-- Honor HealthCheckInterval values that are smaller than SlurmdTimeout.
-- For non-responding nodes, log them all on a single line with a hostlist
expression rather than one line per node. Frequency of log messages is
dependent upon SlurmctldDebug value from 300 seconds at SlurmctldDebug<=3
Loading
Loading full blame...