Newer
Older
This file describes changes in recent versions of Slurm. It primarily
documents those changes that are of interest to users and administrators.
* Changes in Slurm 17.11.1
==========================
-- Fix --with-shared-libslurm option to work correctly.
-- Make it so only daemons log errors on configuration option duplicates.
-- Fix for ConstrainDevices=yes to work correctly.
-- Fix to purge old jobs using burst buffer if slurmctld daemon restarted
after the job's burst buffer work was already completed.
-- Make logging prefix for slurmstepd to happen as soon as possible.
-- mpi/pmix: Fix the job registration for the PMIx v2.1.
-- Fix uid check for signaling a step with anything but SIGKILL.
-- Fix uid check when requesting a jobid from a pid.
-- Return ESLURM_TRANSITION_STATE_NO_UPDATE instead of EAGAIN when trying to
signal a step that is still running a prolog.

David Gloe
committed
-- Update Cray slurm_playbook.yaml with latest recommended version.
-- Only say a prolog is done running after the extern step is launched.
-- Wait to start a batch step until the prolog and extern step are
fully ran/launched. Only matters if running with
PrologFlags=[contain|alloc].
-- Truncate a range for SlurmctldPort to FD_SETSIZE elements and throw an
error, otherwise network traffic may be lost due to poll() not detecting
traffic.
-- Fix for srun --pack-group option that can reuse/corrupt memory.
-- Fix handling ultra long hostlists in a hostfile.
-- X11: fix xauth regex to handle '-' in hostnames again.
-- Fix potential node reboot timeout problem for "scontrol reboot" command.
-- Add ability for squeue to sort jobs by submit time.
-- CRAY - Switch to standard pid files on Cray systems.
-- Update jobcomp records on duplicate inserts.

Alejandro Sanchez
committed
-- Make slurmd oom-unkillable by default, using oom.h OOM_SCORE_ADJ_MIN macro.
-- If unrecognized configuration file option found then print an appropriate
fatal error message rather than relying upon random errno value.
-- Initialize job_desc_msg_t's instead of just memset'ing them.
* Changes in Slurm 17.11.0
==========================
-- Fix documentation for MaxQueryTimeRange option in slurmdbd.conf.
-- Avoid srun abort trying to run on heterogeneous job component that has
ended.
-- Add SLURM_PACK_JOB_ID,SLURM_PACK_JOB_OFFSET to PrologSlurmctld and
EpilogSlurmctld environment.
-- Treat ":" in #SBATCH arguments as fatal error. The "#SBATCH packjob" syntax
must be used instead.
-- job_submit/lua plugin: expose pack_job fields to get.
-- Prevent scheduling deadlock with multiple components of heterogeneous job
in different partitions (i.e. one heterogeneous job component is higher
priority in one partition and another component is lower priority in a
different partition).
-- Fix for heterogeneous job starvation bug.
-- Add SLURM_PACK_JOB_NODELIST to PrologSlurmctld and EpilogSlurmctld
environment.
-- If PrologSlurmctld fails for pack job leader then requeue or kill all
components of the job.
-- Fix for mulitple --pack-group srun arguments given out of order.
-- Update slurm.conf(5) man page with updated example logrotate script.
-- Add SchedulerParameters=whole_pack configuration parameter. If set, then
hold, release and cancel operations on any component of a heterogeneous job
will be applied to all components
-- Handle FQDNs in xauth cookies for x11 display forwarding properly.
-- For heterogeneous job steps, the srun --open-mode option default value will
be set to "append".
-- Pack job scheduling list not being cleared between runs of the backfill
scheduler resulted in various anomalies.
-- Fix that backward compat for pmix version < 1.1.5.
-- Fix use-after-free that can lead to slurmstepd segfaulting when setting
ulimit values.
-- Add heterogeneous job start data to sdiag output.
-- X11 forwarding - handle systems with X11UseLocalhost=no set in sshd_config.
-- Fix potential missing issue with missin symbols in gres plugins.
-- Ignore querying clusters in federation that are down from status commands.
-- Base federated jobs off of origin job and not the local cluster in API.
-- Remove erroneous double '-' on rpath for libslurmfull.
-- Remove version from libslurmfull and move it to $LIBDIR/slurm since the ABI
could change from one version to the other.
-- Fix unused wall time for reservations.
-- Convert old reservation records to insert unused wall into the rows.
-- slurm.spec: further restructing and improvements.
-- Allow nodes state to be updated between FAIL and DRAIN.
-- x11 forwarding: handle build with alternate location for libssh2.
* Changes in Slurm 17.11.0rc3
==============================
-- Fix extern step to wait until launched before allowing job to start.
-- Add missing locks around figuring out TRES when clean starting the
slurmctld.
-- Cray modulefile: avoid removing /usr/bin from path on module unload.
-- Make reoccurring reservations show up in the database.
-- Adjust related resources (cpus, tasks, gres, mem, etc.) when updating
NumNodes with scontrol.
-- Don't initialize MPI plugins for batch or extern steps.`
-- slurm.spec - do not install a slurm.conf file under /etc/ld.so.conf.d.
-- X11 forwarding - fix keepalive message generation code.
-- If heterogeneous job step is unable to acquire MPI reserved ports then
avoid referencing NULL pointer. Retry assigning ports ONLY for
non-heterogeneous job steps.

Dominik Bartkiewicz
committed
-- If any acct_gather_*_init fails fatal instead of error and keep going.
-- launch/slurm plugin - Avoid using global variable for heterogeneous job
steps, which could corrupt memory.
* Changes in Slurm 17.11.0rc2
==============================
-- Prevent slurmctld abort with NodeFeatures=knl_cray and non-KNL nodes lacking
any configured features.
-- The --cpu_bind and --mem_bind options have been renamed to --cpu-bind
and --mem-bind for consistency with the rest of Slurm's options. Both
old and new syntaxes are supported for now.
-- Add slurmdb_connection_commit to the slurmdb api to commit when needed.
-- Add the federation api's to the slurmdb.h file.
-- Fix sacct to always use the db_api instead of sometimes calling functions
directly.
-- Fix sacctmgr to always use the db_api instead of sometimes calling functions
directly.
-- Fix sreport to always use the db_api instead of sometimes calling functions
directly.
-- Make global uid to the db_api to minimize calls to getuid().
-- Added more validation logic for updates to node features.
-- Added node_features_p_node_update_valid() function to node_features plugin.
-- If a job is held due to bad constraints and a node's features change then
test the job again to see if can run with the new features.
-- Added node_features_p_changible_feature() function to node_features plugin.
-- Avoid rebooting a node if a job's requested feature is not under the control
of the node_features plugin and is not currently active.
-- node_features/knl_generic plugin: Do not clear a node's non-KNL features
specified in slurm.conf.
-- Added SchedulerParameters configuration option "disable_hetero_steps" to
disable job steps that span multiple components of a heterogeneous job.
Disabled by default except with mpi/none plugin. This limitation to be
removed in Slurm version 18.08.
* Changes in Slurm 17.11.0rc1
-- Added the following jobcomp/script environment variables: CLUSTER,
DEPENDENCY, DERIVED_EC, EXITCODE, GROUPNAME, QOS, RESERVATION, USERNAME.
The format of LIMIT (job time limit) has been modified to D-HH:MM:SS.
-- Fix QOS usage factor applying to individual TRES run minute usage.
-- Print numbers using exponential format if required to fit in allocated
field width. The sacctmgr and sshare commands are impacted.
-- Make it so a backup DBD doesn't attempt to create database tables and
relies on the primary to do so.

Danny Auble
committed
-- By default have Slurm dynamically link to libslurm.so instead of static
linking. If static linking is desired configure with
--without-shared-libslurm.

Danny Auble
committed
-- Change --workdir in sbatch to be --chdir as in all other commands (salloc,
srun).
-- Add WorkDir to the job record in the database.
-- Make the UsageFactor of a QOS work when a qos has the nodecay flag.
-- Add MaxQueryTimeRange option to slurmdbd.conf to limit accounting query
ranges when fetching job records.
-- Add LaunchParameters=batch_step_set_cpu_freq to allow the setting of the cpu
frequency on the batch step.
-- CRAY - Fix statically linked applications to CRAY's PMI.
-- Fix - Raise an error back to the user when trying to update currently
unsupported core-based reservations.
-- Do not print TmpDisk space as part of 'slurmd -C' line.

Alejandro Sanchez
committed
-- Fix to test MaxMemPerCPU/Node partition limits when scheduling, previously
only checked on submit.
-- Work for heterogeneous job support (complete solution in v17.11):
* Set SLURM_PROCID environment variable to reflect global task rank (needed
by MPI).
* Set SLURM_NTASKS environment variable to reflect global task count (needed
by MPI).
* In srun, if only some steps are allocated and one step allocation fails,
then delete all allocated steps.
* Get SPANK plungins working with heterogeneous jobs. The
spank_init_post_opt() function is executed once per job component.
* Modify sbcast command and srun's --bcast option to support heterogeneous
jobs.
* Set more environment variables for MPI: SLURM_GTIDS and SLURM_NODEID.
* Prevent a heterogeneous job allocation from including the same nodes in
multiple components (required by MPI jobs spanning components).
* Modify step create logic so that call components of a heterogeneous job
launched by a single srun command have the same step ID value.
-- Modify output of "--mpi=list" to avoid duplicates for version numbers in
mpi/pmix plugin names.
-- Allow nodes to be rebooted while in a maintenance reservation.
-- Show nodes as down even when nodes are in a maintenance reservation.
-- Harden the slurmctld HA stack to mitigate certain split-brain issues.
-- Work for heterogeneous job support (complete solution in v17.11):
* Add burst buffer support.
* Remove srun's --mpi-combine option (always combined).
* Add SchedulerParameters configuration option "enable_hetero_steps" to
enable job steps that span multiple components of a heterogeneous job.
Disabled by default as most MPI implementations and Slurm configurations
are not currently supported. Limitation to be removed in Slurm version
18.08.
* Synchronize application launch across multiple components with debugger.
* Modify slurm_kill_job_step() to cancel all components of a heterogeneous
job step (used by MPI).
* Set SLURM_JOB_NUM_NODES environment variable as needed by MVAPICH.
* Base time limit upon the time that the latest job component is available
(after all nodes in all components booted and ready for use).
-- Add cluster name to smail tool email header.
-- Speedup arbitrary distribution algorithm.
-- Modify "srun --mpi=list" output to match valid option input by removing the
"mpi/" prefix on each line of output.
Loading
Loading full blame...