"README.rst" did not exist on "3e7af4ec4f97ee944572c9b8e407e3392a232073"
Newer
Older
This file describes changes in recent versions of Slurm. It primarily
documents those changes that are of interest to users and administrators.
* Changes in Slurm 19.05.0pre2
==============================
-- Remove 512-character line length limit in slurm_print_topo_record().
(Used by "scontrol show topology".)
-- Tweak the sdiag gettimeofday() line format for greater clarity.
-- Add support for SALLOC/SBATCH/SLURM_NO_KILL environment variables.
Add salloc/sbatch/srun support for optional "--no-kill=off" option to
disable the environment variables.
-- Alter the backfill scheduler behavior to prevent it from scheduling lower
priority jobs on resources that become available during the backfill
scheduling cycle when bf_continue is enabled. This behavior was available
as the bf_ignore_newly_avail_nodes option in 18.08.4+, but is now enabled
by default. (The SchedulerParameters option of bf_ignore_newly_avail_nodes
is also now removed, although harmless if still set.)
-- Make LaunchParameters=send_gids the default introducing the reverse option
"disable_send_gids to go back to the original behavior.
-- Limit pam_slurm_adopt to run only in the sshd context by default, for
security reasons. A new module option 'service=<name>' can be used to
allow a different PAM applications to work. The option 'service=*' can be
used to restore the old behavior of always performing the adopt logic
regardless of the PAM application context.
-- pam_slurm_adopt: Use uid to determine whether root is logging.
-- Remove sbatch --x11 option. Slurm's internal X11 forwarding is now only
supported from salloc, or an allocating srun command.
-- Suppressed printing of job id in sbatch when quiet flag is set.
-- Changed sreport 'SizesByAccount' and 'SizesByAccountAndWckey' default
behavior and added new 'AcctAsParent' option.
-- Added printf attribute to setenvf() and corrected related warnings.
-- Kill running/pending job is allocated GRES and that GRES has a "File"
configuration, and the GRES count changes.
-- Add new DebugFlag=Accrue for accrue accounting debugging purposes.
-- Change CryptoType option to CredType, and rename crypto/munge plugin to
cred/munge.
-- Add slurmd -G option to print GRES configuration and exit. This is useful
for testing and debugging.
-- Support GRES types that include numbers (e.g. "--gres=gpu:123g:2").
-- Remove MemLimitEnforce parameter and move functionality into
JobAcctGatherParam=OverMemoryKill.
-- sview - disable admin mode option (which would not work anyways) if the
user is not an admin in SlurmDBD.
-- Remove joules reporting from sview and scontrol.
-- Change the default fair share algorithm to "fair tree". The new
PriorityFlags option of NO_FAIR_TREE can be used to revert to "classic"
fair share scheduling instead.
-- libslurmdb has been merged into libslurm.
-- Added -b as a short option for --begin and removed the -b option which
was a left over artifact from the Moab compatibility work.
-- Add ArrayTaskThrottle to "scontrol show job" output.
-- Add batch step at the beginning of a batch job so that squeue, sstat, and
sacct will show the batch step.
-- Make -l and -o mutually exclusive in saccct, squeue, sinfo, and sprio
* Changes in Slurm 19.05.0pre1
==============================
-- Run epilog and clean up allocation when a job is resized to zero and its
resources transferred to another job (--depend=expand).
-- If GRES are associated with specific sockets, identify those sockets in the
output of "scontrol show node". For example if all 4 GPUs on a node are
all associated with socket zero, then "Gres=gpu:4(S:0)". If associated
with sockets 0 and 1 then "Gres=gpu:4(S:0-1)". The information of which
specific GPUs are associated with specific GPUs is not reported, but only
available by parsing the gres.conf file.
-- Add configuration parameter "GpuFreqDef" to control a job's default GPU
frequency.
-- Add job flags to the database. Currently used to determine which scheduler
scheduled the job.
-- Add constraints/features to the database.
-- Add last reason job didn't run before resources/priority to the database.
-- Make it so we set the alloc_node in a resource allocation based on the auth
plugin instead of the rpc call.
* Changes in Slurm 18.08.6
==========================
-- Fix slurmsmwd build on 32-bit systems.
* Changes in Slurm 18.08.5-2
============================
-- Fix Perl build for 32-bit systems.
* Changes in Slurm 18.08.5
==========================

Dominik Bartkiewicz
committed
-- Backfill - If a job has a time_limit guess the end time of a job better
if OverTimeLimit is Unlimited.
-- Fix "sacctmgr show events event=cluster"
-- Fix sacctmgr show runawayjobs from sibling cluster
-- Avoid bit offset of -1 in call to bit_nclear().
-- Insure that "hbm" is a configured GresType on knl systems.
-- Fix NodeFeaturesPlugins=node_features/knl_generic to allow other gres
other than knl.
-- cons_res: Prevent overflow on multiply.
-- Better debug for bad values in gres.conf.
-- Fix double accounting of energy at end of job.
-- Read gres.conf for cloud nodes on slurmctld.

Dominik Bartkiewicz
committed
-- Don't assume the first node of a job is the batch host when purging jobs
from a node.
-- Better debugging when a job doesn't have a job_resrcs ptr.
-- Add XCC plugin for reading Lenovo Power.
-- Fix minor memory leak when scheduling rebootable nodes.
-- Fix printing correct SLURM_JOB_ACCOUNT_PACK_GROUP_* in env for a Het Job.
-- sbatch - search current working directory first for job script.
-- Make it so held jobs reset the AccrueTime and do not count against any
AccrueTime limits.
-- Add SchedulerParameters option of bf_hetjob_prio=[min|avg|max] to alter the
job sorting algorithm for scheduling heterogeneous jobs.
-- Fix initialization of assoc_mgr_locks and slurmctld_locks lock structures.
-- Fix segfault with job arrays using X11 forwarding.
-- Revert regression caused by e0ee1c7054 which caused negative values and
values starting with a decimal to be invalid for PriorityWeightTRES and
TRESBillingWeight.
-- Fix possibility to update a job's reservation to none.
-- Suppress connection errors to primary slurmdbd when backup dbd is active.
-- Suppress connection errors to primary db when backup db kicks in
-- Add missing fields for sacct --completion when using jobcomp/filetxt.
-- Fix incorrect values set for UserCPU, SystemCPU, and TotalCPU sacct fields
when JobAcctGatherType=jobacct_gather/cgroup.
-- Fixed srun from double printing invalid option msg twice.
-- Remove unused -b flag from getopt call in sbatch.
-- Disable reporting of node TRES in sreport.
-- Re-enabling features combined by OR within parenthesis for non-knl setups.
-- Prevent sending duplicate requests to reboot a node before ResumeTimeout.
-- Down nodes that don't reboot by ResumeTimeout.
-- Update seff to reflect API change from rss_max to tres_usage_in_max.
-- Add missing TRES constants from perl API.

Dominik Bartkiewicz
committed
-- Fix issue where sacct would return incorrect array tasks when querying
specific tasks.
-- Add missing variables to slurmdb_stats_t in the perlapi.
-- Fix nodes not getting reboot RPC when job requires reboot of nodes.
-- Fix failing update the partition list of a job.

Michael Hinton
committed
-- Use slurm.conf gres ids instead of gres.conf names to get a gres type name.
-- Add mitigation for a potential heap overflow on 32-bit systems in xmalloc.
CVE-2019-6438.
* Changes in Slurm 18.08.4
==========================

Dominik Bartkiewicz
committed
-- burst_buffer/cray - avoid launching a job that would be immediately
cancelled due to a DataWarp failure.
-- Fix message sent to user to display preempted instead of time limit when
a job is preempted.
-- Fix memory leak when a failure happens processing a nodes gres config.
-- Improve error message when failures happen processing a nodes gres config.
-- When building rpms ignore redundant standard rpaths and insecure relative
rpaths, for RHEL based distros which use "check-rpaths" tool.
-- Avoid locking the job_list when unneeded.
-- Allow --cpu-bind=verbose to be used with SLURM_HINT environment variable.
-- Make it so fixing runaway jobs will not alter the same job requeued
when not runaway.
-- Avoid checking state when searching for runaway jobs.

Marshall Garey
committed
-- Remove redundant check for end time of job when searching for runaway jobs.
-- Make sure that we properly check for runawayjobs where another job might
have the same id (for example, if a job was requeued) by also checking the
submit time.
-- Add scontrol update job ResetAccrueTime to clear a job's time
previously accrued for priority.
-- cons_res: Delay exiting cr_job_test until after cores/cpus are calculated
and distributed.
-- Fix bug where binary in cwd would trump binary in PATH with test_exec.
-- Fix check to test printf("%s\n", NULL); to not require
-Wno-format-truncation CFLAG.
-- Fix JobAcctGatherParams=UsePss to report the correct usage.
-- Fix minor memory leak in pmix plugin.
-- Fix minor memory leak in slurmctld when reading configuration.
-- Handle return codes correctly from pthread_* functions.
-- Fix minor memory leak when a slurmd is unable to contact a slurmctld
when trying to register.
-- Fix sreport sizesbyaccount report when using Flatview and accounts.
-- Fix incorrect shift when dealing with node weights and scheduling.
-- libslurm/perl - Fix segfault caused by incorrect hv_to_slurm_ctl_conf.
-- Add qos and assoc options to confirmation dialogs.
-- Handle updating identical license or partition information correctly.
-- Makes sure accounts and QOS' are all lower case to match documentation
when read in from the slurm.conf file.
-- Don't consider partitions without enough nodes in reservation,
main scheduler.
-- Set SLURM_NTASKS correctly if having to determine from other options.
-- Removed GCP scripts from contribs. Now located at:
https://github.com/SchedMD/slurm-gcp.
-- Don't check existence of srun --prolog or --epilog executables when set to
"none" and SLURM_TEST_EXEC is used.
-- Add "P" suffix support to job and step tres specifications.
-- When doing a reconfigure handle QOS' GrpJobsAccrue correctly.
-- Remove unneeded extra parentheses from sh5util.
-- Fix jobacct_gather/cgroup to work correctly when more than one task is
started on a node.
-- If requesting --ntasks-per-node with no tasks set tasks correctly.
-- Accept modifiers for TRES originally added in 6f0342e0358.
-- Don't remove reservation on slurmctld restart if nodes are removed from
configuration.
Loading
Loading full blame...