Skip to content
Snippets Groups Projects
NEWS 552 KiB
Newer Older
David Bigagli's avatar
David Bigagli committed
This file describes changes in recent versions of Slurm. It primarily
documents those changes that are of interest to users and administrators.

* Changes in Slurm 19.05.0pre2
==============================
 -- Removed select/serial plugin.
 -- Remove 512-character line length limit in slurm_print_topo_record().
    (Used by "scontrol show topology".)
 -- Removed crypto/openssl plugin.
 -- Tweak the sdiag gettimeofday() line format for greater clarity.
 -- Add support for SALLOC/SBATCH/SLURM_NO_KILL environment variables.
    Add salloc/sbatch/srun support for optional "--no-kill=off" option to
    disable the environment variables.
 -- Fix salloc and missing SLURM_NTASKS.
 -- Alter the backfill scheduler behavior to prevent it from scheduling lower
    priority jobs on resources that become available during the backfill
    scheduling cycle when bf_continue is enabled. This behavior was available
    as the bf_ignore_newly_avail_nodes option in 18.08.4+, but is now enabled
    by default. (The SchedulerParameters option of bf_ignore_newly_avail_nodes
    is also now removed, although harmless if still set.)
 -- Make LaunchParameters=send_gids the default introducing the reverse option
    "disable_send_gids to go back to the original behavior.
 -- Limit pam_slurm_adopt to run only in the sshd context by default, for
    security reasons. A new module option 'service=<name>' can be used to
    allow a different PAM applications to work. The option 'service=*' can be
    used to restore the old behavior of always performing the adopt logic
    regardless of the PAM application context.
 -- pam_slurm_adopt: Use uid to determine whether root is logging.
 -- Remove sbatch --x11 option. Slurm's internal X11 forwarding is now only
    supported from salloc, or an allocating srun command.
 -- Suppressed printing of job id in sbatch when quiet flag is set.
Felip Moll's avatar
Felip Moll committed
 -- Changed sreport 'SizesByAccount' and 'SizesByAccountAndWckey' default
    behavior and added new 'AcctAsParent' option.
 -- Add ave watts to api and sview.
 -- Added printf attribute to setenvf() and corrected related warnings.
 -- Kill running/pending job is allocated GRES and that GRES has a "File"
    configuration, and the GRES count changes.
Felip Moll's avatar
Felip Moll committed
 -- Add new DebugFlag=Accrue for accrue accounting debugging purposes.
 -- Change CryptoType option to CredType, and rename crypto/munge plugin to
    cred/munge.
 -- Add slurmd -G option to print GRES configuration and exit. This is useful
    for testing and debugging.
 -- Support GRES types that include numbers (e.g. "--gres=gpu:123g:2").
 -- Remove MemLimitEnforce parameter and move functionality into
    JobAcctGatherParam=OverMemoryKill.
 -- sview - disable admin mode option (which would not work anyways) if the
    user is not an admin in SlurmDBD.
 -- Remove joules reporting from sview and scontrol.
 -- Change the default fair share algorithm to "fair tree". The new
    PriorityFlags option of NO_FAIR_TREE can be used to revert to "classic"
    fair share scheduling instead.
 -- libslurmdb has been merged into libslurm.
Jason Booth's avatar
Jason Booth committed
 -- Added -b as a short option for --begin and removed the -b option which
    was a left over artifact from the Moab compatibility work.
 -- Add ArrayTaskThrottle to "scontrol show job" output.
Jason Booth's avatar
Jason Booth committed
 -- Added SPRIO_FORMAT env variable to the sprio command.
 -- Add batch step at the beginning of a batch job so that squeue, sstat, and
    sacct will show the batch step.
Tim Wickberg's avatar
Tim Wickberg committed
 -- Deprecated 32-bit builds.
 -- Make -l and -o mutually exclusive in saccct, squeue, sinfo, and sprio
* Changes in Slurm 19.05.0pre1
==============================
 -- Run epilog and clean up allocation when a job is resized to zero and its
    resources transferred to another job (--depend=expand).
 -- If GRES are associated with specific sockets, identify those sockets in the
    output of "scontrol show node". For example if all 4 GPUs on a node are
    all associated with socket zero, then "Gres=gpu:4(S:0)". If associated
    with sockets 0 and 1 then "Gres=gpu:4(S:0-1)". The information of which
    specific GPUs are associated with specific GPUs is not reported, but only
    available by parsing the gres.conf file.
 -- Add configuration parameter "GpuFreqDef" to control a job's default GPU
    frequency.
 -- Add job flags to the database.  Currently used to determine which scheduler
    scheduled the job.
 -- Add constraints/features to the database.
 -- Add last reason job didn't run before resources/priority to the database.
Danny Auble's avatar
Danny Auble committed
 -- Make it so we set the alloc_node in a resource allocation based on the auth
    plugin instead of the rpc call.
Tim Wickberg's avatar
Tim Wickberg committed
* Changes in Slurm 18.08.6
==========================
 -- Added parsing of -H flag with scancel.
 -- Fix slurmsmwd build on 32-bit systems.
* Changes in Slurm 18.08.5-2
============================
 -- Fix Perl build for 32-bit systems.

Tim Wickberg's avatar
Tim Wickberg committed
* Changes in Slurm 18.08.5
==========================
 -- Backfill - If a job has a time_limit guess the end time of a job better
    if OverTimeLimit is Unlimited.
 -- Fix "sacctmgr show events event=cluster"
 -- Fix sacctmgr show runawayjobs from sibling cluster
 -- Avoid bit offset of -1 in call to bit_nclear().
 -- Insure that "hbm" is a configured GresType on knl systems.
 -- Fix NodeFeaturesPlugins=node_features/knl_generic to allow other gres
    other than knl.
 -- cons_res: Prevent overflow on multiply.
 -- Better debug for bad values in gres.conf.
 -- Fix double accounting of energy at end of job.
 -- Read gres.conf for cloud nodes on slurmctld.
 -- Don't assume the first node of a job is the batch host when purging jobs
    from a node.
 -- Better debugging when a job doesn't have a job_resrcs ptr.
 -- Store ave watts in energy plugins.
 -- Add XCC plugin for reading Lenovo Power.
 -- Fix minor memory leak when scheduling rebootable nodes.
 -- Fix debug2 prefix for sched log.
 -- Fix printing correct SLURM_JOB_ACCOUNT_PACK_GROUP_* in env for a Het Job.
 -- sbatch - search current working directory first for job script.
 -- Make it so held jobs reset the AccrueTime and do not count against any
    AccrueTime limits.
 -- Add SchedulerParameters option of bf_hetjob_prio=[min|avg|max] to alter the
    job sorting algorithm for scheduling heterogeneous jobs.
 -- Fix initialization of assoc_mgr_locks and slurmctld_locks lock structures.
 -- Fix segfault with job arrays using X11 forwarding.
 -- Revert regression caused by e0ee1c7054 which caused negative values and
    values starting with a decimal to be invalid for PriorityWeightTRES and
    TRESBillingWeight.
 -- Fix possibility to update a job's reservation to none.
 -- Suppress connection errors to primary slurmdbd when backup dbd is active.
 -- Suppress connection errors to primary db when backup db kicks in
 -- Add missing fields for sacct --completion when using jobcomp/filetxt.
 -- Fix incorrect values set for UserCPU, SystemCPU, and TotalCPU sacct fields
    when JobAcctGatherType=jobacct_gather/cgroup.
 -- Fixed srun from double printing invalid option msg twice.
 -- Remove unused -b flag from getopt call in sbatch.
 -- Disable reporting of node TRES in sreport.
 -- Re-enabling features combined by OR within parenthesis for non-knl setups.
 -- Prevent sending duplicate requests to reboot a node before ResumeTimeout.
 -- Down nodes that don't reboot by ResumeTimeout.
 -- Update seff to reflect API change from rss_max to tres_usage_in_max.
 -- Add missing TRES constants from perl API.
 -- Fix issue where sacct would return incorrect array tasks when querying
    specific tasks.
 -- Add missing variables to slurmdb_stats_t in the perlapi.
 -- Fix nodes not getting reboot RPC when job requires reboot of nodes.
 -- Fix failing update the partition list of a job.
 -- Use slurm.conf gres ids instead of gres.conf names to get a gres type name.
 -- Add mitigation for a potential heap overflow on 32-bit systems in xmalloc.
    CVE-2019-6438.
Tim Wickberg's avatar
Tim Wickberg committed
* Changes in Slurm 18.08.4
==========================
 -- burst_buffer/cray - avoid launching a job that would be immediately
    cancelled due to a DataWarp failure.
 -- Fix message sent to user to display preempted instead of time limit when
    a job is preempted.
 -- Fix memory leak when a failure happens processing a nodes gres config.
 -- Improve error message when failures happen processing a nodes gres config.
 -- When building rpms ignore redundant standard rpaths and insecure relative
    rpaths, for RHEL based distros which use "check-rpaths" tool.
 -- Don't skip jobs in scontrol hold.
 -- Avoid locking the job_list when unneeded.
 -- Allow --cpu-bind=verbose to be used with SLURM_HINT environment variable.
 -- Make it so fixing runaway jobs will not alter the same job requeued
    when not runaway.
 -- Avoid checking state when searching for runaway jobs.
 -- Remove redundant check for end time of job when searching for runaway jobs.
 -- Make sure that we properly check for runawayjobs where another job might
    have the same id (for example, if a job was requeued) by also checking the
    submit time.
 -- Add scontrol update job ResetAccrueTime to clear a job's time
    previously accrued for priority.
 -- cons_res: Delay exiting cr_job_test until after cores/cpus are calculated
    and distributed.
 -- Fix bug where binary in cwd would trump binary in PATH with test_exec.
 -- Fix check to test printf("%s\n", NULL); to not require
    -Wno-format-truncation CFLAG.
 -- Fix JobAcctGatherParams=UsePss to report the correct usage.
 -- Fix minor memory leak in pmix plugin.
 -- Fix minor memory leak in slurmctld when reading configuration.
 -- Handle return codes correctly from pthread_* functions.
 -- Fix minor memory leak when a slurmd is unable to contact a slurmctld
    when trying to register.
 -- Fix sreport sizesbyaccount report when using Flatview and accounts.
 -- Fix incorrect shift when dealing with node weights and scheduling.
 -- libslurm/perl - Fix segfault caused by incorrect hv_to_slurm_ctl_conf.
 -- Add qos and assoc options to confirmation dialogs.
 -- Handle updating identical license or partition information correctly.
 -- Makes sure accounts and QOS' are all lower case to match documentation
    when read in from the slurm.conf file.
 -- Don't consider partitions without enough nodes in reservation,
    main scheduler.
 -- Set SLURM_NTASKS correctly if having to determine from other options.
 -- Removed GCP scripts from contribs. Now located at:
    https://github.com/SchedMD/slurm-gcp.
 -- Don't check existence of srun --prolog or --epilog executables when set to
    "none" and SLURM_TEST_EXEC is used.
 -- Add "P" suffix support to job and step tres specifications.
 -- When doing a reconfigure handle QOS' GrpJobsAccrue correctly.
 -- Remove unneeded extra parentheses from sh5util.
 -- Fix jobacct_gather/cgroup to work correctly when more than one task is
    started on a node.
 -- If requesting --ntasks-per-node with no tasks set tasks correctly.
 -- Accept modifiers for TRES originally added in 6f0342e0358.
 -- Don't remove reservation on slurmctld restart if nodes are removed from
    configuration.
Loading
Loading full blame...