Skip to content
Snippets Groups Projects
NEWS 408 KiB
Newer Older
David Bigagli's avatar
David Bigagli committed
This file describes changes in recent versions of Slurm. It primarily
documents those changes that are of interest to users and administrators.

* Changes in Slurm 16.05.6
==========================
 -- Docs - the correct default value for GroupUpdateForce is 0.
 -- mpi/pmix - improve point to point communication performance.
 -- SlurmDB - include pending jobs in search during 'sacctmgr show runawayjobs'.
 -- Add client side out-of-range checks to --nice flag.
 -- Fix support for sbatch "-W" option, previously eeded to use "--wait".
 -- node_features/knl_cray plugin and capmc_suspend/resume programs modified to
    sleep and retry capmc operations if the Cray State Manager is down. Added
    CapmcRetries configuration parameter to knl_cray.conf.
 -- node_features/knl_cray plugin: Remove any KNL MCDRAM or NUMA features from
    node's configuration if capmc does NOT report the node as being KNL.
 -- node_features/knl_cray plugin: drain any node not reported by
    "capmc node_status" on startup or reconfig.
Danny Auble's avatar
Danny Auble committed
* Changes in Slurm 16.05.5
==========================
 -- Fix accounting for jobs requeued after the previous job was finished.
 -- slurmstepd modified to pre-load all relevant plugins at startup to avoid
    the possibility of modified plugins later resulting in inconsistent API
    or data structures and a failure of slurmstepd.
 -- Export functions from parse_time.c in libslurm.so.
 -- Export unit convert functions from slurm_protocol_api.c in libslurm.so.
 -- Fix scancel to allow multiple steps from a job to be cancelled at once.
 -- Update and expand upgrade guide (in Quick Start Administrator web page).
 -- burst_buffer/cray: Requeue, but do not hold a job which fails the pre_run
    operation.
 -- Insure reported expected job start time is not in the past for pending jobs.
 -- Add support for PMIx v2.
 -- mpi/pmix: support for passing TMPDIR path through info key
 -- Cray: update slurmconfgen_smw.py script to correctly identify service nodes
    versus compute nodes.
 -- FreeBSD - fix build issue in knl_cray plugin.
 -- Corrections to gres.conf parsing logic.
 -- Make partition State independent of EnforcePartLimits value.
 -- Fix multipart srun submission with EnforcePartLimits=NO and job violating
    the partition limits.
 -- Fix problem updating job state_reason.
 -- pmix - Provide HWLOC topology in the job-data if Slurm was configured
    with hwloc.
 -- Cray - Fix issue restoring jobs when blade count increases due to hardware
    reconfiguration.
 -- burst_buffer/cray - Hold job after 3 failed pre-run operations.
 -- sched/backfill - Check that a user's QOS is allowed to use a partition
    before trying to schedule resources on that partition for the job.
 -- sacctmgr - Fix displaying nodenames when printing out events or
    reservations.
 -- Fix mpiexec wrapper to accept task count with more than one digit.
 -- Add mpiexec man page to the script.
 -- Add salloc_wait_nodes option to the SchedulerParameters parameter in the
    slurm.conf file controlling when the salloc command returns in relation to
    when nodes are ready for use (i.e. booted).
 -- Handle case when slurmctld daemon restart while compute node reboot in
    progress. Return node to service rather than setting DOWN.
 -- Preserve node "RESERVATION" state when one of multiple overlapping
    reservations ends.
 -- Restructure srun command locking for task_exit processing logic for improved
    parallelism.
 -- Modify srun task completion handling to only build the task/node string for
    logging purposes if it is needed. Modified for performance purposes.
 -- Docs - update salloc/sbatch/srun man pages to mention corresponding
    environment variables for --mem/--mem-per-cpu and allowed suffixes.
 -- Silence srun warning when overriding the job ntasks-per-node count
    with a lower task count for the step.
 -- Docs - assorted spelling fixes.
 -- node_features/knl_cray: Fix bug where MCDRAM state could be taken from
    capmc rather than cnselect.
 -- node_features/knl_cray: If a node is rebooted outside of Slurm's direction,
    update it's active features with current MCDRAM and NUMA mode information.
 -- Restore ability to manually power down nodes, broken in 15.08.12.
 -- Don't log error for job end_time being zero if node health check is still
    running.
 -- When powering up a node to change it's state (e.g. KNL NUMA or MCDRAM mode)
    then pass to the ResumeProgram the job ID assigned to the nodes in the
    SLURM_JOB_ID environment variable.
 -- Allow a node's PowerUp state flag to be cleared using update_node RPC.
 -- capmc_suspend/resume - If a request modify NUMA or MCDRAM state on a set of
    nodes or reboot a set of nodes fails then just requeue the job and abort the
    entire operation rather than trying to operate on individual nodes.
 -- node_features/knl_cray plugin: Increase default CapmcTimeout parameter from
    10 to 60 seconds.
 -- Fix squeue filter by job license when a job has requested more than 1
    license of a certain type.
 -- Fix bug in PMIX_Ring in the pmi2 plugin so that it supports singleton mode.
    It also updates the testpmixring.c test program so it can be used to check
    singleton runs.
 -- Automically cleanup task/cgroup cpuset and devices cgroups after steps are
    done.
 -- Testsuite - Fix test1.83 to handle gaps in node names properly.
 -- BlueGene - correctly scale node counts when enforcing MaxNodes limit.
 -- Make sure no attempt is made to schedule a requeued job until all steps are
    cleaned (Node Health Check completes for all steps on a Cray).
 -- KNL: Correct task affinity logic for some NUMA modes.
 -- Add salloc/sbatch/srun --priority option of "TOP" to set job priority to
    the highest possible value. This option is only available to Slurm operators
    and administrators.
 -- Add salloc/sbatch/srun option --use-min-nodes to prefer smaller node counts
    when a range of node counts is specified (e.g. "-N 2-4").
 -- Validate salloc/sbatch --wait-all-nodes argument.
 -- Add "sbatch_wait_nodes" to SchedulerParameters to control default sbatch
    behaviour with respect to waiting for all allocated nodes to be ready for
    use. Job can override the configuration option using the --wait-all-nodes=#
    option.
 -- Prevent partition group access updates from resetting last_part_update when
    no changes have been made. Prevents backfill scheduler from restarting
    mid-cycle unnecessarily.
 -- Cray - add NHC_ABSOLUTELY_NO to never run NHC, even on certain edge cases
    that it would otherwise be run on with NHC_NO.
 -- Ignore GRES/QOS updates that maintain the same value as before.
 -- mpi/pmix - prepare temp directory for application.
 -- Fix display for the nice and priority values in sprio/scontrol/squeue.
Morris Jette's avatar
Morris Jette committed
* Changes in Slurm 16.05.4
==========================
 -- Fix potential deadlock if running with message aggregation.
 -- Streamline when schedule() is called when running with message aggregation
    on batch script completes.
 -- Fix incorrect casting when [un]packing derived_ec on slurmdb_job_rec_t.
 -- Document that persistent burst buffers can not be created or destroyed using
    the salloc or srun --bb options.
 -- Add support for setting the SLURM_JOB_ACCOUNT, SLURM_JOB_QOS and
    SLURM_JOB_RESERVAION environment variables are set for the salloc command.
    Document the same environment variables for the salloc, sbatch and srun
    commands in their man pages.
 -- Fix issue where sacctmgr load cluster.cfg wouldn't load associations
    that had a partition in them.
 -- Don't return the extern step from sstat by default.
 -- In sstat print 'extern' instead of 4294967295 for the extern step.
 -- Make advanced reservations work properly with core specialization.
 -- Fix race condition in the account_gather plugin that could result in job
    stuck in COMPLETING state.
 -- Regression test fixes if SelectTypePlugin not managing memory and no node
    memory size set (defaults to 1 MB per node).
 -- Add missing partition write locks to _slurm_rpc_dump_nodes/node_single to
    prevent a race condition leading to inconsistent sinfo results.
Morris Jette's avatar
Morris Jette committed
 -- Fix task:CPU binding logic for some processors. This bug was introduced
    in version 16.05.1 to address KNL bunding problem.
 -- Fix two minor memory leaks in slurmctld.
 -- Improve partition-specific limit logging from slurmctld daemon.
 -- Fix incorrect access check when using MaxNodes setting on the partition.
 -- Fix issue with sacctmgr when specifying a list of clusters to query.
 -- Fix issue when calculating future StartTime for a job.
 -- Make EnforcePartLimit support logic work with any ordering of partitions
    in job submit request.
 -- Prevent restoration of wrong CPU governor and frequency when using
    multiple task plugins.
Morris Jette's avatar
Morris Jette committed
 -- Prevent slurmd abort if hwloc library fails to populate the "children"
    arrays (observed with hwloc version "dev-333-g85ea6e4").
 -- burst_buffer/cray: Add "--groupid" to DataWarp "setup" command.
 -- Fix lustre profiling putting it in the Filesystem dataset instead of the
    Network dataset.
 -- Fix profiling documentation and code to match be consistent with
    Filesystem instead of Lustre.
 -- Correct the way watts is calculated in the rapl plugin when using a poll
    frequency other than AcctGatherNodeFreq.
 -- Don't about step launch if job reaches expected end time while node is
    configuring/booting (NOTE: The job end time will be adjusted after node
    becomes ready for use).
 -- Fix several print routines to respect a custom output delimiter when
    printing NO_VAL or INFINITE.
 -- Correct documented configurations where --ntasks-per-core and
    --ntasks-per-socket are supported.
 -- task/affinity plugin buffer allocated too small, can corrupt memory.
Morris Jette's avatar
Morris Jette committed
* Changes in Slurm 16.05.3
==========================
 -- Make it so the extern step uses a reverse tree when cleaning up.
 -- If extern step doesn't get added into the proctrack plugin make sure the
    sleep is killed.
 -- Fix areas the slurmctld can segfault if an extern step is in the system
    cleaning up on a restart.
 -- Prevent possible incorrect counting of GRES of a given type if a node has
    the multiple "types" of a given GRES "name", which could over-subscribe
    GRES of a given type.
 -- Add web links to Slurm Diamond Collectors (from Harvard University) and
    collectd (from EDF).
 -- Add job_submit plugin for the "reboot" field.
 -- Make some more Slurm constants (INFINITE, NO_VAL64, etc.) available to
    job_submit/lua plugins.
 -- Send in a -1 for a taskid into spank_task_post_fork for the extern_step.
 -- MYSQL - Sightly better logic if a job completion comes in with an end time
    of 0.
 -- task/cgroup plugin is configured with ConstrainRAMSpace=yes, then set soft
    memory limit to allocated memory limit (previously no soft limit was set).
 -- Document limitations in burst buffer use by the salloc command (possible
    access problems from a login node).
 -- Fix proctrack plugin to only add the pid of a process once
    (regression in 16.05.2).
 -- Fix for sstat to print correct info when requesting jobid.batch as part of
    a comma-separated list.
 -- CRAY - Fix issue if pid has already been added to another job container.
 -- CRAY - Fix add of extern step to AELD.
 -- burstbufer/cray: avoid batch submit error condition if waiting for stagein.
 -- CRAY - Fix for reporting steps lingering after they are already finished.
 -- Testsuite - fix test1.29 / 17.15 for limits with values above 32-bits.
 -- CRAY - Simplify when a NHC is called on a step that has unkillable
    processes.
Loading
Loading full blame...