Skip to content
Snippets Groups Projects
NEWS 416 KiB
Newer Older
David Bigagli's avatar
David Bigagli committed
This file describes changes in recent versions of Slurm. It primarily
documents those changes that are of interest to users and administrators.

Morris Jette's avatar
Morris Jette committed
* Changes in Slurm 16.05.7
==========================
Morris Jette's avatar
Morris Jette committed
 -- Fix issue in the priority/multifactor plugin where on a slurmctld restart,
    where more time is accounted for than should be allowed.
 -- cray/busrt_buffer - If total_space in a pool decreases, reset used_space
    rather than trying to account for buffer allocations in progress.
 -- cray/busrt_buffer - Fix for double counting of used_space at slurmctld
    startup.
 -- Fix regression in 16.05.6 where if you request multiple cpus per task (-c2)
    and request --ntasks-per-core=1 and only 1 task on the node
    the slurmd would abort on an infinite loop fatal.
 -- cray/busrt_buffer - Internally track both allocated and unusable space.
    The reported UsedSpace in a pool is now the allocated space (previously was
    unusable space). Base available space on whichever value leaves least free
    space.
 -- cray/burst_buffer - Preserve job ID and don't translate to job array ID.
 -- cray/burst_buffer - Update "instance" parsing to match updated dw_wlm_cli
    output.
 -- sched/backfill - Insure we don't try to start a job that was already started
    and requeued by the main scheduling logic.
 -- job_submit/lua - add access to the job features field in job_record.
 -- select/linear plugin modified to better support heterogeneous clusters when
    topology/none is also configured.
 -- Permit cancellation of jobs in configuring state.
 -- acct_gather_energy/rapl - prevent segfault in slurmd from race to gather
    data at slurmd startup.
 -- Integrate node_feature/knl_generic with "hbm" GRES information.
 -- Fix output routines to prevent rounding the TRES values for memory or BB.
 -- switch/cray plugin - fix use after free error.
 -- docs - elaborate on how way to clear TRES limits in sacctmgr.
 -- knl_cray plugin - Avoid abort from backup slurmctld at start time.
 -- cgroup plugins - fix two minor memory leaks.
 -- If a node is booting for some job, don't allocate additional jobs to the
    node until the boot completes.
 -- testsuite - fix job id output in test17.39.
Morris Jette's avatar
Morris Jette committed
 -- Modify backfill algorithm to improve performance with large numbers of
    running jobs. Group running jobs that end in a "similar" time frame using a
Morris Jette's avatar
Morris Jette committed
    time window that grows exponentially rather than linearly. After one second
    of wall time, simulate the termination of all remaining running jobs in
    order to respond in a reasonable time frame.
 -- Fix slurm_job_cpus_allocated_str_on_node_id() API call.
Morris Jette's avatar
Morris Jette committed
 -- sched/backfill plugin: Make malloc match data type (defined as uint32_t and
    allocated as int).
 -- srun - prevent segfault when terminating job step before step has launched.
 -- sacctmgr - prevent segfault when trying to reset usage for an invalid
    account name.
 -- Make the openssl crypto plugin compile with openssl >= 1.1.
 -- Fix SuspendExcNodes and SuspendExcParts on slurmctld reconfiguration.
 -- sbcast - prevent segfault in slurmd due to race condition between file
    transfers from separate jobs using zlib compression
 -- cray/burst_buffer - Increase time to synchronize operations between threads
    from 5 to 60 seconds ("setup" operation time observed over 17 seconds).
 -- node_features/knl_cray - Fix possible race condition when changing node
    state that could result in old KNL mode as an active features.
 -- Make sure if a job can't run because of resources we also check accounting
    limits after the node selection to make sure it doesn't violate those limits
    and if it does change the reason for waiting so we don't reserve resources
    on jobs violating accounting limits.
 -- NRT - Make it so a system running against IBM's PE will work with PE
    version 1.3.
 -- NRT - Make it so protocols pgas and test are allowed to be used.
 -- NRT - Make it so you can have more than 1 protocol listed in MP_MSG_API.
 -- cray/burst_buffer - If slurmctld daemon restarts with pending job and burst
    buffer having unknown file stage-in status, teardown the buffer, defer the
    job, and start stage-in over again.
 -- On state restore in the slurmctld don't overwrite the mem_spec_limit given
    from the slurm.conf when using FastSchedule=0.
 -- Recognize a KNL's proper NUMA count (rather than setting it to the value
    in slurm.conf) when using FastSchedule=0.
Morris Jette's avatar
Morris Jette committed
 -- Fix parsing in regression test1.92 for some prompts.
 -- sbcast - use slurmd's gid cache rather than a separate lookup.
 -- slurmd - return error if setgroups() call fails in _drop_privileges().
 -- Remove error messages about gres counts changing when a job is resized on
    a slurmctld restart or reconfig, as they aren't really error messages.
 -- Fix possible memory corruption if a job is using GRES and changing size.
 -- jobcomp/elasticsearch - fix printf format for a value on 32-bit builds.
* Changes in Slurm 16.05.6
==========================
 -- Docs - the correct default value for GroupUpdateForce is 0.
 -- mpi/pmix - improve point to point communication performance.
 -- SlurmDB - include pending jobs in search during 'sacctmgr show runawayjobs'.
 -- Add client side out-of-range checks to --nice flag.
 -- Fix support for sbatch "-W" option, previously eeded to use "--wait".
 -- node_features/knl_cray plugin and capmc_suspend/resume programs modified to
    sleep and retry capmc operations if the Cray State Manager is down. Added
    CapmcRetries configuration parameter to knl_cray.conf.
 -- node_features/knl_cray plugin: Remove any KNL MCDRAM or NUMA features from
    node's configuration if capmc does NOT report the node as being KNL.
 -- node_features/knl_cray plugin: drain any node not reported by
    "capmc node_status" on startup or reconfig.
 -- node_features/knl_cray plugin: Substantially streamline and speed up logic
    to load current node state on reconfigure failure or unexpected node boot.
 -- node_features/knl_cray plugin: Add separate thread to interact with capmc
    in response to unexpected node reboots.
 -- node_features plugin - Add "mode" argument to node_features_p_node_xlate()
    function to fix some bugs updating a node's features using the node update
    RPC.
 -- node_features/knl_cray plugin: If the reconfiguration of nodes for an
    interactive job fails, kill the job (it can't be requeued like a batch job).
 -- Testsuite - Added srun/salloc/sbatch tests with --use-min-nodes option.
 -- Fix typo when an error occurs when discovering pmix version on
    configure.
 -- Fix configuring pmix support when you have your lib dir symlinked to lib64.
 -- Fix waiting reason if a job is waiting for a specific limit instead of
    always just AccountingPolicy.
 -- Correct SchedulerParameters=bf_busy_nodes logic with respect to the job's
    minimum node count. Previous logic would not decremement counter in some
    locations and reject valid job request for not reaching minimum node count.
 -- Fix FreeBSD-11 build by using llabs() function in place of abs().
 -- Cray: The slurmd can manipulate the socket/core/thread values reported based
    upon the configuration. The logic failed to consider select/cray with
    SelectTypeParameters=other_cons_res as equivalent to select/cons_res.
 -- If a node's socket or core count are changed at registration time (e.g. a
    KNL node's NUMA mode is changed), change it's board count to match.
Morris Jette's avatar
Morris Jette committed
 -- Prevent possible divide by zero in select/cons_res if a node's board count
    is higher than it's socket count.
 -- Allow an advanced reservation to contain a license count of zero.
 -- Preserve non-KNL node features when updating the KNL node features for a
    multi-node job in which the non-KNL node features vary by node.
 -- task/affinity plugin: Honor a job's --ntasks-per-socket and
    --ntasks-per-core options in task binding.
 -- slurmd - do not print ClusterName when using 'slurmd -C'.
 -- Correct a bitmap test function (used only by the select/bluegene plugin).
 -- Do not propagate SLURM_UMASK environment variable to batch script.
Morris Jette's avatar
Morris Jette committed
 -- Added node_features/knl_generic plugin for KNL support on non-Cray systems.
Morris Jette's avatar
Morris Jette committed
 -- Cray: Prevent abort in backfill scheduling logic for requeued job that has
    been cancelled while NHC is running.
 -- Improve reported estimates of start and end times for pending jobs.
 -- pbsnodes: Show OS value as "unknown" for down nodes.
 -- BlueGene - correctly scale node counts when enforcing MaxNodes limit take 2.
 -- Fix "sbatch --hold" to set JobHeldUser correctly instead of JobHeldAdmin.
 -- Cray - print warning that task/cgroup is required, and must be after
    task/cray in the TaskPlugin settings.
Morris Jette's avatar
Morris Jette committed
 -- Document that node Weight takes precedence over load with LLN scheduling.
 -- Fix issue where gang scheduling could happen even with OverSubscribe=NO.
 -- Expose JOB_SHARED_* values to job_submit/lua plugin.
 -- Fix issue where number of nodes is not properly allocated when srun is
    requested with -n tasks < hosts from -w hostlist.
 -- Update srun documentation for -N, -w and -m arbitrary.
 -- Fix bug that was clearing MAINT mode on nodes scheduled for reboot (bug
    introduced in version 16.05.5 to address bug in overlapping reservations).
 -- Add logging of node reboot requests.
 -- Docs - remove recommendation for ReleaseAgent setting in cgroup.conf.
 -- Make sure a job cleans up completely if it has a node fail.  Mostly an
    issue with gang scheduling.
Danny Auble's avatar
Danny Auble committed
* Changes in Slurm 16.05.5
==========================
 -- Fix accounting for jobs requeued after the previous job was finished.
 -- slurmstepd modified to pre-load all relevant plugins at startup to avoid
    the possibility of modified plugins later resulting in inconsistent API
    or data structures and a failure of slurmstepd.
 -- Export functions from parse_time.c in libslurm.so.
 -- Export unit convert functions from slurm_protocol_api.c in libslurm.so.
 -- Fix scancel to allow multiple steps from a job to be cancelled at once.
 -- Update and expand upgrade guide (in Quick Start Administrator web page).
 -- burst_buffer/cray: Requeue, but do not hold a job which fails the pre_run
    operation.
 -- Insure reported expected job start time is not in the past for pending jobs.
 -- Add support for PMIx v2.
 -- mpi/pmix: support for passing TMPDIR path through info key
 -- Cray: update slurmconfgen_smw.py script to correctly identify service nodes
    versus compute nodes.
 -- FreeBSD - fix build issue in knl_cray plugin.
 -- Corrections to gres.conf parsing logic.
 -- Make partition State independent of EnforcePartLimits value.
 -- Fix multipart srun submission with EnforcePartLimits=NO and job violating
    the partition limits.
 -- Fix problem updating job state_reason.
 -- pmix - Provide HWLOC topology in the job-data if Slurm was configured
    with hwloc.
 -- Cray - Fix issue restoring jobs when blade count increases due to hardware
    reconfiguration.
 -- burst_buffer/cray - Hold job after 3 failed pre-run operations.
 -- sched/backfill - Check that a user's QOS is allowed to use a partition
    before trying to schedule resources on that partition for the job.
 -- sacctmgr - Fix displaying nodenames when printing out events or
    reservations.
 -- Fix mpiexec wrapper to accept task count with more than one digit.
 -- Add mpiexec man page to the script.
 -- Add salloc_wait_nodes option to the SchedulerParameters parameter in the
    slurm.conf file controlling when the salloc command returns in relation to
    when nodes are ready for use (i.e. booted).
 -- Handle case when slurmctld daemon restart while compute node reboot in
    progress. Return node to service rather than setting DOWN.
 -- Preserve node "RESERVATION" state when one of multiple overlapping
    reservations ends.
 -- Restructure srun command locking for task_exit processing logic for improved
    parallelism.
 -- Modify srun task completion handling to only build the task/node string for
    logging purposes if it is needed. Modified for performance purposes.
 -- Docs - update salloc/sbatch/srun man pages to mention corresponding
    environment variables for --mem/--mem-per-cpu and allowed suffixes.
 -- Silence srun warning when overriding the job ntasks-per-node count
    with a lower task count for the step.
 -- Docs - assorted spelling fixes.
Loading
Loading full blame...