Skip to content
Snippets Groups Projects
NEWS 161 KiB
Newer Older
Christopher J. Morrone's avatar
Christopher J. Morrone committed
This file describes changes in recent versions of SLURM. It primarily
documents those changes that are of interest to users and admins.
* Changes in SLURM 2.5.0
========================
 -- Modify sbcast logic to survive slurmd daemon restart while file a
    transmission is in progress.
 -- Add retry logic to munge encode/decode calls. This is needed if the munge
    deamon is under very heavy load (e.g. with 1000 slurmd daemons per compute
    node).
 -- Add launch and acct_gather_energy plugins to RPMs.
 -- Restore support for srun "--mpi=list" option.
 -- CRAY - Introduce step accounting for a Cray.
 -- Modify srun to abandon I/O 60 seconds after the last task ends. Otherwise
    an aborted slurmstepd can cause the srun process to hang indefinitely.
 -- ENERGY - RAPL - alter code to close open files (and only open them once
    where needed)
 -- If the PrologSlurmctld fails, then requeue the job an indefinite number
    of times instead of only one time.

* Changes in SLURM 2.5.0.rc1
============================
 -- Added Prolog and Epilog Guide (web page). Based upon work by Jason Sollom,
    Cray Inc. and used by permission.
 -- Restore gang scheduling functionality. Preemptor was not being scheduled.
    Fix for bugzilla #3.
Morris Jette's avatar
Morris Jette committed
 -- Add "cpu_load" to node information. Populate CPULOAD in node information
    reported to Moab cluster manager.
Morris Jette's avatar
Morris Jette committed
 -- Preempt jobs only when insufficient idle resources exist to start job,
    regardless of the node weight.
 -- Added priority/multifactor2 plugin based upon ticket distribution system.
    Work by Janne Blomqvist, Aalto University.
 -- Add SLURM_NODELIST to environment variables available to Prolog and Epilog.
 -- Permit reservations to allow or deny access by account and/or user.
 -- Add ReconfigFlags value of KeepPartState. See "man slurm.conf" for details.
 -- Modify the task/cgroup plugin adding a task_pre_launch_priv function and
    move slurmstepd outside of the step's cgroup. Work by Matthieu Hautreux.
 -- Intel MIC processor support added using gres/mic plugin. BIG thanks to
    Olli-Pekka Lehto, CSC-IT Center for Science Ltd.
 -- Accounting - Change empty jobacctinfo structs to not actually be used
    instead of putting 0's into the database we put NO_VALS and have sacct
    figure out jobacct_gather wasn't used.
 -- Cray - Prevent calling basil_confirm more than once per job using a flag.
 -- Fix bug with topology/tree and job with min-max node count. Now try to
    get max node count rather than minimizing leaf switches used.
 -- Add AccountingStorageEnforce=safe option to provide method to avoid jobs
    launching that wouldn't be able to run to completion because of a
    GrpCPUMins limit.
 -- Add support for RFC 5424 timestamps in logfiles. Disable with configuration
    option of "--disable-rfc5424time". By Janne Blomqvist, Aalto University.
 -- CRAY - Replace srun.pl with launch/aprun plugin to use srun to wrap the
    aprun process instead of a perl script.
Danny Auble's avatar
Danny Auble committed
 -- srun - Rename --runjob-opts to --launcher-opts to be used on systems other
    than BGQ.
 -- Added new DebugFlags - Energy for AcctGatherEnergy plugins.
 -- start deprecation of sacct --dump --fdump
 -- BGQ - added --verbose=OFF when srun --quiet is used
 -- Added acct_gather_energy/rapl plugin to record power consumption by job.
    Work by Yiannis Georgiou, Martin Perry, et. al., Bull

* Changes in SLURM 2.5.0.pre3
=============================
 -- Add Google search to all web pages.
 -- Add sinfo -T option to print reservation information. Work by Bill Brophy,
    Bull.
 -- Force slurmd exit after 2 minute wait, even if threads are hung.
 -- Change node_req field in struct job_resources from 8 to 32 bits so we can
    run more than 256 jobs per node.
 -- sched/backfill: Improve accuracy of expected job start with respect to
    reservations.
 -- sinfo partition field size will be set the the length of the longest
    partition name by default.
 -- Make it so the parse_time will return a valid 0 if given epoch time and
    set errno == ESLURM_INVALID_TIME_VALUE on error instead.
 -- Correct srun --no-alloc logic when node count exceeds node list or task
    task count is not a multiple of the node count. Work by Hongjia Cao, NUDT.
 -- Completed integration with IBM Parallel Environment including POE and IBM's
    NRT switch library.

* Changes in SLURM 2.5.0.pre2
=============================
 -- When running with multiple slurmd daemons per node, enable specifying a
    range of ports on a single line of the node configuration in slurm.conf.
 -- Add reservation flag of Part_Nodes to allocate all nodes in a partition to
    a reservation and automatically change the reservation when nodes are
    added to or removed from the reservation. Based upon work by
    Bill Brophy, Bull.
 -- Add support for advanced reservation for specific cores rather than whole
    nodes. Current limiations: homogeneous cluster, nodes idle when reservation
    created, and no more than one reservation per node. Code is still under
    development. Work by Alejandro Lucero Palau, et. al, BSC.
 -- Add DebugFlag of Switch to log switch plugin details.
 -- Correct job node_cnt value in job completion plugin when job fails due to
    down node. Previously was too low by one.
 -- Add new srun option --cpu-freq to enable user control over the job's CPU
    frequency and thus it's power consumption. NOTE: cpu frequency is not
    currently preserved for jobs being suspended and later resumed. Work by
    Don Albert, Bull.
 -- Add node configuration information about "boards" and optimize task
    placement on minimum number of boards. Work by Rod Schultz, Bull.
* Changes in SLURM 2.5.0.pre1
=============================
 -- Add new output to "scontrol show configuration" of LicensesUsed. Output is
    "name:used/total"
 -- Changed jobacct_gather plugin infrastructure to be cleaner and easier to
    maintain.
 -- Change license option count separator from "*" to ":" for consistency with
    the gres option (e.g. "--licenses=foo:2 --gres=gpu:2"). The "*" will still
    be accepted, but is no longer documented.
 -- Permit more than 100 jobs to be scheduled per node (new limit is 250
Danny Auble's avatar
Danny Auble committed
 -- Restructure of srun code to allow outside programs to utilize existing
    logic.
* Changes in SLURM 2.4.5
========================
 -- Cray - On job kill requeust, send SIGCONT, SIGTERM, wait KillWait and send
    SIGKILL. Previously just sent SIGKILL to tasks.
 -- BGQ - Fix issue when running srun outside of an allocation and only
    specifying the number of tasks and not the number of nodes.
 -- BGQ - validate correct ntasks_per_node
 -- BGQ - when srun -Q is given make runjob be quiet
 -- Modify use of OOM (out of memory protection) for Linux 2.6.36 kernel
    or later. NOTE: If you were setting the environment variable
    SLURMSTEPD_OOM_ADJ=-17, it should be set to -1000 for Linux 2.6.36 kernel
    or later.
 -- BGQ - Fix job step timeout actually happen when done from within an
    allocation.
 -- Reset node MAINT state flag when a reservation's nodes or flags change.
 -- Accounting - Fix issue where QOS usage was being zeroed out on a
    slurmctld restart.
 -- BGQ - Add 64 tasks per node as a valid option for srun when used with
    overcommit.
 -- BLUEGENE - With Dynamic layout mode - Fix issue where if a larger block
    was already in error and isn't deallocating and underlying hardware goes
    bad one could get overlapping blocks in error making the code assert when
    a new job request comes in.
 -- BGQ - handle pending actions on a block better when trying to deallocate it.
 -- Accounting - Fixed issue where if nodenames have changed on a system and
    you query against that with -N and -E you will get all jobs during that
    time instead of only the ones running on -N.
Danny Auble's avatar
Danny Auble committed
 -- BGP - Fix for HTC mode
Morris Jette's avatar
Morris Jette committed
* Changes in SLURM 2.4.4
========================
 -- BGQ - minor fix to make build work in emulated mode.
 -- BGQ - Fix if large block goes into error and the next highest priority jobs
    are planning on using the block.  Previously it would fail those jobs
    erroneously.
 -- BGQ - Fix issue when a cnode going to an error (not SoftwareError) state
    with a job running or trying to run on it.
 -- Execute slurm_spank_job_epilog when there is no system Epilog configured.
 -- Fix for srun --test-only to work correctly with timelimits
 -- BGQ - If a job goes away while still trying to free it up in the
    database, and the job is running on a small block make sure we free up
    the correct node count.
 -- BGQ - Logic added to make sure a job has finished on a block before it is
    purged from the system if its front-end node goes down.
 -- Modify strigger so that a filter option of "--user=0" is supported.
 -- Correct --mem-per-cpu logic for core or socket allocations with multiple
    threads per core.
 -- Fix for older < glibc 2.4 systems to use euidaccess() instead of eaccess().
 -- BLUEGENE - Do not alter a pending job's node count when changing it's
 -- BGQ - Add functionality to make it so we track the actions on a block.
    This is needed for when a free request is added to a block but there are
    jobs finishing up so we don't start new jobs on the block since they will
    fail on start.
 -- BGQ - Fixed InactiveLimit to work correctly to avoid scenarios where a
    user's pending allocation was started with srun and then for some reason
    the slurmctld was brought down and while it was down the srun was removed.
 -- Fixed InactiveLimit math to work correctly
 -- BGQ - Add logic to make it so blocks can't use a midplane with a nodeboard
    in error for passthrough.
 -- BGQ - Make it so if a nodeboard goes in error any block using that midplane
    for passthrough gets removed on a dynamic system.
 -- BGQ - Fix for printing realtime server debug correctly.
 -- BGQ - Cleaner handling of cnode failures when reported through the runjob
    interface instead of through the normal method.
 -- smap - spread node information across multiple lines for larger systems.
 -- Cray - Defer salloc until after PrologSlurmctld completes.
 -- Correction to slurmdbd communications failure handling logic, incorrect
    error codes returned in some cases.
* Changes in SLURM 2.4.3
========================
 -- Accounting - Fix so complete 32 bit numbers can be put in for a priority.
 -- cgroups - fix if initial directory is non-existent SLURM creates it
    correctly.  Before the errno wasn't being checked correctly
 -- BGQ - fixed srun when only requesting a task count and not a node count
    to operate the same way salloc or sbatch did and assign a task per cpu
    by default instead of task per node.
 -- Fix salloc --gid to work correctly.  Reported by Brian Gilmer
 -- BGQ - fix smap to set the correct default MloaderImage
 -- BLUEGENE - updated documentation.
 -- Close the batch job's environment file when it contains no data to avoid
    leaking file descriptors.
 -- Fix sbcast's credential to last till the end of a job instead of the
    previous 20 minute time limit.  The previous behavior would fail for
Loading
Loading full blame...