Skip to content
Snippets Groups Projects
NEWS 491 KiB
Newer Older
David Bigagli's avatar
David Bigagli committed
This file describes changes in recent versions of Slurm. It primarily
documents those changes that are of interest to users and administrators.

Tim Wickberg's avatar
Tim Wickberg committed
* Changes in Slurm 17.11.1
==========================
 -- Fix --with-shared-libslurm option to work correctly.
 -- Make it so only daemons log errors on configuration option duplicates.
 -- Fix for ConstrainDevices=yes to work correctly.
 -- Fix to purge old jobs using burst buffer if slurmctld daemon restarted
    after the job's burst buffer work was already completed.
 -- Make logging prefix for slurmstepd to happen as soon as possible.
 -- mpi/pmix: Fix the job registration for the PMIx v2.1.
 -- Fix uid check for signaling a step with anything but SIGKILL.
 -- Fix uid check when requesting a jobid from a pid.
 -- Return ESLURM_TRANSITION_STATE_NO_UPDATE instead of EAGAIN when trying to
    signal a step that is still running a prolog.
 -- Update Cray slurm_playbook.yaml with latest recommended version.
 -- Only say a prolog is done running after the extern step is launched.
Danny Auble's avatar
Danny Auble committed
 -- Wait to start a batch step until the prolog and extern step are
    fully ran/launched.  Only matters if running with
    PrologFlags=[contain|alloc].
 -- Truncate a range for SlurmctldPort to FD_SETSIZE elements and throw an
    error, otherwise network traffic may be lost due to poll() not detecting
    traffic.
 -- Fix for srun --pack-group option that can reuse/corrupt memory.
 -- Fix handling ultra long hostlists in a hostfile.
 -- X11: fix xauth regex to handle '-' in hostnames again.
 -- Fix potential node reboot timeout problem for "scontrol reboot" command.
 -- Add ability for squeue to sort jobs by submit time.
 -- CRAY - Switch to standard pid files on Cray systems.
 -- Update jobcomp records on duplicate inserts.
 -- Make slurmd oom-unkillable by default, using oom.h OOM_SCORE_ADJ_MIN macro.
 -- If unrecognized configuration file option found then print an appropriate
    fatal error message rather than relying upon random errno value.
 -- Initialize job_desc_msg_t's instead of just memset'ing them.
* Changes in Slurm 17.11.0
==========================
 -- Fix documentation for MaxQueryTimeRange option in slurmdbd.conf.
 -- Avoid srun abort trying to run on heterogeneous job component that has
    ended.
 -- Add SLURM_PACK_JOB_ID,SLURM_PACK_JOB_OFFSET to PrologSlurmctld and
    EpilogSlurmctld environment.
 -- Treat ":" in #SBATCH arguments as fatal error. The "#SBATCH packjob" syntax
    must be used instead.
 -- job_submit/lua plugin: expose pack_job fields to get.
 -- Prevent scheduling deadlock with multiple components of heterogeneous job
    in different partitions (i.e. one heterogeneous job component is higher
    priority in one partition and another component is lower priority in a
    different partition).
 -- Fix for heterogeneous job starvation bug.
 -- Fix some slurmctld memory leaks.
 -- Add SLURM_PACK_JOB_NODELIST to PrologSlurmctld and EpilogSlurmctld
    environment.
 -- If PrologSlurmctld fails for pack job leader then requeue or kill all
    components of the job.
 -- Fix for mulitple --pack-group srun arguments given out of order.
 -- Update slurm.conf(5) man page with updated example logrotate script.
 -- Add SchedulerParameters=whole_pack configuration parameter. If set, then
    hold, release and cancel operations on any component of a heterogeneous job
    will be applied to all components
 -- Handle FQDNs in xauth cookies for x11 display forwarding properly.
 -- For heterogeneous job steps, the srun --open-mode option default value will
    be set to "append".
 -- Pack job scheduling list not being cleared between runs of the backfill
    scheduler resulted in various anomalies.
 -- Fix that backward compat for pmix version < 1.1.5.
 -- Fix use-after-free that can lead to slurmstepd segfaulting when setting
    ulimit values.
 -- Add heterogeneous job start data to sdiag output.
 -- X11 forwarding - handle systems with X11UseLocalhost=no set in sshd_config.
 -- Fix potential missing issue with missin symbols in gres plugins.
 -- Ignore querying clusters in federation that are down from status commands.
 -- Base federated jobs off of origin job and not the local cluster in API.
 -- Remove erroneous double '-' on rpath for libslurmfull.
 -- Remove version from libslurmfull and move it to $LIBDIR/slurm since the ABI
    could change from one version to the other.
 -- Fix unused wall time for reservations.
 -- Convert old reservation records to insert unused wall into the rows.
 -- slurm.spec: further restructing and improvements.
 -- Allow nodes state to be updated between FAIL and DRAIN.
 -- x11 forwarding: handle build with alternate location for libssh2.
* Changes in Slurm 17.11.0rc3
==============================
 -- Fix extern step to wait until launched before allowing job to start.
 -- Add missing locks around figuring out TRES when clean starting the
    slurmctld.
 -- Cray modulefile: avoid removing /usr/bin from path on module unload.
 -- Make reoccurring reservations show up in the database.
 -- Adjust related resources (cpus, tasks, gres, mem, etc.) when updating
    NumNodes with scontrol.
 -- Don't initialize MPI plugins for batch or extern steps.`
 -- slurm.spec - do not install a slurm.conf file under /etc/ld.so.conf.d.
 -- X11 forwarding - fix keepalive message generation code.
Morris Jette's avatar
Morris Jette committed
 -- If heterogeneous job step is unable to acquire MPI reserved ports then
    avoid referencing NULL pointer. Retry assigning ports ONLY for
    non-heterogeneous job steps.
 -- If any acct_gather_*_init fails fatal instead of error and keep going.
 -- launch/slurm plugin - Avoid using global variable for heterogeneous job
    steps, which could corrupt memory.
* Changes in Slurm 17.11.0rc2
==============================
Morris Jette's avatar
Morris Jette committed
 -- Prevent slurmctld abort with NodeFeatures=knl_cray and non-KNL nodes lacking
    any configured features.
 -- The --cpu_bind and --mem_bind options have been renamed to --cpu-bind
    and --mem-bind for consistency with the rest of Slurm's options. Both
    old and new syntaxes are supported for now.
 -- Add slurmdb_connection_commit to the slurmdb api to commit when needed.
 -- Add the federation api's to the slurmdb.h file.
 -- Add job functions to the db_api.
 -- Fix sacct to always use the db_api instead of sometimes calling functions
    directly.
 -- Fix sacctmgr to always use the db_api instead of sometimes calling functions
    directly.
 -- Fix sreport to always use the db_api instead of sometimes calling functions
    directly.
 -- Make global uid to the db_api to minimize calls to getuid().
Morris Jette's avatar
Morris Jette committed
 -- Add support for HWLOC version 2.0.
 -- Added more validation logic for updates to node features.
 -- Added node_features_p_node_update_valid() function to node_features plugin.
 -- If a job is held due to bad constraints and a node's features change then
    test the job again to see if can run with the new features.
 -- Added node_features_p_changible_feature() function to node_features plugin.
 -- Avoid rebooting a node if a job's requested feature is not under the control
    of the node_features plugin and is not currently active.
 -- node_features/knl_generic plugin: Do not clear a node's non-KNL features
    specified in slurm.conf.
 -- Added SchedulerParameters configuration option "disable_hetero_steps" to
    disable job steps that span multiple components of a heterogeneous job.
    Disabled by default except with mpi/none plugin. This limitation to be
    removed in Slurm version 18.08.
* Changes in Slurm 17.11.0rc1
Morris Jette's avatar
Morris Jette committed
==============================
 -- Added the following jobcomp/script environment variables: CLUSTER,
    DEPENDENCY, DERIVED_EC, EXITCODE, GROUPNAME, QOS, RESERVATION, USERNAME.
    The format of LIMIT (job time limit) has been modified to D-HH:MM:SS.
 -- Fix QOS usage factor applying to individual TRES run minute usage.
 -- Print numbers using exponential format if required to fit in allocated
    field width. The sacctmgr and sshare commands are impacted.
 -- Make it so a backup DBD doesn't attempt to create database tables and
    relies on the primary to do so.
 -- By default have Slurm dynamically link to libslurm.so instead of static
    linking.  If static linking is desired configure with
    --without-shared-libslurm.
 -- Change --workdir in sbatch to be --chdir as in all other commands (salloc,
    srun).
 -- Add WorkDir to the job record in the database.
 -- Make the UsageFactor of a QOS work when a qos has the nodecay flag.
 -- Add MaxQueryTimeRange option to slurmdbd.conf to limit accounting query
    ranges when fetching job records.
 -- Add LaunchParameters=batch_step_set_cpu_freq to allow the setting of the cpu
    frequency on the batch step.
 -- CRAY - Fix statically linked applications to CRAY's PMI.
 -- Fix - Raise an error back to the user when trying to update currently
    unsupported core-based reservations.
 -- Do not print TmpDisk space as part of 'slurmd -C' line.
 -- Fix to test MaxMemPerCPU/Node partition limits when scheduling, previously
    only checked on submit.
 -- Work for heterogeneous job support (complete solution in v17.11):
    * Set SLURM_PROCID environment variable to reflect global task rank (needed
      by MPI).
    * Set SLURM_NTASKS environment variable to reflect global task count (needed
      by MPI).
    * In srun, if only some steps are allocated and one step allocation fails,
      then delete all allocated steps.
    * Get SPANK plungins working with heterogeneous jobs. The
      spank_init_post_opt() function is executed once per job component.
    * Modify sbcast command and srun's --bcast option to support heterogeneous
      jobs.
    * Set more environment variables for MPI: SLURM_GTIDS and SLURM_NODEID.
Morris Jette's avatar
Morris Jette committed
    * Prevent a heterogeneous job allocation from including the same nodes in
      multiple components (required by MPI jobs spanning components).
    * Modify step create logic so that call components of a heterogeneous job
      launched by a single srun command have the same step ID value.
 -- Modify output of "--mpi=list" to avoid duplicates for version numbers in
    mpi/pmix plugin names.
 -- Allow nodes to be rebooted while in a maintenance reservation.
 -- Show nodes as down even when nodes are in a maintenance reservation.
 -- Harden the slurmctld HA stack to mitigate certain split-brain issues.
 -- Work for heterogeneous job support (complete solution in v17.11):
    * Add burst buffer support.
    * Remove srun's --mpi-combine option (always combined).
    * Add SchedulerParameters configuration option "enable_hetero_steps" to
      enable job steps that span multiple components of a heterogeneous job.
      Disabled by default as most MPI implementations and Slurm configurations
      are not currently supported. Limitation to be removed in Slurm version
      18.08.
    * Synchronize application launch across multiple components with debugger.
Morris Jette's avatar
Morris Jette committed
    * Modify slurm_kill_job_step() to cancel all components of a heterogeneous
      job step (used by MPI).
    * Set SLURM_JOB_NUM_NODES environment variable as needed by MVAPICH.
    * Base time limit upon the time that the latest job component is available
      (after all nodes in all components booted and ready for use).
 -- Add cluster name to smail tool email header.
 -- Speedup arbitrary distribution algorithm.
 -- Modify "srun --mpi=list" output to match valid option input by removing the
    "mpi/" prefix on each line of output.
Loading
Loading full blame...