Skip to content
Snippets Groups Projects
NEWS 359 KiB
Newer Older
David Bigagli's avatar
David Bigagli committed
This file describes changes in recent versions of Slurm. It primarily
documents those changes that are of interest to users and administrators.

* Changes in Slurm 16.05.0pre1
Morris Jette's avatar
Morris Jette committed
===============================
 -- Add sbatch "--wait" option that waits for job completion before exiting.
    Exit code will match that of spawned job.
 -- Modify advanced reservation save/restore logic for core reservations to
    support configuration changes (changes in configured nodes or cores counts).
 -- Allow ControlMachine, BackupController, DbdHost and DbdBackupHost to be
    either short or long hostname.
 -- Job output and error files can now contain "%" character by specifying
    a file name with two consecutive "%" characters. For example,
    "sbatch -o "slurm.%%.%j" for job ID 123 will generate an output file named
    "slurm.%.123".
 -- Pass user name in Prolog RPC from controller to slurmd when using
    PrologFlags=Alloc. Allows SLURM_JOB_USER env variable to be set when using
    Native Slurm on a Cray.
 -- Add "NumTasks" to job information visible to Slurm commands.
 -- Add mail wrapper script "smail" that will include job statistics in email
    notification messages.
Morris Jette's avatar
Morris Jette committed
 -- Remove vestigial "SICP" job option (inter-cluster job option). Completely
    different logic will be forthcoming.
 -- Fix case where the primary and backup dbds would both be performing rollup.
 -- Add an ack reply from slurmd to slurmstepd when job setup is done and the
    job is ready to be executed.
 -- Removed support for authd. authd has not been developed and supported since
David Bigagli's avatar
David Bigagli committed
    several years.
 -- Introduce a new parameter requeue_setup_env_fail in SchedulerParameters.
    A job that fails to setup the environment will be requeued and the node
    drained.
 -- Add ValidateTimeout and OtherTimeout to "scontrol show burst" output.
 -- Increase default sbcast buffer size from 512KB to 8MB.
 -- Enable the hdf5 profiling of the batch step.
 -- Eliminate redundant environment and script files for job arrays.
David Bigagli's avatar
David Bigagli committed
 -- Implemented the checking configuration functionality using the new -C
    options of slurmctld. To check for configuration errors in slurm.conf
    run: 'slurmctld -C'.
 -- Stop searching sbatch scripts for #PBS directives after 100 lines of
    non-comments. Stop parsing #PBS or #SLURM directives after 1024 characters
    into a line. Required for decent perforamnce with huge scripts.
 -- Add debug flag for timing Cray portions of the code.
 -- Remove all *.la files from RPMs.
 -- Add Multi-Category Security (MCS) infrastructure to permit nodes to be bound
    to specific users or groups.
David Bigagli's avatar
David Bigagli committed
 -- Install the pmi2 unix sockets in slurmd spool directory instead of /tmp.
 -- Implement the getaddrinfo and getnameinfo instead of gethostbyaddr and
    gethostbyname.
David Bigagli's avatar
David Bigagli committed
 -- Finished PMIx implementation.
 -- Implemented the --without=package option for configure.
 -- Fix sshare to show each individual cluster with -M,--clusters option.
 -- Added --deadline option to salloc, sbatch and srun. Jobs which can not be
    completed by the user specified deadline will be terminated with a state of
    "Deadline" or "DL".
David Bigagli's avatar
David Bigagli committed
 -- Implemented and documented PMIX protocol which is used to bootstrap an
    MPI job. PMIX is an alternative to PMI and PMI2.
 -- Change default CgroupMountpoint (in cgroup.conf) from "/cgroup" to
    "/sys/fs/cgroup" to match current standard.
 -- Add #BSUB options to sbatch to read in from the batch script.
 -- HDF: Change group name of node from nodename to nodeid.
 -- The partition-specific SelectTypeParameters parameter can now be used to
    change the memory allocation tracking specification in the global
    SelectTypeParameters configuration parameter. Supported partition-specific
    values are CR_Core, CR_Core_Memory, CR_Socket and CR_Socket_Memory. If the
    global SelectTypeParameters value includes memory allocation management and
    the partition-specific value does not, then memory allocation management for
    that partition will NOT be supported (i.e. memory can be over-allocated).
    Likewise the global SelectTypeParameters might not include memory management
    while the partition-specific value does.
Morris Jette's avatar
Morris Jette committed
 -- Burst buffer/cray - Add support for multiple buffer pools including support
    for different resource granularity by pool.
Morris Jette's avatar
Morris Jette committed
 -- Burst buffer advanced reservation units treated as bytes (per documentation)
    rather than GB.
* Changes in Slurm 15.08.7
==========================
 -- sched/backfill: If a job can not be started within the configured
    backfill_window, set it's start time to 0 (unknown) rather than the end
    of the backfill_window.
 -- Remove the 1024-character limit on lines in batch scripts.
 -- burst_buffer/cray: Round up swap size by configured granularity.
 -- select/cray: Log repeated aeld reconnects.
 -- task/affinity: Disable core-level task binding if more CPUs required than
    available cores.
 -- Preemption/gang scheduling: If a job is suspended at slurmctld restart or
    reconfiguration time, then leave it suspended rather than resume+suspend.
 -- Don't use lower weight nodes for job allocation when topology/tree used.
 -- BGQ - If a cable goes into error state remove the under lying block on
    a dynamic system and mark the block in error on a static/overlap system.
 -- BGQ - Fix regression in 9cc4ae8add7f where blocks would be deleted on
    static/overlap systems when some hardware issue happens when restarting
    the slurmctld.
 -- Log if CLOUD node configured without a resume/suspend program or suspend
    time.
 -- MYSQL - Better locking around g_qos_count which was previously unprotected.
 -- Correct size of buffer used for jobid2str to avoid truncation.
 -- Fix allocation/distribution of tasks across multiple nodes when
    --hint=nomultithread is requested.
 -- If a reservation's nodes value is "all" then track the current nodes in the
    system, even if those nodes change.
Morris Jette's avatar
Morris Jette committed
* Changes in Slurm 15.08.6
==========================
Morris Jette's avatar
Morris Jette committed
 -- In slurmctld log file, log duplicate job ID found by slurmd. Previously was
    being logged as prolog/epilog failure.
 -- If a job is requeued while in the process of being launch, remove it's
    job ID from slurmd's record of active jobs in order to avoid generating a
    duplicate job ID error when launched for the second time (which would
    drain the node).
 -- Cleanup messages when handling job script and environment variables in
    older directory structure formats.
 -- Prevent triggering gang scheduling within a partition if configured with
    PreemptType=partition_prio and PreemptMode=suspend,gang.
 -- Decrease parallelism in job cancel request to prevent denial of service
    when cancelling huge numbers of jobs.
 -- If all ephemeral ports are in use, try using other port numbers.
 -- Revert way lib lua is handled when doing a dlopen, fixing a regression in
    15.08.5.
 -- Set the debug level of the rmdir message in xcgroup_delete() to debug2.
 -- Fix the qstat wrapper when user is removed from the system but still
    has running jobs.
 -- Log the request to terminate a job at info level if DebugFlags includes
    the Steps keyword.
 -- Fix potential memory corruption in _slurm_rpc_epilog_complete as well as
    _slurm_rpc_complete_job_allocation.
 -- Fix cosmetic display of AccountingStorageEnforce option "nosteps" when
    in use.
 -- If a job can never be started due to unsatisfied job dependencies, report
    the full original job dependency specification rather than the dependencies
    remaining to be satisfied (typically NULL).
Morris Jette's avatar
Morris Jette committed
 -- Refactor logic to synchronize active batch jobs and their script/environment
    files, reducing overhead dramatically for large numbers of active jobs.
 -- Avoid hard-link/copy of script/environment files for job arrays. Use the
    master job record file for all tasks of the job array.
Morris Jette's avatar
Morris Jette committed
    NOTE: Job arrays submitted to Slurm version 15.08.6 or later will fail if
    the slurmctld daemon is downgraded to an earlier version of Slurm.
 -- Move slurmctld mail handler to separate thread for improved performance.
 -- Fix containment of adopted processes from pam_slurm_adopt.
 -- If a pending job array has multiple reasons for being in a pending state,
    then print all reasons in a comma separated list.
* Changes in Slurm 15.08.5
==========================
 -- Prevent "scontrol update job" from updating jobs that have already finished.
 -- Show requested TRES in "squeue -O tres" when job is pending.
 -- Backfill scheduler: Test association and QOS node limits before reserving
    resources for pending job.
 -- burst_buffer/cray: If teardown operations fails, sleep and retry.
 -- Clean up the external pids when using the PrologFlags=Contain feature
    and the job finishes.
 -- burst_buffer/cray: Support file staging when job lacks job-specific buffer
    (i.e. only persistent burst buffers).
 -- Added srun option of --bcast to copy executable file to compute nodes.
 -- Fix for advanced reservation of burst buffer space.
 -- BurstBuffer/cray: Add logic to terminate dw_wlm_cli child processes at
    shutdown.
 -- If job can't be launch or requeued, then terminate it.
 -- BurstBuffer/cray: Enable clearing of burst buffer string on completed job
    as a means of recovering from a failure mode.
 -- Fix wrong memory free when parsing SrunPortRange=0-0 configuration.
 -- BurstBuffer/cray: Fix job record purging if cancelled from pending state.
 -- BGQ - Handle database throw correctly when syncing users on blocks.
 -- MySQL - Make sure we don't have a NULL string returned when not
    requesting any specific association.
 -- sched/backfill: If max_rpc_cnt is configured and the backlog of RPCs has
    not cleared after yielding locks, then continue to sleep.
 -- Preserve the job dependency description displayed in 'scontrol show job'
    even if the dependee jobs was terminated and cleaned causing the
    dependent to never run because of DependencyNeverSatisfied.
David Bigagli's avatar
David Bigagli committed
 -- Correct job task count calculation if only node count and ntasks-per-node
    options supplied.
 -- Make sure the association manager converts any string to be lower case
    as all the associations from the database will be lower case.
 -- Sanity check for xcgroup_delete() to verify incoming parameter is valid.
 -- Fix formatting for sacct with variables that switched from uint32_t to
    uint64_t.
David Bigagli's avatar
David Bigagli committed
 -- Fix a typo in sacct man page.
 -- Set up extern step to track any childern of an ssh if it leaves anything
    else behind.
 -- Prevent slurmdbd divide by zero if no associations defined at rollup time.
 -- Multifactor - Add sanity check to make sure pending jobs are handled
    correctly when PriorityFlags=CALCULATE_RUNNING is set.
 -- Add slurmdb_find_tres_count_in_string() to slurm db perl api.
 -- Make lua dlopen() conditional on version found at build.
Morris Jette's avatar
Morris Jette committed
 -- sched/backfill - Delay backfill scheduler for completing jobs only if
    CompleteWait configuration parameter is set (make code match documentation).
 -- Release a job's allocated licenses only after epilog runs on all nodes
    rather than at start of termination process.
Morris Jette's avatar
Morris Jette committed
 -- Cray job NHC delayed until after burst buffer released and epilog completes
    on all allocated nodes.
 -- Fix abort of srun if using PrologFlags=NoHold
 -- Let devices step_extern cgroup inherit attributes of job cgroup.
 -- Add new hook to Task plugin to be able to put adopted processes in the
    step_extern cgroups.
 -- Fix AllowUsers documentation in burst_buffer.conf man page. Usernames are
    comma separated, not colon delimited.
 -- Fix issue with time limit not being set correctly from a QOS when a job
    requests no time limit.
 -- Various CLANG fixes.
 -- In both sched/basic and backfill: If a job can not be started due to some
Loading
Loading full blame...