Skip to content
Snippets Groups Projects
NEWS 193 KiB
Newer Older
Christopher J. Morrone's avatar
Christopher J. Morrone committed
This file describes changes in recent versions of SLURM. It primarily
documents those changes that are of interest to users and admins.
* Changes in SLURM 1.4.0-pre8
=============================
 -- In order to create a new partition using the scontrol command, use
    the "create" option rather than "update" (which will only operate
    upon partitions that already exist).
 -- Added environment variable SLURM_RESTART_COUNT to batch jobs to
    indicated the count of job restarts made.
 -- Added sacctmgr command "show config".
 -- Added the scancel option --nodelist to cancel any jobs running on a
    given list of nodes.
 -- Add partition-specific DefaultTime (default time limit for jobs, 
    if not specified use MaxTime for the partition. Patch from Par
    Andersson, National Supercomputer Centre, Sweden.
 -- Add support for the scontrol command to be able change the Weight
    associated with nodes. Patch from Krishnakumar Ravi[KK] (HP).
 -- Add DebugFlag configuration option of "CPU_Bind" for detailed CPU
    binding information to be logged.
 -- Fix some significant bugs in task binding logic (possible infinite loops
    and memory corruption).
 -- Add new node state flag of NODE_STATE_MAINT indicating the node is in
    a reservation of type MAINT.
 -- Modified task/affinity plugin to automatically bind tasks to sockets,
    cores, or threads as appropriated based upon resource allocation and
    task count. User can override with srun's --cpu_bind option. 
 -- Fix bug in backfill logic for select/cons_res plugin, resulted in 
    error "cons_res:_rm_job_from_res: node_state mis-count".
 -- Add logic go bind a batch job to the resources allocated to that job.
 -- Add configuration parameter MpiParams for (future) OpenMPI port 
    management. Add resv_port_cnt and resv_ports fields to the job step 
    data structures. Add environment variable SLURM_STEP_RESV_PORTS to
    show what ports are reserved for a job step.
 -- Add support for SchedulerParameters=interval=<sec> to control the time
    interval between executions of the backfill scheduler logic.
 -- NOTE: Cold-start (without preserving state) required for upgrade from 
Danny Auble's avatar
Danny Auble committed
    version 1.4.0-pre7.
Moe Jette's avatar
Moe Jette committed
* Changes in SLURM 1.4.0-pre7
=============================
 -- Bug fix for preemption with select/cons_res when there are no idle nodes.
Moe Jette's avatar
Moe Jette committed
 -- Bug fix for use of srun options --exclusive and --cpus-per-task together
    for job step resource allocation (tracking of cpus in use was bad).
 -- Added the srun option --preserve-env to pass the current values of 
    environment variables SLURM_NNODES and SLURM_NPROCS through to the 
    executable, rather than computing them from commandline parameters.
 -- For select/cons_res or sched/gang only: Validate a job's resource 
    allocation socket and core count on each allocated node. If the node's
    configuration has been changed, then abort the job.
 -- For select/cons_res or sched/gang only: Disable updating a node's 
    processor count if FastSchedule=0. Administrators must set a valid
    processor count although the memory and disk space configuration can
    be loaded from the compute node when it starts.
 -- Add configure option "--disable-iso8601" to disable SLURM use of ISO 8601
    time format at the time of SLURM build. Default output for all commands
    is now ISO 8601 (yyyy-mm-ddThh:mm:ss).
 -- Add support for scontrol to explicity power a node up or down using the
    configured SuspendProg and ResumeProg programs.
Moe Jette's avatar
Moe Jette committed
 -- Fix book select/cons_res logic for tracking the number of allocated
    CPUs on a node when a partition's Shared value is YES or FORCE.
 -- Added configure options "--enable-cray-xt" and "--with-apbasil=PATH" for
    eventual support of Cray-XT systems.
Moe Jette's avatar
Moe Jette committed
* Changes in SLURM 1.4.0-pre6
=============================
 -- Fix job preemption when sched/gang and select/linear are configured with
    non-sharing partitions.
 -- In select/cons_res insure that required nodes have available resources.
Moe Jette's avatar
Moe Jette committed

* Changes in SLURM 1.4.0-pre5
=============================
 -- Correction in setting of SLURM_CPU_BIND environment variable.
 -- Rebuild slurmctld's job select_jobinfo->node_bitmap on restart/reconfigure
    of the daemon rather than restoring the bitmap since the nodes in a system
    can change (be added or removed).
 -- Add configuration option "--with-cpusetdir=PATH" for non-standard 
    locations.
 -- Get new multi-core data structures working on BlueGene systems.
 -- Modify PMI_Get_clique_ranks() to return an array of integers rather 
    than a char * to satisfy PMI standard. Correct logic in 
    PMI_Get_clique_size() for when srun --overcommit option is used.
 -- Fix bug in select/cons_res, allocated a job all of the processors on a 
    node when the --exclusive option is specified as a job submit option.
 -- Add NUMA cpu_bind support to the task affinity plugin. Binds tasks to
    a set of CPUs that belong NUMA locality domain with the appropriate
    --cpu-bind option (ldoms, rank_ldom, map_ldom, and mask_ldom), see
    "man srun" for more information.
* Changes in SLURM 1.4.0-pre4
=============================
 -- For task/affinity, force jobs to use a particular task binding by setting
    the TaskPluginParam configuration parameter rather than slurmd's
    SLURM_ENFORCED_CPU_BIND environment variable.
 -- Enable full preemption of jobs by partition with select/cons_res 
    (cons_res_preempt.patch from Chris Holmes, HP).
 -- Add configuration parameter DebugFlags to provide detailed logging for
    specific subsystems (steps and triggers so far).
 -- srun's --no-kill option is passed to slurmctld so that a job step is 
    killed even if the node where srun executes goes down (unless the 
    --no-kill option is used, previous termination logic would fail if 
    srun was not responding).
 -- Transfer a job step's core bitmap from the slurmctld to the slurmd
    within the job step credential.
 -- Add cpu_bind, cpu_bind_type, mem_bind and mem_bind_type to job allocation
    request and job_details structure in slurmctld. Add support to --cpu_bind
    and --mem_bind options from salloc and sbatch commands.
* Changes in SLURM 1.4.0-pre3
=============================
 -- Internal changes: CPUs per node changed from 32-bit to 16-bit size.
    Node count fields changed from 16-bit to 32-bit size in some structures.
 -- Remove select plugin functions select_p_get_extra_jobinfo(),
    select_p_step_begin() and select_p_step_fini().
 -- Remove the following slurmctld job structure fields: num_cpu_groups,
    cpus_per_node, cpu_count_reps, alloc_lps_cnt, alloc_lps, and used_lps.
    Use equivalent fields in new "select_job" structure, which is filled
    in by the select plugins.
 -- Modify mem_per_task in job step request from 16-bit to 32-bit size.
    Use new "select_job" structure for the job step's memory management.
 -- Add core_bitmap_job to slurmctld's job step structure to identify
Moe Jette's avatar
Moe Jette committed
    which specific cores are allocated to the step.
 -- Add new configuration option OverTimeLimit to permit jobs to exceed 
    their (soft) time limit by a configurable amount. Backfill scheduling
    will be based upon the soft time limit.
 -- Remove select_g_get_job_cores(). That data is now within the slurmctld's
    job structure.

* Changes in SLURM 1.4.0-pre2
=============================
 -- Remove srun's --ctrl-comm-ifhn-addr option (for PMI/MPICH2). It is no
    longer needed.
 -- Modify power save mode so that nodes can be powered off when idle. See
    https://computing.llnl.gov/linux/slurm/power_save.html or 
    "man slurm.conf" (SuspendProgram and related parameters) for more 
    information.
 -- Added configuration parameter PrologSlurmctld, which can be used to boot
    nodes into a particular state for each job. See "man slurm.conf" for 
    details.
 -- Add configuration parameter CompleteTime to control how long to wait for 
    a job's completion before allocating already released resources to pending
    jobs. This can be used to reduce fragmentation of resources. See
    "man slurm.conf" for details.
 -- Make default CryptoType=crypto/munge. OpenSSL is now completely optional.
 -- Make default AuthType=auth/munge rather than auth/none.
 -- Change output format of "sinfo -R" from "%35R %N" to "%50R %N".
* Changes in SLURM 1.4.0-pre1
=============================
 -- Save/restore a job's task_distribution option on slurmctld retart.
    NOTE: SLURM must be cold-started on converstion from version 1.3.x.
 -- Remove task_mem from job step credential (only job_mem is used now).
 -- Remove --task-mem and --job-mem options from salloc, sbatch and srun
    (use --mem-per-cpu or --mem instead).
 -- Remove DefMemPerTask from slurm.conf (use DefMemPerCPU or DefMemPerNode
    instead).
 -- Modify slurm_step_launch API call. Move launch host from function argument
    to element in the data structure slurm_step_launch_params_t, which is
    used as a function argument.
 -- Add state_reason_string to job state with optional details about why
    a job is pending.
 -- Make "scontrol show node" output match scontrol input for some fields
    ("Cores" changed to "CoresPerSocket", etc.).
 -- Add support for a new node state "FUTURE" in slurm.conf. These node records
    are created in SLURM tables for future use without a reboot of the SLURM
    daemons, but are not reported by any SLURM commands or APIs.

* Changes in SLURM 1.3.14
=========================
 -- Fix bug in squeue command with sort on job name ("-S j" option) for jobs
    that lack a name. Previously generated an invalid memory reference.
 -- Permit the TaskProlog to write to the job's standard output by writing
    a line containing the prefix "print " to it's standard output.
 -- Fix for making the slurmdbd agent thread start up correctly when 
    stopped and then started again.
 -- Prevent the Linux out of memory killer from killing the slurmd or
    slurmstepd daemons. Patch from Hongjia Cao, NUDT.
 -- Add squeue option to report jobs by account (-U or --account). Patch from
    Par Andersson, National Supercomputer Centre, Sweden.
 -- Add -DNUMA_VERSION1_COMPATIBILITY to Makefile CFLAGS for proper behavior
    when building with NUMA version 2 APIs.
 -- BLUEGENE - slurm works on a BGP system.
 -- BLUEGENE - slurm handles HTC blocks
 -- BLUEGENE - Added option DenyPassthrough in the bluegene.conf.  Can be set
    to any combination of X,Y,Z to not allow passthroughs when running in 
    dynamic layout mode.
 -- Fix bug in logic to remove a job's dependency, could result in abort.
 -- Add new error message to sched/wiki and sched/wiki2 (Maui and Moab) for
    STARTJOB request: "TASKLIST includes non-responsive nodes".
 -- Fix bug in select/linear when used with sched/gang that can result in a 
    job's required or excluded node specification being ignored.
 -- Add logic to handle message connect timeouts (timed-out.patch from 
    Chuck Clouston, Bull).
 -- Update python-hostlist code from Kent Engström (NSC) to v1.5
    - Add hostgrep utility to search for lines matching a hostlist.
    - Make each "-" on the command line count as one hostlist argument.
      If multiple hostslists are given on stdin they are combined to a
      union hostlist before being used in the way requested by the
      options.

Loading
Loading full blame...