Skip to content
Snippets Groups Projects
RELEASE_NOTES 5.85 KiB
Newer Older
RELEASE NOTES FOR SLURM VERSION 2.1
Moe Jette's avatar
Moe Jette committed
14 October 2009 (through SLURM 2.1.0-pre4)
SLURM state files in version 2.1 are different from those of version 2.0.
After installing SLURM version 2.1, plan to restart without preserving 
jobs or other state information. While SLURM version 2.0 is still running, 
cancel all pending and running jobs (e.g.
"scancel --state=pending; scancel --state=running"). Then stop and restart 
daemons with the "-c" option or use "/etc/init.d/slurm startclean".
Moe Jette's avatar
Moe Jette committed

Moe Jette's avatar
Moe Jette committed
If using the slurmdbd (SLURM DataBase Daemon) you must update this first.  
The 2.1 slurmdbd will work with SLURM daemons at version 2.0.0 and above.  
Moe Jette's avatar
Moe Jette committed
You will not need to update all clusters at the same time, but it is very 
important to update slurmdbd first and having it running before updating 
any other clusters making use of it.  No real harm will come from updating 
your systems before the slurmdbd, but they will not talk to each other 
until you do.
There are substantial changes in the slurm.conf configuration file. It 
is recommended that you rebuild your configuration file using the tool
doc/html/configurator.html that comes with the distribution.

Moe Jette's avatar
Moe Jette committed

Moe Jette's avatar
Moe Jette committed
* The sched/gang plugin has been removed. The logic is now directly within the 
  slurmctld daemon so that gang scheduling and/or job preemption can be 
  performed with a backfill scheduler.
* Preempted jobs can now be canceled, checkpointed or requeued rather than 
  only suspended.
* Support for QOS (Quality Of Service) has been added to the accounting 
  database with configurable limits, priority and preemption rules.
* Added -"-signal=<int>@<time>" option to salloc, sbatch and srun commands to
  notify programs before reaching the end of their time limit.
* Added squeue option "--start" to report expected start time of pending jobs.
  The times are only set if the backfill scheduler is in use.
Moe Jette's avatar
Moe Jette committed
* The pam_slurm Pluggable Authentication Module for SLURM previously
  distributed separately has been moved within the main SLURM distribution
  and is packaged as a separate RPM.
* Support has been added for OpenSolaris.

CONFIGURATION FILE CHANGES (see "man slurm.conf" for details)
Moe Jette's avatar
Moe Jette committed
* Added PreemptType parameter to specify the plugin used to identify 
  preemptable jobs (partition priority or quality of service) and 
  PreemptionMode to identify how to preempt jobs (requeue, cancel, checkpoint,
  or suspend).
* The sched/gang plugin has be removed, use PreemptType=preempt/partition_prio
  and PreemptMode=suspend,gang.
* ControlMachine changed to  accept multiple comma-separated hostnames for 
  support of some high-availability architectures.
* Added MaxTasksPerNode to control how many tasks that the slurmd can launch.
* Removed SrunIOTimeout parameter.
* Added SchedulerParameters option of "max_job_bf=#" to control how far down
  the queue of pending jobs that SLURM searches in an attempt backfill 
  schedule them. The default value is 50 jobs.

COMMAND CHANGES (see man pages for details)
Moe Jette's avatar
Moe Jette committed
* Added a --detail option to "scontrol show job" to display the cpu/memory
  allocation informaton on a node-by-node basis.
* sacctmgr show problems command added to display problems in the accounting 
  database (e.g. accounts with no users, users with no UID, etc.).
* Several redundant squeue output and sorting options have been removed: 
  "%o" (use %D"), "%b" (use "%S"), "%X", %Y, and "%Z" (use "%z").
* Standardized on the use of the '-Q' flag for all commands that offer the
  --quiet option.
Moe Jette's avatar
Moe Jette committed
* salloc's --wait=<secs> option deprecated by --immediate=<secs> option to 
  match the srun command.
* Scalability of sview dramatically improved.
* Added reservation flag of "OVERLAP" to permit a new reservation to use
  nodes already in another reservation.
* Added sacct ability to use --format NAME%LENGTH similar to sacctmgr.
* For salloc, sbatch and srun commands, ignore _maximum_ values for
  --sockets-per-node, --cores-per-socket and --threads-per-core options.
  Remove --mincores, --minsockets, --minthreads options (map them to 
  minimum values of -sockets-per-node, --cores-per-socket and 
  --threads-per-core for now).
* Change scontrol show job info: ReqProcs (number of processors requested) 
  is replaced by NumProcs (number of processors requested or actually 
  allocated) and ReqNodes (number of nodes requested) is replaced by NumNodes 
  (number of nodes requested or actually allocated).


BLUEGENE SPECIFIC CHANGES
* scontrol show blocks option added.
* scontrol delete block and update block can now remove blocks on dynamic 
  layout configuration.
* sinfo and sview now display correct CPU counts for partitions.
* Jobs waiting for a block to boot will now be reported in Configuring state.
* Vastly improve dynamic layout mode algorithm.
* Environment variables such as SLURM_NNODES, SLURM_JOB_NUM_NODES and
  SLURM_JOB_CPUS_PER_NODE now reference cnode counts instead of midplane
  counts.  SLURM_NODELIST still references midplane names.

Moe Jette's avatar
Moe Jette committed
* A mechanism has been added for SPANK plugins to set environment variables 
  for Prolog, Epilog, PrologSLurmctld and EpilogSlurmctld programs using the
  functions spank_get_job_env, spank_set_job_env, and spank_unset_job_env. See 
  "man spank" for more information.
Moe Jette's avatar
Moe Jette committed
* Set a node's power_up/configuring state flag while PrologSlurmctld is
  running for a job allocated to that node.
* Added sched/wiki2 (Moab) JOBMODIFY command support for VARIABLELIST option
  to set supplemental environment variables for pending batch jobs.
* The RPM previously named "slurm-aix-federation-<version>.rpm" has been 
  renamed to just "slurm-aix-<version>.rpm" (the federation switch plugin may 
  not be present).
Moe Jette's avatar
Moe Jette committed
* Environment variables SLURM_TOPOLOGY_ADDR and SLURM_TOPOLOGY_ADDR_PATTERN
  added to describe the network topology for each launched task when 
  TopologyType=topology/tree is configured
* Add new job wait reason, ReqNodeNotAvail: Required node is not available 
  (down or drained).