Skip to content
Snippets Groups Projects
Commit a7a3664e authored by Moe Jette's avatar Moe Jette
Browse files

Start work on slurm version 1.3

parent 8bb40a88
No related branches found
No related tags found
No related merge requests found
This file describes changes in recent versions of SLURM. It primarily This file describes changes in recent versions of SLURM. It primarily
documents those changes that are of interest to users and admins. documents those changes that are of interest to users and admins.
* Changes in SLURM 1.3.0-pre1
=============================
* Changes in SLURM 1.2.10 * Changes in SLURM 1.2.10
========================= =========================
......
RELEASE NOTES FOR SLURM VERSION 1.2 RELEASE NOTES FOR SLURM VERSION 1.3
13 March 2007 6 June 2007
IMPORTANT NOTE: IMPORTANT NOTE: (This may not apply, check back later)
SLURM state files in version 1.2 are different from those of version 1.1. SLURM state files in version 1.3 are different from those of version 1.2.
After installing SLURM version 1.2, plan to restart without preserving After installing SLURM version 1.2, plan to restart without preserving
jobs or other state information. Restart daemons with the "-c" option or jobs or other state information. Restart daemons with the "-c" option or
use "/etc/init.d/slurm startclean". use "/etc/init.d/slurm startclean".
NEW COMMANDS COMMAND CHANGES
* Several new commands have been added to perform individual srun functions. * The srun options --allocate, --attach and --batch have been removed.
The srun command will continue to exist, but these commands may offer Use the new commands added in SLURM version 1.2 for this functionality:
greater clarity and ease of use. The srun options --allocate, --attach,
and --batch will eventually cease being supported. The new commands are
salloc - Create a job allocation (functions like "srun --allocate") salloc - Create a job allocation (functions like "srun --allocate")
sattach - Attach to an existing job step (functions like "srun --attach") sattach - Attach to an existing job step (functions like "srun --attach")
sbatch - Submit a batch job script (functions like "srun --batch") sbatch - Submit a batch job script (functions like "srun --batch")
See the individual man pages for more information. See the individual man pages for more information.
* A new command, strigger, was added to manage event triggers. This was
added in SLURM version 1.2.2. See strigger's man pages for details.
* A new GUI is available for viewing and modifying state information, sview.
Note that sview will only be built on systems which have libglade-2.0 and
gtk+-2.0 installed.
CONFIGURATION FILE CHANGES CONFIGURATION FILE CHANGES
* The slurm.conf configuration file now supports a "Include" directive to
include other files inline.
* Added new configuration parameter MessageTimeout (replaces #define in the
code).
* Added new configuration parameter MailProg (in case mail program is not
at "/bin/mail").
* Added new configuration parameter TaskPluginParam (for future use, to
control which task affinity functions are used, "cpusets" or "sched" for
schedsetaffinity).
* Removed defunct configuration parameter, ShedulerAuth. It has been rendered
obsolete by the new wiki.conf file.
* Several new configuration parameters can be used to specify architectural
details of the nodes: Sockets, CoresPerSocket, and ThreadsPerCore.
For multi-core systems, these parameters will typically need to be
specified in the configuration file.
* For select/cons_res, an assortment of consumable resources can be supported
including: CPUs, cores, sockets and/or memory. See new configuration
parameter SelectTypeParameters.
* For BlueGene systems only: Alternate BlrtsImage, LinuxImage, MloaderImage,
and RamDiskImage file can be specified along with specific groups that
can use them.
* MPI plugin for use with MPICH-GM renamed from "mpich-gm" to "mpichgm".
(There was previously some inconsistent naming.)
OTHER CHANGES OTHER CHANGES
* Several srun options are available to control layout of tasks across the
cores on a node: --extra-node-info, --ntasks-per-node, --sockets-per-node,
--cores-per-socket, --threads-per-core, --minsockets, --mincores,
--minthreads, --ntasks-per-socket, --ntasks-per-core, and --ntasks-per-node.
* New scontrol, sinfo and squeue options can be used to view socket, core,
and task details by job and/or node.
* The scontrol and squeue commands will provide a more detailed explanation
of why a job was put into FAILED state.
* Permit batch jobs to be requeued ("scontrol requeue <jobid>" or
"srun --batch --no-requeue ..." to prevent).
* View a job's exit code using "scontrol show job".
* Added "account" field to job and step accounting information and sacct output.
* Added new job field, "comment". Set by srun, salloc and sbatch. View
with "scontrol show job". Used by sched/wiki2.
* Added support for OS X operating system.
* A job step's actual run time is maintained with suspend/resume use.
* Added support for Portable Linux Processor Affinity in the task/affinity
plugin (PLPA, see http://www.open-mpi.org/software/plpa).
* There is a new version of the Wiki scheduler plugin to interface with
the Moab Scheduler. It can be accessed with the slurm.conf parameter
"SchedulerType=sched/wiki2". Continue using "SchedulerType=sched/wiki"
for the Maui Scheduler at this time and see the section below for details.
If you use sched/wiki2, you will at least need to add a wiki.conf file.
Key differences include:
- Node and job data returned is correct (several errors in old plugin)
- Node data includes partition information (CCLASS field)
- Improved error handling
- Support added for configuration file ("wiki.conf" in same directory
as "slurm.conf" file, see "man wiki.conf" for details)
- Support added for job modify, suspend/resume, and requeue
- Authentication of communications now supported
- Notification of scheduler on events (job submitted or termination)
- There is no longer a "sched-wiki" RPM. All files are in the main RPM.
* The sched/wiki plugin has been re-written for use with the Maui Scheduler.
Some time in early 2007 the Maui Scheduler should be upgraded to use
the additional capabilities offered by sched/wiki2. Changes in sched/wiki
include:
- Node and job data returned is correct (several errors in old plugin)
- Support added for configuration file ("wiki.conf" in same directory
as "slurm.conf" file, see "man wiki.conf" for details). Currently
wiki.conf is only used to store an authentication key, which is not
used by most versions of the Maui Scheduler.
- There is no longer a "sched-wiki" RPM. All files are in the main RPM.
* The features associated with a node can now be changed using the
"scontrol update" and "sview" commands.
* For BlueGene system only: Added "--reboot" option to srun, salloc, and
sbatch commands to force booting of nodes before starting the job.
* For BlueGene Systems only: Changed way of keeping track of smaller
partitions using ionode range instead of quarter nodecard notation.
(i.e. bgl000[0-3] instead of bgl000.0.0)
* For BlueGene Systems only: New scontrol options to down specific ionodes
without having to create blocks there (i.e.
"scontrol update subbp=bgl000[0-3] state=error"). This will down the first
four ionodes or on a BGL system the first nodecard in the base partition.
* For BlueGene Systems only: sinfo will now display correct node counts for
blocks in an error state or blocks and small blocks that are allocated.
This results in some overlap of nodes displayed if you are running with
Small blocks but the node counts will be correct for the states given.
* For BlueGene Systems only: A state save/recover file has been added to save
the state of the system (primarily for blocks in an error state) to be loaded
when the system has been brought back up.
* Added "scontrol listpids" command to identify processes associated with
specific jobs and/or steps.
See the file NEWS for more details. See the file NEWS for more details.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment