Skip to content
Snippets Groups Projects
Commit a8e86557 authored by Moe Jette's avatar Moe Jette
Browse files

minor updates to RELEASE_NOTES, complete re-write of RELEASE_NOTES_LLNL

parent 5c63b6a5
No related branches found
No related tags found
No related merge requests found
...@@ -34,6 +34,7 @@ HIGHLIGHTS ...@@ -34,6 +34,7 @@ HIGHLIGHTS
* Added -"-signal=<int>@<time>" option to salloc, sbatch and srun commands to * Added -"-signal=<int>@<time>" option to salloc, sbatch and srun commands to
notify programs before reaching the end of their time limit. notify programs before reaching the end of their time limit.
* Added squeue option "--start" to report expected start time of pending jobs. * Added squeue option "--start" to report expected start time of pending jobs.
The times are only set if the backfill scheduler is in use.
* The pam_slurm Pluggable Authentication Module for SLURM previously * The pam_slurm Pluggable Authentication Module for SLURM previously
distributed separately has been moved within the main SLURM distribution distributed separately has been moved within the main SLURM distribution
and is packaged as a separate RPM. and is packaged as a separate RPM.
...@@ -60,7 +61,7 @@ COMMAND CHANGES (see man pages for details) ...@@ -60,7 +61,7 @@ COMMAND CHANGES (see man pages for details)
* Added a --detail option to "scontrol show job" to display the cpu/memory * Added a --detail option to "scontrol show job" to display the cpu/memory
allocation informaton on a node-by-node basis. allocation informaton on a node-by-node basis.
* sacctmgr show problems command added to display problems in the accounting * sacctmgr show problems command added to display problems in the accounting
database (e.g. accounts with no users, users with no UID, etc.) database (e.g. accounts with no users, users with no UID, etc.).
* Several redundant squeue output and sorting options have been removed: * Several redundant squeue output and sorting options have been removed:
"%o" (use %D"), "%b" (use "%S"), "%X", %Y, and "%Z" (use "%z"). "%o" (use %D"), "%b" (use "%S"), "%X", %Y, and "%Z" (use "%z").
* Standardized on the use of the '-Q' flag for all commands that offer the * Standardized on the use of the '-Q' flag for all commands that offer the
......
LLNL-SPECIFIC RELEASE NOTES FOR SLURM VERSION 2.0 LLNL CHAOS-SPECIFIC RELEASE NOTES FOR SLURM VERSION 2.1
19 February 2009 16 October 2009
For processor-scheduled clusters (*not* allocating whole nodes to jobs): This lists only the most significant changes from Slurm v2.0 to v2.1
Set "DefMemPerCPU" and "MaxMemPerCPU" as appropriate to restrict memory with respect to Chaos systems. See the file RELEASE_NOTES for other
available to a job. Also set "JobAcctGatherType=jobacct_gather/linux" changes.
for enforcement (periodic sampling of memory use by the job). You can change
said sampling rate from the default (every 30 seconds) by setting the
"JobAcctGatherFrequency" option to a different number of seconds in
the slurm.conf.
For InfiniBand switch systems, set TopologyType=topology/tree in slurm.conf For system administrators:
and add switch topology information to a new file called topology.conf. * The pam_slurm Pluggable Authentication Module for SLURM previously
Options used are SwitchName, Switches, and Nodes. The SwitchName is any distributed separately has been moved within the main SLURM distribution
convenient name for bookkeeping purposes only. For example: and is packaged as a separate RPM.
# Switch Topology Information * Configuration parameter MaxTasksPerNode has been added to control how many
SwitchName=s0 Nodes=tux[0-11] tasks that the slurmd can launch.
SwitchName=s1 Nodes=tux[12-23] * Added command "sacctmgr show problems" to display problems in the accounting
SwitchName=s2 Nodes=tux[24-35] database (e.g. accounts with no users, users with no UID, etc.).
SwitchName=s3 Switches=s[0-2]
Remove the "preserve-env.so" SPANK plugin. The functionality is now Mostly for users:
directly in SLURM. * Added -"-signal=<int>@<time>" option to salloc, sbatch and srun commands to
notify programs before reaching the end of their time limit.
* Added a --detail option to "scontrol show job" to display the cpu/memory
allocation informaton on a node-by-node basis.
* Add new job wait reason, ReqNodeNotAvail: Required node is not available
(down or drained).
SLURM version 2.0 must use a database daemon (slurmdbd) at version 2.0 SLURM state files in version 2.1 are different from those of version 2.1.
or higher. While we are testing version 2.0, set "AccountingStoragePort=????". After installing SLURM version 2.1, plan to restart without preserving
Once we upgrade the production slurmdbd to version 2.0, this change will jobs or other state information. While SLURM version 2.0 is still running,
not be required. You can likewise test 1.3.7+ clusters with the same port
since 2.0 slurmdbd will talk to 1.3.7+ SLURM.
SLURM state files in version 2.0 are different from those of version 1.3.
After installing SLURM version 2.0, plan to restart without preserving
jobs or other state information. While SLURM version 1.3 is still running,
cancel all pending and running jobs (e.g. cancel all pending and running jobs (e.g.
"scancel --state=pending; scancel --state=running"). Then stop and restart "scancel --state=pending; scancel --state=running"). Then stop and restart
daemons with the "-c" option or use "/etc/init.d/slurm startclean". daemons with the "-c" option or use "/etc/init.d/slurm startclean".
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment