Skip to content
Snippets Groups Projects
Commit 95fd6cc6 authored by Moe Jette's avatar Moe Jette
Browse files
parent 663dd99f
No related branches found
No related tags found
No related merge requests found
...@@ -3,7 +3,7 @@ documents those changes that are of interest to users and admins. ...@@ -3,7 +3,7 @@ documents those changes that are of interest to users and admins.
* Changes in SLURM 1.2.0-pre4 * Changes in SLURM 1.2.0-pre4
============================= =============================
-- added node_inx to job_step_info_t to get the node indecies for mapping out -- Added node_inx to job_step_info_t to get the node indecies for mapping out
steps in a job by nodes. steps in a job by nodes.
-- sview grid added -- sview grid added
......
...@@ -54,6 +54,7 @@ identification. For example, "elx0000" might be used to designate ...@@ -54,6 +54,7 @@ identification. For example, "elx0000" might be used to designate
the ethernet address for node "lx0000". the ethernet address for node "lx0000".
By default the \fBBackupAddr\fR will be identical in value to By default the \fBBackupAddr\fR will be identical in value to
\fBBackupController\fR. \fBBackupController\fR.
.TP .TP
\fBBackupController\fR \fBBackupController\fR
The name of the machine where SLURM control functions are to be The name of the machine where SLURM control functions are to be
...@@ -63,6 +64,8 @@ as a controller only upon the failure of ControlMachine and will revert ...@@ -63,6 +64,8 @@ as a controller only upon the failure of ControlMachine and will revert
to a "standby" mode when the ControlMachine becomes available once again. to a "standby" mode when the ControlMachine becomes available once again.
This should be a node name without the full domain name (e.g. "lx0002"). This should be a node name without the full domain name (e.g. "lx0002").
While not essential, it is recommended that you specify a backup controller. While not essential, it is recommended that you specify a backup controller.
See the \fBRELOCATING CONTROLLERS\fR section if you change this.
.TP .TP
\fBCacheGroups\fR \fBCacheGroups\fR
If set to 1, the slurmd daemon will cache /etc/groups entries. If set to 1, the slurmd daemon will cache /etc/groups entries.
...@@ -86,11 +89,14 @@ identification. For example, "elx0000" might be used to designate ...@@ -86,11 +89,14 @@ identification. For example, "elx0000" might be used to designate
the ethernet address for node "lx0000". the ethernet address for node "lx0000".
By default the \fBControlAddr\fR will be identical in value to By default the \fBControlAddr\fR will be identical in value to
\fBControlMachine\fR. \fBControlMachine\fR.
.TP .TP
\fBControlMachine\fR \fBControlMachine\fR
The name of the machine where SLURM control functions are executed. The name of the machine where SLURM control functions are executed.
This should be a node name without the full domain name (e.g. "lx0001"). This should be a node name without the full domain name (e.g. "lx0001").
This value must be specified. This value must be specified.
See the \fBRELOCATING CONTROLLERS\fR section if you change this.
.TP .TP
\fBEpilog\fR \fBEpilog\fR
Fully qualified pathname of a script to execute as user root on every Fully qualified pathname of a script to execute as user root on every
...@@ -890,6 +896,31 @@ The default value is "NO". ...@@ -890,6 +896,31 @@ The default value is "NO".
\fBState\fR \fBState\fR
State of partition or availability for use. Possible values State of partition or availability for use. Possible values
are "UP" or "DOWN". The default value is "UP". are "UP" or "DOWN". The default value is "UP".
.SH "RELOCATING CONTROLLERS"
If the cluster's computers used for the primary or backup controller
will be out of service for an extended period of time, it may be
desirable to relocate them.
In order to do so, follow this procedure:
.LP
1. Stop the SLURM daemons
.br
2. Modify the slurm.conf file on all nodes
.br
3. Restart the SLURM daemons
.LP
There should be no loss of any running or pending jobs.
Insure that any nodes added to the cluster have the current
slurm.conf file installed.
.LP
\fBCAUTION:\fR If two nodes are simultaneously configured as the
primary controller (two nodes on which \fBControlMachine\fR specify
the local host and the \fBslurmctld\fR daemon is executing on each),
system behavior will be destructive.
If a compute node has an incorrect \fBControlMachine\fR or
\fBBackupController\fR parameter, that node may be rendered
unusable, but no serious harm will result.
.SH "EXAMPLE" .SH "EXAMPLE"
.LP .LP
# #
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment