Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
S
Slurm
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package Registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
tud-zih-energy
Slurm
Commits
d407b4dd
Commit
d407b4dd
authored
11 years ago
by
Morris Jette
Browse files
Options
Downloads
Patches
Plain Diff
Clarify node state configuration in slurm.conf
parent
25b346b2
No related branches found
No related tags found
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
doc/man/man5/slurm.conf.5
+48
-24
48 additions, 24 deletions
doc/man/man5/slurm.conf.5
with
48 additions
and
24 deletions
doc/man/man5/slurm.conf.5
+
48
−
24
View file @
d407b4dd
...
@@ -2998,20 +2998,54 @@ The default value is 1.
...
@@ -2998,20 +2998,54 @@ The default value is 1.
.TP
.TP
\fBState\fR
\fBState\fR
State of the node with respect to the initiation of user jobs.
State of the node with respect to the initiation of user jobs.
Acceptable values are "DOWN", "DRAIN", "FAIL", "FAILING" and "UNKNOWN".
Acceptable values are "CLOUD", "DOWN", "DRAIN", "FAIL", "FAILING", "FUTURE"
"DOWN" indicates the node failed and is unavailable to be allocated work.
and "UNKNOWN".
"DRAIN" indicates the node is unavailable to be allocated work.
Node states of "BUSY" and "IDLE" should not be specified in the node
"FAIL" indicates the node is expected to fail soon, has
configuration, but set the node state to "UNKNOWN" instead.
Setting the node state to "UNKNOWN" will result in the node state being set to
"BUSY", "IDLE" or other appropriate state based upon recovered system state
information.
The default value is "UNKNOWN".
Also see the \fBDownNodes\fR parameter below.
.RS
.TP 10
\fBCLOUD\fP
Indicates the node exists in the cloud.
It's initial state will be treated as powered down.
The node will be available for use after it's state is recovered from SLURM's
state save file or the slurmd daemon starts on the compute node.
.TP
\fBDOWN\fP
Indicates the node failed and is unavailable to be allocated work.
.TP
\fBDRAIN\fP
Indicates the node is unavailable to be allocated work.on.
.TP
\fBFAIL\fP
Indicates the node is expected to fail soon, has
no jobs allocated to it, and will not be allocated
no jobs allocated to it, and will not be allocated
to any new jobs.
to any new jobs.
"FAILING" indicates the node is expected to fail soon, has
.TP
\fBFAILING\fP
Indicates the node is expected to fail soon, has
one or more jobs allocated to it, but will not be allocated
one or more jobs allocated to it, but will not be allocated
to any new jobs.
to any new jobs.
"UNKNOWN" indicates the node's state is undefined (BUSY or IDLE),
.TP
\fBFUTURE\fP
Indicates the node is defined for future use and need not
exist when the SLURM daemons are started. These nodes can be made available
for use simply by updating the node state using the scontrol command rather
than restarting the slurmctld daemon. After these nodes are made available,
change their \fRState\fR in the slurm.conf file. Until these nodes are made
available, they will not be seen using any SLURM commands or nor will
any attempt be made to contact them.
.TP
\fBUNKNOWN\fP
Indicates the node's state is undefined (BUSY or IDLE),
but will be established when the \fBslurmd\fR daemon on that node
but will be established when the \fBslurmd\fR daemon on that node
registers.
registers.
The default value is "UNKNOWN".
The default value is "UNKNOWN".
Also see the \fBDownNodes\fR parameter below.
.RE
.TP
.TP
\fBThreadsPerCore\fR
\fBThreadsPerCore\fR
...
@@ -3085,16 +3119,15 @@ Identifies the reason for a node being in state "DOWN", "DRAIN",
...
@@ -3085,16 +3119,15 @@ Identifies the reason for a node being in state "DOWN", "DRAIN",
.TP
.TP
\fBState\fR
\fBState\fR
State of the node with respect to the initiation of user jobs.
State of the node with respect to the initiation of user jobs.
Acceptable values are "BUSY", "DOWN", "DRAIN", "FAIL",
Acceptable values are "DOWN", "DRAIN", "FAIL", "FAILING" and "UNKNOWN".
"FAILING, "IDLE", and "UNKNOWN".
Node states of "BUSY" and "IDLE" should not be specified in the node
configuration, but set the node state to "UNKNOWN" instead.
Setting the node state to "UNKNOWN" will result in the node state being set to
"BUSY", "IDLE" or other appropriate state based upon recovered system state
information.
The default value is "UNKNOWN".
.RS
.RS
.TP 10
.TP 10
\fBCLOUD\fP
Indicates the node exists in the cloud.
It's initial state will be treated as powered down.
The node will be available for use after it's state is recovered from SLURM's
state save file or the slurmd daemon starts on the compute node.
.TP
\fBDOWN\fP
\fBDOWN\fP
Indicates the node failed and is unavailable to be allocated work.
Indicates the node failed and is unavailable to be allocated work.
.TP
.TP
...
@@ -3111,15 +3144,6 @@ Indicates the node is expected to fail soon, has
...
@@ -3111,15 +3144,6 @@ Indicates the node is expected to fail soon, has
one or more jobs allocated to it, but will not be allocated
one or more jobs allocated to it, but will not be allocated
to any new jobs.
to any new jobs.
.TP
.TP
\fBFUTURE\fP
Indicates the node is defined for future use and need not
exist when the SLURM daemons are started. These nodes can be made available
for use simply by updating the node state using the scontrol command rather
than restarting the slurmctld daemon. After these nodes are made available,
change their \fRState\fR in the slurm.conf file. Until these nodes are made
available, they will not be seen using any SLURM commands or nor will
any attempt be made to contact them.
.TP
\fBUNKNOWN\fP
\fBUNKNOWN\fP
Indicates the node's state is undefined (BUSY or IDLE),
Indicates the node's state is undefined (BUSY or IDLE),
but will be established when the \fBslurmd\fR daemon on that node
but will be established when the \fBslurmd\fR daemon on that node
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment