Skip to content
Snippets Groups Projects
Commit 22a598db authored by Morris Jette's avatar Morris Jette
Browse files

Try to better HealthCheckProgram use

parent 34e0e482
No related branches found
No related tags found
No related merge requests found
...@@ -606,14 +606,15 @@ The default value is zero, which disables execution. ...@@ -606,14 +606,15 @@ The default value is zero, which disables execution.
.TP .TP
\fBHealthCheckProgram\fR \fBHealthCheckProgram\fR
Fully qualified pathname of a script to execute as user root periodically Fully qualified pathname of a script to execute as user root periodically
on all compute nodes that are not in the NOT_RESPONDING state. This may be on all compute nodes that are \fBnot\fR in the NOT_RESPONDING state. This
used to verify the node is fully operational and DRAIN the node or send email program may be used to verify the node is fully operational and DRAIN the node
if a problem is detected. or send email if a problem is detected.
Any action to be taken must be explicitly performed by the program Any action to be taken must be explicitly performed by the program
(e.g. execute (e.g. execute
"scontrol update NodeName=foo State=drain Reason=tmp_file_system_full" "scontrol update NodeName=foo State=drain Reason=tmp_file_system_full"
to drain a node). to drain a node).
The interval is controlled using the \fBHealthCheckInterval\fR parameter. The execution interval is controlled using the \fBHealthCheckInterval\fR
parameter.
Note that the \fBHealthCheckProgram\fR will be executed at the same time Note that the \fBHealthCheckProgram\fR will be executed at the same time
on all nodes to minimize its impact upon parallel programs. on all nodes to minimize its impact upon parallel programs.
This program is will be killed if it does not terminate normally within This program is will be killed if it does not terminate normally within
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment