Skip to content
Snippets Groups Projects
Commit 8b7bcfd3 authored by Moe Jette's avatar Moe Jette
Browse files

Change "silent reboot" to "unexpeted reboot"

Change the reason that a node is marked DOWN and the log message
from node "silent reboot" to "unexpected reboot"
parent 776a2497
No related branches found
No related tags found
No related merge requests found
...@@ -1310,7 +1310,7 @@ and resumes communications). ...@@ -1310,7 +1310,7 @@ and resumes communications).
A DOWN node will become available for use upon registration with a A DOWN node will become available for use upon registration with a
valid configuration only if it was set DOWN due to being non\-responsive. valid configuration only if it was set DOWN due to being non\-responsive.
If the node was set DOWN for any other reason (low memory, prolog failure, If the node was set DOWN for any other reason (low memory, prolog failure,
epilog failure, silently rebooting, etc.), its state will not automatically epilog failure, unexpected reboot, etc.), its state will not automatically
be changed. be changed.
.TP .TP
\fB2\fR \fB2\fR
......
...@@ -1718,9 +1718,9 @@ extern int validate_node_specs(slurm_node_registration_status_msg_t *reg_msg) ...@@ -1718,9 +1718,9 @@ extern int validate_node_specs(slurm_node_registration_status_msg_t *reg_msg)
node_ptr->reason_uid = node_ptr->reason_uid =
slurm_get_slurm_user_id(); slurm_get_slurm_user_id();
node_ptr->reason = xstrdup( node_ptr->reason = xstrdup(
"Node silently failed and came back"); "Node unexpectedly rebooted");
} }
info("Node %s silently failed and came back", info("Node %s unexpectedly rebooted",
reg_msg->node_name); reg_msg->node_name);
_make_node_down(node_ptr, now); _make_node_down(node_ptr, now);
kill_running_job_by_node_name(reg_msg->node_name); kill_running_job_by_node_name(reg_msg->node_name);
...@@ -1790,8 +1790,8 @@ static front_end_record_t * _front_end_reg( ...@@ -1790,8 +1790,8 @@ static front_end_record_t * _front_end_reg(
(front_end_ptr->boot_time > front_end_ptr->last_response) && (front_end_ptr->boot_time > front_end_ptr->last_response) &&
(slurmctld_conf.ret2service != 2)) { (slurmctld_conf.ret2service != 2)) {
set_front_end_down(front_end_ptr, set_front_end_down(front_end_ptr,
"Front end silently failed and came back"); "Front end unexpectedly rebooted");
info("Front end %s silently failed and came back", info("Front end %s unexpectedly rebooted",
reg_msg->node_name); reg_msg->node_name);
reg_msg->job_count = 0; reg_msg->job_count = 0;
} }
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment