From 086db8f6ef2e600aa648a9a78e71412897ac9116 Mon Sep 17 00:00:00 2001 From: Moe Jette <jette1@llnl.gov> Date: Fri, 27 Feb 2004 21:38:50 +0000 Subject: [PATCH] Add job and node state descriptions to the squeue and sinfo man pages. --- NEWS | 1 + doc/man/man1/sinfo.1 | 48 ++++++++++++++++++++++++++++++++++++++++++- doc/man/man1/squeue.1 | 22 +++++++++++++++++++- 3 files changed, 69 insertions(+), 2 deletions(-) diff --git a/NEWS b/NEWS index bfd7a186bc6..17a2f6fa239 100644 --- a/NEWS +++ b/NEWS @@ -22,6 +22,7 @@ documents those changes that are of interest to users and admins. moved from common into plugin modules -- Documentation for writing scheduler, switch, and job completion logging plugins added + -- Added job and node state descriptions to the squeue and sinfo man pages * Changes in SLURM 0.3.0.0-pre6 =============================== diff --git a/doc/man/man1/sinfo.1 b/doc/man/man1/sinfo.1 index bbb0d54e21a..91b5dfc3ba4 100644 --- a/doc/man/man1/sinfo.1 +++ b/doc/man/man1/sinfo.1 @@ -1,4 +1,4 @@ -.TH SINFO "1" "January 2004" "sinfo 0.3" "Slurm components" +.TH SINFO "1" "February 2004" "sinfo 0.3" "Slurm components" .SH "NAME" sinfo \- Used to view information about Slurm nodes and partitions. @@ -214,6 +214,52 @@ Note that the suffix "*" identifies nodes that are presently not responding. \fBTMP_DISK\fR Size of temporary disk space in megabytes on these nodes. +.SH "NODE STATE CODES" +.PP +Node state codes are shorted as required for the field size. +If the node state code is followed by "*", this indicates the node +is presently not responding and will not be allocated any new work. +If the node remains non-responsive, it will be placed in the \fBDOWN\fR +state. +.TP 12 +ALLOCATED +The node has been allocated to one or more jobs. +.TP +COMPLETING +One or more jobs have been allocated this node and are in the process +of COMPLETING. This node state will be left when all of the job's +processes have terminated and the SLURM epilog program (if any) has +terminated. See the \fBEpilog\fR parameter description in the +\fBslurm.conf\fR man page for more information. +.TP +DOWN +The node is unavailable for use. SLURM can automatically place nodes +in this state if some failure occurs. System administrators may also +explicitly place nodes in this state. If a node resumes normal operation, +SLURM can automatically return it to service. See the \fBReturnToService\fR +and \fBSlurmdTimeout\fR parameter descriptions in the \fBslurm.conf\fR(5) +man page for more information. +.TP +DRAINED +The node is unavailable for use per system administrator request. +See the \fBupdate node\fR command in the \fBscontrol\fR(1) man page +or the \fBslurm.conf\fR(5) man page for more information. +.TP +DRAINING +The node is currently executing a job, but will not be allocated to +additional jobs. The node state will be changed to state \fBDRAINED\fR +when the last job on it completes. Nodes enter this state per system +administrator request. See the \fBupdate node\fR command in the +\fBscontrol\fR(1) man page or the \fBslurm.conf\fR(5) man page for +more information. +.TP +IDLE +The node is not allocated to any jobs and is available for use. +.TP +UNKNOWN +The SLURM controller has just started and the node's state has not +yet been determined. + .SH "ENVIRONMENT VARIABLES" .PP Some \fBsinfo\fR options may be set via environment variables. These diff --git a/doc/man/man1/squeue.1 b/doc/man/man1/squeue.1 index 6ff82407650..260a718d586 100644 --- a/doc/man/man1/squeue.1 +++ b/doc/man/man1/squeue.1 @@ -1,4 +1,4 @@ -.TH SQUEUE "1" "January 2004" "squeue 0.3" "Slurm components" +.TH SQUEUE "1" "February 2004" "squeue 0.3" "Slurm components" .SH "NAME" squeue \- Used to view information of jobs located in the scheduling queue. @@ -157,6 +157,26 @@ Report details of squeues actions. \fB\-V\fR , \fB\-\-version\fR Print version information and exit. +.SH "JOB STATE CODES" +.TP 17 +CD COMPLETED +Job has terminated all processes on all nodes. +.TP +CG COMPLETING +Job is in the process of completing. Some processes on some nodes may still be active. +.TP +F FAILED +Job terminated with non-zero exit code or other failure condition. +.TP +NF NODE_FAIL +Job terminated due to failure of one or more allocated nodes. +.TP +PD PENDING +Job is awaiting resource allocation. +.TP +TO TIMEOUT +Job terminated upon reaching its time limit. + .SH "ENVIRONMENT VARIABLES" .PP Some \fBsqueue\fR options may be set via environment variables. These -- GitLab