Skip to content
Snippets Groups Projects
sinfo.1 16 KiB
Newer Older
.TH SINFO "1" "September 2008" "sinfo 1.4" "Slurm components"
sinfo \- view information about SLURM nodes and partitions.

.SH "SYNOPSIS"
\fBsinfo\fR [\fIOPTIONS\fR...] 
.SH "DESCRIPTION"
\fBsinfo\fR is used to view partition and node information for a 
system running SLURM. 

.SH "OPTIONS"
\fB\-a\fR, \fB\-\-all\fR
Display information about all partions. This causes information to be 
displayed about partitions that are configured as hidden and partitions that
are unavailable to user's group.
\fB\-b\fR, \fB\-\-bgl\fR
Display information about bglblocks (on Blue Gene systems only).
If set only report state information for non\-responding (dead) nodes.
\fB\-e\fR, \fB\-\-exact\fR
If set, do not group node information on multiple nodes unless
their configurations to be reported are identical. Otherwise
cpu count, memory size, and disk space for nodes will be listed
with the minimum value followed by a "+" for nodes with the
same partition and state (e.g., "250+").
\fB\-h\fR, \fB\-\-noheader\fR
Do not print a header on the output.

.TP
\fB\-\-help\fR
Print a message describing all \fBsinfo\fR options.
.TP

\fB\-\-hide\fR
Do not display information about hidden partitions. By default, partitions 
that are configured as hidden or are not available to the user's group 
will not be displayed (i.e. this is the default behavior).

\fB\-i <seconds>\fR, \fB\-\-iterate=<seconds>\fR
Print the state on a periodic basis. 
Sleep for the indicated number of seconds between reports.
By default, prints a time stamp with the header.
\fB\-l\fR, \fB\-\-long\fR
Print more detailed information. 
This is ignored if the \fB\-\-format\fR option is specified.
\fB\-n <nodes>\fR, \fB\-\-nodes=<nodes>\fR
Print information only about the specified node(s). 
Multiple nodes may be comma separated or expressed using a 
node range expression. For example "linux[00\-07]" would 
indicate eight nodes, "linux00" through "linux07."
\fB\-N\fR, \fB\-\-Node\fR
Print information in a node\-oriented format.
The default is to print information in a partition\-oriented format.
This is ignored if the \fB\-\-format\fR option is specified.
.TP
\fB\-o <output_format>\fR, \fB\-\-format=<output_format>\fR
Specify the information to be displayed using an \fBsinfo\fR
format string. Format strings transparently used by \fBsinfo\fR
when running with various options are
.RS
.TP 15
"%9P %5a %.10l %.5D %6t %N"
"%9P %5a %.10l %.8s %4r %5h %10g %.5D %11T %N"
"%#N %.5D %9P %11T %.4c %.8z %.6m %.8d %.6w %8f %R"
.TP
.I "\-\-long \-\-list\-reasons"
"%50R %6t %N"
In the above format strings the use of "#" represents the
maximum length of an node list to be printed.
The field specifications available include: 
.RS
.TP 4
\fB%a\fR 
State/availability of a partition
.TP
\fB%A\fR 
Number of nodes by state in the format "allocated/idle".
Do not use this with a node state option ("%t" or "%T") or
the different node states will be placed on separate lines.
\fB%C\fR
Number of CPUs by state in the format
"allocated/idle/other/total". Do not use this with a node
state option ("%t" or "%T") or the different node states will
be placed on separate lines.
.TP
\fB%d\fR 
Size of temporary disk space per node in megabytes
.TP
\fB%D\fR 
Number of nodes
.TP
\fB%E\fR
The reason a node is unavailable (down, drained, or draining states).
This is the same as \fB%R\fR except the entries will be sorted by 
time rather than the reason string.
.TP
\fB%f\fR 
Features associated with the nodes
.TP
\fB%F\fR 
Number of nodes by state in the format
"allocated/idle/other/total".  Do not use this with a node
state option ("%t" or "%T") or the different node states will
be placed on separate lines.
.TP
\fB%g\fR 
Groups which may use the nodes
.TP
\fB%h\fR 
Jobs may share nodes, "yes", "no", or "force"
.TP
\fB%l\fR 
Maximum time for any job in the format "days\-hours:minutes:seconds"
.TP
\fB%m\fR 
Size of memory per node in megabytes
.TP
\fB%N\fR 
List of node names
.TP
\fB%P\fR 
Partition name
.TP
\fB%r\fR 
Only user root may initiate jobs, "yes" or "no"
.TP
\fB%R\fR 
The reason a node is unavailable (down, drained, draining, 
fail or failing states)
.TP
\fB%s\fR 
Maximum job size in nodes
.TP
\fB%t\fR 
State of nodes, compact form
.TP
\fB%T\fR 
State of nodes, extended form
.TP
\fB%w\fR 
Scheduling weight of the nodes
.TP
\fB%X\fR 
Number of sockets per node
.TP
\fB%Y\fR 
Number of cores per socket
.TP
\fB%Z\fR 
Number of threads per core
.TP
\fB%z\fR 
Extended processor information: number of sockets, cores, threads (S:C:T) per node
.TP
\fB%.<*>\fR 
right justification of the field
.TP
\fB%<Number><*>\fR 
size of field
.RE

.TP
\fB\-p <partition>\fR, \fB\-\-partition=<partition>\fR
Print information only about the specified partition.  

\fB\-r\fR, \fB\-\-responding\fR
If set only report state information for responding nodes.
\fB\-R\fR, \fB\-\-list\-reasons\fR
List reasons nodes are in the down, drained, fail or failing state. 
When nodes are in these states SLURM supports optional inclusion 
of a "reason" string by an administrator. 
This option will display the first 35 characters of the reason 
field and list of nodes with that reason for all nodes that are, 
by default, down, drained, draining or failing. 
This option may be used with other node filtering options 
(e.g. \fB\-r\fR, \fB\-d\fR, \fB\-t\fR, \fB\-n\fR),
however, combinations of these options that result in a 
list of nodes that are not down or drained or failing will 
not produce any output.
When used with \fB\-l\fR the output additionally includes
the current node state.
\fB\-s\fR, \fB\-\-summarize\fR
List only a partition state summary with no node state details.
This is ignored if the \fB\-\-format\fR option is specified.
\fB\-S <sort_list>\fR, \fB\-\-sort=<sort_list>\fR
Specification of the order in which records should be reported.
This uses the same field specifciation as the <output_format>.
Multiple sorts may be performed by listing multiple sort fields
separated by commas.  The field specifications may be preceeded
by "+" or "\-" for assending (default) and desending order
respectively.  The partition field specification, "P", may be
preceeded by a "#" to report partitions in the same order that
they appear in SLURM's  configuration file, \fBslurm.conf\fR.
For example, a sort value of "+P,\-m" requests that records
be printed in order of increasing partition name and within a
partition by decreasing memory size.  The default value of sort
is "#P,\-t" (partitions ordered as configured then decreasing
node state).  If the \fB\-\-Node\fB option is selected, the
default sort value is "N" (increasing node name).
.TP
\fB\-t <states>\fR , \fB\-\-states=<states>\fR
List nodes only having the given state(s).  Multiple states
may be comma separated and the comparison is case insensitive.
Possible values include (case insensitive): ALLOC, ALLOCATED,
COMP, COMPLETING, DOWN, DRAIN, DRAINED, DRNG, DRAINING, FAIL, 
FAILING, IDLE, UNK, and UNKNOWN.  
By default nodes in the specified state are reported whether 
they are responding or not.  
The \fB\-\-dead\fR and \fB\-\-responding\fR options may be 
used to filtering nodes by the responding flag.
\fB\-\-usage\fR
Print a brief message listing the \fBsinfo\fR options.

\fB\-v\fR, \fB\-\-verbose\fR
Provide detailed event logging through program execution.
\fB\-V\fR, \fB\-\-version\fR
Print version information and exit.

.SH "OUTPUT FIELD DESCRIPTIONS"
Partition state: \fBup\fR or \fBdown\fR.
.TP
\fBCPUS\fR
Count of CPUs (processors) on these nodes.
.TP
\fBS:C:T\fR
Count of sockets (S), cores (C), and threads (T) on these nodes.
.TP
\fBSOCKETS\fR
Count of sockets on these nodes.
.TP
\fBCORES\fR
Count of cores on these nodes.
.TP
\fBTHREADS\fR
Count of threads on these nodes.
.TP
Resource allocations in this partition are restricted to the
named groups.  \fBall\fR indicates that all groups may use
this partition.
Minimum and maximum node count that can be allocated to any
user job.  A single number indicates the minimum and maximum
node count are the same.  \fBinfinite\fR is used to identify
partitions without a maximum node count.
Maximum time limit for any user job in
days\-hours:minutes:seconds.  \fBinfinite\fR is used to identify
partitions without a job time limit.
Size of real memory in megabytes on these nodes.
.TP
\fBNODELIST\fR or \fBBP_LIST\fR (BlueGene systems only)
Names of nodes associated with this configuration/partition.
.TP
Count of nodes with this particular configuration.
Count of nodes with this particular configuration by node
state in the form "available/idle".
Count of nodes with this particular configuration by node
state in the form "available/idle/other/total".
.TP
\fBPARTITION\fR
Name of a partition.  Note that the suffix "*" identifies the
default partition.
Is the ability to allocate resources in this partition
restricted to user root, \fByes\fR or \fBno\fR.
Will jobs allocated resources in this partition share those
resources.  
\fBno\fR indicates resources are never shared.
\fBexclusive\fR indicates whole nodes are dedicated to jobs
(equivalent to srun \-\-exclusive option, may be used even 
with shared/cons_res managing individual processors).
\fBforce\fR indicates resources are always available to be shared.  
\fByes\fR indicates resource may be shared or not
per job's resource allocation.
State of the nodes. 
Possible states include: allocated, completing, down, 
drained, draining, fail, failing, idle, and unknown plus
their abbreviated forms: alloc, comp, donw, drain, drng, 
fail, failg, idle, and unk respectively.
Note that the suffix "*" identifies nodes that are presently 
not responding.
.TP
\fBTMP_DISK\fR
Size of temporary disk space in megabytes on these nodes.

Node state codes are shortened as required for the field size.
If the node state code is followed by "*", this indicates the
node is presently not responding and will not be allocated
any new work.  If the node remains non\-responsive, it will
be placed in the \fBDOWN\fR state (except in the case of
\fBCOMPLETING\fR, \fBDRAINED\fR, \fBDRAINING\fR,
\fBFAIL\fR, \fBFAILING\fR nodes).
If the node state code is followed by "~", this indicates
the node is presently in a power saving mode (typically
running at reduced frequency).
\fBALLOCATED\fR
The node has been allocated to one or more jobs.
.TP
\fBALLOCATED+\fR
The node is allocated to one or more active jobs plus
one or more jobs are in the process of COMPLETING.
.TP
\fBCOMPLETING\fR
All jobs associated with this node are in the process of 
COMPLETING.  This node state will be removed when
all of the job's processes have terminated and the SLURM
epilog program (if any) has terminated. See the \fBEpilog\fR
parameter description in the \fBslurm.conf\fR man page for
more information.
\fBDOWN\fR
The node is unavailable for use. SLURM can automatically
place nodes in this state if some failure occurs. System
administrators may also explicitly place nodes in this state. If
a node resumes normal operation, SLURM can automatically
return it to service. See the \fBReturnToService\fR
and \fBSlurmdTimeout\fR parameter descriptions in the
\fBslurm.conf\fR(5) man page for more information.
\fBDRAINED\fR
The node is unavailable for use per system administrator
request.  See the \fBupdate node\fR command in the
\fBscontrol\fR(1) man page or the \fBslurm.conf\fR(5) man page
for more information.
\fBDRAINING\fR
The node is currently executing a job, but will not be allocated
to additional jobs. The node state will be changed to state
\fBDRAINED\fR when the last job on it completes. Nodes enter
this state per system administrator request. See the \fBupdate
node\fR command in the \fBscontrol\fR(1) man page or the
\fBslurm.conf\fR(5) man page for more information.
\fBFAIL\fR
The node is expected to fail soon and is unavailable for 
use per system administrator request.  
See the \fBupdate node\fR command in the \fBscontrol\fR(1) 
man page or the \fBslurm.conf\fR(5) man page for more information.
.TP
\fBFAILING\fR
The node is currently executing a job, but is expected to fail 
soon and is unavailable for use per system administrator request.  
See the \fBupdate node\fR command in the \fBscontrol\fR(1) 
man page or the \fBslurm.conf\fR(5) man page for more information.
.TP
\fBIDLE\fR
The node is not allocated to any jobs and is available for use.
.TP
\fBUNKNOWN\fR
The SLURM controller has just started and the node's state
has not yet been determined.
.SH "ENVIRONMENT VARIABLES" 
.PP 
Some \fBsinfo\fR options may
be set via environment variables. These environment variables,
along with their corresponding options, are listed below. (Note:
Commandline options will always override these settings.)
\fB\-o <output_format>, \-\-format=<output_format>\fR
.TP
\fB\-p <partition>, \-\-partition=<partition>\fR
\fB\-S <sort>, \-\-sort=<sort>\fR
.TP
\fBSLURM_CONF\fR
The location of the SLURM configuration file.
.SH "EXAMPLES"
Report basic node and partition configurations:
PARTITION AVAIL TIMELIMIT NODES STATE  NODELIST
batch     up     infinite     2 alloc  adev[8-9]
batch     up     infinite     6 idle   adev[10-15]
debug*    up        30:00     8 idle   adev[0-7]
Report partition summary information:
PARTITION AVAIL TIMELIMIT NODES(A/I/O/T) NODELIST
batch     up     infinite 2/6/0/8        adev[8-15]
debug*    up        30:00 0/8/0/8        adev[0-7]
 
Report more complete information about the partition debug:
> sinfo \-\-long \-\-partition=debug
PARTITION AVAIL TIMELIMIT JOB_SIZE ROOT SHARE GROUPS NODES STATE NODELIST
debug*    up        30:00        8 no   no    all        8 idle  dev[0-7]
Report only those nodes that are in state DRAINED:
> sinfo --states=drained
PARTITION AVAIL NODES TIMELIMIT STATE  NODELIST
debug*    up        2     30:00 drain  adev[6-7]
Report node-oriented information with details and exact matches:
> sinfo -Nel
NODELIST    NODES PARTITION STATE  CPUS MEMORY TMP_DISK WEIGHT FEATURES REASON
adev[0-1]       2 debug*    idle      2   3448    38536     16 (null)   (null)
adev[2,4-7]     5 debug*    idle      2   3384    38536     16 (null)   (null)
adev3           1 debug*    idle      2   3394    38536     16 (null)   (null)
adev[8-9]       2 batch     allocated 2    246    82306     16 (null)   (null)
adev[10-15]     6 batch     idle      2    246    82306     16 (null)   (null)
Report only down, drained and draining nodes and their reason field:
REASON                              NODELIST
Memory errors                       dev[0,5]
Not Responding                      dev8

.fi
Copyright (C) 2002\-2007 The Regents of the University of California.
Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
LLNL\-CODE\-402394.
.LP
This file is part of SLURM, a resource management program.
For details, see <https://computing.llnl.gov/linux/slurm/>.
.LP
SLURM is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 2 of the License, or (at your option)
any later version.
.LP
SLURM is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more
details.
.SH "SEE ALSO"
\fBscontrol\fR(1), \fBsmap\fR(1), \fBsqueue\fR(1), 
\fBslurm_load_ctl_conf\fR(3), \fBslurm_load_jobs\fR(3), \fBslurm_load_node\fR(3), 
\fBslurm_load_partitions\fR(3), 
\fBslurm_reconfigure\fR(3), \fBslurm_shutdown\fR(3), 
\fBslurm_update_job\fR(3), \fBslurm_update_node\fR(3), 
\fBslurm_update_partition\fR(3),
\fBslurm.conf\fR(5)