Skip to content
Snippets Groups Projects
Commit 4a8f837d authored by Mark Grondona's avatar Mark Grondona
Browse files

o change "regular expression" to "node range expression" to avoid

   confusion with real regular expressions.
 o fix up srun.1 man page a bit
parent 60627207
No related branches found
No related tags found
No related merge requests found
...@@ -140,7 +140,7 @@ adev0: srun -n4 -l /bin/hostname ...@@ -140,7 +140,7 @@ adev0: srun -n4 -l /bin/hostname
<p> <p>
Submit the script <i>my.script</i> for later execution (<i>-b</i>). Submit the script <i>my.script</i> for later execution (<i>-b</i>).
Explicitly use the nodes adev9 and adev10 (<i>-w "adev[9-10]"</i>, Explicitly use the nodes adev9 and adev10 (<i>-w "adev[9-10]"</i>,
note the use of a regular expression). note the use of a node range expression).
One processor per task will be used by default One processor per task will be used by default
The output will appear in the file <i>my.stdout</i> (<i>-o my.stdout</i>). The output will appear in the file <i>my.stdout</i> (<i>-o my.stdout</i>).
By default, one task will be initiated per processor on the nodes. By default, one task will be initiated per processor on the nodes.
...@@ -239,11 +239,11 @@ well as various timer values. ...@@ -239,11 +239,11 @@ well as various timer values.
<p> <p>
A description of the nodes and their grouping into non-overlapping A description of the nodes and their grouping into non-overlapping
partitions is required. partitions is required.
Partition and node specifications use regular expressions to Partition and node specifications use node range expressions to
identify nodes in a concise fashion. identify nodes in a concise fashion.
This configuration file defines a 1154 node cluster for SLURM, but This configuration file defines a 1154 node cluster for SLURM, but
might be used for a much larger cluster by just changing a might be used for a much larger cluster by just changing a
few regular expressions. few node range expressions.
<pre> <pre>
# #
......
...@@ -73,7 +73,7 @@ Display the state of the specified entity with the specified identification. ...@@ -73,7 +73,7 @@ Display the state of the specified entity with the specified identification.
entity: the configuration parameter name, job ID, node name, partition name, entity: the configuration parameter name, job ID, node name, partition name,
or job step ID for entities \fIconfig\fP, \fIjob\fP, \fInode\fP, \fIpartition\fP, or job step ID for entities \fIconfig\fP, \fIjob\fP, \fInode\fP, \fIpartition\fP,
and \fIstep\fP respectively. and \fIstep\fP respectively.
Multiple node names may be specified using simple regular expressions Multiple node names may be specified using simple node range expressions
(e.g. "lx[10-20]"). All other \fIID\fP values must identify a single (e.g. "lx[10-20]"). All other \fIID\fP values must identify a single
element. The job step ID is of the form "job_id.step_id", (e.g. "1234.1"). element. The job step ID is of the form "job_id.step_id", (e.g. "1234.1").
By default, all elements of the entity type specified are printed. By default, all elements of the entity type specified are printed.
...@@ -140,7 +140,7 @@ Set the job's priority to the specified value. ...@@ -140,7 +140,7 @@ Set the job's priority to the specified value.
.TP .TP
\fIReqNodeList\fP=<nodes> \fIReqNodeList\fP=<nodes>
Set the job's list of required node. Multiple node names may be specified using Set the job's list of required node. Multiple node names may be specified using
simple regular expressions (e.g. "lx[10-20]"). simple node range expressions (e.g. "lx[10-20]").
.TP .TP
\fIReqNodes\fP=<count> \fIReqNodes\fP=<count>
Set the job's count of required nodes to the specified value. Set the job's count of required nodes to the specified value.
...@@ -159,7 +159,7 @@ Set the job's time limit to the specified value. ...@@ -159,7 +159,7 @@ Set the job's time limit to the specified value.
.TP .TP
\fINodeName\fP=<name> \fINodeName\fP=<name>
Identify the node(s) to be updated. Multiple node names may be specified using Identify the node(s) to be updated. Multiple node names may be specified using
simple regular expressions (e.g. "lx[10-20]"). This specification is required. simple node range expressions (e.g. "lx[10-20]"). This specification is required.
.TP .TP
\fIState\fP=<state> \fIState\fP=<state>
Identify the state to be assigned to the node. Possible values are "NoResp", Identify the state to be assigned to the node. Possible values are "NoResp",
...@@ -182,7 +182,7 @@ identify a partition to use. Possible values are"YES" and "NO". ...@@ -182,7 +182,7 @@ identify a partition to use. Possible values are"YES" and "NO".
.TP .TP
\fINodes\fP=<name> \fINodes\fP=<name>
Identify the node(s) to be associated with this partition. Multiple node names Identify the node(s) to be associated with this partition. Multiple node names
may be specified using simple regular expressions (e.g. "lx[10-20]"). may be specified using simple node range expressions (e.g. "lx[10-20]").
Note that jobs may only be associated with one partition at any time. Note that jobs may only be associated with one partition at any time.
.TP .TP
\fIPartitionName\fP=<name> \fIPartitionName\fP=<name>
......
...@@ -46,7 +46,7 @@ Provide detailed event logging through program execution. ...@@ -46,7 +46,7 @@ Provide detailed event logging through program execution.
.SH "ARGUMENTS" .SH "ARGUMENTS"
.TP .TP
\fBnode\fR \fBnode\fR
The names of one or more comma separated nodes. Names may be specified using regular expressions. The names of one or more comma separated nodes. Names may be specified using node range expressions.
For example "linux[00-07]" would indicate eight nodes, "linux00" through "linux07". For example "linux[00-07]" would indicate eight nodes, "linux00" through "linux07".
.TP .TP
\fBpartition\fR \fBpartition\fR
......
...@@ -8,6 +8,9 @@ srun \- run parallel jobs ...@@ -8,6 +8,9 @@ srun \- run parallel jobs
[\fIOPTIONS\fR...] \fIexecutable \fR[\fIargs\fR...] [\fIOPTIONS\fR...] \fIexecutable \fR[\fIargs\fR...]
.br .br
.B srun .B srun
\-\-batch [\fIOPTIONS\fR...] job_script
.br
.B srun
\-\-allocate [\fIOPTIONS\fR...] [job_script] \-\-allocate [\fIOPTIONS\fR...] [job_script]
.br .br
.B srun .B srun
...@@ -405,13 +408,46 @@ session of \fBsrun\fR had started the job. (stdin, however, cannot ...@@ -405,13 +408,46 @@ session of \fBsrun\fR had started the job. (stdin, however, cannot
be forwarded to the job). be forwarded to the job).
.PP .PP
There are two ways to reattach to a running job. The default method There are two ways to reattach to a running job. The default method
is to steal any current connections to the job. In this case, the is to attach to the current job read-only. In this case,
\fBsrun\fR process currently managing the job will be terminated, and stdout and stderr are duplicated to the attaching \fBsrun\fR, but
control will be relegated to the caller. To allow the current signals are not forwarded to the remote processes (A single
\fBsrun\fR to continue managing the running job, the \fB\-j\fB Ctrl-C will detach this read-only \fBsrun\fR from the job). If
(\fB\-\-join\fR) option may be specified. When joining with the the \fB-j\fR (\fB\-\-join\fR) option is is also specified,
running job, stdout and stderr are duplicated to the new \fBsrun\fR \fBsrun\fR "joins" the running job, and is able to forward signals
session, but signals are not forwarded to the remote job. and acts for the most part much like the \fBsrun\fR process that
initiated the job.
.PP
Attaching to running batch jobs is also supported, if the batch
job is being managed by SLURM (That is, a script submitted with
\fBsrun \-b\fR). The stdout and stderr from the \fIbatch script\fR
will then be copied to the attaching \fBsrun\fR, and if \fB-j\fR
is also specified, signals will be sent to the batch script.
This feature provides a good method for determining the status
of a running \fBsrun\fR within a batch script. For example,
consider attaching to a running batch job with jobid 483:
.br
.br
> srun --join --attach 483
.br
.br
After pressing Ctrl-C twice within one second, SIGINT is forwarded
to the batch job script, and the running srun reports its status:
.br
.br
attach[483]: interrupt (one more within 1 sec to abort)
.br
attach[483]: sending Ctrl-C to job
.br
srun: interrupt (one more within 1 sec to abort)
.br
srun: task[0-15]: running
.br
.br
showing that all 16 tasks in the current job step are running.
.PP .PP
Node and CPU selection options do not make sense when specifying Node and CPU selection options do not make sense when specifying
\fB\-\-attach\fR, and it is an error to use \fB-n\fR, \fB-c\fR, \fB\-\-attach\fR, and it is an error to use \fB-n\fR, \fB-c\fR,
......
...@@ -241,10 +241,10 @@ The node configuration specifies the following information: ...@@ -241,10 +241,10 @@ The node configuration specifies the following information:
\fBNodeName\fR \fBNodeName\fR
Name of a node as returned by the hostname command, Name of a node as returned by the hostname command,
without the full domain name (e.g. "lx0012"). without the full domain name (e.g. "lx0012").
A simple regular expression may optionally A simple node range expression may optionally
be used to specify ranges be used to specify ranges
of nodes to avoid building a configuration file with large numbers of nodes to avoid building a configuration file with large numbers
of entries. The regular expression can contain one of entries. The node range expression can contain one
pair of square brackets with a sequence of comma separated pair of square brackets with a sequence of comma separated
numbers and/or ranges of numbers separated by a "-" numbers and/or ranges of numbers separated by a "-"
(e.g. "linux[0-64,128]", or "lx[15,18,32-33]"). (e.g. "linux[0-64,128]", or "lx[15,18,32-33]").
...@@ -272,13 +272,13 @@ Name that a node should be referred to in establishing ...@@ -272,13 +272,13 @@ Name that a node should be referred to in establishing
a communications path. This name will be used as an a communications path. This name will be used as an
argument to the gethostbyname() function for identification. argument to the gethostbyname() function for identification.
For example, "elx0012" might be used to designate For example, "elx0012" might be used to designate
the ethernet address for node "lx0012". A simple regular the ethernet address for node "lx0012". A simple node range
expression may optionally be used to specify ranges expression may optionally be used to specify ranges
of nodes. The regular expression can contain one of nodes. The node range expression can contain one
pair of square brackets with a sequence of comma separated pair of square brackets with a sequence of comma separated
numbers and/or ranges of numbers separated by a "-" numbers and/or ranges of numbers separated by a "-"
(e.g. "elinux[0-64,128]"). (e.g. "elinux[0-64,128]").
If a regular express is used to designate multiple nodes, If a node range expression is used to designate multiple nodes,
they must exactly match the entries in the \fBNodeName\fR they must exactly match the entries in the \fBNodeName\fR
(e.g. "NodeName=lx[0-7] NodeAddr="elx[0-7]"). (e.g. "NodeName=lx[0-7] NodeAddr="elx[0-7]").
By default the \fBNodeAddr\fR will be identical in value to By default the \fBNodeAddr\fR will be identical in value to
...@@ -374,7 +374,7 @@ The default value is 1. ...@@ -374,7 +374,7 @@ The default value is 1.
\fBNodes\fR \fBNodes\fR
Comma separated list of nodes which are associated with this Comma separated list of nodes which are associated with this
partition. Node names may be specified using the partition. Node names may be specified using the
regular expression syntax described above. A blank list of nodes node range expression syntax described above. A blank list of nodes
(i.e. "Nodes= ") can be used if one wants a partition to exist, (i.e. "Nodes= ") can be used if one wants a partition to exist,
but have no resources (possibly on a temporary basis). but have no resources (possibly on a temporary basis).
.TP .TP
......
.TH SLURMD "8" "October 2002" "slurmd 0.1" "Slurm components" .TH SLURMD "8" "October 2002" "slurmd 0.1" "Slurm components"
.SH "NAME" .SH "NAME"
slurmd \- The compute node daemon of Slurm. slurmd \- The compute node daemon for SLURM.
.SH "SYNOPSIS" .SH "SYNOPSIS"
\fBslurmd\fR [\fIOPTIONS\fR...] \fBslurmd\fR [\fIOPTIONS\fR...]
.SH "DESCRIPTION" .SH "DESCRIPTION"
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment