Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
S
Slurm
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
tud-zih-energy
Slurm
Commits
25d1b57e
Commit
25d1b57e
authored
23 years ago
by
Moe Jette
Browse files
Options
Downloads
Patches
Plain Diff
Change partition RootKey to Interactive. - Jette
parent
d3792319
No related branches found
No related tags found
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
doc/html/admin.guide.html
+36
-48
36 additions, 48 deletions
doc/html/admin.guide.html
with
36 additions
and
48 deletions
doc/html/admin.guide.html
+
36
−
48
View file @
25d1b57e
...
@@ -139,10 +139,8 @@ The default value is 1.
...
@@ -139,10 +139,8 @@ The default value is 1.
<p>
<p>
Only the NodeName must be supplied in the configuration file; all other
Only the NodeName must be supplied in the configuration file; all other
items are optional.
items are optional.
Other configuration information can be gathered through communications
It is advisable to establish baseline node configurations in the configuration
with the SLURM Daemon, slurmd actually running on each node.
file, especially if the cluster is heterogeneous.
Alternately, you can explicitly establish baseline values in the
configuration file.
Nodes which register to the system with less than the configured resources
Nodes which register to the system with less than the configured resources
(e.g. too little memory), will be placed in the "DOWN" state to
(e.g. too little memory), will be placed in the "DOWN" state to
avoid scheduling jobs on them.
avoid scheduling jobs on them.
...
@@ -150,16 +148,18 @@ The resources checked at node registration time are: CPUs,
...
@@ -150,16 +148,18 @@ The resources checked at node registration time are: CPUs,
RealMemory and TmpDisk.
RealMemory and TmpDisk.
The default values for each node can be specified with a record in which
The default values for each node can be specified with a record in which
"NodeName" is "DEFAULT".
"NodeName" is "DEFAULT".
The "NodeName=" specification must be placed on every line
describing the configuration of that node(s).
When a NodeName specification exists on two or more separate lines
in the configuration, only values specified in the second
or subsequent lines will be set (SLURM will not re-apply default values).
All required information can typically be placed on a single line.
The field descriptors above are case sensitive.
The default entry values will apply only to lines following it in the
The default entry values will apply only to lines following it in the
configuration file and the default values can be reset multiple times
configuration file and the default values can be reset multiple times
in the configuration file with multiple entries where "NodeName=DEFAULT".
in the configuration file with multiple entries where "NodeName=DEFAULT".
The "NodeName=" specification must be placed on every line
describing the configuration of nodes.
Each nodes configuration must be specified on a single line
rather than having the various values established on multiple lines.
In fact, it is generally possible
<i>
and desirable
</i>
to define the
configurations of all nodes in only a few lines.
This convension permits significant optimization in the scheduling
of larger clusters.
The field descriptors above are case sensitive.
In order to support the concept of jobs requiring consecutive nodes
In order to support the concept of jobs requiring consecutive nodes
on some architectures,
on some architectures,
node specifications should be place in this file in consecutive order.
node specifications should be place in this file in consecutive order.
...
@@ -219,10 +219,10 @@ without this optimization.
...
@@ -219,10 +219,10 @@ without this optimization.
A sample SLURM configuration file (node information only) follows.
A sample SLURM configuration file (node information only) follows.
<pre>
<pre>
# Node specifications
# Node specifications
NodeName=DEFAULT
CPUs=16 RealMemory=2048
TmpDisk=16384
NodeName=DEFAULT
TmpDisk=16384
NodeName=lx[0
1-
02] State=DRAINED
NodeName=lx[0
001-00
02] State=DRAINED
NodeName=lx[0
3-16]
NodeName=lx[0
003-0016] CPUs=16 RealMemory=2048
NodeName=lx[17-
32
] CPUs=32 RealMemory=4096 Feature=1200MHz,VizTools
NodeName=lx[
00
17-
8096
] CPUs=32 RealMemory=4096 Feature=1200MHz,VizTools
</pre>
</pre>
<p>
<p>
The partition configuration permits you to establish different job
The partition configuration permits you to establish different job
...
@@ -242,6 +242,14 @@ specification will utilize this partition.
...
@@ -242,6 +242,14 @@ specification will utilize this partition.
Possible values are "YES" and "NO".
Possible values are "YES" and "NO".
The default value is "NO".
The default value is "NO".
<dt>
Interactive
<dd>
Specifies if interactive jobs (executed immediately and not
queued) may execute in this partition.
See the
<a
href=
"user.guide.html"
>
SLURM User's Guide
</a>
for more
information about interactive and batch (queued) jobs.
Possible values are "YES" and "NO".
The default value is "YES".
<dt>
MaxNodes
<dt>
MaxNodes
<dd>
Maximum count of nodes which may be allocated to any single job,
<dd>
Maximum count of nodes which may be allocated to any single job,
The default value is "UNLIMITED", which is represented internally as -1.
The default value is "UNLIMITED", which is represented internally as -1.
...
@@ -261,28 +269,6 @@ but have no resources (possibly on a temporary basis).
...
@@ -261,28 +269,6 @@ but have no resources (possibly on a temporary basis).
<dd>
Name by which the partition may be referenced (e.g. "Interactive").
<dd>
Name by which the partition may be referenced (e.g. "Interactive").
This name can be specified by users when submitting jobs.
This name can be specified by users when submitting jobs.
<dt>
RootKey
<dd>
If this keyword is set, the job must be submitted with a
valid "Key=value" specified.
Valid key values are provided to user
<b>
root
</b>
upon request.
This mechanism can be used to restrict access to a partition.
For example, a batch system might execute as root, acquire
a key via the SLURM API, then set its user ID to that of a
non-privileged user and initiate his job.
The user's job has no special privileges other than access
to the partition.
The non-privileged user would not be able to submit jobs
directly to this partition for lack of a key.
Issued keys will remain valid for a single use only.
Possible values are "YES" and "NO".
The default value is "NO".
<dt>
Shared
<dd>
Specify if more than one job may execute on each node in
a partition simultaneously. Possible values are
"YES" and "NO". The default value is "NO".
If nodes are shared, job performance will vary.
<dt>
State
<dt>
State
<dd>
State of partition or availability for use. Possible values
<dd>
State of partition or availability for use. Possible values
are "UP" or "DOWN". The default value is "UP".
are "UP" or "DOWN". The default value is "UP".
...
@@ -312,9 +298,10 @@ The job may specify a particular PartitionName, if so desired,
...
@@ -312,9 +298,10 @@ The job may specify a particular PartitionName, if so desired,
or use the system's default partition.
or use the system's default partition.
<pre>
<pre>
# Partition specifications
# Partition specifications
PartitionName=batch MaxNodes=10 MaxTime=UNLIMITED Nodes=lx[10-30] RootKey=YES
PartitionName=DEFAULT Interactive=YES
PartitionName=debug MaxNodes=2 MaxTime=60 Nodes=lx[03-09] Default=YES
PartitionName=batch MaxNodes=10 MaxTime=UNLIMITED Nodes=lx[10-30] Interactive=NO
PartitionName=class MaxNodes=1 MaxTime=10 Nodes=lx[31-32] AllowGroups=students
PartitionName=debug MaxNodes=2 MaxTime=60 Nodes=lx[03-09] Default=YES
PartitionName=class MaxNodes=1 MaxTime=10 Nodes=lx[31-32] AllowGroups=students
</pre>
</pre>
<p>
<p>
APIs and an administrative tool can be used to alter the SLRUM
APIs and an administrative tool can be used to alter the SLRUM
...
@@ -333,6 +320,9 @@ The job configuration format specified below is used by the
...
@@ -333,6 +320,9 @@ The job configuration format specified below is used by the
slurm_admin administration tool to modify job state information:
slurm_admin administration tool to modify job state information:
<dl>
<dl>
<dt>
Group
<dd>
Comma separated list of group names to which the user belongs.
<dt>
Number
<dt>
Number
<dd>
Unique number by which the job can be referenced. This value
<dd>
Unique number by which the job can be referenced. This value
may not be changed by slurm_admin.
may not be changed by slurm_admin.
...
@@ -356,9 +346,6 @@ This value may not be changed by slurm_admin.
...
@@ -356,9 +346,6 @@ This value may not be changed by slurm_admin.
<dt>
User
<dt>
User
<dd>
Name of the user executing this job.
<dd>
Name of the user executing this job.
<dt>
Group
<dd>
Comma separated list of group names to which the user belongs.
</dl>
</dl>
<a
name=
"Build"
><h2>
Build Parameters
</h2></a>
<a
name=
"Build"
><h2>
Build Parameters
</h2></a>
...
@@ -535,8 +522,9 @@ NodeName=lx[01-02] State=DRAINED
...
@@ -535,8 +522,9 @@ NodeName=lx[01-02] State=DRAINED
NodeName=lx[03-16] Feature=CoolDebugger
NodeName=lx[03-16] Feature=CoolDebugger
#
#
# Default partition specification
# Default partition specification
PartitionName=batch MaxCpus=128 MaxTime=240 Nodes=lx[10-30] RootKey=YES
PartitionName=DEFAULT Interactive=YES
PartitionName=debug MaxCpus=16 MaxTime=60 Nodes=lx[03-09] Default=YES Shared=YES
PartitionName=batch MaxCpus=128 MaxTime=240 Nodes=lx[10-30] Interactive=NO
PartitionName=debug MaxCpus=16 MaxTime=60 Nodes=lx[03-09] Default=YES
PartitionName=class MaxCpus=16 MaxTime=10 Nodes=lx[31-32] AllowGroups=students
PartitionName=class MaxCpus=16 MaxTime=10 Nodes=lx[31-32] AllowGroups=students
</pre>
</pre>
...
@@ -555,8 +543,8 @@ Remove node lx30 from service, removing jobs as needed:
...
@@ -555,8 +543,8 @@ Remove node lx30 from service, removing jobs as needed:
# slurm_admin
# slurm_admin
slurm_admin: update NodeName=lx30 State=DRAINING
slurm_admin: update NodeName=lx30 State=DRAINING
slurm_admin: show job
slurm_admin: show job
ID=1234 Name=Simulation MaxTime=100 Nodes=lx[29-30] State=RUNNING User=
grondo
ID=1234 Name=Simulation MaxTime=100 Nodes=lx[29-30] State=RUNNING User=
smith
ID=1235 Name=MyBigTest MaxTime=100 Nodes=lx20,lx23 State=RUNNING User=
grondo
ID=1235 Name=MyBigTest MaxTime=100 Nodes=lx20,lx23 State=RUNNING User=
smith
slurm_admin: update job ID=1234 State=ENDING
slurm_admin: update job ID=1234 State=ENDING
slurm_admin: show job 1234
slurm_admin: show job 1234
Job 1234 not found
Job 1234 not found
...
@@ -567,7 +555,7 @@ Remove node lx30 from service, removing jobs as needed:
...
@@ -567,7 +555,7 @@ Remove node lx30 from service, removing jobs as needed:
<hr>
<hr>
URL = http://www-lc.llnl.gov/dctg-lc/slurm/admin.guide.html
URL = http://www-lc.llnl.gov/dctg-lc/slurm/admin.guide.html
<p>
Last Modified February
15
, 2002
</p>
<p>
Last Modified February
20
, 2002
</p>
<address>
Maintained by
<a
href=
"mailto:slurm-dev@lists.llnl.gov"
>
<address>
Maintained by
<a
href=
"mailto:slurm-dev@lists.llnl.gov"
>
slurm-dev@lists.llnl.gov
</a></address>
slurm-dev@lists.llnl.gov
</a></address>
</body>
</body>
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment