Fixed typos in faq.shtml

b345c8d2 · Don Lipari · e0db5afd · b345c8d2
Commit b345c8d2 authored 15 years ago by Don Lipari
--- a/doc/html/faq.shtml
+++ b/doc/html/faq.shtml
@@ -99,12 +99,12 @@ execute on DOWN nodes?</li>
 <li><a href="#batch_lost">What is the meaning of the error 
 &quot;Batch JobId=# missing from master node, killing it&quot;?</a></li>
 <li><a href="#accept_again">What does the messsage
-&quot;srun: error: Unable to accept connection: Resources termporarily unavailable&quot; 
+&quot;srun: error: Unable to accept connection: Resources temporarily unavailable&quot; 
 indicate?</a></li>
 <li><a href="#task_prolog">How could I automatically print a job's 
 SLURM job ID to its standard output?</li>
 <li><a href="#moab_start">I run SLURM with the Moab or Maui scheduler.
-How can I start a job under SLURM wihtout the scheduler?</li>
+How can I start a job under SLURM without the scheduler?</li>
 <li><a href="#orphan_procs">Why are user processes and <i>srun</i>
 running even though the job is supposed to be completed?</li>
 <li><a href="#slurmd_oom">How can I prevent the <i>slurmd</i> and
@@ -129,7 +129,7 @@ for an extended period of time.
 This may be indicative of processes hung waiting for a core file 
 to complete I/O or operating system failure. 
 If this state persists, the system administrator should check for processes 
-associated with the job that can not be terminated then use the 
+associated with the job that cannot be terminated then use the 
 <span class="commandline">scontrol</span> command to change the node's 
 state to DOWN (e.g. &quot;scontrol update NodeName=<i>name</i> State=DOWN Reason=hung_completing&quot;), 
 reboot the node, then reset the node's state to IDLE 
@@ -184,7 +184,7 @@ until no previously submitted job is pending. If the scheduler type is <b>backfi
 then jobs will generally be executed in the order of submission for a given partition 
 with one exception: later submitted jobs will be initiated early if doing so does 
 not delay the expected execution time of an earlier submitted job. In order for 
-backfill scheduling to be effective, users jobs should specify reasonable time 
+backfill scheduling to be effective, users' jobs should specify reasonable time
 limits. If jobs do not specify time limits, then all jobs will receive the same 
 time limit (that associated with the partition), and the ability to backfill schedule 
 jobs will be limited. The backfill scheduler does not alter job specifications 
@@ -226,7 +226,7 @@ more information.</p>
 SLURM has a job purging mechanism to remove inactive jobs (resource allocations)
 before reaching its time limit, which could be infinite.
 This inactivity time limit is configurable by the system administrator. 
-You can check it's value with the command</p>
+You can check its value with the command</p>
 <blockquote>
 <p><span class="commandline">scontrol show config | grep InactiveLimit</span></p>
 </blockquote>
@@ -251,7 +251,7 @@ the command. For example:</p>
 </blockquote>
 <p>srun processes "-N2" as an option to itself. "hostname" is the 
 command to execute and "-pdebug" is treated as an option to the 
-hostname command. Which will change the name of the computer 
+hostname command. This will change the name of the computer 
 on which SLURM executes the command - Very bad, <b>Don't run 
 this command as user root!</b></p>

@@ -305,7 +305,7 @@ that the processes associated with the switch have been terminated
 to avoid the possibility of re-using switch resources for other 
 jobs (even on different nodes).
 SLURM considers jobs COMPLETED when all nodes allocated to the 
-job are either DOWN or confirm termination of all it's processes.
+job are either DOWN or confirm termination of all its processes.
 This enables SLURM to purge job information in a timely fashion 
 even when there are many failing nodes.
 Unfortunately the job step information may persist longer.</p>
@@ -490,7 +490,7 @@ indicate?</b></a><br>
 The srun command normally terminates when the standard output and 
 error I/O from the spawned tasks end. This does not necessarily 
 happen at the same time that a job step is terminated. For example, 
-a file system problem could render a spawned tasks non-killable 
+a file system problem could render a spawned task non-killable
 at the same time that I/O to srun is pending. Alternately a network 
 problem could prevent the I/O from being transmitted to srun.
 In any event, the srun command is notified when a job step is 
@@ -526,8 +526,8 @@ If the user's resource limit is not propagated, the limit in
 effect for the <i>slurmd</i> daemon will be used for the spawned job.
 A simple way to control this is to insure that user <i>root</i> has a 
 sufficiently large resource limit and insuring that <i>slurmd</i> takes 
-full advantage of this limit. For example, you can set user's root's
-locked memory limit limit to be unlimited on the compute nodes (see
+full advantage of this limit. For example, you can set user root's
+locked memory limit ulimit to be unlimited on the compute nodes (see
 <i>"man limits.conf"</i>) and insuring that <i>slurmd</i> takes 
 full advantage of this limit (e.g. by adding something like
 <i>"ulimit -l unlimited"</i> to the <i>/etc/init.d/slurm</i>
@@ -632,11 +632,11 @@ accommodate all jobs allocated to a node, either running or suspended.
 <p><a name="fast_schedule"><b>2. How can I configure SLURM to use 
 the resources actually found on a node rather than what is defined 
 in <i>slurm.conf</i>?</b></a><br>
-SLURM can either base it's scheduling decisions upon the node 
+SLURM can either base its scheduling decisions upon the node
 configuration defined in <i>slurm.conf</i> or what each node 
 actually returns as available resources. 
 This is controlled using the configuration parameter <i>FastSchedule</i>.
-Set it's value to zero in order to use the resources actually 
+Set its value to zero in order to use the resources actually 
 found on each node, but with a higher overhead for scheduling.
 A value of one is the default and results in the node configuration 
 defined in <i>slurm.conf</i> being used. See &quot;man slurm.conf&quot;
@@ -667,7 +667,7 @@ See the slurm.conf and srun man pages for more information.</p>
 
 <p><a name="multi_job"><b>5. How can I control the execution of multiple 
 jobs per node?</b></a><br>
-There are two mechanism to control this. 
+There are two mechanisms to control this.
 If you want to allocate individual processors on a node to jobs, 
 configure <i>SelectType=select/cons_res</i>. 
 See <a href="cons_res.html">Consumable Resources in SLURM</a>
@@ -691,7 +691,7 @@ for more information (e.g. &quot;slurmctld -Dvvvvv&quot;).

 <p><a name="sigpipe"><b>7. Why are user tasks intermittently dying
 at launch with SIGPIPE error messages?</b></a><br>
-If you are using ldap or some other remote name service for
+If you are using LDAP or some other remote name service for
 username and groups lookup, chances are that the underlying
 libc library functions are triggering the SIGPIPE.  You can likely
 work around this problem by setting <i>CacheGroups=1</i> in your slurm.conf
@@ -765,7 +765,7 @@ to relocate them. In order to do so, follow this procedure:</p>
 <li>Stop all SLURM daemons</li>
 <li>Modify the <i>ControlMachine</i>, <i>ControlAddr</i>, 
 <i>BackupController</i>, and/or <i>BackupAddr</i> in the <i>slurm.conf</i> file</li>
-<li>Distribute the updated <i>slurm.conf</i> file file to all nodes</li>
+<li>Distribute the updated <i>slurm.conf</i> file to all nodes</li>
 <li>Restart all SLURM daemons</li>
 </ol>
 <p>There should be no loss of any running or pending jobs. Insure that
@@ -803,7 +803,7 @@ cluster?</b></a><br>
 Yes, this can be useful for testing purposes. 
 It has also been used to partition "fat" nodes into multiple SLURM nodes.
 There are two ways to do this.
-The best method for most conditins is to run one <i>slurmd</i> 
+The best method for most conditions is to run one <i>slurmd</i> 
 daemon per emulated node in the cluster as follows.
 <ol>
 <li>When executing the <i>configure</i> program, use the option 
@@ -822,7 +822,7 @@ slurm.conf. </li>
 of the node that it is supposed to serve on the execute line.</li> 
 </ol>
 <p>It is strongly recommended that SLURM version 1.2 or higher be used 
-for this due to it's improved support for multiple slurmd daemons.
+for this due to its improved support for multiple slurmd daemons.
 See the
 <a href="programmer_guide.html#multiple_slurmd_support">Programmers Guide</a>
 for more details about configuring multiple slurmd support.</p>
@@ -983,7 +983,7 @@ This error indicates that a job credential generated by the slurmctld daemon
 corresponds to a job that the slurmd daemon has already revoked. 
 The slurmctld daemon selects job ID values based upon the configured 
 value of <b>FirstJobId</b> (the default value is 1) and each job gets 
-an value one large than the previous job. 
+a value one larger than the previous job.
 On job termination, the slurmctld daemon notifies the slurmd on each 
 allocated node that all processes associated with that job should be 
 terminated. 
@@ -1031,7 +1031,7 @@ of the program.

 <p><a name="rpm"><b>27. Why isn't the auth_none.so (or other file) in a 
 SLURM RPM?</b></a><br>
-The auth_none plugin is in a separete RPM and not built by default.
+The auth_none plugin is in a separate RPM and not built by default.
 Using the auth_none plugin means that SLURM communications are not 
 authenticated, so you probably do not want to run in this mode of operation 
 except for testing purposes. If you want to build the auth_none RPM then 
@@ -1080,7 +1080,7 @@ sinfo -t drain -h -o "scontrol update nodename='%N' state=drain reason='%E'"
 execute on DOWN nodes?</a></b><br>
 Hierarchical communications are used for sending this message. If there
 are DOWN nodes in the communications hierarchy, messages will need to 
-be re-routed. This limits SLURM's ability to tightly synchroize the 
+be re-routed. This limits SLURM's ability to tightly synchronize the
 execution of the <i>HealthCheckProgram</i> across the cluster, which
 could adversely impact performance of parallel applications. 
 The use of CRON or node startup scripts may be better suited to insure
@@ -1113,14 +1113,14 @@ is executing. If a batch program is expected to be running on some
 node (i.e. node zero of the job's allocation) and is not found, the
 message above will be logged and the job cancelled. This typically is 
 associated with exhausting memory on the node or some other critical 
-failure that can not be recovered from. The equivalent message in 
+failure that cannot be recovered from. The equivalent message in 
 earlier releases of slurm is 
 &quot;Master node lost JobId=#, killing it&quot;.

 <p><a name="accept_again"><b>33. What does the messsage
-&quot;srun: error: Unable to accept connection: Resources termporarily unavailable&quot; 
+&quot;srun: error: Unable to accept connection: Resources temporarily unavailable&quot; 
 indicate?</b></a><br>
-This has been reported on some larger clusters running Suse Linux when
+This has been reported on some larger clusters running SUSE Linux when
 a user's resource limits are reached. You may need to increase limits
 for locked memory and stack size to resolve this problem.

@@ -1128,7 +1128,7 @@ for locked memory and stack size to resolve this problem.
 SLURM job ID to its standard output?</b></a></br>
 The configured <i>TaskProlog</i> is the only thing that can write to 
 the job's standard output or set extra environment variables for a job
-or job step. To write to the job's standard output, preceed the message
+or job step. To write to the job's standard output, precede the message
 with "print ". To export environment variables, output a line of this
 form "export name=value". The example below will print a job's SLURM
 job ID and allocated hosts for a batch job only.
@@ -1150,7 +1150,7 @@ fi
 </pre>

 <p><a name="moab_start"><b>35. I run SLURM with the Moab or Maui scheduler.
-How can I start a job under SLURM wihtout the scheduler?</b></a></br>
+How can I start a job under SLURM without the scheduler?</b></a></br>
 When SLURM is configured to use the Moab or Maui scheduler, all submitted
 jobs have their priority initialized to zero, which SLURM treats as a held
 job. The job only begins when Moab or Maui decide where and when to start
@@ -1163,7 +1163,7 @@ $ scontrol update jobid=1234 priority=1000000
 </pre>
 <p>Note that changes in the configured value of <i>SchedulerType</i> only
 take effect when the <i>slurmctld</i> daemon is restarted (reconfiguring
-SLURM will not change this parameter. You will also manuallly need to 
+SLURM will not change this parameter. You will also manually need to
 modify the priority of every pending job. 
 When changing to Moab or Maui scheduling, set every job priority to zero. 
 When changing from Moab or Maui scheduling, set every job priority to a