diff --git a/doc/html/hdf5_profile_user_guide.shtml b/doc/html/hdf5_profile_user_guide.shtml index ddb5fa79d16fef2aa8da279345b291f84f1fa6a0..782a08075f6f8b9ddda48fb8a2dd3ce9183bf597 100644 --- a/doc/html/hdf5_profile_user_guide.shtml +++ b/doc/html/hdf5_profile_user_guide.shtml @@ -31,7 +31,7 @@ cpu consumption, and memory use from a jobacct_gather plugin. Data from other sources may be added in the future.</p> <p>The data is collected into a file on a shared file system for each step on -each allocated node of a job and then merged into a HDF5 file. +each allocated node of a job and then merged into an HDF5 file. Individual files on a shared file system was chosen because it is possible that the data is voluminous so solutions that pass data to the Slurm control daemon via RPC may not scale to very large clusters or jobs with @@ -39,7 +39,7 @@ many allocated nodes.</p> <p>A separate <a href="acct_gather_profile_plugins.html"> SLURM Profile Accounting Plugin API (AcctGatherProfileType)</a> documents how -write other Profile Accounting plugins.</P> +to write other Profile Accounting plugins.</P> <a id="Administration"></a> <h2>Administration</h2> @@ -57,13 +57,13 @@ option in the acct_gather.conf file. The directory will be created by Slurm if it doesn't exist. Each user will have their own directory created in the ProfileHDF5Dir which contains the HDF5 files. All the directories and files are created by the -SlurmdUser which is usually root. The user specific directories as well -as the files inside are chowned to the user running the job so they +SlurmdUser which is usually root. The user specific directories, as well +as the files inside, are chowned to the user running the job so they can access the files. Since user root is usually creating these files/directories a root squashed file system will not work for the ProfileHDF5Dir.</p> -<p>Each user that creates a profile will have a subdirector to the profile +<p>Each user that creates a profile will have a subdirectory in the profile directory that has read/write permission only for the user.</p> </span> </div> @@ -85,14 +85,14 @@ This sets the sampling frequency for data types: </div> </div> <div style="margin-left: 20px;"> -<h4>act_gather.conf parameters</h4> +<h4>acct_gather.conf parameters</h4> <div style="margin-left: 20px;"> <p>These parameters are directly used by the HDF5 Profile Plugin.</p> <dl> <dt><b>ProfileHDF5Dir</b> = <path></dt> <p> This parameter is the path to the shared folder into which the -acct_gather_profile plugin will write detailed data as a HDF5 file. +acct_gather_profile plugin will write detailed data as an HDF5 file. The directory is assumed to be on a file system shared by the controller and all compute nodes. This is a required parameter.<p> @@ -207,7 +207,7 @@ to be attached to groups to store application defined properties.</p> <p>There are commodity programs, notably <a href="http://www.hdfgroup.org/hdf-java-html/hdfview/index.html"> -HDFView</a> for viewing and manipulating these files. +HDFView</a>, for viewing and manipulating these files. <p>Below is a screen shot from HDFView expanding the job tree and showing the attributes for a specific task.</p> diff --git a/doc/man/man1/scontrol.1 b/doc/man/man1/scontrol.1 index 3eae7d03538e9998ac95776bddafeec84e449112..bf32ae3f6c405186cf2fff9e08c1f681658e56bc 100644 --- a/doc/man/man1/scontrol.1 +++ b/doc/man/man1/scontrol.1 @@ -288,7 +288,7 @@ The job_list argument is a comma separated list of job IDs. Requeue a running, suspended or finished SLURM batch job into pending state, moreover the job is put in held state (priority zero). The job_list argument is a comma separated list of job IDs. -A held job can be release using scontrol to reset its priority (e.g. +A held job can be released using scontrol to reset its priority (e.g. "scontrol release <job_id>"). The command accepts the following option: .RS .TP 12 diff --git a/doc/man/man1/slurm.1 b/doc/man/man1/slurm.1 index 65d9a97c5c0b96daa736292bb88b60447032abcc..c98f0536c816e66f740608a9e0f40315c43d3c10 100644 --- a/doc/man/man1/slurm.1 +++ b/doc/man/man1/slurm.1 @@ -64,7 +64,7 @@ details. \fBsacct\fR(1), \fBsacctmgr\fR(1), \fBsalloc\fR(1), \fBsattach\fR(1), \fBsbatch\fR(1), \fBsbcast\fR(1), \fBscancel\fR(1), \fBscontrol\fR(1), \fBsinfo\fR(1), \fBsmap\fR(1), \fBsqueue\fR(1), \fBsreport\fR(1), -\fBsrun\fR(1), \fBsshare\fR(1), \fBsstate\fR(1), \fBstrigger\fR(1), +\fBsrun\fR(1), \fBsshare\fR(1), \fBsstat\fR(1), \fBstrigger\fR(1), \fBsview\fR(1), \fBbluegene.conf\fR(5), \fBslurm.conf\fR(5), \fBslurmdbd.conf\fR(5), \fBwiki.conf\fR(5), diff --git a/doc/man/man5/slurm.conf.5 b/doc/man/man5/slurm.conf.5 index cacf2d3004dd1c8c6716d3fd0a862d08ffdefc3f..dcd9ca4a4b826db97b876ea1c6356b06fe88ff24 100644 --- a/doc/man/man5/slurm.conf.5 +++ b/doc/man/man5/slurm.conf.5 @@ -1686,13 +1686,13 @@ enable user login, etc. By default there is no prolog. Any configured script is expected to complete execution quickly (in less time than \fBMessageTimeout\fR). If the prolog fails (returns a non\-zero exit code), this will result in the -node being set to a DRAIN state and the job requeued to executed on another node. +node being set to a DRAIN state and the job being requeued in a held state. See \fBProlog and Epilog Scripts\fR for more information. .TP \fBPrologFlags\fR Flags to control the Prolog behavior. By default no flags are set. -Currently the only option defined is: +Currently the options are: .RS .TP 6 \fBAlloc\fR @@ -1839,7 +1839,7 @@ NOTE: This configuration option does not apply to IBM BlueGene systems. .TP \fBReconfigFlags\fR Flags to control various actions that may be taken when an "scontrol -reconfig" command is issued. Currently the only option defined is: +reconfig" command is issued. Currently the options are: .RS .TP 17 \fBKeepPartInfo\fR @@ -4023,7 +4023,7 @@ node being set to a DRAIN state. If the EpilogSlurmctld fails (returns a non\-zero exit code), this will only be logged. If the Prolog fails (returns a non\-zero exit code), this will result in the -node being set to a DRAIN state and the job requeued to executed on another node. +node being set to a DRAIN state and the job being requeued in a held state. If the PrologSlurmctld fails (returns a non\-zero exit code), this will result in the job requeued to executed on another node if possible. Only batch jobs can be requeued. Interactive jobs (salloc and srun) will be cancelled if the