Something went wrong on our end
Review loadleveler page
Compare changes
- Martin Schroschk authored
@@ -29,7 +36,7 @@ An example job file may look like this:
@@ -29,7 +36,7 @@ An example job file may look like this:
@@ -58,9 +65,9 @@ mpirun -x OMP_NUM_THREADS=1 -x LD_LIBRARY_PATH -np 16 ./my_mpi_program
@@ -58,9 +65,9 @@ mpirun -x OMP_NUM_THREADS=1 -x LD_LIBRARY_PATH -np 16 ./my_mpi_program
@@ -105,10 +112,10 @@ mpirun -x OMP_NUM_THREADS=8 -x LD_LIBRARY_PATH -np 4 --bynode ./my_hybrid_progra
@@ -105,10 +112,10 @@ mpirun -x OMP_NUM_THREADS=8 -x LD_LIBRARY_PATH -np 4 --bynode ./my_hybrid_progra
@@ -119,14 +126,14 @@ about resource usage.
@@ -119,14 +126,14 @@ about resource usage.
|:-------------------|:------------------------------------------------|:-------------------------------------------------------------------------------------|
| `notification` | `always`, `error`, `start`, `never`, `complete` | When to write notification email. |
| `resources` | `name(count)` ... `name(count)` | Specifies quantities of the consumable resources consumed by each task of a job step |
@@ -148,30 +155,35 @@ options. Afterwards, the job file will be passed to the command
@@ -148,30 +155,35 @@ options. Afterwards, the job file will be passed to the command
|:----------------------|:---------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `-J <name>` | `llsub` | Specifies the name of the job. You can name the job using any combination of letters, numbers, or both. The job name only appears in the long reports of the llq, llstatus, and llsummary commands. |
| `-n` | `1` | Specifies the total number of tasks of a parallel job you want to run on all available nodes. |
| `-T` | not specified | Specifies the maximum number of OpenMP threads to use per process by setting the environment variable OMP_NUM_THREADS to number. |
| `--o, -oo <filename>` | `<jobname>.<hostname>.<jobid>.out` | Specifies the name of the file to use as standard output (stdout) when your job step runs. |
| `-e, -oe <filename>` | `<jobname>.<hostname>.<jobid>.err` | Specifies the name of the file to use as standard error (stderr) when your job step runs. |
| `-I` | not specified | Submits an interactive job and sends the job's standard output (or standard error) to the terminal. |
| `-q <name>` | non-interactive: `short` interactive(n`1): =interactive` interactive(n>1): `interactive_par` | Specifies the name of a job class defined locally in your cluster. You can use the llclass command to find out information on job classes. |
| `-x` | not specified | Puts the node running your job into exclusive execution mode. In exclusive execution mode, your job runs by itself on a node. It is dispatched only to a node with no other jobs running, and LoadLeveler does not send any other jobs to the node until the job completes. |
| `-hosts <number>` | automatically | Specifies the number of nodes requested by a job step. This option is equal to the bsub option -R "span\[hosts=number\]". |
| `-ptile <number>` | automatically | Specifies the number of nodes requested by a job step. This option is equal to the bsub option -R "span\[ptile=number\]". |
| `-mem <size>` | not specified | Specifies the requirement of memory which the job needs on a single node. The memory requirement is specified in MB. This option is equal to the bsub option -R "rusage\[mem=size\]". |
@@ -209,8 +221,8 @@ The `llclass` command provides information about each queue. Example
@@ -209,8 +221,8 @@ The `llclass` command provides information about each queue. Example
triton_ism undefined undefined 8 80 exclusive, serial + parallel queue, nodes shared, unlimited runtime
@@ -226,13 +238,13 @@ short undefined undefined 272 384 serial + parallel queu
@@ -226,13 +238,13 @@ short undefined undefined 272 384 serial + parallel queu
@@ -262,14 +274,14 @@ Total number of available initiators of this class on all machines in the cluste
@@ -262,14 +274,14 @@ Total number of available initiators of this class on all machines in the cluste
@@ -277,41 +289,41 @@ This command will give you detailed job information.
@@ -277,41 +289,41 @@ This command will give you detailed job information.
|------------------|-----|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Deferred | D | The job will not be assigned until a specified date. The start date may have been specified by the user in the Job Command file or it may have been set by LoadLeveler because a parallel job could not obtain enough machines to run the job. |
| Idle | I | The job is being considered to run on a machine though no machine has been selected yet. |
| NotQueued | NQ | The job is not being considered to run. A job may enter this state due to an error in the command file or because LoadLeveler can not obtain information that it needs to act on the request. |
| Not Run | NR | The job will never run because a stated dependency in the Job Command file evaluated to be false. |
| Pending | P | The job is in the process of starting on one or more machines. The request to start the job has been sent but has not yet been acknowledged. |
| Rejected | X | The job did not start because there was a mismatch or requirements for your job and the resources on the target machine or because the user does not have a valid ID on the target machine. |
| Submission Error | SX | The job can not start due to a submission error. Please notify the Bluedawg administration team if you encounter this error. |
| Terminated | TX | The job was terminated, presumably by means beyond LoadLeveler's control. Please notify the Bluedawg administration team if you encounter this error. |
| Vacated | V | The started job did not complete. The job will be scheduled again provided that the job may be rescheduled. |
@@ -319,18 +331,18 @@ This command will give you detailed job information.
@@ -319,18 +331,18 @@ This command will give you detailed job information.
@@ -346,83 +358,85 @@ interactive 105 105 04:46:24 00:00:26 660.9
@@ -346,83 +358,85 @@ interactive 105 105 04:46:24 00:00:26 660.9
@@ -436,7 +450,7 @@ All machines on the machine_list are present.
@@ -436,7 +450,7 @@ All machines on the machine_list are present.