From b161f2e5dfc35336ef1a50c9e3882bfb218a4988 Mon Sep 17 00:00:00 2001 From: "Christopher J. Morrone" <morrone2@llnl.gov> Date: Thu, 24 Aug 2006 21:22:24 +0000 Subject: [PATCH] Command line options cleanup. --- doc/man/man1/slaunch.1 | 6 +- doc/man/man1/srun.1 | 622 ++++++++++++++++++++--------------------- src/salloc/opt.c | 376 +++---------------------- src/salloc/opt.h | 9 - src/salloc/salloc.c | 6 +- src/sbatch/opt.c | 423 +++------------------------- src/sbatch/opt.h | 12 - src/slaunch/opt.c | 42 +-- src/slaunch/opt.h | 1 - 9 files changed, 390 insertions(+), 1107 deletions(-) diff --git a/doc/man/man1/slaunch.1 b/doc/man/man1/slaunch.1 index e05e37edafe..165f8e840d5 100644 --- a/doc/man/man1/slaunch.1 +++ b/doc/man/man1/slaunch.1 @@ -13,10 +13,10 @@ slaunch launches a parallel application (a \fBjob step\fR in SLURM parlance) on .LP .TP \fB\-\-jobid\fR <\fIJOBID\fP> -The job allocation under which the parallel application should be launched. +The job allocation under which the parallel application should be launched. If slaunch is running under salloc or a batch script, slaunch can automatically determint the jobid from the SLURM_JOB_ID environment variable. Otherwise, you will need to tell slaunch which job allocation to use. .TP -\fB\-n\fR, \fB\-\-ntasks\fR[=]<\fInumber\fR> -Specify the number of processes to launch. The default is one process per node, but note that the \-c parameter will change this default. +\fB\-n\fR, \fB\-\-tasks\fR[=]<\fInumber\fR> +Specify the number of processes to launch. The default is one process per node. .TP \fB\-N\fR, \fB\-\-nodes\fR[=]<\fInumber\fR> Specify the number of nodes to be used by this job step. By default, diff --git a/doc/man/man1/srun.1 b/doc/man/man1/srun.1 index 402d3695672..0c6732195d3 100644 --- a/doc/man/man1/srun.1 +++ b/doc/man/man1/srun.1 @@ -1,31 +1,31 @@ -\." $Id$ -.\" -.TH SRUN "1" "June 2006" "srun 1.2" "slurm components" +.\" \." $Id$ +.\" +.TH "SRUN" "1" "June 2006" "srun 1.2" "slurm components" .SH "NAME" srun \- run parallel jobs -.SH SYNOPSIS +.SH "SYNOPSIS" .B srun [\fIOPTIONS\fR...] \fIexecutable \fR[\fIargs\fR...] -.br +.br .B srun \-\-batch [\fIOPTIONS\fR...] job_script -.br +.br .B srun \-\-allocate [\fIOPTIONS\fR...] [job_script] -.br +.br .B srun \-\-attach=jobid -.SH DESCRIPTION +.SH "DESCRIPTION" Allocate resources and optionally initiate parallel jobs on clusters managed by SLURM. -.TP +.TP parallel run options -.TP +.TP \fB\-n\fR, \fB\-\-ntasks\fR=\fIntasks\fR Specify the number of processes to run. Request that \fBsrun\fR allocate \fIntasks\fR processes. The default is one process per node, but note that the \fB\-c\fR parameter will change this default. -.TP +.TP \fB\-c\fR, \fB\-\-cpus\-per\-task\fR=\fIncpus\fR Request that \fIncpus\fR be allocated \fBper process\fR. This may be useful if the job is multithreaded and requires more than one cpu @@ -33,7 +33,7 @@ per task for optimal performance. The default is one cpu per process. If \fB\-c\fR is specified without \fB\-n\fR as many tasks will be allocated per node as possible while satisfying the \fB\-c\fR restriction. -.TP +.TP \fB\-N\fR, \fB\-\-nodes\fR=\fIminnodes\fR[\-\fImaxnodes\fR] Request that a minimum of \fIminnodes\fR nodes be allocated to this job. The scheduler may decide to launch the job on more than \fIminnodes\fR nodes. @@ -49,11 +49,11 @@ allocated to the job. See the \fBENVIRONMENT VARIABLES \fR section for more information. If \fB\-N\fR is not specified, the default behaviour is to allocate enough nodes to satisfy the requirements of the \fB\-n\fR and \fB\-c\fR options. -.TP +.TP \fB\-r\fR, \fB\-\-relative\fR=\fIn\fR Run a job step relative to node \fIn\fR of the current allocation. This option may be used to spread several job steps out among the -nodes of the current job. If \fB-r\fR is used, the current job +nodes of the current job. If \fB\-r\fR is used, the current job step will begin at node \fIn\fR of the allocated nodelist, where the first node is considered node 0. The \fB\-r\fR option is not permitted along with \fB\-w\fR or \fB\-x\fR, and will be silently @@ -62,28 +62,28 @@ SLURM_JOBID is not set). The default for \fIn\fR is 0. If the value of \fB\-\-nodes\fR exceeds the number of nodes identified with the \fB\-\-relative\fR option, a warning message will be printed and the \fB\-\-relative\fR option will take precedence. -.TP +.TP \fB\-p\fR, \fB\-\-partition\fR=\fIpartition\fR Request resources from partition "\fIpartition\fR." Partitions are created by the slurm administrator, who also identify one of those partitions as the default. -.TP +.TP \fB\-P\fR, \fB\-\-dependency\fR=\fIjobid\fR Defer initiation of this job until the specified jobid has completed execution. Many jobs can share the same dependency and these jobs may belong to different users. The value may be changed after job submission using the \fBscontrol\fR command. -.TP +.TP \fB\-\-nice\fR[=\fIadjustment]\fR Run the job with an adjusted scheduling priority. With no adjustment value the scheduling priority is decreased -by 100. The adjustment range is from -10000 (highest priority) +by 100. The adjustment range is from \-10000 (highest priority) to 10000 (lowest priority). Only privileged users can specify a negative adjustment. NOTE: This option is presently ignored if \fISchedulerType=sched/maui\fR. -.TP +.TP \fB\-\-multi\-prog\fR Run a job with different programs and different arguments for each task. In this case, the executable program specified is @@ -91,33 +91,33 @@ actually a configuration file specifying the executable and arguments for each task. See \fBMULTIPLE PROGRAM CONFIGURATION\fR below for details on the configuration file contents. -.TP +.TP \fB\-\-begin\fR=\fItime\fR Defer initiation of this job until the specified time. It accepts times of the form \fIHH:MM:SS\fR to run a job at a specific time of day (seconds are optional). (If that time is already past, the next day is assumed.) You may also specify \fImidnight\fR, \fInoon\fR, or -\fIteatime\fR (4pm) and you can have a time-of-day suffixed +\fIteatime\fR (4pm) and you can have a time\-of\-day suffixed with \fIAM\fR or \fIPM\fR for running in the morning or the evening. You can also say what day the job will be run, by giving -a date in the form \fImonth-name\fR day with an optional year, +a date in the form \fImonth\-name\fR day with an optional year, or giving a date of the form \fIMMDDYY\fR or \fIMM/DD/YY\fR or \fIDD.MM.YY\fR. You can also -give times like \fInow + count time-units\fR, where the time-units +give times like \fInow + count time\-units\fR, where the time\-units can be \fIminutes\fR, \fIhours\fR, \fIdays\fR, or \fIweeks\fR and you can tell SLURM to run the job today with the keyword \fItoday\fR and to run the job tomorrow with the keyword \fItomorrow\fR. The value may be changed after job submission using the \fBscontrol\fR command. -.TP +.TP \fB\-U\fR, \fB\-\-account\fR=\fIaccount\fR Change resource use by this job to specified account. The \fIaccount\fR is an arbitrary string. The may be changed after job submission using the \fBscontrol\fR command. -.TP +.TP \fB\-t\fR, \fB\-\-time\fR=\fIminutes\fR Establish a time limit to terminate the job after the specified number of minutes. If the job's time limit exceeds the partition's time limit, the @@ -126,18 +126,18 @@ time limit. When the time limit is reached, the job's processes are sent SIGTERM followed by SIGKILL. The interval between signals is specified by the SLURM configuration parameter \fBKillWait\fR. A time limit of 0 minutes indicates that an infinite timelimit should be used. -.TP +.TP \fB\-D\fR, \fB\-\-chdir\fR=\fIpath\fR have the remote processes do a chdir to \fIpath\fR before beginning execution. The default is to chdir to the current working directory of the \fBsrun\fR process. -.TP +.TP \fB\-I\fR, \fB\-\-immediate\fR exit if resources are not immediately available. By default, \fB\-\-immediate\fR is off, and .B srun will block until resources become available. -.TP +.TP \fB\-k\fR, \fB\-\-no\-kill\fR Do not automatically terminate a job of one of the nodes it has been allocated fails. This option is only recognized on a job allocation, @@ -148,14 +148,14 @@ but subsequent job steps may be run if this option is specified. The default action is to terminate job upon node failure. Note that \fB\-\-batch\fR jobs will be re\-queued if a node failure occurs in the process of initiating it. -.TP +.TP \fB\-K\fR, \fB\-\-kill\-on\-bad\-exit\fR -Terminate a job if any task exits with a non-zero exit code. -.TP +Terminate a job if any task exits with a non\-zero exit code. +.TP \fB\-s\fR, \fB\-\-share\fR The job can share nodes with other running jobs. This may result in faster job initiation and higher system utilization, but lower application performance. -.TP +.TP \fB\-O\fR, \fB\-\-overcommit\fR overcommit resources. Normally, .B srun @@ -164,34 +164,34 @@ will not allocate more than one process per cpu. By specifying per cpu. However no more than \fBMAX_TASKS_PER_NODE\fR tasks are permitted to execute per node. ./"NOTE: Do not document feature until user release mechanism is available. -./".TP -./"-H, --hold +./".TP +./"\-H, \-\-hold ./"Specify the job is to be submitted in a held state (priority of zero). ./"A held job can now be released using scontrol to reset its priority. -.TP +.TP \fB\-T\fR, \fB\-\-threads\fR=\fInthreads\fR Request that .B srun use \fInthreads\fR to initiate and control the parallel job. The default value is the smaller of 32 or the number of nodes allocated. -.TP +.TP \fB\-l\fR, \fB\-\-label\fR prepend task number to lines of stdout/err. Normally, stdout and stderr -from remote tasks is line-buffered directly to the stdout and stderr of +from remote tasks is line\-buffered directly to the stdout and stderr of .B srun The \fB\-\-label\fR option will prepend lines of output with the remote task id. -.TP -\fB-u\fR, \fB\-\-unbuffered\fR +.TP +\fB\-u\fR, \fB\-\-unbuffered\fR do not line buffer stdout from remote tasks. This option cannot be used with \fI\-\-label\fR. -.TP +.TP \fB\-m\fR, \fB\-\-distribution\fR=(\fIblock\fR|\fIcyclic\fR|\fIhostfile\fR) Specify an alternate distribution method for remote processes. .RS -.TP +.TP .B block -The block method of distribution will allocate processes in-order to +The block method of distribution will allocate processes in\-order to the cpus on a node. If the number of processes exceeds the number of cpus on all of the nodes in the allocation then all nodes will be utilized. For example, consider an allocation of three nodes each with @@ -200,67 +200,67 @@ those processes to the nodes with processes one and two on the first node, process three on the second node, and process four on the third node. Block distribution is the default behavior if the number of tasks exceeds the number of nodes requested. -.TP +.TP .B cyclic -The cyclic method distributes processes in a round-robin fashion across +The cyclic method distributes processes in a round\-robin fashion across the allocated nodes. That is, process one will be allocated to the first node, process two to the second, and so on. This is the default behavior if the number of tasks is no larger than the number of nodes requested. -.TP +.TP .B hostfile -The hostfile method of distribution will allocate processes in-order as +The hostfile method of distribution will allocate processes in\-order as listed in file designated by the environment variable SLURM_HOSTFILE. If this variable is listed it will over ride any other method specified. If not set the method will default to block. .RE -.TP +.TP \fB\-J\fR, \fB\-\-job\-name\fR=\fIjobname\fR Specify a name for the job. The specified name will appear along with the job id number when querying running jobs on the system. The default is the supplied \fBexecutable\fR program's name. -.TP +.TP \fB\-\-mpi\fR=\fImpi_type\fR Identify the type of MPI to be used. May result in unique initiation procedures. .RS -.TP +.TP .B list Lists avaliable mpi types to choose from. -.TP +.TP .B lam Initiates one 'lamd' process per node and establishes necessary environment variables for LAM/MPI. -.TP +.TP .B mpich\-gm For use with Myrinet. -.TP +.TP .B mvapich For use with Infiniband. -.TP +.TP .B none No special MPI processing. This is the default and works with many other versions of MPI. .RE -.TP +.TP \fB\-\-ctrl\-comm\-ifhn\fR=\fIaddr\fR Specify the address or hostname to be used for PMI communications only (task communication and synchronization primitives for MPCIH2). Defaults to hostname (response from getnodename function). Use of this is required if a DNS lookup can not be performed on the hostname or if that address is blocked from the compute nodes. -.TP +.TP \fB\-\-jobid\fR=\fIid\fR Initiate a job step under an already allocated job with job id \fIid\fR. Using this option will cause \fBsrun\fR to behave exactly as if the SLURM_JOBID environment variable was set. -.TP +.TP \fB\-\-no\-requeue\fR Specifies that the batch job is not requeue. Setting this option will prevent system administrators from being able to restart the job (for example, after a scheduled downtime). When a job is requeued, the batch script is initiated from its beginning. This option is only applicable to batch job submission (see \fB\-\-batch\fR). -.TP +.TP \fB\-o\fR, \fB\-\-output\fR=\fImode\fR Specify the mode for stdout redirection. By default in interactive mode, .B srun @@ -269,11 +269,11 @@ the attached terminal. With \fB\-\-output\fR stdout may be redirected to a file, to one file per task, or to /dev/null. See section \fBIO Redirection\fR below for the various forms of \fImode\fR. If the specified file already exists, it will be overwritten. -.br +.br If \fB\-\-error\fR is not also specified on the command line, both stdout and stderr will directed to the file specified by \fB\-\-output\fR. -.TP +.TP \fB\-i\fR, \fB\-\-input\fR=\fImode\fR Specify how stdin is to redirected. By default, .B srun @@ -282,7 +282,7 @@ below for more options. For OS X, the poll() function does not support stdin, so input from a terminal is not possible. -.TP +.TP \fB\-e\fR, \fB\-\-error\fR=\fImode\fR Specify how stderr is to be redirected. By default in interactive mode, .B srun @@ -291,7 +291,7 @@ redirects stderr to the same file as stdout, if one is specified. The redirected to different locations. See \fBIO Redirection\fR below for more options. If the specified file already exists, it will be overwritten. -.TP +.TP \fB\-b\fR, \fB\-\-batch\fR Submit in "batch mode." \fBsrun\fR will make a copy of the \fIexecutable\fR file (a script) and submit the request for execution when resouces are @@ -301,8 +301,8 @@ job and must contain \fBsrun\fR commands to initiate parallel tasks. stdin will be redirected from /dev/null, stdout and stderr will be redirected to a file (default is \fIjobname\fR.out or \fIjobid\fR.out in current working directory, see \fB\-o\fR for other IO options). -Note that if the slurm daemons are cold-started, jobid values will be -reused. Plan accordingly to avoid over-writing output and error files. +Note that if the slurm daemons are cold\-started, jobid values will be +reused. Plan accordingly to avoid over\-writing output and error files. \fIexecutable\fR must be specified using either a fully qualified pathname or its pathname will be relative to the current working directory. The search path will not be used to locate the file. \fIexecutable\fR @@ -316,29 +316,29 @@ the option with #SLURM. Multiple options can be on one line or multiple lines. i.e. .br -#SLURM -N 2 -n 2 +#SLURM \-N 2 \-n 2 +.br +#SLURM \-\-mpi=lam .br -#SLURM --mpi=lam -.br This is run the script on 2 nodes, with 2 procs with mpi type lam. All commandline options are able to be set inside the script with the exception of the mode (which has already been set since to run a batch script you are in batch mode). -.br +.br Options on the command line take precedence over options in the batch script, which in turn take precedence over exiting environmement variables. -.TP +.TP \fB\-v\fR, \fB\-\-verbose\fR -verbose operation. Multiple \fB-v\fR's will further increase the verbosity of +verbose operation. Multiple \fB\-v\fR's will further increase the verbosity of \fBsrun\fR. By default only errors will be displayed. -.TP -\fB\-d\fR, \fB\-\-slurmd-debug\fR=\fIlevel\fR +.TP +\fB\-d\fR, \fB\-\-slurmd\-debug\fR=\fIlevel\fR Specify a debug level for slurmd(8). \fIlevel\fR may be an integer value between 0 [quiet, only errors are displayed] and 4 [verbose operation]. The slurmd debug information is copied onto the stderr of the job. By default only errors are displayed. -.TP +.TP \fB\-W\fR, \fB\-\-wait\fR=\fIseconds\fR Specify how long to wait after the first task terminates before terminating all remaining tasks. A value of 0 indicates an unlimited wait (a warning will @@ -346,34 +346,34 @@ be issued after 60 seconds). The default value is set by the WaitTime parameter in the slurm configuration file (see \fBslurm.conf(5)\fR). This option can be useful to insure that a job is terminated in a timely fashion in the event that one or more tasks terminate prematurely. -.TP +.TP \fB\-q\fR, \fB\-\-quit\-on\-interrupt\fR -Quit immediately on single SIGINT (Ctrl-C). Use of this option +Quit immediately on single SIGINT (Ctrl\-C). Use of this option disables the status feature normally available when \fBsrun\fR receives -a single Ctrl-C and causes \fBsrun\fR to instead immediately terminate the +a single Ctrl\-C and causes \fBsrun\fR to instead immediately terminate the running job. -.TP +.TP \fB\-X\fR, \fB\-\-disable\-status\fR Disable the display of task status when srun receives a single SIGINT -(Ctrl-C). Instead immediately forward the SIGINT to the running job. -A second Ctrl-C in one second will forcibly terminate the job and +(Ctrl\-C). Instead immediately forward the SIGINT to the running job. +A second Ctrl\-C in one second will forcibly terminate the job and \fBsrun\fR will immediately exit. May also be set via the environment variable SLURM_DISABLE_STATUS. -.TP +.TP \fB\-Q\fR, \fB\-\-quiet\fR Quiet operation. Suppress informational messages. Errors will still be displayed. -.TP +.TP \fB\-\-mail\-type\fR=\fItype\fR Notify user by email when certain event types occur. Valid \fItype\fR values are BEGIN, END, FAIL, ALL (any state change). The user to be notified is indicated with \fB\-\-mail\-user\fR. -.TP +.TP \fB\-\-mail\-user\fR=\fIuser\fR User to receive email notification of state changes as defined by \fB\-\-mail\-type\fR. The default value is the submitting user. -.TP +.TP \fB\-\-uid\fR=\fIuser\fR Attempt to submit and/or run a job as \fIuser\fR instead of the invoking user id. The invoking user's credentials will be used @@ -382,38 +382,38 @@ may use this option to run jobs as a normal user in a RootOnly partition for example. If run as root, \fBsrun\fR will drop its permissions to the uid specified after node allocation is successful. \fIuser\fR may be the user name or numerical user ID. -.TP +.TP \fB\-\-gid\fR=\fIgroup\fR If \fBsrun\fR is run as root, and the \fB\-\-gid\fR option is used, submit the job with \fIgroup\fR's group access permissions. \fIgroup\fR may be the group name or the numerical group ID. -.TP +.TP \fB\-\-core\fR=\fItype\fR Adjust corefile format for parallel job. If possible, srun will set up the environment for the job such that a corefile format other than full core dumps is enabled. If run with type = "list", srun will print a list of supported corefile format types to stdout and exit. -.TP +.TP \fB\-\-propagate\fR[=\fIrlimits\fR] Allows users to specify which of the modifiable (soft) resource limits to propagate to the compute nodes and apply to their jobs. If \fIrlimits\fR is not specified, then all resource limits will be propagated. -.TP +.TP \fB\-\-prolog\fR=\fIexecutable\fR \fBsrun\fR will run \fIexecutable\fR just before launching the job step. The command line arguments for \fIexecutable\fR will be the command and arguments of the job step. If \fIexecutable\fR is "none", then no prolog will be run. This parameter overrides the SrunProlog parameter in slurm.conf. -.TP +.TP \fB\-\-epilog\fR=\fIexecutable\fR \fBsrun\fR will run \fIexecutable\fR just after the job step completes. The command line arguments for \fIexecutable\fR will be the command and arguments of the job step. If \fIexecutable\fR is "none", then no epilog will be run. This parameter overrides the SrunEpilog parameter in slurm.conf. -.TP +.TP \fB\-\-task\-prolog\fR=\fIexecutable\fR The \fBslurmd\fR daemon will run \fIexecutable\fR just before launching each task. This will be executed after any TaskProlog parameter @@ -423,28 +423,28 @@ available to identify the process ID of the task being started. Standard output from this program of the form "export NAME=value" will be used to set environment variables for the task being spawned. -.TP +.TP \fB\-\-task\-epilog\fR=\fIexecutable\fR The \fBslurmd\fR daemon will run \fIexecutable\fR just after each task terminates. This will be before after any TaskEpilog parameter -in slurm.conf is executed. This is meant to be a very short-lived +in slurm.conf is executed. This is meant to be a very short\-lived program. If it fails to terminate within a few seconds, it will be killed along with any descendant processes. -.PP +.PP Allocate options: -.TP +.TP \fB\-A\fR, \fB\-\-allocate\fR allocate resources and spawn a shell. When \fB\-\-allocate\fR is specified to \fBsrun\fR, no remote tasks are started. Instead a subshell is started that has access to the allocated resources. Multiple jobs can then be run on the same cpus from within this subshell. See \fBAllocate Mode\fR below. -.TP +.TP \fB\-\-no\-shell\fR immediately exit after allocating resources instead of spawning a shell when used with the \fB\-A\fR, \fB\-\-allocate\fR option. -.PP +.PP Attach to running job: -.TP +.TP \fB\-a\fR, \fB\-\-attach\fR=\fIid\fR This option will attach \fBsrun\fR to a running job with job id = \fIid\fR. Provided that the calling user @@ -452,25 +452,25 @@ has access to that running job, stdout and stderr will be redirected to the current session (assuming that the tasks' stdout and stderr are not connected directly to files). stdin is not connected to the remote tasks, and signals are not forwarded unless the \fB\-\-join\fR parameter is also specified. -.TP +.TP \fB\-j\fR, \fB\-\-join\fR Used in conjunction with \fB\-\-attach\fR to specify that stdin should also be connected to the remote tasks (assuming that the remote tasks' stdin are not directly connected to files), and signals sent to \fBsrun\fR will be forwarded to the remote tasks. -.PP +.PP Constraint Options. The following options all put constraints on the nodes that may be considered for the job: -.TP +.TP \fB\-\-mincpus\fR=\fIn\fR Specify minimum number of cpus per node. -.TP +.TP \fB\-\-mem\fR=\fIMB\fR Specify a minimum amount of real memory. -.TP +.TP \fB\-\-tmp\fR=\fIMB\fR Specify a minimum amount of temporary disk space. -.TP +.TP \fB\-C\fR, \fB\-\-constraint\fR=\fIlist\fR Specify a list of constraints. The constraints are features that have been assigned to the nodes by @@ -481,49 +481,49 @@ For example: \fB\-\-constraint="opteron&video"\fR or \fB\-\-constraint="fast|faster"\fR. If no nodes have the requested features, then the job will be rejected by the slurm job manager. -.TP +.TP \fB\-\-contiguous\fR Demand a contiguous range of nodes. The default is "yes". Specify ---contiguous=no if a contiguous range of nodes is not a constraint. -.TP +\-\-contiguous=no if a contiguous range of nodes is not a constraint. +.TP \fB\-w\fR, \fB\-\-nodelist\fR=\fIhost1,host2,...\fR or \fIfilename\fR Request a specific list of hosts. The job will contain \fIat least\fR -these hosts. The list may be specified as a comma-separated list of -hosts, a range of hosts (host[1-5,7,...] for example), or a filename. +these hosts. The list may be specified as a comma\-separated list of +hosts, a range of hosts (host[1\-5,7,...] for example), or a filename. The host list will be assumed to be a filename if it contains a "/" character. -.TP +.TP \fB\-x\fR, \fB\-\-exclude\fR=\fIhost1,host2,...\fR or \fIfilename\fR Request that a specific list of hosts not be included in the resources allocated to this job. The host list will be assumed to be a filename if it contains a "/"character. -.PP -Affinity/Multi-core Options (when the task/affinity or task/numa +.PP +Affinity/Multi\-core Options (when the task/affinity or task/numa plugin is enabled): -.TP +.TP \fB\-\-cpu_bind\fR=[{\fIquiet,verbose\fR},]\fItype\fR Bind tasks to CPUs .RS -.TP +.TP .B q[uiet], quietly bind before task runs (default) -.TP +.TP .B v[erbose], verbosely report binding before task runs -.TP +.TP .B no[ne] don't bind tasks to CPUs (default) -.TP +.TP .B rank bind by task rank -.TP +.TP .B map_cpu:<list> bind by mapping CPU IDs to tasks as specified where <list> is <cpuid1>,<cpuid2>,...<cpuidN>. CPU IDs are interpreted as decimal values unless they are preceded with '0x' in which case they interpreted as hexadecimal values. -.TP +.TP .B mask_cpu:<list> bind by setting CPU masks on tasks as specified where <list> is <mask1>,<mask2>,...<maskN>. @@ -534,24 +534,24 @@ preceded with an optional '0x'. To have SLURM always report on the selected CPU binding for all srun commands executed in a shell, you can also enable verbose mode separately from the command line with: -.PP -.nf +.PP +.nf setenv SLURM_CPU_BIND verbose -.fi -.PP +.fi +.PP SLURM_CPU_BIND will not propagate into the tasks environment (binding by default only affects the first srun). To propagate \-\-cpu_bind to successive srun commands, first do the following in each task: -.PP -.nf +.PP +.nf setenv SLURM_CPU_BIND \\ ${SLURM_CPU_BIND_VERBOSE},${SLURM_CPU_BIND_TYPE}${SLURM_CPU_BIND_LIST} -.fi +.fi -.PP -Affinity/Multi-core Options (when the task/affinity plugin is enabled and +.PP +Affinity/Multi\-core Options (when the task/affinity plugin is enabled and the NUMA memory functions are available): -.TP +.TP \fB\-\-mem_bind\fR=[{\fIquiet,verbose\fR},]\fItype\fR Bind tasks to memory. \fBNote that the resolution of CPU and memory binding may differ on some architectures.\fR For example, CPU binding may be performed @@ -563,29 +563,29 @@ If you want greater control, try running a simple test code with the options "\-\-cpu_bind=verbose,none \-\-mem_bind=verbose,none" to determine the specific configuration. .RS -.TP +.TP .B q[uiet], quietly bind before task runs (default) -.TP +.TP .B v[erbose], verbosely report binding before task runs -.TP +.TP .B no[ne] don't bind tasks to memory (default) -.TP +.TP .B rank bind by task rank (not recommended) -.TP +.TP .B local Use memory local to the processor in use -.TP +.TP .B map_mem:<list> bind by mapping a node's memory to tasks as specified where <list> is <cpuid1>,<cpuid2>,...<cpuidN>. CPU IDs are interpreted as decimal values unless they are preceded with '0x' in which case they interpreted as hexadecimal values (not recommended) -.TP +.TP .B mask_mem:<list> bind by setting memory masks on tasks as specified where <list> is <mask1>,<mask2>,...<maskN>. @@ -596,31 +596,31 @@ preceded with an optional '0x' (not recommended) To have SLURM always report on the selected memory binding for all srun commands executed in a shell, you can also enable verbose mode separately from the command line with: -.PP -.PP -.nf +.PP +.PP +.nf setenv SLURM_MEM_BIND verbose -.fi -.PP +.fi +.PP SLURM_MEM_BIND will not propagate into the tasks environment (binding by default only affects the first srun). To propagate \-\-mem_bind to successive srun commands, first do the following in each task: -.PP -.nf +.PP +.nf setenv SLURM_MEM_BIND \\ ${SLURM_MEM_BIND_VERBOSE},${SLURM_MEM_BIND_TYPE}${SLURM_MEM_BIND_LIST} -.fi -.PP +.fi +.PP See the \fBENVIRONMENT VARIABLES\fR section for a more detailed description of the individual SLURM_CPU_BIND* and SLURM_MEM_BIND* variables. -.PP +.PP The following options support AIX systems, but may be applicable to other systems as well. Since POE is used to launch tasks, these options are not normally used or are specified using the \fBSLURM_NETWORK\fR environment variable. -.TP +.TP \fB\-\-network\fR=\fItype\fR Specify the communication protocol to be used. The interpretation of \fItype\fR is system dependent. @@ -631,46 +631,46 @@ comma\-separated and case insensitive types are recongnized: IBM systems see \fIpoe\fR documenation on the environment variables \fBMP_EUIDEVICE\fR and \fBMP_USE_BULK_XFER\fR. -.PP +.PP The following options support Blue Gene systems, but may be applicable to other systems as well. -.TP +.TP \fB\-g\fR, fB\-\-geometry\fR=\fIXxYxZ\fR Specify the geometry requirements for the job. The three numbers represent the required geometry giving dimensions in the X, Y and Z directions. For example "\-\-geometry=2x3x4", specifies a block of nodes having 2 x 3 x 4 = 24 nodes (actually base partions on Blue Gene). -.TP +.TP \fB\-\-conn\-type\fR=\fItype\fR Require the partition connection type to be of a certain type. On Blue Gene the acceptable of \fItype\fR are MESH, TORUS and NAV. If NAV, or if not set, then SLURM will try to fit a TORUS else MESH. You should not normally set this option. SLURM will normally allocate a TORUS if possible for a given geometry. -.TP -\fB\-R\fR, \fB\-\-no-rotate\fR +.TP +\fB\-R\fR, \fB\-\-no\-rotate\fR Disables rotation of the job's requested geometry in order to fit an appropriate partition. By default the specified geometry can rotate in three dimensions. -.PP +.PP Help options -.TP +.TP \fB\-\-help\fR Display verbose help message and exit. -.TP +.TP \fB\-\-usage\fR Display brief help message and exit. -.PP +.PP Other options -.TP +.TP \fB\-V\fR, \fB\-\-version\fR Display version information and exit. -.PP -Unless the \fB\-a\fR (\fB\-\-attach\fR) or \fB-A\fR (\fB\-\-allocate\fR) +.PP +Unless the \fB\-a\fR (\fB\-\-attach\fR) or \fB\-A\fR (\fB\-\-allocate\fR) options are specified (see \fBAllocate mode\fR and \fBAttaching to jobs\fR below), .B srun @@ -681,33 +681,33 @@ will block until the resources are free to run the job. If the \fB\-I\fR (\fB\-\-immediate\fR) option is specified .B srun will terminate if resources are not immediately available. -.PP +.PP When initiating remote processes .B srun will propagate the current working directory, unless \fB\-\-chdir\fR=\fIpath\fR is specified, in which case \fIpath\fR will become the working directory for the remote processes. -.PP -The \fB-n\fB, \fB-c\fR, and \fB-N\fR options control how CPUs and +.PP +The \fB\-n\fB, \fB\-c\fR, and \fB\-N\fR options control how CPUs and nodes will be allocated to the job. When specifying only the number -of processes to run with \fB-n\fR, a default of one CPU per process -is allocated. By specifying the number of CPUs required per task (\fB-c\fR), +of processes to run with \fB\-n\fR, a default of one CPU per process +is allocated. By specifying the number of CPUs required per task (\fB\-c\fR), more than one CPU may be allocated per process. If the number of nodes -is specified with \fB-N\fR, +is specified with \fB\-N\fR, .B srun will attempt to allocate \fIat least\fR the number of nodes specified. -.PP +.PP Combinations of the above three options may be used to change how processes are distributed across nodes and cpus. For instance, by specifying both the number of processes and number of nodes on which to run, the number of processes per node is implied. However, if the number of CPUs -per process is more important then number of processes (\fB-n\fR) and the -number of CPUs per process (\fB-c\fR) should be specified. -.PP +per process is more important then number of processes (\fB\-n\fR) and the +number of CPUs per process (\fB\-c\fR) should be specified. +.PP .B srun will refuse to allocate more than one process per CPU unless \fB\-\-overcommit\fR (\fB\-O\fR) is also specified. -.PP +.PP .B srun will attempt to meet the above specifications "at a minimum." That is, if 16 nodes are requested for 32 processes, and some nodes do not have @@ -716,9 +716,9 @@ demand for CPUs. In other words, a \fIminimum\fR of 16 nodes are being requested. However, if 16 nodes are requested for 15 processes, .B srun will consider this an error, as 15 processes cannot run across 16 nodes. -.PP +.PP .B "IO Redirection" -.PP +.PP By default stdout and stderr will be redirected from all tasks to the stdout and stderr of .B srun @@ -736,11 +736,11 @@ for these options are stdout stderr is redirected from all tasks to srun. stdin is broadcast to all remote tasks. (This is the default behavior) -.TP +.TP \fBnone\fR stdout and stderr is not received from any task. stdin is not sent to any task (stdin is closed). -.TP +.TP \fItaskid\fR stdout and/or stderr are redirected from only the task with relative id equal to \fItaskid\fR, where 0 <= \fItaskid\fR <= \fIntasks\fR, @@ -748,21 +748,21 @@ where \fIntasks\fR is the total number of tasks in the current job step. stdin is redirected from the stdin of .B srun to this same task. -.TP +.TP \fIfilename\fR .B srun will redirect stdout and/or stderr to the named file from all tasks. stdin will be redirected from the named file and broadcast to all tasks in the job. If the job is submitted in batch mode using the -.B -b +.B \-b or -.B --batch +.B \-\-batch option, \fIfilename\fR refers to a path on each of the nodes on which the job runs. Otherwise \fIfilename\fR refers to a path on the host that runs \fBsrun\fR. Depending on the cluster's file system layout, this may result in the output appearing in different places depending on whether the job is run in batch mode. -.TP +.TP format string .B srun allows for a format string to be used to generate the named IO file @@ -772,82 +772,82 @@ unique to a given jobid, stepid, node, or task. In each case, the appropriate number of files are opened and associated with the corresponding tasks. .RS 10 -.TP +.TP %J jobid.stepid of the running job. (e.g. "128.0") -.TP +.TP %j jobid of the running job. -.TP +.TP %s stepid of the running job. -.TP +.TP %N short hostname. This will create a separate IO file per node. -.TP +.TP %n Node identifier relative to current job (e.g. "0" is the first node of the running job) This will create a separate IO file per node. -.TP +.TP %t task identifier (rank) relative to current job. This will create a separate IO file per task. -.PP +.PP A number placed between the percent character and format specifier may be -used to zero-pad the result in the IO filename. This number is ignored if -the format specifier corresponds to non-numeric data (%N for example). +used to zero\-pad the result in the IO filename. This number is ignored if +the format specifier corresponds to non\-numeric data (%N for example). Some examples of how the format string may be used for a 4 task job step with a Job ID of 128 and step id of 0 are included below: .TP 15 job%J.out job128.0.out -.TP +.TP job%4j.out job0128.out -.TP -job%j-%2t.out -job128-00.out, job128-01.out, ... -.PP -.RS -10 -.PP +.TP +job%j\-%2t.out +job128\-00.out, job128\-01.out, ... +.PP +.RS \-10 +.PP .B "Allocate Mode" -.PP +.PP When the allocate option is specified (\fB\-A\fR, \fB\-\-allocate\fR) \fBsrun\fR will not initiate any remote processes after acquiring resources. Instead, \fBsrun\fR will spawn a subshell which has access to the acquired resources. Subsequent instances of \fBsrun\fR from within this subshell will then run on these resources. -.PP +.PP If the name of a script is specified on the commandline with \fB\-\-allocate\fR, the spawned shell will run the specified script. Resources allocated in this way will only be freed when the subshell terminates. -.PP +.PP .B "Attaching to a running job" -.PP -Use of the \fB-a\fR \fIjobid\fR (or \fB\-\-attach\fR) option allows +.PP +Use of the \fB\-a\fR \fIjobid\fR (or \fB\-\-attach\fR) option allows \fBsrun\fR to reattach to a running job, receiving stdout and stderr from the job and forwarding signals to the job, just as if the current session of \fBsrun\fR had started the job. (stdin, however, cannot be forwarded to the job). -.PP +.PP There are two ways to reattach to a running job. The default method -is to attach to the current job read-only. In this case, +is to attach to the current job read\-only. In this case, stdout and stderr are duplicated to the attaching \fBsrun\fR, but signals are not forwarded to the remote processes (A single -Ctrl-C will detach this read-only \fBsrun\fR from the job). If -the \fB-j\fR (\fB\-\-join\fR) option is is also specified, +Ctrl\-C will detach this read\-only \fBsrun\fR from the job). If +the \fB\-j\fR (\fB\-\-join\fR) option is is also specified, \fBsrun\fR "joins" the running job, and is able to forward signals, connects stdin, and acts for the most part much like the \fBsrun\fR process that initiated the job. -.PP +.PP Node and CPU selection options do not make sense when specifying -\fB\-\-attach\fR, and it is an error to use \fB-n\fR, \fB-c\fR, -or \fB-N\fR in attach mode. -.PP +\fB\-\-attach\fR, and it is an error to use \fB\-n\fR, \fB\-c\fR, +or \fB\-N\fR in attach mode. +.PP .SH "ENVIRONMENT VARIABLES" -.PP +.PP Some .B srun options may be set via environment variables. These environment @@ -856,147 +856,147 @@ variables, along with their corresponding options, are listed below. .TP 22 \fBSLURM_CONF\fR The location of the SLURM configuration file. -.TP +.TP \fBSLURM_ACCOUNT\fR \fB\-U, \-\-account\fR=\fIaccount\fR -.TP +.TP \fBSLURM_CPU_BIND\fR \fB\-\-cpu_bind\fR=\fItype\fR -.TP +.TP \fBSLURM_CPUS_PER_TASK\fR \fB\-c, \-\-ncpus\-per\-task\fR=\fIn\fR -.TP +.TP \fBSLURM_CONN_TYPE\fR \fB\-\-conn\-type\fR=(\fImesh|nav|torus\fR) -.TP +.TP \fBSLURM_CORE_FORMAT\fR \fB\-\-core\fR=\fIformat\fR -.TP +.TP \fBSLURM_DEBUG\fR \fB\-v, \-\-verbose\fR -.TP +.TP \fBSLURMD_DEBUG\fR -\fB\-d, \-\-slurmd-debug\fR -.TP +\fB\-d, \-\-slurmd\-debug\fR +.TP \fBSLURM_DISTRIBUTION\fR \fB\-m, \-\-distribution\fR=(\fIblock|cyclic|hostfile\fR) -.TP +.TP \fBSLURM_GEOMETRY\fR \fB\-g, \-\-geometry\fR=\fIX,Y,Z\fR -.TP +.TP \fBSLURM_LABELIO\fR -\fB-l, --label\fR -.TP +\fB\-l, \-\-label\fR +.TP \fBSLURM_MEM_BIND\fR \fB\-\-mem_bind\fR=\fItype\fR -.TP +.TP \fBSLURM_NETWORK\fR \fB\-\-network\fR=\fItype\fR -.TP +.TP \fBSLURM_NNODES\fR -\fB\-N, \-\-nodes\fR=(\fIn|min-max\fR) -.TP +\fB\-N, \-\-nodes\fR=(\fIn|min\-max\fR) +.TP \fBSLURM_NO_REQUEUE\fR \fB\-\-no\-requeue\fR -.TP +.TP \fBSLURM_NO_ROTATE\fR \fB\-\-no\-rotate\fR -.TP +.TP \fBSLURM_NPROCS\fR \fB\-n, \-\-ntasks\fR=\fIn\fR -.TP +.TP \fBSLURM_OVERCOMMIT\fR \fB\-o, \-\-overcommit\fR -.TP +.TP \fBSLURM_PARTITION\fR -\fB\-p, --partition\fR=\fIpartition\fR -.TP +\fB\-p, \-\-partition\fR=\fIpartition\fR +.TP \fBSLURM_REMOTE_CWD\fR -\fB\-D, --chdir=\fR=\fIdir\fR -.TP +\fB\-D, \-\-chdir=\fR=\fIdir\fR +.TP \fBSLURM_SRUN_COMM_IFHN\fR \fB\-\-ctrl\-comm\-ifhn\fR=\fIaddr\fR -.TP +.TP \fBSLURM_STDERRMODE\fR \fB\-e, \-\-error\fR=\fImode\fR -.TP +.TP \fBSLURM_STDINMODE\fR \fB\-i, \-\-input\fR=\fImode\fR -.TP +.TP \fBSLURM_STDOUTMODE\fR \fB\-o, \-\-output\fR=\fImode\fR -.TP +.TP \fBSLURM_TASK_EPILOG\fR \fB\-\-task\-epilog\fR=\fIexecutable\fR -.TP +.TP \fBSLURM_TASK_PROLOG\fR \fB\-\-task\-prolog\fR=\fIexecutable\fR -.TP +.TP \fBSLURM_TIMELIMIT\fR \fB\-t, \-\-time\fR=\fIminutes\fR -.TP +.TP \fBSLURM_WAIT\fR \fB\-W, \-\-wait\fR=\fIseconds\fR -.TP +.TP \fBSLURM_DISABLE_STATUS\fR -\fB\-X, \-\-disable-status\fR -.PP +\fB\-X, \-\-disable\-status\fR +.PP Additionally, .B srun will set some environment variables in the environment of the executing tasks on the remote compute nodes. These environment variables are: -.TP +.TP \fBSLURM_CPU_BIND_VERBOSE\fR \-\-cpu_bind verbosity (quiet,verbose). -.TP +.TP \fBSLURM_CPU_BIND_TYPE\fR \-\-cpu_bind type (none,rank,map_cpu:,mask_cpu:) -.TP +.TP \fBSLURM_CPU_BIND_LIST\fR \-\-cpu_bind map or mask list (<list of IDs or masks for this node>) -.TP +.TP \fBSLURM_CPUS_ON_NODE\fR Count of processors available to the job on this node -.TP +.TP \fBSLURM_JOBID\fR Job id of the executing job -.TP +.TP \fBSLURM_LAUNCH_NODE_IPADDR\fR IP adddress of the node from which the task launch was initiated (where the srun command ran from) -.TP +.TP \fBSLURM_LOCALID\fR Node local task ID for the process within a job -.TP +.TP \fBSLURM_MEM_BIND_VERBOSE\fR \-\-mem_bind verbosity (quiet,verbose). -.TP +.TP \fBSLURM_MEM_BIND_TYPE\fR \-\-mem_bind type (none,rank,map_mem:,mask_mem:) -.TP +.TP \fBSLURM_MEM_BIND_LIST\fR \-\-mem_bind map or mask list (<list of IDs or masks for this node>) -.TP +.TP \fBSLURM_NNODES\fR Total number of nodes in the job's resource allocation -.TP +.TP \fBSLURM_NODEID\fR The relative node ID of the current node -.TP +.TP \fBSLURM_NODELIST\fR List of nodes allocated to the job -.TP +.TP \fBSLURM_NPROCS\fR Total number of processes in the current job -.TP +.TP \fBSLURM_PROCID\fR The MPI rank (or relative process ID) of the current process -.TP +.TP \fBSLURM_TASKS_PER_NODE\fR Number of tasks to be initiated on each node. Values are comma separated and in the same order as SLURM_NODELIST. @@ -1005,26 +1005,26 @@ count, that count is followed by "(x#)" where "#" is the repetition count. For example, "SLURM_TASKS_PER_NODE=2(x3),1" indicates that the first three nodes will each execute three tasks and the fourth node will execute one task. -.TP +.TP \fBMPIRUN_PARTITION\fR The block name on Blue Gene systems only. -.TP +.TP \fBMPIRUN_NOALLOCATE\fR Do not allcate a block on Blue Gene systems only. -.TP +.TP \fBMPIRUN_NOFREE\fR Do not free a block on Blue Gene systems only. .SH "SIGNALS AND ESCAPE SEQUENCES" Signals sent to the \fBsrun\fR command are automatically forwarded to the tasks it is controlling with a few exceptions. The escape sequence -\fB<control-c>\fR will report the state of all tasks associated with -the \fBsrun\fR command. If \fB<control-c>\fR is entered twice within +\fB<control\-c>\fR will report the state of all tasks associated with +the \fBsrun\fR command. If \fB<control\-c>\fR is entered twice within one second, then the associated SIGINT signal will be sent to all tasks. -If a third \fB<control-c>\fR is received, the job will be forcefully +If a third \fB<control\-c>\fR is received, the job will be forcefully terminated without waiting for remote tasks to exit. -The escape sequence \fB<control-z>\fR is presently ignored. Our intent +The escape sequence \fB<control\-z>\fR is presently ignored. Our intent is for this put the \fBsrun\fR command into a mode where various special actions may be invoked. @@ -1045,10 +1045,10 @@ The \fBmpirun\fR command may need to be provided with information on its command line identifying the resources to be used. The installer of the MPICH software may configure it to perform these steps automatically. At worst, you must specify two parameters: -.TP +.TP \fB\-np SLURM_NPROCS\fR number of processors to run on -.TP +.TP \fB\-machinefile <machinefile>\fR list of computers on which to execute. This list can be constructed executing the command \fBsrun /bin/hostname\fR and writing its standard @@ -1058,39 +1058,39 @@ output to the desired file. Execute \fBmpirun \-\-help\fR for more options. Comments in the configuration file must have a "#" in collumn one. The configuration file contains the following fields separated by white space: -.TP +.TP Task rank One or more task ranks to use this configuration. Multiple values may be comma separated. Ranges may be indicated with two numbers separated with a '\-'. To indicate all tasks, specify a rank of '*' (in which case you probably should not be using this option). -.TP +.TP Executable The name of the program to execute. May be fully qualified pathname if desired. -.TP +.TP Arguments Program arguments. The expression "%t" will be replaced with the task's number. The expression "%o" will be replaced with the task's offset within -this range (e.g. a configured task rank value of "1-5" would -have offset values of "0-4"). +this range (e.g. a configured task rank value of "1\-5" would +have offset values of "0\-4"). Single quotes may be used to avoid having the enclosed values interpretted. This field is optional. -.PP +.PP For example: -.nf +.nf ################################################################### # srun multiple program configuration file # -# srun -n8 -l --multi-prog silly.conf +# srun \-n8 \-l \-\-multi\-prog silly.conf ################################################################### -4-6 hostname +4\-6 hostname 1,7 echo task:%t -0,2-3 echo offset:%o +0,2\-3 echo offset:%o -$ srun -n8 -l --multi-prog silly.conf +$ srun \-n8 \-l \-\-multi\-prog silly.conf 0: offset:0 1: task:1 2: offset:1 @@ -1100,7 +1100,7 @@ $ srun -n8 -l --multi-prog silly.conf 6: linux17.llnl.gov 7: task:7 -.fi +.fi .SH "EXAMPLES" @@ -1110,7 +1110,7 @@ in eight tasks. At least eight processors will be allocated to the job the request. The output of each task will be proceeded with its task number. (The machine "dev" in the example below has a total of two CPUs per node) -.nf +.nf > srun \-n8 \-l hostname 0: dev0 @@ -1122,8 +1122,8 @@ the request. The output of each task will be proceeded with its task number. 6: dev3 7: dev3 -.fi -.PP +.fi +.PP This example demonstrates how one might submit a script for later execution (batch mode). The script will be initiated when resources are available and no higher priority job is pending for the same @@ -1132,7 +1132,7 @@ implicit. Note that the script executes on one node. For the script to utilize all allocated nodes, it must execute the \fBsrun\fR command or an MPI program. -.nf +.nf > cat test.sh #!/bin/sh @@ -1142,63 +1142,63 @@ srun \-l hostname > srun \-N4 \-b test.sh srun: jobid 42 submitted -.fi -.PP +.fi +.PP The output of test.sh would be found in the default output file -"slurm-42.out." -.PP -The srun \fB-r\fR option is used within a job script +"slurm\-42.out." +.PP +The srun \fB\-r\fR option is used within a job script to run two job steps on disjoint nodes in the following example. The script is run using allocate mode instead of as a batch job in this case. -.nf +.nf > cat test.sh #!/bin/sh echo $SLURM_NODELIST -srun -lN2 -r2 hostname -srun -lN2 hostname +srun \-lN2 \-r2 hostname +srun \-lN2 hostname -> srun -A -N4 test.sh -dev[7-10] +> srun \-A \-N4 test.sh +dev[7\-10] 0: dev9 1: dev10 0: dev7 1: dev8 -.fi -.PP +.fi +.PP The follwing script runs two job steps in parallel within an allocated set of nodes. -.nf +.nf > cat test.sh #!/bin/bash -srun -lN2 -n4 -r 2 sleep 60 & -srun -lN2 -r 0 sleep 60 & +srun \-lN2 \-n4 \-r 2 sleep 60 & +srun \-lN2 \-r 0 sleep 60 & sleep 1 squeue -squeue -s +squeue \-s wait -> srun -A -N4 test.sh +> srun \-A \-N4 test.sh JOBID PARTITION NAME USER ST TIME NODES NODELIST - 65641 batch test.sh grondo R 0:01 4 dev[7-10] + 65641 batch test.sh grondo R 0:01 4 dev[7\-10] STEPID PARTITION USER TIME NODELIST -65641.0 batch grondo 0:01 dev[7-8] -65641.1 batch grondo 0:01 dev[9-10] +65641.0 batch grondo 0:01 dev[7\-8] +65641.1 batch grondo 0:01 dev[9\-10] -.fi -.PP +.fi +.PP This example demonstrates how one executes a simple MPICH job. We use \fBsrun\fR to build a list of machines (nodes) to be used by \fBmpirun\fR in its required format. A sample command line and the script to be executed follow. -.nf +.nf > cat test.sh #!/bin/sh @@ -1207,24 +1207,24 @@ MACHINEFILE="nodes.$SLURM_JOBID" # Generate Machinefile for mpich such that hosts are in the same # order as if run via srun # -srun -l /bin/hostname | sort -n | awk '{print $2}' > $MACHINEFILE +srun \-l /bin/hostname | sort \-n | awk '{print $2}' > $MACHINEFILE # Run using generated Machine file: -mpirun -np $SLURM_NPROCS -machinefile $MACHINEFILE mpi-app +mpirun \-np $SLURM_NPROCS \-machinefile $MACHINEFILE mpi\-app rm $MACHINEFILE -> srun -AN2 -n4 test.sh +> srun \-AN2 \-n4 test.sh .fi -.PP +.PP This simple example demonstrates the execution of different jobs on different nodes in the same srun. You can do this for any number of nodes or any number of jobs. The executables are placed on the nodes sited by the SLURM_NODEID env var. Starting at 0 and going to the number specified on the srun commandline. -.nf +.nf > cat test.sh case $SLURM_NODEID in @@ -1240,8 +1240,8 @@ is where I am running I am running on dev1 -.fi -.PP +.fi +.PP .SH "SEE ALSO" \fBscancel\fR(1), \fBscontrol\fR(1), \fBsqueue\fR(1), \fBslurm.conf\fR(5), \fBsched_setaffinity\fR(2), \fBnuma\fR(3) diff --git a/src/salloc/opt.c b/src/salloc/opt.c index c1c6d84888c..a9385592b37 100644 --- a/src/salloc/opt.c +++ b/src/salloc/opt.c @@ -80,22 +80,15 @@ #define OPT_DEBUG 0x03 #define OPT_DISTRIB 0x04 #define OPT_NODES 0x05 -#define OPT_OVERCOMMIT 0x06 #define OPT_CORE 0x07 #define OPT_CONN_TYPE 0x08 #define OPT_NO_ROTATE 0x0a #define OPT_GEOMETRY 0x0b -#define OPT_CPU_BIND 0x0d -#define OPT_MEM_BIND 0x0e #define OPT_BELL 0x0f #define OPT_NO_BELL 0x10 #define OPT_JOBID 0x11 /* generic getopt_long flags, integers and *not* valid characters */ -#define LONG_OPT_HELP 0x100 -#define LONG_OPT_USAGE 0x101 -#define LONG_OPT_XTO 0x102 -#define LONG_OPT_LAUNCH 0x103 #define LONG_OPT_TIMEO 0x104 #define LONG_OPT_TMP 0x106 #define LONG_OPT_MEM 0x107 @@ -104,7 +97,6 @@ #define LONG_OPT_UID 0x10a #define LONG_OPT_GID 0x10b #define LONG_OPT_CORE 0x10e -#define LONG_OPT_NOSHELL 0x10f #define LONG_OPT_DEBUG_TS 0x110 #define LONG_OPT_CONNTYPE 0x111 #define LONG_OPT_TEST_ONLY 0x113 @@ -115,8 +107,6 @@ #define LONG_OPT_MAIL_TYPE 0x11a #define LONG_OPT_MAIL_USER 0x11b #define LONG_OPT_NICE 0x11e -#define LONG_OPT_CPU_BIND 0x11f -#define LONG_OPT_MEM_BIND 0x120 #define LONG_OPT_NO_REQUEUE 0x123 #define LONG_OPT_BELL 0x124 #define LONG_OPT_NO_BELL 0x125 @@ -161,11 +151,7 @@ static void _usage(void); static bool _valid_node_list(char **node_list_pptr); static enum task_dist_states _verify_dist_type(const char *arg); static bool _verify_node_count(const char *arg, int *min, int *max); -static int _verify_cpu_bind(const char *arg, char **cpu_bind, - cpu_bind_type_t *cpu_bind_type); static int _verify_geometry(const char *arg, uint16_t *geometry); -static int _verify_mem_bind(const char *arg, char **mem_bind, - mem_bind_type_t *mem_bind_type); static int _verify_conn_type(const char *arg); /*---[ end forward declarations of static functions ]---------------------*/ @@ -309,203 +295,6 @@ static int _verify_geometry(const char *arg, uint16_t *geometry) return rc; } -/* - * verify cpu_bind arguments - * returns -1 on error, 0 otherwise - */ -static int _verify_cpu_bind(const char *arg, char **cpu_bind, - cpu_bind_type_t *cpu_bind_type) -{ - char *buf = xstrdup(arg); - char *pos = buf; - /* we support different launch policy names - * we also allow a verbose setting to be specified - * --cpu_bind=v - * --cpu_bind=rank,v - * --cpu_bind=rank - * --cpu_bind={MAP_CPU|MAP_MASK}:0,1,2,3,4 - */ - if (*pos) { - /* parse --cpu_bind command line arguments */ - bool fl_cpubind_verbose = 0; - char *cmd_line_affinity = NULL; - char *cmd_line_mapping = NULL; - char *mappos = strchr(pos,':'); - if (!mappos) { - mappos = strchr(pos,'='); - } - if (strncasecmp(pos, "quiet", 5) == 0) { - fl_cpubind_verbose=0; - pos+=5; - } else if (*pos=='q' || *pos=='Q') { - fl_cpubind_verbose=0; - pos++; - } - if (strncasecmp(pos, "verbose", 7) == 0) { - fl_cpubind_verbose=1; - pos+=7; - } else if (*pos=='v' || *pos=='V') { - fl_cpubind_verbose=1; - pos++; - } - if (*pos==',') { - pos++; - } - if (*pos) { - char *vpos=NULL; - cmd_line_affinity = pos; - if (((vpos=strstr(pos,",q")) !=0 ) || - ((vpos=strstr(pos,",Q")) !=0 )) { - *vpos='\0'; - fl_cpubind_verbose=0; - } - if (((vpos=strstr(pos,",v")) !=0 ) || - ((vpos=strstr(pos,",V")) !=0 )) { - *vpos='\0'; - fl_cpubind_verbose=1; - } - } - if (mappos) { - *mappos='\0'; - mappos++; - cmd_line_mapping=mappos; - } - - /* convert parsed command line args into interface */ - if (cmd_line_mapping) { - xfree(*cpu_bind); - *cpu_bind = xstrdup(cmd_line_mapping); - } - if (fl_cpubind_verbose) { - *cpu_bind_type |= CPU_BIND_VERBOSE; - } - if (cmd_line_affinity) { - *cpu_bind_type &= CPU_BIND_VERBOSE; /* clear any - * previous type */ - if ((strcasecmp(cmd_line_affinity, "no") == 0) || - (strcasecmp(cmd_line_affinity, "none") == 0)) { - *cpu_bind_type |= CPU_BIND_NONE; - } else if (strcasecmp(cmd_line_affinity, "rank") == 0) { - *cpu_bind_type |= CPU_BIND_RANK; - } else if ((strcasecmp(cmd_line_affinity, "map_cpu") == 0) || - (strcasecmp(cmd_line_affinity, "mapcpu") == 0)) { - *cpu_bind_type |= CPU_BIND_MAPCPU; - } else if ((strcasecmp(cmd_line_affinity, "mask_cpu") == 0) || - (strcasecmp(cmd_line_affinity, "maskcpu") == 0)) { - *cpu_bind_type |= CPU_BIND_MASKCPU; - } else { - error("unrecognized --cpu_bind argument \"%s\"", - cmd_line_affinity); - xfree(buf); - return 1; - } - } - } - - xfree(buf); - return 0; -} - -/* - * verify mem_bind arguments - * returns -1 on error, 0 otherwise - */ -static int _verify_mem_bind(const char *arg, char **mem_bind, - mem_bind_type_t *mem_bind_type) -{ - char *buf = xstrdup(arg); - char *pos = buf; - /* we support different launch policy names - * we also allow a verbose setting to be specified - * --mem_bind=v - * --mem_bind=rank,v - * --mem_bind=rank - * --mem_bind={MAP_CPU|MAP_MASK}:0,1,2,3,4 - */ - if (*pos) { - /* parse --mem_bind command line arguments */ - bool fl_membind_verbose = 0; - char *cmd_line_affinity = NULL; - char *cmd_line_mapping = NULL; - char *mappos = strchr(pos,':'); - if (!mappos) { - mappos = strchr(pos,'='); - } - if (strncasecmp(pos, "quiet", 5) == 0) { - fl_membind_verbose = 0; - pos+=5; - } else if (*pos=='q' || *pos=='Q') { - fl_membind_verbose = 0; - pos++; - } - if (strncasecmp(pos, "verbose", 7) == 0) { - fl_membind_verbose = 1; - pos+=7; - } else if (*pos=='v' || *pos=='V') { - fl_membind_verbose = 1; - pos++; - } - if (*pos==',') { - pos++; - } - if (*pos) { - char *vpos=NULL; - cmd_line_affinity = pos; - if (((vpos=strstr(pos,",q")) !=0 ) || - ((vpos=strstr(pos,",Q")) !=0 )) { - *vpos='\0'; - fl_membind_verbose = 0; - } - if (((vpos=strstr(pos,",v")) !=0 ) || - ((vpos=strstr(pos,",V")) !=0 )) { - *vpos='\0'; - fl_membind_verbose = 1; - } - } - if (mappos) { - *mappos='\0'; - mappos++; - cmd_line_mapping=mappos; - } - - /* convert parsed command line args into interface */ - if (cmd_line_mapping) { - xfree(*mem_bind); - *mem_bind = xstrdup(cmd_line_mapping); - } - if (fl_membind_verbose) { - *mem_bind_type |= MEM_BIND_VERBOSE; - } - if (cmd_line_affinity) { - *mem_bind_type &= MEM_BIND_VERBOSE; /* clear any - * previous type */ - if ((strcasecmp(cmd_line_affinity, "no") == 0) || - (strcasecmp(cmd_line_affinity, "none") == 0)) { - *mem_bind_type |= MEM_BIND_NONE; - } else if (strcasecmp(cmd_line_affinity, "rank") == 0) { - *mem_bind_type |= MEM_BIND_RANK; - } else if (strcasecmp(cmd_line_affinity, "local") == 0) { - *mem_bind_type |= MEM_BIND_LOCAL; - } else if ((strcasecmp(cmd_line_affinity, "map_mem") == 0) || - (strcasecmp(cmd_line_affinity, "mapmem") == 0)) { - *mem_bind_type |= MEM_BIND_MAPCPU; - } else if ((strcasecmp(cmd_line_affinity, "mask_mem") == 0) || - (strcasecmp(cmd_line_affinity, "maskmem") == 0)) { - *mem_bind_type |= MEM_BIND_MASKCPU; - } else { - error("unrecognized --mem_bind argument \"%s\"", - cmd_line_affinity); - xfree(buf); - return 1; - } - } - } - - xfree(buf); - return 0; -} - - /* Convert a string into a node count */ static int _str_to_nodes(const char *num_str, char **leftover) @@ -700,10 +489,6 @@ static void _opt_default() opt.min_nodes = 1; opt.max_nodes = 0; opt.nodes_set = false; - opt.cpu_bind_type = 0; - opt.cpu_bind = NULL; - opt.mem_bind_type = 0; - opt.mem_bind = NULL; opt.time_limit = -1; opt.partition = NULL; @@ -714,7 +499,6 @@ static void _opt_default() opt.distribution = SLURM_DIST_CYCLIC; - opt.overcommit = false; opt.share = false; opt.no_kill = false; opt.kill_command_signal = SIGTERM; @@ -722,11 +506,7 @@ static void _opt_default() opt.immediate = false; opt.no_requeue = false; - - opt.noshell = false; opt.max_wait = 0; - - opt.quit_on_intr = false; opt.test_only = false; opt.quiet = 0; @@ -743,8 +523,6 @@ static void _opt_default() opt.exclusive = false; opt.nodelist = NULL; opt.exc_nodes = NULL; - opt.max_launch_time = 120;/* 120 seconds to launch job */ - opt.max_exit_timeout= 60; /* Warn user 60 seconds after task exit */ opt.msg_timeout = 5; /* Default launch msg timeout */ for (i=0; i<SYSTEM_DIMENSIONS; i++) @@ -779,24 +557,21 @@ struct env_vars { }; env_vars_t env_vars[] = { - {"SLURM_ACCOUNT", OPT_STRING, &opt.account, NULL }, - {"SLURM_CPUS_PER_TASK", OPT_INT, &opt.cpus_per_task, &opt.cpus_set }, - {"SLURM_CONN_TYPE", OPT_CONN_TYPE, NULL, NULL }, - {"SLURM_CPU_BIND", OPT_CPU_BIND, NULL, NULL }, - {"SLURM_MEM_BIND", OPT_MEM_BIND, NULL, NULL }, - {"SLURM_DEBUG", OPT_DEBUG, NULL, NULL }, - {"SLURM_DISTRIBUTION", OPT_DISTRIB, NULL, NULL }, - {"SLURM_GEOMETRY", OPT_GEOMETRY, NULL, NULL }, - {"SLURM_IMMEDIATE", OPT_INT, &opt.immediate, NULL }, - {"SLURM_JOBID", OPT_JOBID, NULL, NULL }, - {"SLURM_NNODES", OPT_NODES, NULL, NULL }, - {"SLURM_NO_REQUEUE", OPT_INT, &opt.no_requeue, NULL }, - {"SLURM_NO_ROTATE", OPT_NO_ROTATE, NULL, NULL }, - {"SLURM_NPROCS", OPT_INT, &opt.nprocs, &opt.nprocs_set}, - {"SLURM_OVERCOMMIT", OPT_OVERCOMMIT, NULL, NULL }, - {"SLURM_PARTITION", OPT_STRING, &opt.partition, NULL }, - {"SLURM_TIMELIMIT", OPT_INT, &opt.time_limit, NULL }, - {"SLURM_WAIT", OPT_INT, &opt.max_wait, NULL }, + {"SALLOC_ACCOUNT", OPT_STRING, &opt.account, NULL }, + {"SALLOC_CPUS_PER_TASK", OPT_INT, &opt.cpus_per_task, &opt.cpus_set }, + {"SALLOC_CONN_TYPE", OPT_CONN_TYPE, NULL, NULL }, + {"SALLOC_DEBUG", OPT_DEBUG, NULL, NULL }, + {"SALLOC_DISTRIBUTION", OPT_DISTRIB, NULL, NULL }, + {"SALLOC_GEOMETRY", OPT_GEOMETRY, NULL, NULL }, + {"SALLOC_IMMEDIATE", OPT_INT, &opt.immediate, NULL }, + {"SALLOC_JOBID", OPT_JOBID, NULL, NULL }, + {"SALLOC_NNODES", OPT_NODES, NULL, NULL }, + {"SALLOC_NO_REQUEUE", OPT_INT, &opt.no_requeue, NULL }, + {"SALLOC_NO_ROTATE", OPT_NO_ROTATE, NULL, NULL }, + {"SALLOC_NPROCS", OPT_INT, &opt.nprocs, &opt.nprocs_set}, + {"SALLOC_PARTITION", OPT_STRING, &opt.partition, NULL }, + {"SALLOC_TIMELIMIT", OPT_INT, &opt.time_limit, NULL }, + {"SALLOC_WAIT", OPT_INT, &opt.max_wait, NULL }, {"SALLOC_BELL", OPT_BELL, NULL, NULL }, {"SALLOC_NO_BELL", OPT_NO_BELL, NULL, NULL }, {NULL, 0, NULL, NULL} @@ -862,18 +637,6 @@ _process_env_var(env_vars_t *e, const char *val) opt.distribution = dt; break; - case OPT_CPU_BIND: - if (_verify_cpu_bind(val, &opt.cpu_bind, - &opt.cpu_bind_type)) - exit(1); - break; - - case OPT_MEM_BIND: - if (_verify_mem_bind(val, &opt.mem_bind, - &opt.mem_bind_type)) - exit(1); - break; - case OPT_NODES: opt.nodes_set = _verify_node_count( val, &opt.min_nodes, @@ -883,10 +646,6 @@ _process_env_var(env_vars_t *e, const char *val) } break; - case OPT_OVERCOMMIT: - opt.overcommit = true; - break; - case OPT_CONN_TYPE: opt.conn_type = _verify_conn_type(val); break; @@ -951,6 +710,7 @@ void set_options(const int argc, char **argv) {"constraint", required_argument, 0, 'C'}, {"dependency", required_argument, 0, 'D'}, {"geometry", required_argument, 0, 'g'}, + {"help", no_argument, 0, 'h'}, {"hold", no_argument, 0, 'H'}, {"immediate", no_argument, 0, 'I'}, {"job-name", required_argument, 0, 'J'}, @@ -958,13 +718,12 @@ void set_options(const int argc, char **argv) {"kill-command", optional_argument, 0, 'K'}, {"distribution", required_argument, 0, 'm'}, {"nodes", required_argument, 0, 'N'}, - {"overcommit", no_argument, 0, 'O'}, {"partition", required_argument, 0, 'p'}, - {"quit-on-interrupt", no_argument, 0, 'q'}, - {"quiet", no_argument, 0, 'Q'}, + {"quiet", no_argument, 0, 'q'}, {"no-rotate", no_argument, 0, 'R'}, {"share", no_argument, 0, 's'}, {"time", required_argument, 0, 't'}, + {"usage", no_argument, 0, 'u'}, {"account", required_argument, 0, 'U'}, {"verbose", no_argument, 0, 'v'}, {"version", no_argument, 0, 'V'}, @@ -973,20 +732,13 @@ void set_options(const int argc, char **argv) {"exclude", required_argument, 0, 'x'}, {"contiguous", no_argument, 0, LONG_OPT_CONT}, {"exclusive", no_argument, 0, LONG_OPT_EXCLUSIVE}, - {"cpu_bind", required_argument, 0, LONG_OPT_CPU_BIND}, - {"mem_bind", required_argument, 0, LONG_OPT_MEM_BIND}, {"mincpus", required_argument, 0, LONG_OPT_MINCPU}, {"mem", required_argument, 0, LONG_OPT_MEM}, - {"no-shell", no_argument, 0, LONG_OPT_NOSHELL}, {"tmp", required_argument, 0, LONG_OPT_TMP}, {"msg-timeout", required_argument, 0, LONG_OPT_TIMEO}, - {"max-launch-time", required_argument, 0, LONG_OPT_LAUNCH}, - {"max-exit-timeout", required_argument, 0, LONG_OPT_XTO}, {"uid", required_argument, 0, LONG_OPT_UID}, {"gid", required_argument, 0, LONG_OPT_GID}, {"debugger-test", no_argument, 0, LONG_OPT_DEBUG_TS}, - {"help", no_argument, 0, LONG_OPT_HELP}, - {"usage", no_argument, 0, LONG_OPT_USAGE}, {"conn-type", required_argument, 0, LONG_OPT_CONNTYPE}, {"test-only", no_argument, 0, LONG_OPT_TEST_ONLY}, {"network", required_argument, 0, LONG_OPT_NETWORK}, @@ -1001,7 +753,7 @@ void set_options(const int argc, char **argv) {"jobid", required_argument, 0, LONG_OPT_JOBID}, {NULL, 0, 0, 0} }; - char *opt_string = "+a:c:C:D:g:HIJ:kK::m:n:N:Op:qQR:st:U:vVw:W:x:"; + char *opt_string = "+a:c:C:D:g:hHIJ:kK::m:n:N:p:qR:st:uU:vVw:W:x:"; opt.progname = xbasename(argv[0]); optind = 0; @@ -1030,6 +782,9 @@ void set_options(const int argc, char **argv) if (_verify_geometry(optarg, opt.geometry)) exit(1); break; + case 'h': + _help(); + exit(0); case 'H': opt.hold = true; break; @@ -1074,17 +829,11 @@ void set_options(const int argc, char **argv) exit(1); } break; - case 'O': - opt.overcommit = true; - break; case 'p': xfree(opt.partition); opt.partition = xstrdup(optarg); break; case 'q': - opt.quit_on_intr = true; - break; - case 'Q': opt.quiet++; break; case 'R': @@ -1096,6 +845,9 @@ void set_options(const int argc, char **argv) case 't': opt.time_limit = _get_int(optarg, "time"); break; + case 'u': + _usage(); + exit(0); case 'U': xfree(opt.account); opt.account = xstrdup(optarg); @@ -1134,16 +886,6 @@ void set_options(const int argc, char **argv) case LONG_OPT_EXCLUSIVE: opt.exclusive = true; break; - case LONG_OPT_CPU_BIND: - if (_verify_cpu_bind(optarg, &opt.cpu_bind, - &opt.cpu_bind_type)) - exit(1); - break; - case LONG_OPT_MEM_BIND: - if (_verify_mem_bind(optarg, &opt.mem_bind, - &opt.mem_bind_type)) - exit(1); - break; case LONG_OPT_MINCPU: opt.mincpus = _get_int(optarg, "mincpus"); break; @@ -1155,9 +897,6 @@ void set_options(const int argc, char **argv) exit(1); } break; - case LONG_OPT_NOSHELL: - opt.noshell = true; - break; case LONG_OPT_TMP: opt.tmpdisk = _to_bytes(optarg); if (opt.tmpdisk < 0) { @@ -1169,14 +908,6 @@ void set_options(const int argc, char **argv) opt.msg_timeout = _get_int(optarg, "msg-timeout"); break; - case LONG_OPT_LAUNCH: - opt.max_launch_time = - _get_int(optarg, "max-launch-time"); - break; - case LONG_OPT_XTO: - opt.max_exit_timeout = - _get_int(optarg, "max-exit-timeout"); - break; case LONG_OPT_UID: opt.euid = uid_from_string (optarg); if (opt.euid == (uid_t) -1) @@ -1187,12 +918,6 @@ void set_options(const int argc, char **argv) if (opt.egid == (gid_t) -1) fatal ("--gid=\"%s\" invalid", optarg); break; - case LONG_OPT_HELP: - _help(); - exit(0); - case LONG_OPT_USAGE: - _usage(); - exit(0); case LONG_OPT_CONNTYPE: opt.conn_type = _verify_conn_type(optarg); break; @@ -1300,7 +1025,7 @@ static bool _opt_verify(void) bool verified = true; if (opt.quiet && opt.verbose) { - error ("don't specify both --verbose (-v) and --quiet (-Q)"); + error ("don't specify both --verbose (-v) and --quiet (-q)"); verified = false; } @@ -1360,12 +1085,6 @@ static bool _opt_verify(void) } /* else if (opt.nprocs_set && !opt.nodes_set) */ - /* - * --wait always overrides hidden max_exit_timeout - */ - if (opt.max_wait) - opt.max_exit_timeout = opt.max_wait; - if (opt.time_limit == 0) opt.time_limit = INFINITE; @@ -1570,14 +1289,9 @@ static void _opt_list() if (opt.jobid != NO_VAL) info("jobid : %u", opt.jobid); info("distribution : %s", format_task_dist_states(opt.distribution)); - info("cpu_bind : %s", - opt.cpu_bind == NULL ? "default" : opt.cpu_bind); - info("mem_bind : %s", - opt.mem_bind == NULL ? "default" : opt.mem_bind); info("verbose : %d", opt.verbose); info("immediate : %s", tf_(opt.immediate)); info("no-requeue : %s", tf_(opt.no_requeue)); - info("overcommit : %s", tf_(opt.overcommit)); if (opt.time_limit == INFINITE) info("time_limit : INFINITE"); else @@ -1620,14 +1334,13 @@ static void _usage(void) printf( "Usage: salloc [-N numnodes|[min nodes]-[max nodes]] [-n num-processors]\n" " [[-c cpus-per-node] [-r n] [-p partition] [--hold] [-t minutes]\n" -" [--immediate] [--overcommit] [--no-kill]\n" +" [--immediate] [--no-kill]\n" " [--share] [-m dist] [-J jobname] [--jobid=id]\n" " [--verbose]\n" " [-W sec]\n" " [--contiguous] [--mincpus=n] [--mem=MB] [--tmp=MB] [-C list]\n" " [--mpi=type] [--account=name] [--dependency=jobid]\n" -" [--kill-on-bad-exit] [--propagate[=rlimits] ]\n" -" [--cpu_bind=...] [--mem_bind=...]\n" +" [--propagate[=rlimits] ]\n" #ifdef HAVE_BG /* Blue gene specific options */ " [--geometry=XxYxZ] [--conn-type=type] [--no-rotate]\n" #endif @@ -1649,7 +1362,6 @@ static void _help(void) " -H, --hold submit job in held state\n" " -t, --time=minutes time limit\n" " -I, --immediate exit if resources are not immediately available\n" -" -O, --overcommit overcommit resources\n" " -k, --no-kill do not kill job on node failure\n" " -s, --share share nodes with other jobs\n" " -m, --distribution=type distribution method for processes to nodes\n" @@ -1659,9 +1371,8 @@ static void _help(void) " --mpi=type type of MPI being used\n" " -W, --wait=sec seconds to wait for allocation if not\n" " immediately available\n" -" -q, --quit-on-interrupt quit on single Ctrl-C\n" " -v, --verbose verbose mode (multiple -v's increase verbosity)\n" -" -Q, --quiet quiet mode (suppress informational messages)\n" +" -q, --quiet quiet mode (suppress informational messages)\n" " -P, --dependency=jobid defer job until specified jobid completes\n" " --nice[=value] decrease secheduling priority by value\n" " -U, --account=name charge job to specified account\n" @@ -1671,7 +1382,6 @@ static void _help(void) " --mail-type=type notify on state change: BEGIN, END, FAIL or ALL\n" " --mail-user=user who to send email notification for job state changes\n" " --no-requeue if set, do not permit the job to be requeued\n" -" --no-shell don't spawn shell in allocate mode\n" "\n" "Constraint options:\n" " --mincpus=n minimum number of cpus per node\n" @@ -1681,32 +1391,10 @@ static void _help(void) " -C, --constraint=list specify a list of constraints\n" " -w, --nodelist=hosts... request a specific list of hosts\n" " -x, --exclude=hosts... exclude a specific list of hosts\n" -" -Z, --no-allocate don't allocate nodes (must supply -w)\n" "\n" "Consumable resources related options:\n" " --exclusive allocate nodes in exclusive mode when\n" -" cpu consumable resource is enabled\n" -"\n" -"Affinity/Multi-core options: (when the task/affinity plugin is enabled)\n" -" --cpu_bind= Bind tasks to CPUs\n" -" q[uiet], quietly bind before task runs (default)\n" -" v[erbose], verbosely report binding before task runs\n" -" no[ne] don't bind tasks to CPUs (default)\n" -" rank bind by task rank\n" -" map_cpu:<list> bind by mapping CPU IDs to tasks as specified\n" -" where <list> is <cpuid1>,<cpuid2>,...<cpuidN>\n" -" mask_cpu:<list> bind by setting CPU masks on tasks as specified\n" -" where <list> is <mask1>,<mask2>,...<maskN>\n" -" --mem_bind= Bind tasks to memory\n" -" q[uiet], quietly bind before task runs (default)\n" -" v[erbose], verbosely report binding before task runs\n" -" no[ne] don't bind tasks to memory (default)\n" -" rank bind by task rank\n" -" local bind to memory local to processor\n" -" map_mem:<list> bind by mapping memory of CPU IDs to tasks as specified\n" -" where <list> is <cpuid1>,<cpuid2>,...<cpuidN>\n" -" mask_mem:<list> bind by setting menory of CPU masks on tasks as specified\n" -" where <list> is <mask1>,<mask2>,...<maskN>\n"); +" cpu consumable resource is enabled\n"); printf("\n"); @@ -1726,8 +1414,8 @@ static void _help(void) "\n" #endif "Help options:\n" -" --help show this help message\n" -" --usage display brief usage message\n" +" -h, --help show this help message\n" +" -u, --usage display brief usage message\n" "\n" "Other options:\n" " -V, --version output version information and exit\n" diff --git a/src/salloc/opt.h b/src/salloc/opt.h index d16f84840ed..4bfd9f67d3b 100644 --- a/src/salloc/opt.h +++ b/src/salloc/opt.h @@ -66,10 +66,6 @@ typedef struct salloc_options { bool cpus_set; /* true if cpus_per_task explicitly set */ int min_nodes; /* --nodes=n, -N n */ int max_nodes; /* --nodes=x-n, -N x-n */ - cpu_bind_type_t cpu_bind_type; /* --cpu_bind= */ - char *cpu_bind; /* binding map for map/mask_cpu */ - mem_bind_type_t mem_bind_type; /* --mem_bind= */ - char *mem_bind; /* binding map for map/mask_mem */ bool nodes_set; /* true if nodes explicitly set */ int time_limit; /* --time, -t */ char *partition; /* --partition=n, -p n */ @@ -84,15 +80,12 @@ typedef struct salloc_options { int immediate; /* -i, --immediate */ bool hold; /* --hold, -H */ - bool noshell; /* --noshell */ - bool overcommit; /* --overcommit, -O */ bool no_kill; /* --no-kill, -k */ int kill_command_signal;/* --kill-command, -K */ bool kill_command_signal_set; bool no_requeue; /* --no-requeue */ bool share; /* --share, -s */ int max_wait; /* --wait, -W */ - bool quit_on_intr; /* --quit-on-interrupt, -q */ int quiet; int verbose; bool test_only; /* --test-only */ @@ -106,8 +99,6 @@ typedef struct salloc_options { bool contiguous; /* --contiguous */ char *nodelist; /* --nodelist=node1,node2,... */ char *exc_nodes; /* --exclude=node1,node2,... -x */ - int max_launch_time; /* Undocumented */ - int max_exit_timeout; /* Undocumented */ int msg_timeout; /* Undocumented */ char *network; /* --network= */ bool exclusive; /* --exclusive */ diff --git a/src/salloc/salloc.c b/src/salloc/salloc.c index 738ed064f43..5af47959ddb 100644 --- a/src/salloc/salloc.c +++ b/src/salloc/salloc.c @@ -272,11 +272,7 @@ static int fill_job_desc_from_opts(job_desc_msg_t *desc) desc->min_memory = opt.realmem; if (opt.tmpdisk > -1) desc->min_tmp_disk = opt.tmpdisk; - if (opt.overcommit) { - desc->num_procs = opt.min_nodes; - desc->overcommit = opt.overcommit; - } else - desc->num_procs = opt.nprocs * opt.cpus_per_task; + desc->num_procs = opt.nprocs * opt.cpus_per_task; if (opt.nprocs_set) desc->num_tasks = opt.nprocs; if (opt.cpus_set) diff --git a/src/sbatch/opt.c b/src/sbatch/opt.c index 75b0defd4cc..bb953021f4b 100644 --- a/src/sbatch/opt.c +++ b/src/sbatch/opt.c @@ -84,14 +84,9 @@ #define OPT_NO_ROTATE 0x0a #define OPT_GEOMETRY 0x0b #define OPT_MPI 0x0c -#define OPT_CPU_BIND 0x0d -#define OPT_MEM_BIND 0x0e #define OPT_MULTI 0x0f /* generic getopt_long flags, integers and *not* valid characters */ -#define LONG_OPT_USAGE 0x101 -#define LONG_OPT_XTO 0x102 -#define LONG_OPT_LAUNCH 0x103 #define LONG_OPT_TIMEO 0x104 #define LONG_OPT_JOBID 0x105 #define LONG_OPT_TMP 0x106 @@ -102,7 +97,6 @@ #define LONG_OPT_GID 0x10b #define LONG_OPT_MPI 0x10c #define LONG_OPT_CORE 0x10e -#define LONG_OPT_NOSHELL 0x10f #define LONG_OPT_DEBUG_TS 0x110 #define LONG_OPT_CONNTYPE 0x111 #define LONG_OPT_TEST_ONLY 0x113 @@ -115,9 +109,6 @@ #define LONG_OPT_TASK_PROLOG 0x11c #define LONG_OPT_TASK_EPILOG 0x11d #define LONG_OPT_NICE 0x11e -#define LONG_OPT_CPU_BIND 0x11f -#define LONG_OPT_MEM_BIND 0x120 -#define LONG_OPT_CTRL_COMM_IFHN 0x121 #define LONG_OPT_NO_REQUEUE 0x123 /*---- global variables, defined in opt.h ----*/ @@ -168,11 +159,7 @@ static void _usage(void); static bool _valid_node_list(char **node_list_pptr); static enum task_dist_states _verify_dist_type(const char *arg); static bool _verify_node_count(const char *arg, int *min, int *max); -static int _verify_cpu_bind(const char *arg, char **cpu_bind, - cpu_bind_type_t *cpu_bind_type); static int _verify_geometry(const char *arg, uint16_t *geometry); -static int _verify_mem_bind(const char *arg, char **mem_bind, - mem_bind_type_t *mem_bind_type); static int _verify_conn_type(const char *arg); static char *_fullpath(const char *filename); static void _set_options(int argc, char **argv); @@ -300,202 +287,6 @@ static int _verify_geometry(const char *arg, uint16_t *geometry) return rc; } -/* - * verify cpu_bind arguments - * returns -1 on error, 0 otherwise - */ -static int _verify_cpu_bind(const char *arg, char **cpu_bind, - cpu_bind_type_t *cpu_bind_type) -{ - char *buf = xstrdup(arg); - char *pos = buf; - /* we support different launch policy names - * we also allow a verbose setting to be specified - * --cpu_bind=v - * --cpu_bind=rank,v - * --cpu_bind=rank - * --cpu_bind={MAP_CPU|MAP_MASK}:0,1,2,3,4 - */ - if (*pos) { - /* parse --cpu_bind command line arguments */ - bool fl_cpubind_verbose = 0; - char *cmd_line_affinity = NULL; - char *cmd_line_mapping = NULL; - char *mappos = strchr(pos,':'); - if (!mappos) { - mappos = strchr(pos,'='); - } - if (strncasecmp(pos, "quiet", 5) == 0) { - fl_cpubind_verbose=0; - pos+=5; - } else if (*pos=='q' || *pos=='Q') { - fl_cpubind_verbose=0; - pos++; - } - if (strncasecmp(pos, "verbose", 7) == 0) { - fl_cpubind_verbose=1; - pos+=7; - } else if (*pos=='v' || *pos=='V') { - fl_cpubind_verbose=1; - pos++; - } - if (*pos==',') { - pos++; - } - if (*pos) { - char *vpos=NULL; - cmd_line_affinity = pos; - if (((vpos=strstr(pos,",q")) !=0 ) || - ((vpos=strstr(pos,",Q")) !=0 )) { - *vpos='\0'; - fl_cpubind_verbose=0; - } - if (((vpos=strstr(pos,",v")) !=0 ) || - ((vpos=strstr(pos,",V")) !=0 )) { - *vpos='\0'; - fl_cpubind_verbose=1; - } - } - if (mappos) { - *mappos='\0'; - mappos++; - cmd_line_mapping=mappos; - } - - /* convert parsed command line args into interface */ - if (cmd_line_mapping) { - xfree(*cpu_bind); - *cpu_bind = xstrdup(cmd_line_mapping); - } - if (fl_cpubind_verbose) { - *cpu_bind_type |= CPU_BIND_VERBOSE; - } - if (cmd_line_affinity) { - *cpu_bind_type &= CPU_BIND_VERBOSE; /* clear any - * previous type */ - if ((strcasecmp(cmd_line_affinity, "no") == 0) || - (strcasecmp(cmd_line_affinity, "none") == 0)) { - *cpu_bind_type |= CPU_BIND_NONE; - } else if (strcasecmp(cmd_line_affinity, "rank") == 0) { - *cpu_bind_type |= CPU_BIND_RANK; - } else if ((strcasecmp(cmd_line_affinity, "map_cpu") == 0) || - (strcasecmp(cmd_line_affinity, "mapcpu") == 0)) { - *cpu_bind_type |= CPU_BIND_MAPCPU; - } else if ((strcasecmp(cmd_line_affinity, "mask_cpu") == 0) || - (strcasecmp(cmd_line_affinity, "maskcpu") == 0)) { - *cpu_bind_type |= CPU_BIND_MASKCPU; - } else { - error("unrecognized --cpu_bind argument \"%s\"", - cmd_line_affinity); - xfree(buf); - return 1; - } - } - } - - xfree(buf); - return 0; -} - -/* - * verify mem_bind arguments - * returns -1 on error, 0 otherwise - */ -static int _verify_mem_bind(const char *arg, char **mem_bind, - mem_bind_type_t *mem_bind_type) -{ - char *buf = xstrdup(arg); - char *pos = buf; - /* we support different launch policy names - * we also allow a verbose setting to be specified - * --mem_bind=v - * --mem_bind=rank,v - * --mem_bind=rank - * --mem_bind={MAP_CPU|MAP_MASK}:0,1,2,3,4 - */ - if (*pos) { - /* parse --mem_bind command line arguments */ - bool fl_membind_verbose = 0; - char *cmd_line_affinity = NULL; - char *cmd_line_mapping = NULL; - char *mappos = strchr(pos,':'); - if (!mappos) { - mappos = strchr(pos,'='); - } - if (strncasecmp(pos, "quiet", 5) == 0) { - fl_membind_verbose = 0; - pos+=5; - } else if (*pos=='q' || *pos=='Q') { - fl_membind_verbose = 0; - pos++; - } - if (strncasecmp(pos, "verbose", 7) == 0) { - fl_membind_verbose = 1; - pos+=7; - } else if (*pos=='v' || *pos=='V') { - fl_membind_verbose = 1; - pos++; - } - if (*pos==',') { - pos++; - } - if (*pos) { - char *vpos=NULL; - cmd_line_affinity = pos; - if (((vpos=strstr(pos,",q")) !=0 ) || - ((vpos=strstr(pos,",Q")) !=0 )) { - *vpos='\0'; - fl_membind_verbose = 0; - } - if (((vpos=strstr(pos,",v")) !=0 ) || - ((vpos=strstr(pos,",V")) !=0 )) { - *vpos='\0'; - fl_membind_verbose = 1; - } - } - if (mappos) { - *mappos='\0'; - mappos++; - cmd_line_mapping=mappos; - } - - /* convert parsed command line args into interface */ - if (cmd_line_mapping) { - xfree(*mem_bind); - *mem_bind = xstrdup(cmd_line_mapping); - } - if (fl_membind_verbose) { - *mem_bind_type |= MEM_BIND_VERBOSE; - } - if (cmd_line_affinity) { - *mem_bind_type &= MEM_BIND_VERBOSE; /* clear any - * previous type */ - if ((strcasecmp(cmd_line_affinity, "no") == 0) || - (strcasecmp(cmd_line_affinity, "none") == 0)) { - *mem_bind_type |= MEM_BIND_NONE; - } else if (strcasecmp(cmd_line_affinity, "rank") == 0) { - *mem_bind_type |= MEM_BIND_RANK; - } else if (strcasecmp(cmd_line_affinity, "local") == 0) { - *mem_bind_type |= MEM_BIND_LOCAL; - } else if ((strcasecmp(cmd_line_affinity, "map_mem") == 0) || - (strcasecmp(cmd_line_affinity, "mapmem") == 0)) { - *mem_bind_type |= MEM_BIND_MAPCPU; - } else if ((strcasecmp(cmd_line_affinity, "mask_mem") == 0) || - (strcasecmp(cmd_line_affinity, "maskmem") == 0)) { - *mem_bind_type |= MEM_BIND_MASKCPU; - } else { - error("unrecognized --mem_bind argument \"%s\"", - cmd_line_affinity); - xfree(buf); - return 1; - } - } - } - - xfree(buf); - return 0; -} - /* * verify that a node count in arg is of a known form (count or min-max) * OUT min, max specified minimum and maximum node counts @@ -661,10 +452,6 @@ static void _opt_default() opt.min_nodes = 1; opt.max_nodes = 0; opt.nodes_set = false; - opt.cpu_bind_type = 0; - opt.cpu_bind = NULL; - opt.mem_bind_type = 0; - opt.mem_bind = NULL; opt.time_limit = -1; opt.partition = NULL; @@ -678,16 +465,10 @@ static void _opt_default() opt.share = false; opt.no_kill = false; - opt.kill_bad_exit = false; opt.immediate = false; opt.no_requeue = false; - - opt.noshell = false; opt.max_wait = slurm_get_wait_time(); - - opt.quit_on_intr = false; - opt.disable_status = false; opt.test_only = false; opt.quiet = 0; @@ -704,8 +485,6 @@ static void _opt_default() opt.exclusive = false; opt.nodelist = NULL; opt.exc_nodes = NULL; - opt.max_launch_time = 120;/* 120 seconds to launch job */ - opt.max_exit_timeout= 60; /* Warn user 60 seconds after task exit */ opt.msg_timeout = 5; /* Default launch msg timeout */ for (i=0; i<SYSTEM_DIMENSIONS; i++) @@ -724,9 +503,6 @@ static void _opt_default() opt.ifname = NULL; opt.ofname = NULL; opt.efname = NULL; - - opt.ctrl_comm_ifhn = xshort_hostname(); - } /*---[ env var processing ]-----------------------------------------------*/ @@ -748,30 +524,23 @@ struct env_vars { }; env_vars_t env_vars[] = { - {"SLURM_ACCOUNT", OPT_STRING, &opt.account, NULL }, - {"SLURM_CPUS_PER_TASK", OPT_INT, &opt.cpus_per_task, &opt.cpus_set }, - {"SLURM_CONN_TYPE", OPT_CONN_TYPE, NULL, NULL }, - {"SLURM_CPU_BIND", OPT_CPU_BIND, NULL, NULL }, - {"SLURM_MEM_BIND", OPT_MEM_BIND, NULL, NULL }, - {"SLURM_DEBUG", OPT_DEBUG, NULL, NULL }, - {"SLURM_DISTRIBUTION", OPT_DISTRIB, NULL, NULL }, - {"SLURM_GEOMETRY", OPT_GEOMETRY, NULL, NULL }, - {"SLURM_IMMEDIATE", OPT_INT, &opt.immediate, NULL }, - {"SLURM_JOBID", OPT_INT, &opt.jobid, NULL }, - {"SLURM_KILL_BAD_EXIT", OPT_INT, &opt.kill_bad_exit, NULL }, - {"SLURM_NNODES", OPT_NODES, NULL, NULL }, - {"SLURM_NO_REQUEUE", OPT_INT, &opt.no_requeue, NULL }, - {"SLURM_NO_ROTATE", OPT_NO_ROTATE, NULL, NULL }, - {"SLURM_NPROCS", OPT_INT, &opt.nprocs, &opt.nprocs_set}, - {"SLURM_PARTITION", OPT_STRING, &opt.partition, NULL }, - {"SLURM_REMOTE_CWD", OPT_STRING, &opt.cwd, NULL }, - {"SLURM_TIMELIMIT", OPT_INT, &opt.time_limit, NULL }, - {"SLURM_WAIT", OPT_INT, &opt.max_wait, NULL }, - {"SLURM_DISABLE_STATUS",OPT_INT, &opt.disable_status,NULL }, - {"SLURM_MPI_TYPE", OPT_MPI, NULL, NULL }, - {"SLURM_SRUN_COMM_IFHN",OPT_STRING, &opt.ctrl_comm_ifhn,NULL }, - {"SLURM_SRUN_MULTI", OPT_MULTI, NULL, NULL }, - + {"SBATCH_ACCOUNT", OPT_STRING, &opt.account, NULL }, + {"SBATCH_CPUS_PER_TASK", OPT_INT, &opt.cpus_per_task, &opt.cpus_set }, + {"SBATCH_CONN_TYPE", OPT_CONN_TYPE, NULL, NULL }, + {"SBATCH_DEBUG", OPT_DEBUG, NULL, NULL }, + {"SBATCH_DISTRIBUTION", OPT_DISTRIB, NULL, NULL }, + {"SBATCH_GEOMETRY", OPT_GEOMETRY, NULL, NULL }, + {"SBATCH_IMMEDIATE", OPT_INT, &opt.immediate, NULL }, + {"SBATCH_JOBID", OPT_INT, &opt.jobid, NULL }, + {"SBATCH_NNODES", OPT_NODES, NULL, NULL }, + {"SBATCH_NO_REQUEUE", OPT_INT, &opt.no_requeue, NULL }, + {"SBATCH_NO_ROTATE", OPT_NO_ROTATE, NULL, NULL }, + {"SBATCH_NPROCS", OPT_INT, &opt.nprocs, &opt.nprocs_set}, + {"SBATCH_PARTITION", OPT_STRING, &opt.partition, NULL }, + {"SBATCH_REMOTE_CWD", OPT_STRING, &opt.cwd, NULL }, + {"SBATCH_TIMELIMIT", OPT_INT, &opt.time_limit, NULL }, + {"SBATCH_WAIT", OPT_INT, &opt.max_wait, NULL }, + {"SBATCH_MPI_TYPE", OPT_MPI, NULL, NULL }, {NULL, 0, NULL, NULL} }; @@ -834,18 +603,6 @@ _process_env_var(env_vars_t *e, const char *val) opt.distribution = dt; break; - case OPT_CPU_BIND: - if (_verify_cpu_bind(val, &opt.cpu_bind, - &opt.cpu_bind_type)) - exit(1); - break; - - case OPT_MEM_BIND: - if (_verify_mem_bind(val, &opt.mem_bind, - &opt.mem_bind_type)) - exit(1); - break; - case OPT_NODES: opt.nodes_set = _verify_node_count( val, &opt.min_nodes, @@ -901,44 +658,35 @@ static struct option long_options[] = { {"immediate", no_argument, 0, 'I'}, {"job-name", required_argument, 0, 'J'}, {"no-kill", no_argument, 0, 'k'}, - {"kill-on-bad-exit", no_argument, 0, 'K'}, {"distribution", required_argument, 0, 'm'}, {"ntasks", required_argument, 0, 'n'}, {"nodes", required_argument, 0, 'N'}, {"output", required_argument, 0, 'o'}, {"partition", required_argument, 0, 'p'}, {"dependency", required_argument, 0, 'P'}, - {"quit-on-interrupt", no_argument, 0, 'q'}, - {"quiet", no_argument, 0, 'Q'}, + {"quiet", no_argument, 0, 'q'}, {"relative", required_argument, 0, 'r'}, {"no-rotate", no_argument, 0, 'R'}, {"share", no_argument, 0, 's'}, {"time", required_argument, 0, 't'}, + {"usage", no_argument, 0, 'u'}, {"account", required_argument, 0, 'U'}, {"verbose", no_argument, 0, 'v'}, {"version", no_argument, 0, 'V'}, {"nodelist", required_argument, 0, 'w'}, {"wait", required_argument, 0, 'W'}, {"exclude", required_argument, 0, 'x'}, - {"disable-status", no_argument, 0, 'X'}, - {"no-allocate", no_argument, 0, 'Z'}, {"contiguous", no_argument, 0, LONG_OPT_CONT}, {"exclusive", no_argument, 0, LONG_OPT_EXCLUSIVE}, - {"cpu_bind", required_argument, 0, LONG_OPT_CPU_BIND}, - {"mem_bind", required_argument, 0, LONG_OPT_MEM_BIND}, {"mincpus", required_argument, 0, LONG_OPT_MINCPU}, {"mem", required_argument, 0, LONG_OPT_MEM}, {"mpi", required_argument, 0, LONG_OPT_MPI}, - {"no-shell", no_argument, 0, LONG_OPT_NOSHELL}, {"tmp", required_argument, 0, LONG_OPT_TMP}, {"jobid", required_argument, 0, LONG_OPT_JOBID}, {"msg-timeout", required_argument, 0, LONG_OPT_TIMEO}, - {"max-launch-time", required_argument, 0, LONG_OPT_LAUNCH}, - {"max-exit-timeout", required_argument, 0, LONG_OPT_XTO}, {"uid", required_argument, 0, LONG_OPT_UID}, {"gid", required_argument, 0, LONG_OPT_GID}, {"debugger-test", no_argument, 0, LONG_OPT_DEBUG_TS}, - {"usage", no_argument, 0, LONG_OPT_USAGE}, {"conn-type", required_argument, 0, LONG_OPT_CONNTYPE}, {"test-only", no_argument, 0, LONG_OPT_TEST_ONLY}, {"network", required_argument, 0, LONG_OPT_NETWORK}, @@ -949,13 +697,12 @@ static struct option long_options[] = { {"task-prolog", required_argument, 0, LONG_OPT_TASK_PROLOG}, {"task-epilog", required_argument, 0, LONG_OPT_TASK_EPILOG}, {"nice", optional_argument, 0, LONG_OPT_NICE}, - {"ctrl-comm-ifhn", required_argument, 0, LONG_OPT_CTRL_COMM_IFHN}, {"no-requeue", no_argument, 0, LONG_OPT_NO_REQUEUE}, {NULL, 0, 0, 0} }; static char *opt_string = - "+a:c:C:D:e:g:hHi:IJ:kKm:n:N:o:Op:P:qQr:R:st:U:vVw:W:x:XZ"; + "+a:c:C:D:e:g:hHi:IJ:km:n:N:o:Op:P:qr:R:st:uU:vVw:W:x:"; /* @@ -996,9 +743,12 @@ char *process_options_first_pass(int argc, char **argv) _help(); exit(0); break; - case 'Q': + case 'q': opt.quiet++; break; + case 'u': + _usage(); + exit(0); case 'v': opt.verbose++; break; @@ -1006,9 +756,6 @@ char *process_options_first_pass(int argc, char **argv) _print_version(); exit(0); break; - case LONG_OPT_USAGE: - _usage(); - exit(0); default: /* will be parsed in second pass function */ break; @@ -1200,7 +947,7 @@ static void _opt_batch_script(const void *body, int size) int skipped = 0; int i; - /* getopt_long skips over the first argument, so fill it in blank */ + /* getopt_long skips over the first argument, so fill it in */ argc = 1; argv = xmalloc(sizeof(char *)); argv[0] = "sbatch"; @@ -1234,8 +981,7 @@ static void _opt_batch_script(const void *body, int size) static void _set_options(int argc, char **argv) { int opt_char, option_index = 0; - static bool set_cwd=false, set_name=false; - struct utsname name; + static bool set_cwd=false; optind = 0; while((opt_char = getopt_long(argc, argv, opt_string, @@ -1286,16 +1032,12 @@ static void _set_options(int argc, char **argv) opt.immediate = true; break; case 'J': - set_name = true; xfree(opt.job_name); opt.job_name = xstrdup(optarg); break; case 'k': opt.no_kill = true; break; - case 'K': - opt.kill_bad_exit = true; - break; case 'm': opt.distribution = _verify_dist_type(optarg); if (opt.distribution == -1) { @@ -1335,9 +1077,6 @@ static void _set_options(int argc, char **argv) opt.dependency = _get_int(optarg, "dependency"); break; case 'q': - opt.quit_on_intr = true; - break; - case 'Q': opt.quiet++; break; case 'r': @@ -1353,6 +1092,9 @@ static void _set_options(int argc, char **argv) case 't': opt.time_limit = _get_int(optarg, "time"); break; + case 'u': + _usage(); + exit(0); case 'U': xfree(opt.account); opt.account = xstrdup(optarg); @@ -1385,31 +1127,12 @@ static void _set_options(int argc, char **argv) if (!_valid_node_list(&opt.exc_nodes)) exit(1); break; - case 'X': - opt.disable_status = true; - break; - case 'Z': - opt.no_alloc = true; - uname(&name); - if (strcasecmp(name.sysname, "AIX") == 0) - opt.network = xstrdup("ip"); - break; case LONG_OPT_CONT: opt.contiguous = true; break; case LONG_OPT_EXCLUSIVE: opt.exclusive = true; break; - case LONG_OPT_CPU_BIND: - if (_verify_cpu_bind(optarg, &opt.cpu_bind, - &opt.cpu_bind_type)) - exit(1); - break; - case LONG_OPT_MEM_BIND: - if (_verify_mem_bind(optarg, &opt.mem_bind, - &opt.mem_bind_type)) - exit(1); - break; case LONG_OPT_MINCPU: opt.mincpus = _get_int(optarg, "mincpus"); break; @@ -1428,9 +1151,6 @@ static void _set_options(int argc, char **argv) optarg); } break; - case LONG_OPT_NOSHELL: - opt.noshell = true; - break; case LONG_OPT_TMP: opt.tmpdisk = _to_bytes(optarg); if (opt.tmpdisk < 0) { @@ -1446,14 +1166,6 @@ static void _set_options(int argc, char **argv) opt.msg_timeout = _get_int(optarg, "msg-timeout"); break; - case LONG_OPT_LAUNCH: - opt.max_launch_time = - _get_int(optarg, "max-launch-time"); - break; - case LONG_OPT_XTO: - opt.max_exit_timeout = - _get_int(optarg, "max-exit-timeout"); - break; case LONG_OPT_UID: opt.euid = uid_from_string (optarg); if (opt.euid == (uid_t) -1) @@ -1464,9 +1176,6 @@ static void _set_options(int argc, char **argv) if (opt.egid == (gid_t) -1) fatal ("--gid=\"%s\" invalid", optarg); break; - case LONG_OPT_USAGE: - _usage(); - exit(0); case LONG_OPT_CONNTYPE: opt.conn_type = _verify_conn_type(optarg); break; @@ -1516,10 +1225,6 @@ static void _set_options(int argc, char **argv) exit(1); } break; - case LONG_OPT_CTRL_COMM_IFHN: - xfree(opt.ctrl_comm_ifhn); - opt.ctrl_comm_ifhn = xstrdup(optarg); - break; case LONG_OPT_NO_REQUEUE: opt.no_requeue = true; break; @@ -1543,22 +1248,7 @@ static bool _opt_verify(void) bool verified = true; if (opt.quiet && opt.verbose) { - error ("don't specify both --verbose (-v) and --quiet (-Q)"); - verified = false; - } - - if (opt.no_alloc && !opt.nodelist) { - error("must specify a node list with -Z, --no-allocate."); - verified = false; - } - - if (opt.no_alloc && opt.exc_nodes) { - error("can not specify --exclude list with -Z, --no-allocate."); - verified = false; - } - - if (opt.no_alloc && opt.relative) { - error("do not specify -r,--relative with -Z,--no-allocate."); + error ("don't specify both --verbose (-v) and --quiet (-q)"); verified = false; } @@ -1618,12 +1308,6 @@ static bool _opt_verify(void) } /* else if (opt.nprocs_set && !opt.nodes_set) */ - /* - * --wait always overrides hidden max_exit_timeout - */ - if (opt.max_wait) - opt.max_exit_timeout = opt.max_wait; - if (opt.time_limit == 0) opt.time_limit = INFINITE; @@ -1897,10 +1581,6 @@ static void _opt_list() opt.partition == NULL ? "default" : opt.partition); info("job name : `%s'", opt.job_name); info("distribution : %s", format_task_dist_states(opt.distribution)); - info("cpu_bind : %s", - opt.cpu_bind == NULL ? "default" : opt.cpu_bind); - info("mem_bind : %s", - opt.mem_bind == NULL ? "default" : opt.mem_bind); info("verbose : %d", opt.verbose); info("immediate : %s", tf_(opt.immediate)); info("no-requeue : %s", tf_(opt.no_requeue)); @@ -1937,7 +1617,6 @@ static void _opt_list() info("mail_user : %s", opt.mail_user); info("task_prolog : %s", opt.task_prolog); info("task_epilog : %s", opt.task_epilog); - info("ctrl_comm_ifhn : %s", opt.ctrl_comm_ifhn); str = print_commandline(); info("remote command : `%s'", str); xfree(str); @@ -1956,14 +1635,13 @@ static void _usage(void) " [-W sec]\n" " [--contiguous] [--mincpus=n] [--mem=MB] [--tmp=MB] [-C list]\n" " [--mpi=type] [--account=name] [--dependency=jobid]\n" -" [--kill-on-bad-exit] [--propagate[=rlimits] ]\n" -" [--cpu_bind=...] [--mem_bind=...]\n" +" [--propagate[=rlimits] ]\n" #ifdef HAVE_BG /* Blue gene specific options */ " [--geometry=XxYxZ] [--conn-type=type] [--no-rotate]\n" #endif " [--mail-type=type] [--mail-user=user][--nice[=value]]\n" " [--task-prolog=fname] [--task-epilog=fname]\n" -" [--ctrl-comm-ifhn=addr] [--no-requeue]\n" +" [--no-requeue]\n" " [-w hosts...] [-x hosts...] executable [args...]\n"); } @@ -1986,8 +1664,6 @@ static void _help(void) " -D, --chdir=path change remote current working directory\n" " -I, --immediate exit if resources are not immediately available\n" " -k, --no-kill do not kill job on node failure\n" -" -K, --kill-on-bad-exit kill the job if any task terminates with a\n" -" non-zero exit code\n" " -s, --share share nodes with other jobs\n" " -m, --distribution=type distribution method for processes to nodes\n" " (type = block|cyclic|hostfile)\n" @@ -1996,10 +1672,8 @@ static void _help(void) " --mpi=type type of MPI being used\n" " -W, --wait=sec seconds to wait after first task exits\n" " before killing job\n" -" -q, --quit-on-interrupt quit on single Ctrl-C\n" -" -X, --disable-status Disable Ctrl-C status feature\n" " -v, --verbose verbose mode (multiple -v's increase verbosity)\n" -" -Q, --quiet quiet mode (suppress informational messages)\n" +" -q, --quiet quiet mode (suppress informational messages)\n" " -d, --slurmd-debug=level slurmd debug level\n" " -P, --dependency=jobid defer job until specified jobid completes\n" " --nice[=value] decrease secheduling priority by value\n" @@ -2011,9 +1685,7 @@ static void _help(void) " --begin=time defer job until HH:MM DD/MM/YY\n" " --mail-type=type notify on state change: BEGIN, END, FAIL or ALL\n" " --mail-user=user who to send email notification for job state changes\n" -" --ctrl-comm-ifhn=addr interface hostname for PMI commaunications from srun\n" " --no-requeue if set, do not permit the job to be requeued\n" -" --no-shell don't spawn shell in allocate mode\n" "\n" "Constraint options:\n" " --mincpus=n minimum number of cpus per node\n" @@ -2023,33 +1695,10 @@ static void _help(void) " -C, --constraint=list specify a list of constraints\n" " -w, --nodelist=hosts... request a specific list of hosts\n" " -x, --exclude=hosts... exclude a specific list of hosts\n" -" -Z, --no-allocate don't allocate nodes (must supply -w)\n" "\n" "Consumable resources related options:\n" " --exclusive allocate nodes in exclusive mode when\n" -" cpu consumable resource is enabled\n" -"\n" -"Affinity/Multi-core options: (when the task/affinity plugin is enabled)\n" -" --cpu_bind= Bind tasks to CPUs\n" -" q[uiet], quietly bind before task runs (default)\n" -" v[erbose], verbosely report binding before task runs\n" -" no[ne] don't bind tasks to CPUs (default)\n" -" rank bind by task rank\n" -" map_cpu:<list> bind by mapping CPU IDs to tasks as specified\n" -" where <list> is <cpuid1>,<cpuid2>,...<cpuidN>\n" -" mask_cpu:<list> bind by setting CPU masks on tasks as specified\n" -" where <list> is <mask1>,<mask2>,...<maskN>\n" -" --mem_bind= Bind tasks to memory\n" -" q[uiet], quietly bind before task runs (default)\n" -" v[erbose], verbosely report binding before task runs\n" -" no[ne] don't bind tasks to memory (default)\n" -" rank bind by task rank\n" -" local bind to memory local to processor\n" -" map_mem:<list> bind by mapping memory of CPU IDs to tasks as specified\n" -" where <list> is <cpuid1>,<cpuid2>,...<cpuidN>\n" -" mask_mem:<list> bind by setting menory of CPU masks on tasks as specified\n" -" where <list> is <mask1>,<mask2>,...<maskN>\n"); - +" cpu consumable resource is enabled\n"); printf("\n"); printf( @@ -2068,8 +1717,8 @@ static void _help(void) "\n" #endif "Help options:\n" -" --help show this help message\n" -" --usage display brief usage message\n" +" -h, --help show this help message\n" +" -u, --usage display brief usage message\n" "\n" "Other options:\n" " -V, --version output version information and exit\n" diff --git a/src/sbatch/opt.h b/src/sbatch/opt.h index 4dab126568c..cc604068588 100644 --- a/src/sbatch/opt.h +++ b/src/sbatch/opt.h @@ -71,10 +71,6 @@ typedef struct sbatch_options { bool cpus_set; /* true if cpus_per_task explicitly set */ int min_nodes; /* --nodes=n, -N n */ int max_nodes; /* --nodes=x-n, -N x-n */ - cpu_bind_type_t cpu_bind_type; /* --cpu_bind= */ - char *cpu_bind; /* binding map for map/mask_cpu */ - mem_bind_type_t mem_bind_type; /* --mem_bind= */ - char *mem_bind; /* binding map for map/mask_mem */ bool nodes_set; /* true if nodes explicitly set */ int time_limit; /* --time, -t */ char *partition; /* --partition=n, -p n */ @@ -91,14 +87,10 @@ typedef struct sbatch_options { int immediate; /* -i, --immediate */ bool hold; /* --hold, -H */ - bool noshell; /* --noshell */ bool no_kill; /* --no-kill, -k */ - bool kill_bad_exit; /* --kill-on-bad-exit, -K */ bool no_requeue; /* --no-requeue */ bool share; /* --share, -s */ int max_wait; /* --wait, -W */ - bool quit_on_intr; /* --quit-on-interrupt, -q */ - bool disable_status; /* --disable-status, -X */ int quiet; int verbose; bool test_only; /* --test-only */ @@ -115,9 +107,6 @@ typedef struct sbatch_options { char *nodelist; /* --nodelist=node1,node2,... */ char *exc_nodes; /* --exclude=node1,node2,... -x */ char *relative; /* --relative -r N */ - bool no_alloc; /* --no-allocate, -Z */ - int max_launch_time; /* Undocumented */ - int max_exit_timeout; /* Undocumented */ int msg_timeout; /* Undocumented */ char *network; /* --network= */ bool exclusive; /* --exclusive */ @@ -128,7 +117,6 @@ typedef struct sbatch_options { time_t begin; /* --begin */ uint16_t mail_type; /* --mail-type */ char *mail_user; /* --mail-user */ - char *ctrl_comm_ifhn; /* --ctrl-comm-ifhn */ char *ifname; /* input file name */ char *ofname; /* output file name */ char *efname; /* error file name */ diff --git a/src/slaunch/opt.c b/src/slaunch/opt.c index dd911fc8818..99989682949 100644 --- a/src/slaunch/opt.c +++ b/src/slaunch/opt.c @@ -623,8 +623,6 @@ static void _opt_default() opt.max_launch_time = 120; opt.msg_timeout = 15; } - - opt.no_alloc = false; } /*---[ env var processing ]-----------------------------------------------*/ @@ -656,13 +654,8 @@ env_vars_t env_vars[] = { {"SLAUNCH_DISTRIBUTION", OPT_DISTRIB, NULL, NULL }, {"SLAUNCH_KILL_BAD_EXIT",OPT_INT, &opt.kill_bad_exit, NULL }, {"SLAUNCH_LABELIO", OPT_INT, &opt.labelio, NULL }, - {"SLAUNCH_NUM_NODES", OPT_INT, &opt.num_nodes, &opt.num_nodes_set}, - {"SLAUNCH_NPROCS", OPT_INT, &opt.num_tasks, &opt.num_tasks_set}, {"SLAUNCH_OVERCOMMIT", OPT_OVERCOMMIT,NULL, NULL }, {"SLAUNCH_REMOTE_CWD", OPT_STRING, &opt.cwd, NULL }, - {"SLAUNCH_STDERRMODE", OPT_STRING, &opt.local_efname, NULL }, - {"SLAUNCH_STDINMODE", OPT_STRING, &opt.local_ifname, NULL }, - {"SLAUNCH_STDOUTMODE", OPT_STRING, &opt.local_ofname, NULL }, {"SLAUNCH_TIMELIMIT", OPT_INT, &opt.time_limit, NULL }, {"SLAUNCH_WAIT", OPT_INT, &opt.max_wait, NULL }, {"SLAUNCH_MPI_TYPE", OPT_MPI, NULL, NULL }, @@ -838,22 +831,21 @@ void set_options(const int argc, char **argv) {"label", no_argument, 0, 'l'}, {"nodelist-byid", required_argument, 0, 'L'}, {"distribution", required_argument, 0, 'm'}, - {"ntasks", required_argument, 0, 'n'}, + {"tasks", required_argument, 0, 'n'}, {"nodes", required_argument, 0, 'N'}, {"local-output", required_argument, 0, 'o'}, {"remote-output", required_argument, 0, 'O'}, {"overcommit", no_argument, 0, 'C'}, - {"quiet", no_argument, 0, 'q'}, + {"quiet", no_argument, 0, 'q'}, {"relative", required_argument, 0, 'r'}, {"time", required_argument, 0, 't'}, {"unbuffered", no_argument, 0, 'u'}, {"task-layout-byid", required_argument, 0, 'T'}, {"verbose", no_argument, 0, 'v'}, {"version", no_argument, 0, 'V'}, - {"nodelist", required_argument, 0, 'w'}, + {"nodelist-byname", required_argument, 0, 'w'}, {"wait", required_argument, 0, 'W'}, {"task-layout-byname", required_argument, 0, 'Y'}, - {"no-allocate", no_argument, 0, 'Z'}, {"cpu_bind", required_argument, 0, LONG_OPT_CPU_BIND}, {"mem_bind", required_argument, 0, LONG_OPT_MEM_BIND}, {"core", required_argument, 0, LONG_OPT_CORE}, @@ -877,8 +869,8 @@ void set_options(const int argc, char **argv) {"pmi-threads", required_argument, 0, LONG_OPT_PMI_THREADS}, {NULL, 0, 0, 0} }; - char *opt_string = "+c:Cd:D:e:E:F:hi:I:J:kKlL:m:n:N:" - "o:O:qr:t:T:uvVw:W:Y:Z"; + char *opt_string = + "+c:Cd:D:e:E:F:hi:I:J:kKlL:m:n:N:o:O:qr:t:T:uvVw:W:Y:"; struct option *optz = spank_option_table_create (long_options); @@ -1048,12 +1040,6 @@ void set_options(const int argc, char **argv) opt.task_layout = xstrdup(optarg); opt.task_layout_byname_set = true; break; - case 'Z': - opt.no_alloc = true; - uname(&name); - if (strcasecmp(name.sysname, "AIX") == 0) - opt.network = xstrdup("ip"); - break; case LONG_OPT_CPU_BIND: if (_verify_cpu_bind(optarg, &opt.cpu_bind, &opt.cpu_bind_type)) @@ -1619,7 +1605,6 @@ static bool _opt_verify(void) } else if (opt.num_tasks > hostlist_count(task_l)) { error("Asked for more tasks (%d) than listed" " in the task layout (%d)", - opt.num_tasks, hostlist_count(task_l)); verified = false; } else { @@ -1679,22 +1664,11 @@ static bool _opt_verify(void) } if (opt.quiet && opt.verbose) { - error ("don't specify both --verbose (-v) and --quiet (-Q)"); - verified = false; - } - - if (opt.no_alloc && !opt.nodelist) { - error("must specify a node list with -Z/--no-allocate."); + error ("don't specify both --verbose (-v) and --quiet (-q)"); verified = false; } if (opt.relative_set) { - if (opt.no_alloc) { - error("-r/--relative not allowed with" - " -Z/--no-allocate."); - verified = false; - } - if (opt.nodelist != NULL) { error("-r/--relative not allowed with" " -w/--nodelist."); @@ -1991,7 +1965,7 @@ static void _help(void) " -W, --wait=sec seconds to wait after first task exits\n" " before killing job\n" " -v, --verbose verbose mode (multiple -v's increase verbosity)\n" -" -Q, --quiet quiet mode (suppress informational messages)\n" +" -q, --quiet quiet mode (suppress informational messages)\n" " -d, --slurmd-debug=level slurmd debug level\n" " --core=type change default corefile format type\n" " (type=\"list\" to list of valid formats)\n" @@ -2004,9 +1978,7 @@ static void _help(void) " --ctrl-comm-ifhn=addr interface hostname for PMI commaunications from slaunch\n" " --multi-prog if set the program name specified is the\n" " configuration specificaiton for multiple programs\n" -"\n" " -w, --nodelist=hosts... request a specific list of hosts\n" -" -Z, --no-allocate don't allocate nodes (must supply -w)\n" "\n" "Affinity/Multi-core options: (when the task/affinity plugin is enabled)\n" " --cpu_bind= Bind tasks to CPUs\n" diff --git a/src/slaunch/opt.h b/src/slaunch/opt.h index 01645498b91..72573a9ab42 100644 --- a/src/slaunch/opt.h +++ b/src/slaunch/opt.h @@ -117,7 +117,6 @@ typedef struct slaunch_options { bool task_layout_file_set; int relative; /* --relative -r N */ bool relative_set; /* true if --relative set explicitly */ - bool no_alloc; /* --no-allocate, -Z */ int max_launch_time; /* Undocumented */ int max_exit_timeout; /* Undocumented */ int msg_timeout; /* Undocumented */ -- GitLab