diff --git a/doc/html/heterogeneous_jobs.shtml b/doc/html/heterogeneous_jobs.shtml index 8536709da6e2561c2aa9e5d6f98a2db69101956c..498d4946e377f561fd4941fb387aa2f4eeb76bf4 100644 --- a/doc/html/heterogeneous_jobs.shtml +++ b/doc/html/heterogeneous_jobs.shtml @@ -331,6 +331,17 @@ P0 1: nid00012 P1 0: Wed Jul 5 16:23:07 MDT 2017 </pre> +<p>If multiple srun commands are executed concurrently, this may result in resource +contention (e.g. memory limits preventing some job steps components from being +allocated resources because of two srun commands executing at the same time). +If the srun --pack-group option is used to create multiple job steps (for the +different components of a heterogeneous job), those job steps will be created +sequentially. +When multiple srun commmands execute at the same time, this may result in some +step allocations taking place, while others are delayed. +Only after all job step allocations have been granted will the application +being launched.</p> + <h2><a name="env_var">Environment Variables</a></h2> <p>Slurm environment variables will be set independently for each component of @@ -507,6 +518,6 @@ especially other heterogeneous jobs.</p> <p class="footer"><a href="#top">top</a></p> -<p style="text-align:center;">Last modified 16 August 2017</p> +<p style="text-align:center;">Last modified 17 August 2017</p> <!--#include virtual="footer.txt"--> diff --git a/src/plugins/launch/slurm/launch_slurm.c b/src/plugins/launch/slurm/launch_slurm.c index addb8106599f014809ab1fef802194af1c70bef7..d51623beff11c46665df55689f820540abdd7edb 100644 --- a/src/plugins/launch/slurm/launch_slurm.c +++ b/src/plugins/launch/slurm/launch_slurm.c @@ -798,7 +798,6 @@ extern int launch_p_step_wait(srun_job_t *job, bool got_alloc, opt_t *opt_local) { int rc = 0; -//FIXME-PACK: should we create multiple steps in a single RPC or use threads? slurm_step_launch_wait_finish(job->step_ctx); if ((MPIR_being_debugged == 0) && retry_step_begin && (retry_step_cnt < MAX_STEP_RETRIES)) {