Cray PMI refinements
Refine commit 5f89223f based upon feedback from David Gloe: * It's not only MPI jobs, but anything that uses PMI. That includes MPI, shmem, etc, so you may want to reword the error message. * I added the terminated flag because if multiple tasks on a node exit, you would get an error message from each of them. That reduces it to one error message per node. Cray bug 810310 prompted that change. * Since we're now relying on --kill-on-bad-exit, I think we should update the Cray slurm.conf template to default to 1 (set KillOnBadExit=1 in contribs/cray/slurm.conf.template). bug 1171
Loading
Please register or sign in to comment