salloc: add support for Cray
This adds support for execution of salloc on a local Cray system, disabling node sharing (still not supported on XT/XE). It further disables running salloc within salloc, as it leads to errors: since Cray uses process group / PAGG IDs for tracking its reservations, running salloc from within salloc invariably leads to a ALPS resource allocation error. Thirdly, it disable Cray node allocation on non-Cray systems, since this requires that the host on which salloc spawns the shell process is capable of Cray task launch. If it is not, then the remote slurmctld will reserve the requested nodes, but the local host runninc salloc will neither be able to confirm the ALPS reservation (due to the absence of a local apbasil command), nor would it be able to run jobs on the compute nodes. To distinguish this case from general task launch (we use a frontend host where salloc could end up running jobs on different clusters, depending on the value exported via $SLURM_CONF), the following condition is tested: * Cray build support has been enabled (HAVE_CRAY); * the loaded slurm.conf uses select/cray (required on Cray hosts); * the local host does not have support for apbasil (HAVE_NATIVE_CRAY undefined). Since the 'apbasil' command is only available on native Cray systems, this combination of conditions seems sufficient to prevent accidentally using salloc on a host which does not support it. (For sbatch the case is different, since the job script runs on the remote host.) 11_salloc.diff done with minor change for Cray emulation
Loading
Please register or sign in to comment