Skip to content
Snippets Groups Projects
  • Morris Jette's avatar
    bccf0f85
    Transfer GPU file information to slurmstepd · bccf0f85
    Morris Jette authored
    Add logic to cache GPU file information (bitmap index mapping to device
    file number) in the slurmd daemon and transfer that information to the
    slurmstepd whenever a job step is initiated. This is needed to set the
    appropriate CUDA_VISIBLE_DEVICES environment variable value when the
    devices are not in strict numeric order (e.g. some GPUs are skipped).
    Based upon work by Nicolas Bigaouette.
    bccf0f85
    History
    Transfer GPU file information to slurmstepd
    Morris Jette authored
    Add logic to cache GPU file information (bitmap index mapping to device
    file number) in the slurmd daemon and transfer that information to the
    slurmstepd whenever a job step is initiated. This is needed to set the
    appropriate CUDA_VISIBLE_DEVICES environment variable value when the
    devices are not in strict numeric order (e.g. some GPUs are skipped).
    Based upon work by Nicolas Bigaouette.