Skip to content
Snippets Groups Projects
  1. Mar 06, 2012
  2. Mar 02, 2012
    • Morris Jette's avatar
      Mods in priority/multifactor for prio=1 · b223af49
      Morris Jette authored
      In SLURM verstion 2.4, we now schedule jobs at priority=1 and no longer treat
      it as a special case.
      b223af49
    • Morris Jette's avatar
      Cosmetic mods to priority logic · 0810353e
      Morris Jette authored
      0810353e
    • Morris Jette's avatar
      Merge branch 'slurm-2.3' · ec372e00
      Morris Jette authored
      ec372e00
    • Morris Jette's avatar
      cray/srun wrapper, don't use aprun -q by default · ea9adc17
      Morris Jette authored
      In cray/srun wrapper, only include aprun "-q" option when srun "--quiet"
      option is used.
      ea9adc17
    • Morris Jette's avatar
      Change a slurmd msg from info() to debug() · 73f915bf
      Morris Jette authored
      73f915bf
    • Morris Jette's avatar
      Merge branch 'slurm-2.3' · c06064bc
      Morris Jette authored
      c06064bc
    • Morris Jette's avatar
      Fix for possible SEGV · ed56303c
      Morris Jette authored
      Here's what seems to have happened:
      
      - A job was pending, waiting for resources.
      - slurm.conf was changed to remove some nodes, and a scontrol reconfigure was done.
      - As a result of the reconfigure, the pending job became non-runnable, due to "Requested node configuration is not available". The scheduler set the job state to JOB_FAILED and called delete_job_details.
      - scontrol reconfigure was done again.
      - read_slurm_conf called _restore_job_dependencies.
      - _restore_job_dependencies called build_feature_list for each job in the job list
      - When build_feature_list tried to reference the now deleted job details for the failed job, it got a segmentation fault.
      
      The problem was reported by a customer on Slurm 2.2.7.  I have not been able to reproduce it on 2.4.0-pre3, although the relevant code looks the same. There may be a timing window. The attached patch attempts to fix the problem by adding a check to _restore_job_dependencies.  If the job state is JOB_FAILED, the job is skipped.
      
      Regards,
      Martin
      
      This is an alternative solutionh to bug316980fix.patch
      ed56303c
  3. Mar 01, 2012
  4. Feb 29, 2012
  5. Feb 28, 2012
  6. Feb 27, 2012
  7. Feb 25, 2012
    • Morris Jette's avatar
      Print negative time as "INVALID" · 131ff55e
      Morris Jette authored
      If a time value to be printed (e.g. job run time) is negative then
      print the value as "INVALID" rather than with negative numbers
      (e.g. "-123--12:-12:-12").
      131ff55e
  8. Feb 24, 2012
Loading