Skip to content
Snippets Groups Projects
  • Morris Jette's avatar
    42081d87
    retry slurm.conf file · 42081d87
    Morris Jette authored
    Add logic to sleep and retry if slurm.conf can't be read.
    Without this, the slurmd daemons may die and when the SlurmdTimeout
    is reached, the nodes will be marked DOWN and their jobs will be
    killed.
    In the long term, it would be good to exit only if the read files
    on program startup, and the daemons keep running with old configuration
    on reconfiguration, but I don't have time to do that work now.
    42081d87
    History
    retry slurm.conf file
    Morris Jette authored
    Add logic to sleep and retry if slurm.conf can't be read.
    Without this, the slurmd daemons may die and when the SlurmdTimeout
    is reached, the nodes will be marked DOWN and their jobs will be
    killed.
    In the long term, it would be good to exit only if the read files
    on program startup, and the daemons keep running with old configuration
    on reconfiguration, but I don't have time to do that work now.