Skip to content
Snippets Groups Projects
Commit eeec6cd2 authored by Carlos Tripiana Montes's avatar Carlos Tripiana Montes Committed by Danny Auble
Browse files

job_container/tmpfs: add functionality to restore NSs state after restart

container_p_restore get now the list of jobs running from the spool dir
with stepd_available.

Then, it iterates over basepath entries and, for those which seems to
have been a mount point (has .ns file), tries to mount it again.

If it succeeds (it must), and if for this mount point the job is dead,
it releases resources and tries to delete files. Remember the removal
can fail if a resource is leaked. These would be fixed if slurmd starts
after HW reboot (no kernel leaks).

Bug 11093
parent c530b15e
No related branches found
No related tags found
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment