- Nov 13, 2015
-
-
Danny Auble authored
-
Danny Auble authored
tree.
-
Danny Auble authored
-
Morris Jette authored
Previously only SlurmUser could do so.
-
Danny Auble authored
-
Tim Wickberg authored
Bug 2006
-
Brian Christiansen authored
Bug 2006
-
Ryan Cox authored
-
Danny Auble authored
cgroup system for the extern step. This needs to happen for accounting purposes, but also makes things simplier.
-
Danny Auble authored
step.
-
Morris Jette authored
-
- Nov 12, 2015
-
-
-
Mark Roberts authored
-
Morris Jette authored
Test if job is pending before looking at burst buffer specs when building the job queue. This should result in a slight speedup in the job scheduling logic.
-
Alejandro Sanchez authored
-
Morris Jette authored
-
Morris Jette authored
Previously only supported by SlurmUser and root.
-
- Nov 11, 2015
-
-
Morris Jette authored
Previously only reserved space for one task of pending job array.
-
-
Morris Jette authored
Support taking node out of FUTURE state with "scontrol reconfig" command. Previous logic would keep node in FUTURE state if that was the original configuration when slurmctld started. If job was running on the node, it will stay running, but the node make not be visible.
-
David Bigagli authored
-
Morris Jette authored
Make SLURM_ARRAY_TASK_MIN, SLURM_ARRAY_TASK_MAX, and SLURM_ARRAY_TASK_STEP environment variables available to PrologSlurmctld and EpilogSlurmctld.
-
- Nov 10, 2015
-
-
Hongjia Cao authored
-
Danny Auble authored
We needed to send a finish from each node in the step whether it had any activity or not. This way the controller knew things were done on the node and the data was sent to the database. Bug 2097
-
Danny Auble authored
-
Danny Auble authored
to get the batch step, no real code change outside of using strcasecmp instead of strcmp.
-
Morris Jette authored
Burst_buffer/cray: Don't stall scheduling of other jobs while a stage-in is in progress. bug 2114
-
Morris Jette authored
Fix to purge terminated jobs with burst buffer errors. bug 2123
-
- Nov 09, 2015
-
-
Morris Jette authored
The prolog_running counter can now exceed 1. New logic raises limit from 1 to 4 before preventing job recovery on restart.
-
David Bigagli authored
-
Thomas Cadeau authored
-
David Bigagli authored
the error happened.
-
- Nov 07, 2015
-
-
Morris Jette authored
Correct preservation of job ID. This effects emulation mode only. bug 2113
-
Morris Jette authored
Added burst_buffer.conf flag parameter of "TeardownFailure" which will teardown and remove a burst buffer after failed stage-in or stage-out. By default, the buffer will be preserved for analysis and manual teardown. bug 2116
-
- Nov 06, 2015
-
-
Morris Jette authored
If a stage-out fails, Slurm leaves the burst buffer in place and logs something like "error: bb_set_use_time: job 98 with allocated burst buffers not found" every minute thereafer. This changes the logic to only log the event one time. bug 2112
-
Morris Jette authored
This is a revision to commit e21b666c which did not fix the problem for all configurations. bug 2086
-
David Bigagli authored
-
Alejandro Sanchez authored
-
Danny Auble authored
Bug 2106 What was happening was the calculation wasn't happening for memory or nodes, just cpus and gres.
-
Morris Jette authored
bug 2086
-