- Aug 02, 2017
-
-
Morris Jette authored
-
- Aug 01, 2017
-
-
Morris Jette authored
Without this change, each component would generate separate email at job begin, end, etc.
-
Morris Jette authored
-
Morris Jette authored
If the pack job allocation partially failed, properly handle accounting and deallocation of burst buffer. Note, this should rarely happen.
-
- Jul 29, 2017
-
-
Morris Jette authored
-
- Jul 28, 2017
-
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
If a pack job is only partitially allocated resources (likely due due to limits), deallocate resources from those components which have been started and requeue them.
-
Morris Jette authored
Previous logic could corrupt a pack job's tres_alloc_count array resulting in slurmctld's accounting module abort.
-
Morris Jette authored
-
Morris Jette authored
Perform limit check on heterogeneous job as a whole at submit time to reject jobs that will never be able to run. Accepting pack jobs that can never start will have a significant effect on scheduling in general (blocking the queue).
-
- Jul 27, 2017
-
-
Morris Jette authored
This change adds a new function and moves some logic around so that limits can be tested on a pack job as a whole (that logic still needs to be developed).
-
- Jul 26, 2017
-
-
Morris Jette authored
-
Morris Jette authored
This should never happen, but if we start some pack job components and for unexpected reasons fail to start others at the same time, the components that remain pending will be able to start at a later time so long as the other components can either 1. start at the same time OR 2. have already been started
-
Morris Jette authored
-
- Jul 25, 2017
-
-
Morris Jette authored
Adds assocation and QOS limits for the pack job as a whole
-
Morris Jette authored
Don't requeue a batch pack job component that is not found node zero of the allocation. Only the first pack job component is expected to have a running script.
-
Morris Jette authored
Clear a job's "wait reason" value of BeginTime" after that time has passed. Previously a readon of "BeginTime" could be reported long after the job's requested begin time had passed (for so long as the current reason is "None".
-
Morris Jette authored
-
- Jul 24, 2017
-
-
Morris Jette authored
Add support to sched/backfill for concurrent allocation of all pack job components including support of --time-min option.
-
Isaac Hartung authored
-
Isaac Hartung authored
-
- Jul 22, 2017
-
-
Morris Jette authored
-
- Jul 21, 2017
-
-
Morris Jette authored
Don't try to launch pack job component ID != 0 Make pack job batch test38.2 more robust Add completion time data to backfill data structure to support deadline and min-time options
-
Morris Jette authored
-
- Jul 20, 2017
-
-
Morris Jette authored
This is a work in progress, not ready for use yet.
-
- Jul 19, 2017
-
-
Morris Jette authored
This removes several define statements with different names in various functions
-
Morris Jette authored
-
Morris Jette authored
Fix for possible slurmctld abort with use of salloc/sbatch/srun --gres-flags=enforce-binding option. bug 4008
-
Morris Jette authored
Update from commit b40bd8d3
-
Morris Jette authored
-
Brian Christiansen authored
Clarify --immediate option.
-
- Jul 18, 2017
-
-
Morris Jette authored
-
Morris Jette authored
-
Dominik Bartkiewicz authored
By removing the real locks we can get into a race condition where the prolog starts and finishes before we get here and then we end up waiting forever. Making the mutex a static seemed to help in many cases, but didn't completely close the window. Changing slurm_cond_wait to slurm_cond_timedwait fixed the scenario where we would hit the window, but not degrade performance the original commit provides. There were also spots where if the job or step didn't exist it wouldn't signal the conditional also providing a spot this could get stuck not starting the job. Fix regression from commit 52ce3ff0 Bug 3977
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
Fix for debugger setup bug introduced in commit f1110568
-
Morris Jette authored
-
- Jul 17, 2017
-
-
Morris Jette authored
Avoid interleaving labels and output from various components of a pack job
-