- Jun 16, 2014
-
-
Morris Jette authored
-
- Jun 14, 2014
-
-
jette authored
If FastSchedule=0 is configured and some nodes have not registered for service (so we do not know their actual resource counts), then leave the job pending rather than rejecting it without knowing if it can run later (when the node registers and we know its specs). bug 872
-
jette authored
-
- Jun 13, 2014
- Jun 12, 2014
-
-
Morris Jette authored
For "scontrol --details show job" report the correct CPU_IDs when thre are multiple threads per core (we are translating a core bitmap to CPU IDs). This is an enhancement of commit 83d626ca so the node table is only loaded once for the entire job table. bug 850
-
Martin Perry authored
Correct the record of CPU_IDs allocated to a job if there is more than one CPU per core.
-
Morris Jette authored
If job requests --exclusive then do not use nodes which have any cores in an advanced reservation. Also prevents case where nodes can be shared by other jobs.
-
Morris Jette authored
Disable some logging that would be very slow unless the _DEBUG flag is set in the plugin
-
Morris Jette authored
If job requests --exclusive then do not use nodes which have any cores in an advanced reservation. Previously the job would be allocated all of the cores outside of the advanced reservation.
-
Morris Jette authored
Correct support for partition with Shared=YES configuration. Previous logic would share resources for jobs by default (i.e. if user did not explicitly request --exclusive). bug 758
-
Jens Dreger authored
-
Morris Jette authored
This reording of some code results in cleaner logic
-
Morris Jette authored
collapse the scheduling table when possible to reduce the number of time slots to check for pending jobs. This should improve performance considerably.
-
Morris Jette authored
-
David Bigagli authored
-
Morris Jette authored
-
Morris Jette authored
Previous logic was sometimes building incomplete map
-
- Jun 11, 2014
-
-
David Bigagli authored
-
Morris Jette authored
-
Morris Jette authored
When a decision is made to start a job, if for some reason that job's start failed, the backfill scheduler would previously just exit. With this change, it logs the event and reserves the resources expected to be used and continues down the job queue.
-
Morris Jette authored
This change prevents creation of some back-to-back records with the same resources, but different times.
-
Morris Jette authored
No change in logic
-
Morris Jette authored
Improved logging of backfill scheduling actions Better handling of backfill_resolution logic to avoid creating some records that are not needed Avoid creating some backfill scheduling maps with zero duration The net effect should be slightly improved performance with no significant difference in action
-
Morris Jette authored
Update slurm.conf man page for DebugFlag BackfillMap. This should be considered part of commit 3c2bffb6
-
Morris Jette authored
Add DebugFlag of BackfillMap. Previously a DebugFlag value of Backfill logged information about what it was doing plus a map of expected resouce use in the future. Now that very verbose resource use map is only logged with a DebugFlag value of BackfillMap
-
Morris Jette authored
Log not only the count of jobs tested since the last time locks were released, but also the total job count since the backfill scheduler started.
-
Morris Jette authored
-
Morris Jette authored
Remove duplicate backfill scheduling tests. For example there is no need to test if a job can be started if the only difference from the previous test involves nodes in other partitions that can not be used by the job we are trying to start.
-
- Jun 10, 2014
-
-
Morris Jette authored
The backfill scheduler was always reporting the time that a job was being considered as NOW rather than the time that was really being considered.
-
David Bigagli authored
decreases and total is less than in use.
-
Danny Auble authored
-
Morris Jette authored
Improve how failures in slurmd/slurmstepd communications are logged.
-
- Jun 09, 2014
-
-
Morris Jette authored
mail messages for job array events print now use the job ID using the format "#_# (#)" rather than just the internal job ID.
-
David Bigagli authored
-
Morris Jette authored
This will help limit damage from two active primary slurmctld (split brain problem).
-
- Jun 07, 2014
-
-
David Bigagli authored
it is already running.
-
Morris Jette authored
Duplicate triggers are not not allowed
-
Morris Jette authored
Job profiling leaves a file open
-