- Mar 02, 2014
-
-
jette authored
Add support for SchedulerParameters value of bf_max_job_start that limits the total number of jobs that can be started in a single iteration of the backfill scheduler. bug 607
-
- Feb 27, 2014
-
-
Morris Jette authored
bug 607
-
Danny Auble authored
-
Danny Auble authored
-
- Feb 26, 2014
-
-
Danny Auble authored
-
Danny Auble authored
-
- Feb 25, 2014
-
-
Danny Auble authored
placed.
-
David Bigagli authored
-
- Feb 21, 2014
-
-
Danny Auble authored
count.
-
Danny Auble authored
nodes.
-
Danny Auble authored
-
- Feb 20, 2014
-
-
Morris Jette authored
If a job requires specific nodes and can not run due to those nodes being busy, the main scheduling loop will block those specific nodes rather than the entire queue/partition. bug 595
-
Morris Jette authored
-
- Feb 19, 2014
-
-
David Bigagli authored
is not a corresponding association.
-
David Bigagli authored
is not a corresponding association.
-
- Feb 14, 2014
-
-
David Bigagli authored
-
Danny Auble authored
needed to forward a message the slurmd would core dump.
-
- Feb 13, 2014
-
-
David Bigagli authored
describing that jobs must be drained from cluster before deploying any checkpoint plugin.
-
- Feb 12, 2014
-
-
Morris Jette authored
Properly enforce a job's cpus-per-task option when a job's allocation is constrained on some nodes by the mem-per-cpu option. bug 590
-
- Feb 10, 2014
-
-
Morris Jette authored
-
- Feb 09, 2014
-
-
Moe Jette authored
-
- Feb 08, 2014
-
-
Danny Auble authored
-
Danny Auble authored
-
- Feb 07, 2014
-
-
Morris Jette authored
bug 586
-
- Feb 05, 2014
-
-
Danny Auble authored
-
Dominik Bartkiewicz authored
Set GPU_DEVICE_ORDINAL environment variable.
-
Danny Auble authored
-
- Feb 04, 2014
-
-
Morris Jette authored
Previous logic would try to pick a specific node count and on a heterogeneous system, this would cause a problem. This change largely reverts commit a270417b
-
Danny Auble authored
-
- Feb 03, 2014
-
-
Danny Auble authored
-
- Jan 31, 2014
-
-
David Bigagli authored
-
Danny Auble authored
i.e. salloc -n32 doesn't request the number of nodes and with the previous code if this request used 4 nodes and only 1 was left in GrpNodes it would just run with no issue since we were checking things before we selected how many nodes it ran on. Now we check this afterwards so we always check the limits on how many nodes, cpus and how much memory is to be used.
-
Morris Jette authored
Fix step allocation when some CPUs are not available due to memory limits. This happens when one step is active and using memory that blocks the scheduling of another step on a portion of the CPUs needed. The new step is now delayed rather than aborting with "Requested node configuration is not available". bug 577
-
- Jan 28, 2014
-
-
Danny Auble authored
based on ionode count correctly on slurmctld restart.
-
- Jan 23, 2014
-
-
Danny Auble authored
connect in a loop instead of producing a fatal.
-
Danny Auble authored
-
- Jan 21, 2014
-
-
David Bigagli authored
-
David Bigagli authored
This reverts commit 2fa28eb6. Conflicts: NEWS
-
- Jan 18, 2014
-
-
David Bigagli authored
data correctly accumulating differences between sampling intervals. Fix the data structure mismatch between acct_gather_filesystem_lustre.c and slurm_jobacct_gather.h which caused the hdf5 plugin to log incorrect data.
-
- Jan 16, 2014
-
-
David Bigagli authored
the srun help.
-