- Apr 09, 2013
-
-
Danny Auble authored
-
Morris Jette authored
Fix for bug 258
-
- Apr 02, 2013
-
-
Danny Auble authored
never looked at to determine eligibility of backfillable job.
-
Morris Jette authored
-
Danny Auble authored
and when reading in state from DB2 we find a block that can't be created. You can now do a clean start to rid the bad block.
-
Danny Auble authored
the slurmctld there were software errors on some nodes.
-
Danny Auble authored
without it still existing there. This is extremely rare.
-
Danny Auble authored
a pending job on it we don't kill the job.
-
Danny Auble authored
while it was free cnodes would go into software error and kill the job.
-
- Apr 01, 2013
-
-
Morris Jette authored
Fix for bug 224
-
- Mar 29, 2013
-
-
Danny Auble authored
-
Danny Auble authored
-
- Mar 27, 2013
-
-
Jason Bacon authored
-
Morris Jette authored
WIthout this patch, when the slurmd cold starts or slurmstepd terminates abnormally, the job script file can be left around. bug 243
-
Morris Jette authored
Previously such a job submitted to a DOWN partition would be queued. bug 187
-
- Mar 26, 2013
-
-
Danny Auble authored
-
Danny Auble authored
a reservation when it has the "Ignore_Jobs" flag set. Since jobs could run outside of the reservation in it's nodes without this you could have double time.
-
- Mar 25, 2013
-
-
Morris Jette authored
This is not applicable with launch/aprun
-
Morris Jette authored
-
- Mar 22, 2013
-
-
Morris Jette authored
These changes are required so that select/cray can load select/linear, which is a bit more complex than the other select plugin structures. Export plugin_context_create and plugin_context_destroy symbols from libslurm.so. Correct typo in exported hostlist_sort symbol name Define some functions in select/cray to avoid undefined symbols if the plugin is loaded via libslurm rather than from a slurm command (which has all of the required symbols)
-
- Mar 20, 2013
-
-
Hongjia Cao authored
-
Danny Auble authored
cluster.
-
- Mar 19, 2013
-
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
- Mar 14, 2013
-
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
- Mar 13, 2013
-
-
Morris Jette authored
If step requests more CPUs than possible in specified node count of job allocation then return ESLURM_TOO_MANY_REQUESTED_CPUS rather than ESLURM_NODES_BUSY and retrying.
-
- Mar 12, 2013
-
-
Morris Jette authored
-
- Mar 11, 2013
-
-
Nathan Yee authored
Without this change, when the sbatch --export option is used, many Slurm environment variables are not set unless explicitly exported.
-
Danny Auble authored
-
- Mar 08, 2013
-
-
Morris Jette authored
-
Danny Auble authored
success
-
- Mar 07, 2013
-
-
jette authored
This problem would effect systems in which specific GRES are associated with specific CPUs. One possible result is the CPUs identified as usable could be inappropriate and job would be held when trying to layout out the tasks on CPUs (all done as part of the job allocation process). The other problem is that if multiple GRES are linked to specific CPUs, there was a CPU bitmap OR which should have been an AND, resulting in some CPUs being identified as usable, but not available to all GRES.
-
- Mar 06, 2013
-
-
Danny Auble authored
options in srun, and push that logic to salloc and sbatch. Bug 201
-
Danny Auble authored
and timeout in the runjob_mux trying to send in this situation. Bug 223
-
- Mar 04, 2013
-
-
Morris Jette authored
The original reservation data structure is deleted and it's backup added to the reservation list, but jobs can retain a pointer to the original (now invalid) reservation data structure. Bug 250
-
Alejandro Lucero Palau authored
-
- Mar 01, 2013
-
-
Danny Auble authored
-