- Oct 22, 2013
-
-
Thomas Cadeau authored
If slurmd fails to get IPMI value, then I propose to force to wait 1 second instead of asking BMC again. (Part 3/4 of the patch). If IPMI init fails when slurmd forces to update the value, then we should not update the value. (Part 4/4 of the patch Part 1/4 and 2/4 add a security in IPMI init because the function can be call several time. This force to return SLURM_FAILURE if the first call failed, since the other call will not do anything. bug 469
-
Morris Jette authored
Previously a node failure would always requeue the job
-
- Oct 21, 2013
-
-
Morris Jette authored
Restore default behavior of allocating cores to jobs on a cyclic basis across the sockets unless SelectTypeParameters=CR_CORE_DEFAULT_DIST_BLOCK or user specifies other distribution options. Reverts commit 7fcdc7e5 bug 466
-
- Oct 20, 2013
-
-
jette authored
Change Sockets to SocketsPerBoard and Procs to CPUs
-
jette authored
If the backfill scheduler relinquishes locks and the normal job scheduler starts a job that the backfill scheduler was actively working, the backfill scheduler will try to re-schedule that same job, possibly resulting in an invalid memory reference or other badness.
-
jette authored
-
- Oct 19, 2013
-
-
Morris Jette authored
Fix for --cpu_bind=map_cpu/mask_cpu/map_ldom/mask_ldom plus --mem_bind=map_mem/mask_mem options, broken in 2.6.2. See commit 718382da
-
Morris Jette authored
Expect was failing periodicallly due to apparent timing problems
-
David Bigagli authored
-
- Oct 18, 2013
-
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
This messsage type: Warning: Note very large processing time from schedule: usec=9467365 began=11:06:23.003 is reporting the end time as the began value
-
Danny Auble authored
-
- Oct 17, 2013
-
-
Morris Jette authored
-
Morris Jette authored
This prevents premature re-sending of job kill RPC (e.g. "Resending TERMINATE_JOB request JobId=#")
-
Danny Auble authored
-
David Bigagli authored
-
- Oct 16, 2013
-
-
Chrysovalantis Paschoulas authored
-
jette authored
-
Morris Jette authored
If the default partition has shared=force, then each job is allocated whole nodes and core reservations tests are not valid
-
- Oct 15, 2013
-
-
Morris Jette authored
-
Trofinoff, Stephen authored
-
Martin Perry authored
-
Danny Auble authored
-
Filip Skalski authored
This fixes another error in job priority calculations
-
- Oct 14, 2013
-
-
Filip Skalski authored
-
Filip Skalski authored
-
Nathan Yee authored
-
jette authored
-
jette authored
The pending jobs will have their reservation info removed bug 455
-
- Oct 11, 2013
-
-
Morris Jette authored
Increase maximum number of hostlist ranges from 12k to 64k and use malloc to allocate memory rather than using the stack bug 458
-
Morris Jette authored
Initiate jobs pending to run in a reservation as soon as the reservation becomes active. Partial fix for bug 455
-
Morris Jette authored
Revert commit 626be3ea It was causing stack overflow and memory corruption
-
Martin Perry authored
-
Morris Jette authored
Previous logic only reported un-reserved node map. New logging adds information about each job testing and where/when it is scheduled resources.
-
Danny Auble authored
-
Danny Auble authored
slurm.conf when using the DBD.
-
- Oct 10, 2013
-
-
jette authored
Induced by bf_continue option and deleting a partition.
-
jette authored
-
Danny Auble authored
-