- Oct 24, 2013
-
-
Morris Jette authored
-
- Oct 22, 2013
-
-
Morris Jette authored
Add cgroup create retry logic in case one step is starting at the same time as another step is ending and the logic to create and delete cgroups overlaps. bug 447
-
Dave Henseler authored
-
Morris Jette authored
I did the merge improperly
-
-
Morris Jette authored
If a node has GRES and multiple threads per core the select/cons_res plugin can get stuck in an infinite loop. See bug 475 Contributed by: PREVOST Ludovic NEC HPC Europe
-
Morris Jette authored
-
Morris Jette authored
-
Thomas Cadeau authored
If slurmd fails to get IPMI value, then I propose to force to wait 1 second instead of asking BMC again. (Part 3/4 of the patch). If IPMI init fails when slurmd forces to update the value, then we should not update the value. (Part 4/4 of the patch Part 1/4 and 2/4 add a security in IPMI init because the function can be call several time. This force to return SLURM_FAILURE if the first call failed, since the other call will not do anything. bug 469
-
Morris Jette authored
Previously a node failure would always requeue the job
-
- Oct 21, 2013
-
-
Morris Jette authored
Restore default behavior of allocating cores to jobs on a cyclic basis across the sockets unless SelectTypeParameters=CR_CORE_DEFAULT_DIST_BLOCK or user specifies other distribution options. Reverts commit 7fcdc7e5 bug 466
-
- Oct 20, 2013
-
-
jette authored
Change Sockets to SocketsPerBoard and Procs to CPUs
-
jette authored
If the backfill scheduler relinquishes locks and the normal job scheduler starts a job that the backfill scheduler was actively working, the backfill scheduler will try to re-schedule that same job, possibly resulting in an invalid memory reference or other badness.
-
jette authored
-
- Oct 19, 2013
-
-
Morris Jette authored
Fix for --cpu_bind=map_cpu/mask_cpu/map_ldom/mask_ldom plus --mem_bind=map_mem/mask_mem options, broken in 2.6.2. See commit 718382da
-
Morris Jette authored
Expect was failing periodicallly due to apparent timing problems
-
David Bigagli authored
-
- Oct 18, 2013
-
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
This messsage type: Warning: Note very large processing time from schedule: usec=9467365 began=11:06:23.003 is reporting the end time as the began value
-
Danny Auble authored
-
- Oct 17, 2013
-
-
Morris Jette authored
-
Morris Jette authored
This prevents premature re-sending of job kill RPC (e.g. "Resending TERMINATE_JOB request JobId=#")
-
Danny Auble authored
-
David Bigagli authored
-
- Oct 16, 2013
-
-
Chrysovalantis Paschoulas authored
-
jette authored
-
Morris Jette authored
If the default partition has shared=force, then each job is allocated whole nodes and core reservations tests are not valid
-
- Oct 15, 2013
-
-
Morris Jette authored
-
Trofinoff, Stephen authored
-
Martin Perry authored
-
Danny Auble authored
-
Filip Skalski authored
This fixes another error in job priority calculations
-
- Oct 14, 2013
-
-
Filip Skalski authored
-
Filip Skalski authored
-
Nathan Yee authored
-
jette authored
-
jette authored
The pending jobs will have their reservation info removed bug 455
-
- Oct 11, 2013
-
-
Morris Jette authored
Increase maximum number of hostlist ranges from 12k to 64k and use malloc to allocate memory rather than using the stack bug 458
-
Morris Jette authored
Initiate jobs pending to run in a reservation as soon as the reservation becomes active. Partial fix for bug 455
-