Commits · 802eb9aef029f85a68098c4bd4181967d3804146 · tud-zih-energy / Slurm

Oct 22, 2013

acct_gather_energy/ipmi - Add delay before retry on read error. · 802eb9ae

Thomas Cadeau authored 11 years ago

If slurmd fails to get IPMI value, then I propose to force to wait 1 second instead of asking BMC again. (Part 3/4 of the patch).
If IPMI init fails when slurmd forces to update the value, then we should not update the value. (Part 4/4 of the patch
Part 1/4 and 2/4 add a security in IPMI init because the function can be call several time.
This force to return SLURM_FAILURE if the first call failed, since the other call will not do anything.

bug 469

802eb9ae

Enforce JobRequeue configuration parameter · 351b1f50
Morris Jette authored 11 years ago
```
Previously a node failure would always requeue the job
```
351b1f50

Oct 21, 2013

select/cons_res - allocate cores cyclic across sockets · 0cbcba1a

Morris Jette authored 11 years ago

Restore default behavior of allocating cores to jobs on a cyclic basis
across the sockets unless SelectTypeParameters=CR_CORE_DEFAULT_DIST_BLOCK
or user specifies other distribution options.
Reverts commit 7fcdc7e5
bug 466

0cbcba1a

Oct 20, 2013

Make slurmd -C format match slurm.conf · e1dc6635
jette authored 11 years ago
```
Change Sockets to SocketsPerBoard and Procs to CPUs
```
e1dc6635

sched/backfill - Prevent invalid memory ref with bf_continue · ea1b316c

jette authored 11 years ago

If the backfill scheduler relinquishes locks and the normal job
scheduler starts a job that the backfill scheduler was actively
working, the backfill scheduler will try to re-schedule that
same job, possibly resulting in an invalid memory reference
or other badness.

ea1b316c

Expand description of slurm.conf scheduling options · 211ccca2
jette authored 11 years ago

211ccca2

Oct 19, 2013
- cpu/mem_bind fix · 1537c161
  Morris Jette authored 11 years ago
  
  Fix for --cpu_bind=map_cpu/mask_cpu/map_ldom/mask_ldom plus --mem_bind=map_mem/mask_mem options, broken in 2.6.2. See commit 718382da
  1537c161
- Make regression test more robust · 2ac98769
  Morris Jette authored 11 years ago
  
  Expect was failing periodicallly due to apparent timing problems
  2ac98769
- Replace the tempname() function call with mkstemp(). · 68deb76d
  David Bigagli authored 11 years ago
  
  68deb76d
Oct 18, 2013
- Clarify PriorityFlags configuration parameter use · 24c67c3b
  Morris Jette authored 11 years ago
  
  24c67c3b
- Move cpuset vars · ffbd7540
  Morris Jette authored 11 years ago
  
  ffbd7540
- Correct began time in logging of slow events · fe0ec976
  Morris Jette authored 11 years ago
  
  This messsage type: Warning: Note very large processing time from schedule: usec=9467365 began=11:06:23.003 is reporting the end time as the began value
  fe0ec976
- Fix warning with gcc 4.8 · 346fc106
  Danny Auble authored 11 years ago
  
  346fc106
Oct 17, 2013
- Add timers to JobSubmit plugin functions · 0e07b229
  Morris Jette authored 11 years ago
  
  0e07b229
- Set job last active time on cancel · 332ae5eb
  Morris Jette authored 11 years ago
  
  This prevents premature re-sending of job kill RPC (e.g. "Resending TERMINATE_JOB request JobId=#")
  332ae5eb
- task/cgroup - handle new cpuset files, similar to commit c4223940 . · 202cfaca
  Danny Auble authored 11 years ago
  
  202cfaca
- Fixed typo about command case in quickstart.html. · ce0d3775
  David Bigagli authored 11 years ago
  
  ce0d3775
Oct 16, 2013
- init scripts ignore quotes around Pid file name specifications · c667a995
  Chrysovalantis Paschoulas authored 11 years ago
  
  c667a995
- Modify test to work if partition name contains "." · b99ef964
  jette authored 11 years ago
  
  b99ef964
- Disable some reservation tests with shared=force · 6c3f5e2e
  Morris Jette authored 11 years ago
  
  If the default partition has shared=force, then each job is allocated whole nodes and core reservations tests are not valid
  6c3f5e2e
Oct 15, 2013
- Support default partition name with "." in test suite · 07927348
  Morris Jette authored 11 years ago
  
  07927348
- Report AccountingStorageBackupHost with "scontrol show config" · 9496ea6c
  Trofinoff, Stephen authored 11 years ago
  
  9496ea6c
- Updated documentation to give correct units being displayed. · 71c890a0
  Martin Perry authored 11 years ago
  
  71c890a0
- Memory freeing up to avoid minor memory leaks at close of daemons · 46bac772
  Danny Auble authored 11 years ago
  
  46bac772
- Corrections to job priority calculation · 5bb80164
  Filip Skalski authored 11 years ago
  
  This fixes another error in job priority calculations
  5bb80164
Oct 14, 2013
- Corrections to calculation of a pending job's expected start time. · e1dce4a5
  Filip Skalski authored 11 years ago
  
  e1dce4a5
- Remove some vestigial logic treating job priority of 1 as a special case · 0b68c2ed
  Filip Skalski authored 11 years ago
  
  0b68c2ed
- Add test for job array into two partitions · 222c2db6
  Nathan Yee authored 11 years ago
  
  222c2db6
- Correction to error handling in test 28.5 · d9257969
  jette authored 11 years ago
  
  d9257969
- Purged expired reservation even if it has pending jobs · 4c8af242
  jette authored 11 years ago
  
  The pending jobs will have their reservation info removed bug 455
  4c8af242
Oct 11, 2013
- Expand hostlist range count · 9e3b690f
  Morris Jette authored 11 years ago
  
  Increase maximum number of hostlist ranges from 12k to 64k and use malloc to allocate memory rather than using the stack bug 458
  9e3b690f
- start a reservations jobs asap · 4418593e
  Morris Jette authored 11 years ago
  
  Initiate jobs pending to run in a reservation as soon as the reservation becomes active. Partial fix for bug 455
  4418593e
- Revert hostlist range size · ff281dcb
  Morris Jette authored 11 years ago
  
  Revert commit 626be3ea It was causing stack overflow and memory corruption
  ff281dcb
- Expand maximum hostlist ranges from 12k to 64k elements. · 626be3ea
  Martin Perry authored 11 years ago
  
  626be3ea
- Expand information reported with DebugFlags=backfill · 260eed9b
  Morris Jette authored 11 years ago
  
  Previous logic only reported un-reserved node map. New logging adds information about each job testing and where/when it is scheduled resources.
  260eed9b
- minor capitalization of words · 6124ee39
  Danny Auble authored 11 years ago
  
  6124ee39
- Minor document update to include note about PrivateData=Usage for the · ce55e76c
  Danny Auble authored 11 years ago
  
  slurm.conf when using the DBD.
  ce55e76c
Oct 10, 2013
- sched/backfill - Prevent invalid memory reference · ee9704c1
  jette authored 11 years ago
  
  Induced by bf_continue option and deleting a partition.
  ee9704c1
- Add note about changing JobAcctGatherType in slurm.conf · 79c15d67
  jette authored 11 years ago
  
  79c15d67
- Bring back original functionality using DEBUG · 48fb03e4
  Danny Auble authored 11 years ago
  
  48fb03e4