Commits · d76b4a60a8b099f9075f79739c940c6af95499b1 · tud-zih-energy / Slurm

May 14, 2014
- Jobs hidden only if ALL partitions are hidden · 5fc21da2
  Morris Jette authored 10 years ago
  
  Only if ALL of their partitions are hidden will a job be hidden by default. bug 812
  5fc21da2
May 13, 2014

Correct CR_LLN with node selection by job · 899561b1

Morris Jette authored 10 years ago

Correct SelectTypeParameters=CR_LLN with job selecition of specific nodes.
Previous logic would in most instances allocate resources on all nodes
to the job.

899561b1

Correct squeue job node & CPU counts on requeue · 4f97cae8

Morris Jette authored 10 years ago

Correct squeue's job node and CPU counts for requeued jobs.
Previously, when a job was requeued, its CPU count reported
was that of the previous execution. When combined with the
--ntasks-per-node option, squeue would compute the expected
node count. If the --exclusive option is also used, the node
count reported by squeue could be off by a large margin (e.g.
"sbatch --exclusive --ntasks-per-node=1 -N1 .." on requeue
would use the number of CPUs on the allocated node to recompute
the expected node count).
bug 756

4f97cae8

Fix issue where batch cpuset wasn't looked at correctly in · c5728294
Danny Auble authored 10 years ago
```
jobacct_gather/cgroup.
```
c5728294
Support non-standard slurm.conf path · 3bf2adcd
Morris Jette authored 10 years ago
```
Support SLURM_CONF path which does not have "slurm.conf" as the file name.
bug 803
```
3bf2adcd

May 12, 2014

Retry step create if node not responding · ffad3102

Morris Jette authored 10 years ago

If a job has non-responding node, retry job step create rather than
returning with DOWN node error.
bug 734

ffad3102

Cosmetic mods to NEWS · e17ffc1b
Morris Jette authored 10 years ago

e17ffc1b
Fix support for job --profile=none option · 043e1b08
Puenlap Lee authored 10 years ago
```
Also correct related documentation
```
043e1b08

fix of comp nodes causing backfill to end early · d508ea95

Hongjia Cao authored 10 years ago

Completing nodes is removed when calling _try_sched() for a job, which
is the case in select_nodes(). If _try_sched() thinks the job can run
now but select_nodes() returns ESLURM_NODES_BUSY, the backfill loop will
be ended.

d508ea95

May 09, 2014
- CRAY - make job_container/cncu default when running on a Cray natively · dbf03e40
  Danny Auble authored 10 years ago
  
  dbf03e40
- If an invalid assoc_ptr comes in don't use the id to verify it. · 2261d393
  Danny Auble authored 10 years ago
  
  2261d393
May 08, 2014

Fix sinfo -R to print each node once · b5ace9a8

Morris Jette authored 10 years ago

Fix sinfo -R to print each down/drained node once, rather than once per
partition. This was broken in the sinfo change to process each partition's
information in a separate pthread.

b5ace9a8

Correct sinfo sort fields options · ff518ad1

Morris Jette authored 10 years ago

Correct sinfo --sort fields to match documentation: E => Reason,
H -> Reason Time (new), R -> Partition Name, u/U -> Reason user (new)

ff518ad1

May 07, 2014
- enforce job preemption GraceTime · b8d55249
  Morris Jette authored 10 years ago
  
  Without this patch, jobs with an infinite time limit would have their preemption GraceTime ignored.
  b8d55249
- Disable time limit reset for job being preempted · 52de11ac
  Morris Jette authored 10 years ago
  
  related to bug 789
  52de11ac
- CRAY - make switch/cray default when running on a Cray natively · 1c2200db
  Danny Auble authored 10 years ago
  
  1c2200db
- Fix issue where not enforcing QOS but a partition either allows or denies · b6333a12
  Danny Auble authored 10 years ago
  
  them.
  b6333a12
May 06, 2014
- Start NEWS for v14.03.4 · b4f3f38d
  Morris Jette authored 10 years ago
  
  b4f3f38d
- update news for tag · 70d1e809
  Danny Auble authored 10 years ago
  
  70d1e809
- BGQ - Fix issue with uninitialized variable. · 950a3fd6
  Danny Auble authored 10 years ago
  
  950a3fd6
- Start NEWS for v14.03.4 · 3e95dc32
  Morris Jette authored 10 years ago
  
  3e95dc32
- in slurm.spec, remove cray-mysql-devel requirement · f85e362c
  Morris Jette authored 10 years ago
  
  In slurm.spec file, replace "Requires cray-MySQL-devel-enterprise" with "Requires mysql-devel" per David Gloe.
  f85e362c
May 05, 2014
- Fix perlapi to compile correctly with perl 5.18 · 21ebf585
  Danny Auble authored 10 years ago
  
  21ebf585
- Handle node ranges better when dealing with accounting max node limits. · d849aadb
  Danny Auble authored 10 years ago
  
  d849aadb
- BGQ - Move code to only start job on a block after limits are checked. · 3a4246cc
  Danny Auble authored 10 years ago
  
  Related to bug 771
  3a4246cc
- Correct default batch job output file name · 4334ab7d
  Morris Jette authored 10 years ago
  
  In version 14.03.2 was using "slurm_<jobid>_4294967294.out" due to error in job array logic.
  4334ab7d
- BGQ - Fix issue where limits were checked on midplane counts instead of · 836b654f
  Danny Auble authored 10 years ago
  
  cnode counts.
  836b654f
May 02, 2014
- Update NEWS for next version · 87080f15
  Danny Auble authored 10 years ago
  
  87080f15
- Handle node ranges better when dealing with accounting max node limits. · c6833796
  Danny Auble authored 10 years ago
  
  This is for bug 775
  c6833796
- BGQ - Temp fix issue where job could be left on job_list after it finished. · e4f1a099
  Danny Auble authored 10 years ago
  
  e4f1a099
- Fix issue where user is requesting --acctg-freq=0 and no memory limits. · 17e4e2ac
  Danny Auble authored 10 years ago
  
  17e4e2ac
May 01, 2014
- Fix allowgroup on bad group seg fault with the controller. · 76846134
  Danny Auble authored 10 years ago
  
  regression from 2a674aee
  76846134
- Temporary fix for handling our typemap for the perl api with newer perl. · bffdc7e2
  Danny Auble authored 10 years ago
  
  bffdc7e2
- Fix issue with GrpCPURunMins if a job's timelimit is altered while the job · 98de72e4
  Danny Auble authored 10 years ago
  
  is running.
  98de72e4
- Fix issue where user is requesting --acctg-freq=0 and no memory limits. · 0018cdf4
  Danny Auble authored 10 years ago
  
  0018cdf4
Apr 30, 2014

Correct squeue to not merge jobs with state pending and completing · 8ddadea5
David Bigagli authored 10 years ago
```
together.
```
8ddadea5

switch/nrt - CAU and RMDA tracking correction · 6f66fdef

Morris Jette authored 10 years ago

Switch/nrt - Properly track usage of CAU and RDMA resources with multiple
tasks per compute node. Previous logic would allocate resources once per
task and then deallocate once per node, leaking CMA and RDMA resources
and preventing their use by future jobs.

6f66fdef

ignore prio reset on held jobs · cbcea672

Morris Jette authored 10 years ago

If a job is held, then only release it with the "scontrol release <jobid>"
command rather than a simple reset of the job's priority. This is needed to
support job arrays better. Otherwise a priority reset of a job array
would free all requeued/held jobs from that job array rather than
leaving them held.

cbcea672

Apr 28, 2014
- Fix segfault of sacct -c if spaces are in the variables. · 61641594
  Danny Auble authored 10 years ago
  
  61641594
- Fix sacct -c when using jobcomp/filetxt to read variables that were added · d6ab20b7
  Danny Auble authored 10 years ago
  
  in 2.0 :)
  d6ab20b7