- Jun 06, 2017
-
-
Isaac Hartung authored
-
Isaac Hartung authored
Routes request to origin and then to the cluster running the job.
-
Isaac Hartung authored
Should be fed1,2,3 and not fed1,2,2
-
Isaac Hartung authored
-
Brian Christiansen authored
-
Morris Jette authored
-
Morris Jette authored
-
- Jun 05, 2017
-
-
Tim Wickberg authored
-
Tim Wickberg authored
The glibc version does not double-fork like Slurm's version. Rename to xdaemon to avoid confusion and ensure no one tries to remove it again without further investigation (see commits 6f45a2bf/954288ad). While here, simplify it as the arguements are unchanged in the three calling locations. Remove incorrect documentation as well - our xdaemon() call does not close everything.
-
Tim Wickberg authored
Our local implementation was being used instead of glibc, and does have one subtle difference - it double forks(), whereas glibc single-forks, leading to some slight differences in process control behavior. Revert this and take a different approach. This reverts commit 6f45a2bf.
-
- Jun 03, 2017
-
-
Danny Auble authored
Fix regression from commit c05dcb8a (bug 1923) that doesn't take into consideration a blank char * as a valid option. This fixes the scenario like sacctmgr list associations user='' which would only print account associations. Bug 3862
-
- Jun 02, 2017
-
-
Danny Auble authored
a good return code. This also fixes the situation where the step was ending but not yet ended so it sends the KILL_TASK_FAILED error instead of JOB_NOTRUNNING. Also it removes the abort in favor of exit which it should had been anyways. Bug 3758
-
Dominik Bartkiewicz authored
list_for_each)
-
Gary B Skouson authored
which the backfill test window expands. This can be used on a system with a modest number of running jobs (hundreds of jobs) to help prevent expected start times of pending jobs to get pushed forward in time. On systems with large numbers of running jobs, performance of the backfill scheduler will suffer and fewer jobs will be evaluated. Bug 3790
-
- Jun 01, 2017
-
-
Danny Auble authored
This reverts commit da414931.
-
Danny Auble authored
which the backfill test window expands. This can be used on a system with a modest number of running jobs (hundreds of jobs) to help prevent expected start times of pending jobs to get pushed forward in time. On systems with large numbers of running jobs, performance of the backfill scheduler will suffer and fewer jobs will be evaluated. Bug 3790
-
Mark Klein authored
Bug 3671
-
Mark Klein authored
Inadvertently set to one when requested. Bug 3855.
-
Tim Wickberg authored
Bug 3857.
-
Doug Jacobsen authored
-
Doug Jacobsen authored
Bug 3808
-
Danny Auble authored
# Conflicts: # src/slurmctld/job_mgr.c
-
Pablo Escobar authored
bug 3846
-
Danny Auble authored
-
Danny Auble authored
purge_files_list.
-
Danny Auble authored
-
Tim Wickberg authored
File deletion can be slow, especially when StateSaveLocation in on NFS or other network filesystems. Since purge_old_job() holds all the slurmctld write locks, this is especially performance sensitive. Moving this to an independent thread lets the slower filesystem cleanup happen without owning these locks. purge_old_job() then results in the purged job ids being queued in the purge_list. A race with the job id potentially wrapping around again is already prevented by _dup_job_file_test() in get_next_job_id(). Bug 3763.
-
Tim Wickberg authored
Only called from _list_delete_job once the MinJobAge has passed.
-
Tim Wickberg authored
This will need to be handled differently. The timeout can lead to the purge process falling further and further behind on high throughput systems if the number of job scripts that can be deleted within a second is lower than the job submission and completion rate of the cluster, eventually leading to the MaxJobCount limit being reached. Bug 3763.
-
Danny Auble authored
-
Danny Auble authored
-
- May 31, 2017
-
-
Danny Auble authored
it works better on multi-slurmd installs.
-
Isaac Hartung authored
Should be fed1,2,3 and not fed1,2,2
-
Isaac Hartung authored
Bug 3839
-
Tim Wickberg authored
Revert some of my b50f4661. Elaborate on tradeoffs, and point to HTC page as well which is a better location for this info.
-
Danny Auble authored
-
Brian Christiansen authored
-
Tim Wickberg authored
This is better discussed in the high_throughput.shtml doc. Also, "Contrain" is misspelled adding to the confusion.
-
Isaac Hartung authored
To submit sibling jobs to clusters that don't have the specified features. Bug 3859
-
Isaac Hartung authored
-