- Sep 03, 2014
-
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
Close old lua file once a new one is loaded
-
John Morrissey authored
-
John Morrissey authored
use native tables for everything that you can just index without any setup, so user scripts don't require any boilerplate setup code for them add reservation state in slurm.reservations
-
- Sep 02, 2014
-
-
Morris Jette authored
Do not send messages to srun while the process is suspended as the authentication credential from Munge may expire by the time it is resumed and thus be ignored. bug 1065
-
Morris Jette authored
-
David Bigagli authored
of --cflags. See #1066.
-
Danny Auble authored
which is also in an array.
-
Danny Auble authored
9fa92d9e
-
Morris Jette authored
-
Morris Jette authored
Refactor the logic which checks a job array task limit and clears the begin time at the same time as setting the wait reason to WAIT_ARRAY_TASK_LIMIT
-
Morris Jette authored
When one task of a job array is started, the remaining job record copy should have it's start time cleared.
-
Danny Auble authored
-
Danny Auble authored
-
David Gloe authored
-
Morris Jette authored
Conflicts: testsuite/expect/test1.84 testsuite/expect/test21.32
-
Nathan Yee authored
I just ran the test suite for slurm 14.04.7, and have a few suggestions and bugfixes: Test 1.35 fails on our system (probably because we limit memory with cgroups). Changing job_mem_opt from "--mem-per-cpu=64" to "--mem-per-cpu=192" in line 61 fixes the problem for us. Test 1.84 fails to recognise node names like "something1-2", ending up with node names "something1" instead. Changing NodeName=(\w+) to NodeName=([^\s]+) fixes the problem. Test 1.97 reports FAILURE when it discovers that SelectTypeParameters is not CR_PACK_NODES. Having "exit 0" instead of "exit 1" in line 50 is perhaps preferable. Test 2.18 fails because the variable $partition never gets set, so no idle nodes are found in line 215. Setting $partition in globals.local helps, but should not be needed, IMO. There is a function "default_partition" in globals that could perhaps be used. The same applies to test 2.19. Test 12.2 fails on our system because the jobs get killed due to memory limit. Increasing the "slack" in job_mem_limit from 4 to 10 in line 269 fixes the problem for us. Tests 21.30, 21.31 and 21.32 fails when run as a non-privileged user. Perhaps they should test for it and exit with a warning instead, like many other tests. Test 22.1 fails on our system because the time zone is different from where the test was written. The problem is that set midnight 1201766400 is only correct in one time zone (and unfortunately for us, not in our :). Perhaps one could use the GNU date command to get the correct seconds-since-epoch regardless of time zone. Something like date +%s --date=2008-01-31 should do it. Unfortunately, I don't know enough Expect (tcl?) to suggest how to implement that. -- Regards, Bjørn-Helge Mevik, dr. scient, Department for Research Computing, University of Oslo
-
Morris Jette authored
-
Morris Jette authored
This is a revision of commit 1f0c210f
-
- Aug 30, 2014
-
-
Morris Jette authored
Remove the "profile" field from the job step information record. The information does not actually exist in the slurmctld step record, but is passed directly from srun to the slurmd on task launch.
-
Danny Auble authored
-
Danny Auble authored
ran inside the allocation can read the environment correctly.
-
Danny Auble authored
which won't accrue time until the job is actually able to run and then set the begin_time to time(NULL)
-
Nathan Yee authored
-
Danny Auble authored
-
Morris Jette authored
If the highest priority job is a job array which can not start additional tasks due to the task count limit, do not let that job prevent the initiation of lower priority jobs.
-
- Aug 29, 2014
-
-
David Bigagli authored
message opcode for easier debug.
-
Rémi Palancher authored
Append subsystem name to cgroup mountpoint path extracted from configuration file cgroup.conf. Without this patch, the script does not execute properly and some garbage is let in cgroup filesystems.
-
Rémi Palancher authored
Not a bug but a cosmetic patch to avoid double definition of variable rmcg and uidcg.
-
Morris Jette authored
Without this change, if the last task of a job array is started, then requeued, it will appear twice in the job array hash table, which typically results in a linked list that loops back on itself (an infinite loop)
-
Morris Jette authored
Without this change, canceling the last task of a job array while pending could result in a fatal error due to NULL bitmap value
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
so when task_id_str is made we update the record in the database.
-
Danny Auble authored
-
Danny Auble authored
-
Morris Jette authored
-
Morris Jette authored
Wait up to 20 seconds for gres.conf file before exiting.
-