- May 12, 2016
-
-
Morris Jette authored
Put header files in alphabetic order, No change in logic
-
Danny Auble authored
# Conflicts: # src/slurmctld/controller.c
-
Danny Auble authored
-
Danny Auble authored
trying to verify the cluster name (which may try to /create/ files or directories) *before* dropping privs results in a fatal error as slurmctld tries to create items which ultimately fail. Moving this process until after the privs and uid have changed allows the process to succeed. Reported by Jon Nelson <jdnelson@dyn.com> Bug 2728
-
Morris Jette authored
Reject invalid step at submit time rather than leaving it queued. Bug 2722 describes one of the use cases triggering the bug.
-
Morris Jette authored
-
Morris Jette authored
Minor update to commit 2fad3bcf This leaves the files locked until file write completes.
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Morris Jette authored
Disable a job deadline test with no input time limit if the default partition has a default time limit.
-
Morris Jette authored
This partially restores commit 03b2cfb5 Logic was not closing file descriptor, which left the file locked and leaked an open file descriptor.
-
- May 11, 2016
-
-
Danny Auble authored
tasks-per-node/nodes != tasks print warning and ignore ntasks-per-node. Bug 2520
-
Brian Christiansen authored
On a Cray, the output file isn't being created the second time.
-
Morris Jette authored
Make test_id in more tests be just the numeric value rather than "test#.#" for consistency with the other tests.
-
Morris Jette authored
Make test_id in test30.1 be just the numeric value rather than "test30.1" for consistency with the other tests.
-
Brian Christiansen authored
The account still had maxnodes=1 set preventing the qos grpnodes to take affect. This showed up on slower machines because it takes a second for the changes to get to the controller.
-
Morris Jette authored
-
Morris Jette authored
Test would originally try to start more jobs than default_queue_depth in SchedulerParameters and fail
-
Morris Jette authored
Job was failing on Cray/kachina due to timeout. Increase job time limit from 1 to 2 minutes.
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
make it to the slurmctld when using message aggregation.
-
Danny Auble authored
-
- May 10, 2016
-
-
Danny Auble authored
make sure we handle it correctly when the database comes back up.
-
Morris Jette authored
Give test job an extra second to start. Test was failing by one second on kachina.
-
Morris Jette authored
Get the maximum file pathname size from system include file rather than local #define. This was causing failures on kachina test.
-
Morris Jette authored
-
Danny Auble authored
-
Danny Auble authored
-
Morris Jette authored
-
Alejandro Sanchez authored
-
Tim Wickberg authored
-
Danny Auble authored
# Conflicts: # src/plugins/select/cray/select_cray.c # testsuite/expect/test1.84
-
Brian Christiansen authored
-
Marlys Kohnke authored
for better robustness. This cray/select plugin code has been modified to remove a possible timing window where two aeld pthreads could exist, interfering with each other through the global aeld_running variable. An additional validity check has been added to the data provided to aeld through an alpsc_ev_set_application_info() call. If an error is returned from that call, only certain errors need the current socket connection closed to aeld and a new connection established. Other error returns will log an error message and keep the current session established with aeld.
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Morris Jette authored
This might possibly be related to bug 2334, but it's a long shot.
-