- May 11, 2016
-
-
Danny Auble authored
tasks-per-node/nodes != tasks print warning and ignore ntasks-per-node. Bug 2520
-
Brian Christiansen authored
On a Cray, the output file isn't being created the second time.
-
Morris Jette authored
Make test_id in more tests be just the numeric value rather than "test#.#" for consistency with the other tests.
-
Morris Jette authored
Make test_id in test30.1 be just the numeric value rather than "test30.1" for consistency with the other tests.
-
Brian Christiansen authored
The account still had maxnodes=1 set preventing the qos grpnodes to take affect. This showed up on slower machines because it takes a second for the changes to get to the controller.
-
Morris Jette authored
-
Morris Jette authored
Test would originally try to start more jobs than default_queue_depth in SchedulerParameters and fail
-
Morris Jette authored
Job was failing on Cray/kachina due to timeout. Increase job time limit from 1 to 2 minutes.
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
make it to the slurmctld when using message aggregation.
-
Danny Auble authored
-
- May 10, 2016
-
-
Danny Auble authored
make sure we handle it correctly when the database comes back up.
-
Morris Jette authored
Give test job an extra second to start. Test was failing by one second on kachina.
-
Morris Jette authored
Get the maximum file pathname size from system include file rather than local #define. This was causing failures on kachina test.
-
Morris Jette authored
-
Danny Auble authored
-
Danny Auble authored
-
Morris Jette authored
-
Alejandro Sanchez authored
-
Tim Wickberg authored
-
Danny Auble authored
# Conflicts: # src/plugins/select/cray/select_cray.c # testsuite/expect/test1.84
-
Brian Christiansen authored
-
Marlys Kohnke authored
for better robustness. This cray/select plugin code has been modified to remove a possible timing window where two aeld pthreads could exist, interfering with each other through the global aeld_running variable. An additional validity check has been added to the data provided to aeld through an alpsc_ev_set_application_info() call. If an error is returned from that call, only certain errors need the current socket connection closed to aeld and a new connection established. Other error returns will log an error message and keep the current session established with aeld.
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Morris Jette authored
This might possibly be related to bug 2334, but it's a long shot.
-
Danny Auble authored
slurm.conf instead of all. If looking for specific addresses use TopologyParam options No*InAddrAny. This was broken in 15.08 with the advent of the referenced TopologyParams the commits 9378f195 and c5312f52 are no longer needed. Bug 2696
-
Brian Christiansen authored
Thread names can only be 16 characters long, plus we already know that the threads are from the slurmctld.
-
Brian Christiansen authored
-
- May 09, 2016
-
-
Danny Auble authored
-
Moe Jette authored
at the same time. Bug 2683 Turns out making a variable static in a function will make it not safe when dealing with threads.
-
Brian Christiansen authored
-
- May 06, 2016
-
-
Morris Jette authored
If node_feature/knl_cray plugin is configured and a GresType of "hbm" is not defined, then add it the the GRES tables. Without this, references to a GRES of "hbm" (either by a user or Slurm's internal logic) will generate error messages. bug 2708
-
Morris Jette authored
-
Morris Jette authored
-
John Thiltges authored
With slurm-15.08.10, we're seeing occasional segfaults in slurmstepd. The logs point to the following line: slurm-15.08.10/src/slurmd/slurmstepd/mgr.c:2612 On that line, _get_primary_group() is accessing the results of getpwnam_r(): *gid = pwd0->pw_gid; If getpwnam_r() cannot find a matching password record, it will set the result (pwd0) to NULL, but still return 0. When the pointer is accessed, it will cause a segfault. Checking the result variable (pwd0) to determine success should fix the issue.
-
Morris Jette authored
Note that Slurm can not support heterogenous core counts for each NUMA nodes. bug 2704
-