- Mar 03, 2015
-
-
Brian Christiansen authored
Bug 1492
-
Morris Jette authored
For job running under a debugger, if the exec of the task fails, then cancel its I/O and abort immediately rather than waiting 60 seconds for I/O timeout.
-
- Mar 02, 2015
-
-
David Bigagli authored
-
David Bigagli authored
-
Danny Auble authored
-
Danny Auble authored
-
- Feb 27, 2015
-
-
Nicolas Joly authored
Add missing arguments to slurm_sched_p_newalloc/slurm_sched_p_freealloc documentation.
-
Nicolas Joly authored
-
Nicolas Joly authored
-
Morris Jette authored
-
Brian Christiansen authored
Bug 1476
-
- Feb 26, 2015
-
-
David Bigagli authored
-
Morris Jette authored
Improved logging and some code restructuring. No change in logic.
-
- Feb 25, 2015
-
-
David Bigagli authored
This reverts commit e24a418b.
-
David Bigagli authored
-
Morris Jette authored
-
Morris Jette authored
This is a variation on commit 5391b8cc Check $HOME/.my.cnf last rather than first to follow more standard search order
-
- Feb 24, 2015
-
-
Brian Christiansen authored
Bug 1469
-
Nina Suvanphim authored
The /root/.my.cnf would typically contain the login credentials for root. If those are needed for Slurm, then it should be checking that directory. (In reply to Nina Suvanphim from comment #0) ... > const char *default_conf_paths[] = { > "/root/.my.cnf", <<<<<<<<<<<<<<<<<------- add this line > "/etc/my.cnf", "/etc/opt/cray/MySQL/my.cnf", > "/etc/mysql/my.cnf", NULL }; I'll also note that typically the $HOME/.my.cnf file would be checked last rather than first.
-
Morris Jette authored
-
Danny Auble authored
-
Danny Auble authored
don't support strong_alias
-
- Feb 20, 2015
-
-
Dorian Krause authored
we came across the following error message in the slurmctld logs when using non-consumable resources: error: gres/potion: job 39 dealloc of node node1 bad node_offset 0 count is 0 The error comes from _job_dealloc(): node_gres_data=0x7f8a18000b70, node_offset=0, gres_name=0x1999e00 "potion", job_id=46, node_name=0x1987ab0 "node1") at gres.c:3980 (job_gres_list=0x199b7c0, node_gres_list=0x199bc38, node_offset=0, job_id=46, node_name=0x1987ab0 "node1") at gres.c:4190 job_ptr=0x19e9d50, pre_err=0x7f8a31353cb0 "_will_run_test", remove_all=true) at select_linear.c:2091 bitmap=0x7f8a18001ad0, min_nodes=1, max_nodes=1, max_share=1, req_nodes=1, preemptee_candidates=0x0, preemptee_job_list=0x7f8a2f910c40) at select_linear.c:3176 bitmap=0x7f8a18001ad0, min_nodes=1, max_nodes=1, req_nodes=1, mode=2, preemptee_candidates=0x0, preemptee_job_list=0x7f8a2f910c40, exc_core_bitmap=0x0) at select_linear.c:3390 bitmap=0x7f8a18001ad0, min_nodes=1, max_nodes=1, req_nodes=1, mode=2, preemptee_candidates=0x0, preemptee_job_list=0x7f8a2f910c40, exc_core_bitmap=0x0) at node_select.c:588 avail_bitmap=0x7f8a2f910d38, min_nodes=1, max_nodes=1, req_nodes=1, exc_core_bitmap=0x0) at backfill.c:367 The cause of this problem is that _node_state_dup() in gres.c does not duplicate the no_consume flag. The cr_ptr passed to _rm_job_from_nodes() is created with _dup_cr() which calls _node_state_dup(). Below is a simple patch to fix the problem. A "future-proof" alternative might be to memcpy() from gres_ptr to new_gres and only handle pointers separately.
-
- Feb 19, 2015
-
-
Brian Christiansen authored
Bug 1471
-
Morris Jette authored
"If you specify a maximum node count and the host list contains more nodes, the extra node names will be silently ignored." Not so.
-
Danny Auble authored
runs certain sreport reports.
-
- Feb 18, 2015
-
-
Morris Jette authored
Add SLURM_JOB_GPUS environment variable to those available in Prolog. Also add list of environment variables available in the various prologs and epilogs on the web page. bug 1458
-
Brian Christiansen authored
-
- Feb 17, 2015
-
-
Danny Auble authored
runjob happened, and the step was part of an array. This is an addition to commit 49e0f5f2
-
Danny Auble authored
in the runjob_mux plugin.
-
Brian Christiansen authored
Bug 1461 Commit: 2e2d924e
-
Morris Jette authored
See bug 1461
-
- Feb 13, 2015
-
-
David Bigagli authored
-
Morris Jette authored
If call was made to change a node's state to the same state it was already in and set its reason to the same value it already had, then an accounting record was generated. If a script, say NodeHealthCheck is repeatedly setting a node state (say DRAIN), it could generate a huge number of redundant accounting records. This eliminates these redundant records. related to bug 1437
-
- Feb 12, 2015
-
-
Morris Jette authored
-
Morris Jette authored
-
Brian Christiansen authored
-
Brian Christiansen authored
Bug 1446
-
- Feb 11, 2015
-
-
Danny Auble authored
don't insert a new row in the event table.
-
- Feb 10, 2015
-
-
Brian Christiansen authored
uid's are 0 when associations are loaded.
-