- Feb 27, 2015
-
-
Morris Jette authored
Conflicts: src/slurmctld/job_mgr.c
-
Brian Christiansen authored
Display job's estimated NodeCount based off of partition's configured resources rather than the whole system's. Bug 1478
-
Morris Jette authored
-
Morris Jette authored
This provides a better global view of what the limits and caps are.
-
Morris Jette authored
Remove time from "capmc get_node_energy_counter" call. If no recent data is available, no data is being returned, so just get latest information. Initialize a variable to avoid xfree of uninitialized variable. Correct joule to watt calculation (">" changed to "<") Read configuration once when slurmctld starts rather than twice Compute a node's power consumption with more precision based upon time to the microsecond
-
- Feb 26, 2015
-
-
Morris Jette authored
-
Morris Jette authored
Add links to burst buffer and power management pages. Add JSON-C build/installation instructions.
-
David Bigagli authored
-
Morris Jette authored
-
Morris Jette authored
Previously, there was no binding of tasks to the appropriate NUMA. Based upon work by Josko Plazonic <plazonic@princeton.edu>.
-
Morris Jette authored
Improved logging and some code restructuring. No change in logic.
-
David Bigagli authored
This reverts commit e24a418b.
-
David Bigagli authored
-
Morris Jette authored
Improved logging and some code restructuring. No change in logic.
-
- Feb 25, 2015
-
-
Morris Jette authored
Mail notifications on job BEGIN, END and FAIL now apply to a job array as a whole rather than generating individual email messages for each task in the job array.
-
David Bigagli authored
This reverts commit e24a418b.
-
David Bigagli authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
This is a variation on commit 5391b8cc Check $HOME/.my.cnf last rather than first to follow more standard search order
-
Morris Jette authored
-
- Feb 24, 2015
-
-
Brian Christiansen authored
Bug 1469
-
Michael A. Raymond authored
-
Morris Jette authored
-
Nina Suvanphim authored
The /root/.my.cnf would typically contain the login credentials for root. If those are needed for Slurm, then it should be checking that directory. (In reply to Nina Suvanphim from comment #0) ... > const char *default_conf_paths[] = { > "/root/.my.cnf", <<<<<<<<<<<<<<<<<------- add this line > "/etc/my.cnf", "/etc/opt/cray/MySQL/my.cnf", > "/etc/mysql/my.cnf", NULL }; I'll also note that typically the $HOME/.my.cnf file would be checked last rather than first.
-
Morris Jette authored
Fix some logic related to power distribution across nodes
-
Morris Jette authored
-
Morris Jette authored
-
Danny Auble authored
-
Danny Auble authored
don't support strong_alias
-
Morris Jette authored
Update power management web page: Add notes about powering nodes down/up Prevent underflow in power distribution logic Add logic to identify nodes in "ready" state. Only ready nodes can have their power caps modified Don't change power cap if node not in ready state Various improvements to logging Refactor code to eliminate duplicate/repeated building of full NID list Plug some memory leaks
-
- Feb 23, 2015
-
-
Morris Jette authored
Modify test 12.7 so that we specify a reason when setting a node DOWN A recent change to the Slurm code now requires a reason
-
- Feb 21, 2015
-
-
Morris Jette authored
-
- Feb 20, 2015
-
-
Morris Jette authored
-
Morris Jette authored
Correct capmc arguments to set power cap. Convert "capmc get_node_energy_counter" to use hostlist expressin rather than listing every node in a comma separated list. Log commands and args run by the plugin via the power_run_script() function in src/plugins/power/common/power_common.c. Use hostlist to build condenced nid list for power cap set/clear functions.
-
Morris Jette authored
-
Dorian Krause authored
we came across the following error message in the slurmctld logs when using non-consumable resources: error: gres/potion: job 39 dealloc of node node1 bad node_offset 0 count is 0 The error comes from _job_dealloc(): node_gres_data=0x7f8a18000b70, node_offset=0, gres_name=0x1999e00 "potion", job_id=46, node_name=0x1987ab0 "node1") at gres.c:3980 (job_gres_list=0x199b7c0, node_gres_list=0x199bc38, node_offset=0, job_id=46, node_name=0x1987ab0 "node1") at gres.c:4190 job_ptr=0x19e9d50, pre_err=0x7f8a31353cb0 "_will_run_test", remove_all=true) at select_linear.c:2091 bitmap=0x7f8a18001ad0, min_nodes=1, max_nodes=1, max_share=1, req_nodes=1, preemptee_candidates=0x0, preemptee_job_list=0x7f8a2f910c40) at select_linear.c:3176 bitmap=0x7f8a18001ad0, min_nodes=1, max_nodes=1, req_nodes=1, mode=2, preemptee_candidates=0x0, preemptee_job_list=0x7f8a2f910c40, exc_core_bitmap=0x0) at select_linear.c:3390 bitmap=0x7f8a18001ad0, min_nodes=1, max_nodes=1, req_nodes=1, mode=2, preemptee_candidates=0x0, preemptee_job_list=0x7f8a2f910c40, exc_core_bitmap=0x0) at node_select.c:588 avail_bitmap=0x7f8a2f910d38, min_nodes=1, max_nodes=1, req_nodes=1, exc_core_bitmap=0x0) at backfill.c:367 The cause of this problem is that _node_state_dup() in gres.c does not duplicate the no_consume flag. The cr_ptr passed to _rm_job_from_nodes() is created with _dup_cr() which calls _node_state_dup(). Below is a simple patch to fix the problem. A "future-proof" alternative might be to memcpy() from gres_ptr to new_gres and only handle pointers separately.
-
Morris Jette authored
-
- Feb 19, 2015
-
-
Brian Christiansen authored
Bug 1471
-
Morris Jette authored
-