- Mar 05, 2004
-
-
Moe Jette authored
-
- Mar 04, 2004
-
-
Moe Jette authored
data structure. This eliminates risks associated with re-reading slurm.conf.
-
- Mar 03, 2004
- Mar 02, 2004
- Feb 27, 2004
-
-
Moe Jette authored
-
Moe Jette authored
to relinquish control. This is necessary since a temporary network problem can result in the BackupController becoming the primary server even while the server on the ControllerMachine continues execution. While this event is impossible to prevent, the new code restores proper operation when communications are restored. (gnats:387)
-
- Feb 26, 2004
-
-
Moe Jette authored
state recovery.
-
- Feb 20, 2004
-
-
Moe Jette authored
them when partition limits change.
-
- Feb 19, 2004
-
-
Moe Jette authored
the plugins themselves. Add new functions to each plugin to return the error number and given an error number, return a description.
-
- Feb 14, 2004
-
-
Moe Jette authored
parameter.
-
- Feb 13, 2004
- Feb 12, 2004
- Jan 30, 2004
- Jan 28, 2004
-
-
Moe Jette authored
slurmctld to exit.
-
- Jan 26, 2004
-
-
Moe Jette authored
clusters with FastSchedule configured off * Only return DOWN nodes to service if the reason for them being in that state is non-responsiveness and ReturnToService configured on * Some general code clean-up
-
- Jan 23, 2004
-
-
Moe Jette authored
fast_schedule == 0 (i.e. the node's value is used directly rather than the partition's configuration value).
-
- Jan 17, 2004
-
-
Moe Jette authored
This was a problem briefly observed in RedHat 9.
-
- Jan 16, 2004
-
-
Moe Jette authored
-
- Jan 14, 2004
-
-
Moe Jette authored
was being ignored by default when manually initiated, which prevents it from terminating gracefully.
-
- Jan 13, 2004
-
-
Moe Jette authored
when a node becomes DOWN for not responding. This is because if there are a large number of non-responsive nodes, the ping agent can take a long time to complete (one second per non-responsive node or 10 second timeout per node with 10 parallel tasks). This should more properly mark nodes as DOWN.
-
Moe Jette authored
-
Moe Jette authored
then don't treat as fatal error.
-
- Dec 31, 2003
-
-
Moe Jette authored
modifications were relatively minor - mostly changes in function names or arguments.
-
- Dec 24, 2003
-
-
Moe Jette authored
(it was inappropriately going to DRAINING state).
-
- Dec 23, 2003
-
-
Moe Jette authored
see memory leaks.
-
Moe Jette authored
(it grows anyway). Fix state read logic to better handle error conditions.
-
Moe Jette authored
Fix update node RPC to handle reason field change without state change. State was being handled as type int instead of uint16_t so NO_VAL check was not working properly.
-
Moe Jette authored
-
Moe Jette authored
"Scontrol abort" works. It was leaving a hung pthread due to a recent change. Fix a couple of potential memory leaks "switch_type" has been added to config data structure, un/pack, etc, but not yet reported to the user or documented yet. The plugins now use function calls to get a their type and plugin directory from a common data structure rather than individually reading and parsing the configuration file.
-
- Dec 22, 2003
-
-
Moe Jette authored
-
- Dec 19, 2003
-
-
Mark Grondona authored
-