Skip to content
Snippets Groups Projects
Commit f16338dd authored by Moe Jette's avatar Moe Jette
Browse files

Permit a change SchedulerType to take effect upon reconfigure (without

daemon restart).
parent 0fe03bc5
No related branches found
No related tags found
No related merge requests found
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
.SH "NAME" .SH "NAME"
slurm.conf \- Slurm configuration file slurm.conf \- Slurm configuration file
.SH "DESCRIPTION" .SH "DESCRIPTION"
\fB/etc/slurm.conf\fP is an ASCI file which describes general Slurm configuration \fB/etc/slurm.conf\fP is an ASCI file which describes general SLURM configuration
information, the nodes to be managed, information about how those nodes are information, the nodes to be managed, information about how those nodes are
grouped into partitions, and various scheduling parameters associated with grouped into partitions, and various scheduling parameters associated with
those partitions. The file location can be modified at system build time using those partitions. The file location can be modified at system build time using
...@@ -12,7 +12,9 @@ The contents of the file are case insensitive except for the names of nodes ...@@ -12,7 +12,9 @@ The contents of the file are case insensitive except for the names of nodes
and partitions. Any text following a "#" in the configuration file is treated and partitions. Any text following a "#" in the configuration file is treated
as a comment through the end of that line. as a comment through the end of that line.
The size of each line in the file is limited to 1024 characters. The size of each line in the file is limited to 1024 characters.
Changes to the configuration file take effect upon restart of
SLURM daemons, daemon reciept of the SIGHUP signal, or execution
of the command "scontrol reconfigure" unless otherwise noted.
.LP .LP
The overall configuration parameters available include: The overall configuration parameters available include:
.TP .TP
...@@ -110,8 +112,6 @@ stopped a debugger. The default value is unlimited (zero). ...@@ -110,8 +112,6 @@ stopped a debugger. The default value is unlimited (zero).
Define the location where job completion records are to be logged. Define the location where job completion records are to be logged.
The interpretation of this value depends upon the logging mechanism The interpretation of this value depends upon the logging mechanism
specified by the \fBJobCompType\fR parameter. specified by the \fBJobCompType\fR parameter.
Slurmctld must be reconfigured ("scontrol reconfig" or SIGHUP signal)
for a change in \fBJobCompLoc\fR to take effect.
.TP .TP
\fBJobCompType\fR \fBJobCompType\fR
Define the job completion logging mechanism type. Define the job completion logging mechanism type.
...@@ -120,8 +120,6 @@ The default value is "jobcomp/none", which means that upon job completion ...@@ -120,8 +120,6 @@ The default value is "jobcomp/none", which means that upon job completion
the record of the job is purged from the system. the record of the job is purged from the system.
The value "jobcomp/filetxt" indicates that a record of the job should be The value "jobcomp/filetxt" indicates that a record of the job should be
written to a text file specified by the \fBJobCompLoc\fR parameter. written to a text file specified by the \fBJobCompLoc\fR parameter.
Slurmctld must be reconfigured ("scontrol reconfig" or SIGHUP signal) for
a change in \fBJobCompType\fR to take effect.
.TP .TP
\fBJobCredentialPrivateKey\fR \fBJobCredentialPrivateKey\fR
Fully qualified pathname of a file containing a private key used for Fully qualified pathname of a file containing a private key used for
...@@ -191,10 +189,8 @@ Identifies the type of scheduler to be used. Acceptable values include ...@@ -191,10 +189,8 @@ Identifies the type of scheduler to be used. Acceptable values include
the default FIFO scheduling, and the default FIFO scheduling, and
"sched/wiki" for the Wiki interface to the Maui Scheduler. "sched/wiki" for the Wiki interface to the Maui Scheduler.
The default value is "sched/builtin". The default value is "sched/builtin".
Slurmctld must be restared for a change in \fBSchedulerType\fR to When initially setting the value to "sched/wiki", any pending jobs
take effect. must have their priority set to zero (held).
In addition, when initially setting the value to "sched/wiki"
all pending jobs must have their priority set to zero (held).
When changing the value from "sched/wiki", all pending jobs When changing the value from "sched/wiki", all pending jobs
should have their priority change from zero to some large number. should have their priority change from zero to some large number.
The \fBscontrol\fR command can be used to change job priorities. The \fBscontrol\fR command can be used to change job priorities.
...@@ -260,7 +256,8 @@ Fully qualified pathname of a directory into which the \fBslurmd\fR ...@@ -260,7 +256,8 @@ Fully qualified pathname of a directory into which the \fBslurmd\fR
daemon's state information and batch job script information are written. This daemon's state information and batch job script information are written. This
must be a common pathname for all nodes, but should represent a directory which must be a common pathname for all nodes, but should represent a directory which
is local to each node (reference a local file system). The default value is local to each node (reference a local file system). The default value
is "/var/spool/slurmd." \fBNOTE\fR: This directory is also used to store \fBslurmd\fR's is "/var/spool/slurmd." \fBNOTE\fR: This directory is also used to store
\fBslurmd\fR's
shared memory lockfile, and \fBshould not be changed\fR unless the system shared memory lockfile, and \fBshould not be changed\fR unless the system
is being cleanly restarted. If the location of \fBSlurmdSpoolDir\fR is is being cleanly restarted. If the location of \fBSlurmdSpoolDir\fR is
changed and \fBslurmd\fR is restarted, the new daemon will attach to a changed and \fBslurmd\fR is restarted, the new daemon will attach to a
...@@ -281,7 +278,8 @@ If any slurm daemons terminate abnormally, their core files will also be written ...@@ -281,7 +278,8 @@ If any slurm daemons terminate abnormally, their core files will also be written
into this directory. into this directory.
.TP .TP
\fBSwitchType\fR \fBSwitchType\fR
Identifies the type of switch or interconnect used for application communications. Identifies the type of switch or interconnect used for application
communications.
Acceptable values include Acceptable values include
"switch/none" for switches not requiring special processing for job launch "switch/none" for switches not requiring special processing for job launch
or termination (Myrinet, Ethernet, and InfiniBand), or termination (Myrinet, Ethernet, and InfiniBand),
...@@ -294,7 +292,8 @@ value of \fBSwitchType\fR, records of all jobs in any state may be lost. ...@@ -294,7 +292,8 @@ value of \fBSwitchType\fR, records of all jobs in any state may be lost.
.TP .TP
\fBTmpFS\fR \fBTmpFS\fR
Fully qualified pathname of the file system available to user jobs for Fully qualified pathname of the file system available to user jobs for
temporary storage. This parameter is used in establishing a node's \fBTmpDisk\fR space. temporary storage. This parameter is used in establishing a node's \fBTmpDisk\fR
space.
The default value is "/tmp". The default value is "/tmp".
.TP .TP
\fBWaitTime\fR \fBWaitTime\fR
......
...@@ -51,6 +51,7 @@ ...@@ -51,6 +51,7 @@
#include "src/slurmctld/locks.h" #include "src/slurmctld/locks.h"
#include "src/slurmctld/proc_req.h" #include "src/slurmctld/proc_req.h"
#include "src/slurmctld/read_config.h" #include "src/slurmctld/read_config.h"
#include "src/slurmctld/sched_plugin.h"
#include "src/slurmctld/slurmctld.h" #include "src/slurmctld/slurmctld.h"
#define BUF_SIZE 1024 #define BUF_SIZE 1024
...@@ -65,7 +66,7 @@ static void _restore_node_state(struct node_record *old_node_table_ptr, ...@@ -65,7 +66,7 @@ static void _restore_node_state(struct node_record *old_node_table_ptr,
int old_node_record_count); int old_node_record_count);
static void _preserve_plugins(slurm_ctl_conf_t * ctl_conf_ptr, static void _preserve_plugins(slurm_ctl_conf_t * ctl_conf_ptr,
char *old_auth_type, char *old_auth_type,
char *old_sched_type, char *old_switch_type); char *old_switch_type);
static int _sync_nodes_to_comp_job(void); static int _sync_nodes_to_comp_job(void);
static int _sync_nodes_to_jobs(void); static int _sync_nodes_to_jobs(void);
static int _sync_nodes_to_active_job(struct job_record *job_ptr); static int _sync_nodes_to_active_job(struct job_record *job_ptr);
...@@ -686,7 +687,6 @@ int read_slurm_conf(int recover) ...@@ -686,7 +687,6 @@ int read_slurm_conf(int recover)
int old_node_record_count; int old_node_record_count;
struct node_record *old_node_table_ptr; struct node_record *old_node_table_ptr;
char *old_auth_type = xstrdup(slurmctld_conf.authtype); char *old_auth_type = xstrdup(slurmctld_conf.authtype);
char *old_sched_type = xstrdup(slurmctld_conf.schedtype);
char *old_switch_type = xstrdup(slurmctld_conf.switch_type); char *old_switch_type = xstrdup(slurmctld_conf.switch_type);
/* initialization */ /* initialization */
...@@ -770,11 +770,11 @@ int read_slurm_conf(int recover) ...@@ -770,11 +770,11 @@ int read_slurm_conf(int recover)
fclose(slurm_spec_file); fclose(slurm_spec_file);
_preserve_plugins(&slurmctld_conf, _preserve_plugins(&slurmctld_conf,
old_auth_type, old_auth_type, old_switch_type);
old_sched_type, old_switch_type);
validate_config(&slurmctld_conf); validate_config(&slurmctld_conf);
update_logging(); update_logging();
g_slurm_jobcomp_init(slurmctld_conf.job_comp_loc); g_slurm_jobcomp_init(slurmctld_conf.job_comp_loc);
slurm_sched_init();
switch_init(); switch_init();
if (default_part_loc == NULL) if (default_part_loc == NULL)
...@@ -872,15 +872,11 @@ static void _purge_old_node_state(struct node_record *old_node_table_ptr, ...@@ -872,15 +872,11 @@ static void _purge_old_node_state(struct node_record *old_node_table_ptr,
* plugin value changes to take effect. * plugin value changes to take effect.
*/ */
static void _preserve_plugins(slurm_ctl_conf_t * ctl_conf_ptr, static void _preserve_plugins(slurm_ctl_conf_t * ctl_conf_ptr,
char *old_auth_type, char *old_auth_type, char *old_switch_type)
char *old_sched_type, char *old_switch_type)
{ {
xfree(ctl_conf_ptr->authtype); xfree(ctl_conf_ptr->authtype);
ctl_conf_ptr->authtype = old_auth_type; ctl_conf_ptr->authtype = old_auth_type;
xfree(ctl_conf_ptr->schedtype);
ctl_conf_ptr->schedtype = old_sched_type;
xfree(ctl_conf_ptr->switch_type); xfree(ctl_conf_ptr->switch_type);
ctl_conf_ptr->switch_type = old_switch_type; ctl_conf_ptr->switch_type = old_switch_type;
......
...@@ -141,6 +141,10 @@ slurm_sched_context_create( const char *sched_type ) ...@@ -141,6 +141,10 @@ slurm_sched_context_create( const char *sched_type )
static int static int
slurm_sched_context_destroy( slurm_sched_context_t *c ) slurm_sched_context_destroy( slurm_sched_context_t *c )
{ {
/*
* Must check return code here because plugins might still
* be loaded and active.
*/
if ( c->plugin_list ) { if ( c->plugin_list ) {
if ( plugrack_destroy( c->plugin_list ) != SLURM_SUCCESS ) { if ( plugrack_destroy( c->plugin_list ) != SLURM_SUCCESS ) {
return SLURM_ERROR; return SLURM_ERROR;
...@@ -165,7 +169,8 @@ slurm_sched_init( void ) ...@@ -165,7 +169,8 @@ slurm_sched_init( void )
slurm_mutex_lock( &g_sched_context_lock ); slurm_mutex_lock( &g_sched_context_lock );
if ( g_sched_context ) goto done; if ( g_sched_context )
slurm_sched_context_destroy( g_sched_context );
sched_type = slurm_get_sched_type(); sched_type = slurm_get_sched_type();
g_sched_context = slurm_sched_context_create( sched_type ); g_sched_context = slurm_sched_context_create( sched_type );
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment