Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
S
Slurm
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package Registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
tud-zih-energy
Slurm
Commits
765600fd
Commit
765600fd
authored
17 years ago
by
Moe Jette
Browse files
Options
Downloads
Patches
Plain Diff
svn merge -r12134:12171
https://eris.llnl.gov/svn/slurm/branches/slurm-1.2
parent
8c939577
No related branches found
Branches containing commit
No related tags found
No related merge requests found
Changes
4
Hide whitespace changes
Inline
Side-by-side
Showing
4 changed files
contribs/mpich1.slurm.patch
+2
-2
2 additions, 2 deletions
contribs/mpich1.slurm.patch
src/api/slurm_pmi.c
+13
-6
13 additions, 6 deletions
src/api/slurm_pmi.c
src/api/step_ctx.c
+1
-1
1 addition, 1 deletion
src/api/step_ctx.c
src/sattach/opt.c
+1
-1
1 addition, 1 deletion
src/sattach/opt.c
with
17 additions
and
10 deletions
contribs/mpich1.slurm.patch
+
2
−
2
View file @
765600fd
...
@@ -282,8 +282,8 @@ Index: README
...
@@ -282,8 +282,8 @@ Index: README
+DETAILS: The srun command opens two socket connections and passes
+DETAILS: The srun command opens two socket connections and passes
+their ports to all tasks via the SLURM_MPICH1_P4_PORT1 and
+their ports to all tasks via the SLURM_MPICH1_P4_PORT1 and
+SLURM_MPICH1_P4_PORT2 environment variables. Task zero connects to
+SLURM_MPICH1_P4_PORT2 environment variables. Task zero connects to
+SLURM_MPICH1_P4_PORT1 and writes port number. The other tasks connect
+SLURM_MPICH1_P4_PORT1 and writes
its
port number. The other tasks connect
to
+
to
SLURM_MPICH1_P4_PORT2 and that port number. This avoid the requirement
+SLURM_MPICH1_P4_PORT2 and
read
that port number. This avoid the requirement
+of having task zero launch all subsequent tasks and also launches
+of having task zero launch all subsequent tasks and also launches
+all tasks under the direct control of SLURM (for process management
+all tasks under the direct control of SLURM (for process management
+and accounting). SLURM only launches one task per node and that
+and accounting). SLURM only launches one task per node and that
...
...
This diff is collapsed.
Click to expand it.
src/api/slurm_pmi.c
+
13
−
6
View file @
765600fd
...
@@ -197,12 +197,19 @@ int slurm_get_kvs_comm_set(struct kvs_comm_set **kvs_set_ptr,
...
@@ -197,12 +197,19 @@ int slurm_get_kvs_comm_set(struct kvs_comm_set **kvs_set_ptr,
/* Send the RPC to the local srun communcation manager.
/* Send the RPC to the local srun communcation manager.
* Since the srun can be sent thousands of messages at
* Since the srun can be sent thousands of messages at
* the same time and refuse some connections, retry as
* the same time and refuse some connections, retry as
* needed. Spread out messages by task's rank. Also
* needed. Wait until all key-pairs have been sent by
* increase the timeout if many tasks since the srun
* all tasks then spread out messages by task's rank.
* command is very overloaded.
* Also increase the message timeout if many tasks
* We also increase the timeout (default timeout is
* since the srun command can get very overloaded (the
* 10 secs). */
* default timeout is 10 secs).
usleep
(
pmi_rank
*
pmi_time
);
*
* TaskID SendTime GetTime (Units are PMI_TIME, default=500 usec)
* 0 0 N+0
* 1 1 N+1
* 2 2 N+2
* N-1 N-1 N+N-1
*/
usleep
(
pmi_size
*
pmi_time
);
if
(
pmi_size
>
1000
)
/* 100 secs */
if
(
pmi_size
>
1000
)
/* 100 secs */
timeout
=
slurm_get_msg_timeout
()
*
10000
;
timeout
=
slurm_get_msg_timeout
()
*
10000
;
else
if
(
pmi_size
>
100
)
/* 50 secs */
else
if
(
pmi_size
>
100
)
/* 50 secs */
...
...
This diff is collapsed.
Click to expand it.
src/api/step_ctx.c
+
1
−
1
View file @
765600fd
...
@@ -329,7 +329,7 @@ extern void slurm_step_ctx_params_t_init (slurm_step_ctx_params_t *ptr)
...
@@ -329,7 +329,7 @@ extern void slurm_step_ctx_params_t_init (slurm_step_ctx_params_t *ptr)
char
*
jobid_str
;
char
*
jobid_str
;
/* zero the entire structure */
/* zero the entire structure */
memset
(
ptr
,
0
,
sizeof
(
job
_step_c
reate_request_
ms
g
_t
));
memset
(
ptr
,
0
,
sizeof
(
slurm
_step_c
tx_para
ms_t
));
/* now set anything that shouldn't be 0 or NULL by default */
/* now set anything that shouldn't be 0 or NULL by default */
ptr
->
relative
=
(
uint16_t
)
NO_VAL
;
ptr
->
relative
=
(
uint16_t
)
NO_VAL
;
...
...
This diff is collapsed.
Click to expand it.
src/sattach/opt.c
+
1
−
1
View file @
765600fd
...
@@ -305,7 +305,7 @@ void set_options(const int argc, char **argv)
...
@@ -305,7 +305,7 @@ void set_options(const int argc, char **argv)
switch
(
opt_char
)
{
switch
(
opt_char
)
{
case
'?'
:
case
'?'
:
fprintf
(
stderr
,
"Try
\"
s
b
atch --help
\"
for more "
fprintf
(
stderr
,
"Try
\"
sat
ta
ch --help
\"
for more "
"information
\n
"
);
"information
\n
"
);
exit
(
1
);
exit
(
1
);
break
;
break
;
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment