- Feb 14, 2004
-
-
Moe Jette authored
parameter.
-
- Feb 13, 2004
- Feb 12, 2004
- Jan 30, 2004
- Jan 27, 2004
-
-
Moe Jette authored
(gnats:372)
-
- Jan 26, 2004
-
-
Moe Jette authored
Export version info via new API
-
- Jan 16, 2004
-
-
Moe Jette authored
-
- Jan 02, 2004
-
-
Moe Jette authored
1. New srun argument --mpi=<type> (e.g. "--mpi=lam") 2. New environment variable set SLURM_TASKS_PER_NODE 3. Launch a single task per node if --mpi=lam
-
- Dec 31, 2003
- Dec 18, 2003
-
-
Moe Jette authored
-
- Dec 09, 2003
-
-
Moe Jette authored
its support.
-
- Dec 08, 2003
-
-
Moe Jette authored
-
- Dec 05, 2003
-
-
Moe Jette authored
sets new environment variable SLURM_CPUS_ON_NODE for use by LAM/MPI. Also fixed bug in srun task distribution logic for block distribution in heterogeneous cluster.
-
- Dec 02, 2003
-
-
Moe Jette authored
-
- Nov 26, 2003
-
-
Moe Jette authored
-
- Nov 25, 2003
-
-
Moe Jette authored
Otherwise signal all steps associated with the job (unless individual job steps are identified).
-
- Nov 24, 2003
-
-
Moe Jette authored
signals all job steps, but not the job script itself).
-
- Nov 21, 2003
-
-
Mark Grondona authored
- fixes to help ensure slurmd uses the same key for shared memory on a restart (to avoid losing track of jobs) - slurmd only runs one launch thread at a time - fix bug in slurmd where multiple threads used same address space for connecting client address. - srun always sends SIGKILL to job step before issuing complete request - Changed short string for draining nodes to drng from drain. - srun default launch message timeout increased to 5s.
-
- Nov 20, 2003
-
-
Mark Grondona authored
o Modify srun(1) man page to reflect that, really, slurmd debug level is only allowed to be set up to 4.
-
- Nov 18, 2003
-
-
Moe Jette authored
-
- Nov 17, 2003
-
-
jwindley authored
-
- Nov 14, 2003
-
-
Moe Jette authored
-
- Nov 10, 2003
-
-
Moe Jette authored
-
- Nov 07, 2003
- Nov 05, 2003
-
-
Moe Jette authored
to take effect.
-
- Nov 03, 2003
-
-
Moe Jette authored
-
- Oct 31, 2003
-
-
Moe Jette authored
scontrol command to use it.
-
- Oct 29, 2003
-
-
Moe Jette authored
and/or job step(s) will have their resources de-allocated and be killed. A resource allocation will not be release unless no job steps are active for at least InactiveLimit seconds. DPCS jobs will be subject to this forced de-allocation if they remain inactive for an extended period of time, which can get SLURM and DPCS back in sync if DPCS does a cold-start.
-
- Oct 24, 2003
-
-
Moe Jette authored
report an error message.
-
Moe Jette authored
-
Moe Jette authored
avoid highly fragmented resource allocations. Add list of excluded nodes to job info dumpped and reported. Fix how mis-matched RPC version number are handled. Let error code get back to the API function. Dump job state information upon each job's termination via plugin. Re-issue incomplete write requests in job/partition state save. Make slurmctld continue proper operation without any default partition (gnats:317). Add command/RPC to delete a partition. Retry socket connection for slurmd/io.c as needed (gnats:253).
-
jwindley authored
-
- Oct 15, 2003