- Feb 10, 2006
-
-
- Jan 30, 2006
-
-
Moe Jette authored
of SlurmdTimeout and SlurmctldTimeout for communications to slurmd and slurmctld daemons repsectively.
-
- Jan 27, 2006
-
-
Danny Auble authored
-
- Jan 24, 2006
-
-
Moe Jette authored
debug() to info().
-
- Jan 05, 2006
-
-
Moe Jette authored
UCRL-CODE-2002-040 to UCRL-CODE-217948. No changes in any logic.
-
- Nov 23, 2005
-
-
- Nov 21, 2005
-
-
Danny Auble authored
-
- Sep 22, 2005
-
-
Christopher J. Morrone authored
-
- Jul 15, 2005
-
-
Moe Jette authored
-
- May 19, 2005
-
-
Moe Jette authored
-
- May 10, 2005
-
-
Moe Jette authored
-
- Mar 04, 2005
-
-
Moe Jette authored
-
- Dec 14, 2004
-
-
Danny Auble authored
-
- Dec 01, 2004
-
-
Moe Jette authored
prolog and epilog. Needed to synchronize slurmctld with user job execution: Don't start script until partition is ready and clear all jobs upon termination (offloading work from slurmctld).
-
- Nov 18, 2004
-
-
Moe Jette authored
-
- Nov 16, 2004
-
-
Moe Jette authored
-
- Oct 29, 2004
-
-
Moe Jette authored
-
- Oct 08, 2004
- Oct 07, 2004
- Sep 24, 2004
-
-
Moe Jette authored
Move pretty much all BGL-specific logic into that module and the associated plugin and make use of an opaque data object for maintaining the information.
-
- Aug 25, 2004
-
-
Moe Jette authored
code for current Linux clusters and develop all new logic for Blue Gene).
-
- Aug 17, 2004
-
-
Moe Jette authored
Map all nodes in cluster to a single front-end node. Don't repeat ping/register/kill/etc. RPCs to all pseudo nodes, just the front-end. Treat single message for some RPCs as representing all nodes in the cluster: register, ping responce, epilog complete, etc.
-
- Aug 04, 2004
-
-
Moe Jette authored
-
- Jul 29, 2004
-
-
Moe Jette authored
-
- Jul 26, 2004
-
-
Moe Jette authored
-
- Jul 23, 2004
-
-
Moe Jette authored
scontrol options. For now only the NULL plugin is available, but this is required for ASC Purple.
-
- Jul 09, 2004
-
-
Moe Jette authored
-
- May 24, 2004
-
-
Moe Jette authored
-
- May 14, 2004
- Apr 23, 2004
-
-
Moe Jette authored
* Memory leak in slurm_cred.c, added EVP_MD_CTX_cleanup(). * Pthread stack size too small on AIX. Resulting in stack corruption and ugly failure modes. Added slurm_attr_init to macros.h to explicitly set the stack size for all pthreads. * /dev/urandom not present on AIX, use rand() as needed instead in constructing a credential. Used in "srun --join". * getsockopt(Socket, Level, SO_ERROR, &err, OptionLenght) sometime returns an error code of -1. This causes an assert failure in slurmd/io.c:_update_error_state(). * Function aliasing is not working on AIX. It is being turned off via a variable in config.h and "#if" logic in macros.h and slurm_xlator.h. * dlopen failing if plugins reference any functions not present in caller. This may be fixed with the LDFLAG "-Wl,-bgcbypass=1000" being added for the slurm commands (avoid garbage collection of unused functions). * read() is sometimes generates EAGAIN error, which was not handled in some places. * vsnprintf() for string NULL is printing "" instead of "(null)" as produced by snprintf(). More format printing was added to log.c to produce more consistent log messages. * poll() takes a timeout of -1 for unlimited rather than any negative number. Modify logic that was always multiplying by 1000 to convert usec to msec. * getopt_long keyword table was not NULL terminated, resulting in segfault with invalid command-line argument in most commands. * xmalloc module assert failures were not generating a core file. Changed "fatal();abort();" to "error();abort();". * Change msg timeout from 3 sec to 5 sec. Running everything on single AIX node was very slow.
-
- Mar 20, 2004
-
-
Moe Jette authored
order to interupt accept(), because this can fail if the authentication plugin is bad or there are other communications problems. Use interrupt instead.
-
- Mar 19, 2004
-
-
Moe Jette authored
and slurmctld (specify different configuration file).
-
- Mar 16, 2004