- Jun 08, 2016
-
-
Morris Jette authored
Valgrind's drd tool reported a race condition on a variable, so mutex was added to wrap it. The chances of this causing a problem are nill, but this makes it zero.
-
Morris Jette authored
-
Morris Jette authored
Valgrind's drd tool reported a race condition on a variable, so mutex was added to wrap it. The chances of this causing a problem are nill, but this makes it zero.
-
Danny Auble authored
-
Morris Jette authored
Valgrind's drd tool reported a race condition on a variable, so mutex was added to wrap it. The chances of this causing a problem are nill, but this makes it zero.
-
Morris Jette authored
Valgrind's drd tool reported a race condition on a variable, so mutex was added to wrap it. The chances of this causing a problem are nill, but this makes it zero.
-
Morris Jette authored
Valgrind's drd tool reported a race condition on a variable, so mutex was added to wrap it. The chances of this causing a problem are nill, but this makes it zero.
-
Morris Jette authored
This modifies code added in Slurm version 16.05 so that it works properly with a change made in version 15.08 with respect to the definition of SLURMSTEPD_MEMCHECK, see commit c4d0d306
-
Morris Jette authored
Previous logic could hang the slurmstepd in an infinite loop
-
Morris Jette authored
slurmctld could die if a job existed in completing state with an invalid/defunct partition name and "scontrol reconfigure" is run.
-
- Jun 07, 2016
-
-
Andy Riebs authored
-
Morris Jette authored
-
Morris Jette authored
Make sure that /proc/#/stat file is NULL terminated to avoid having sscan() go off the end of a buffer. bug 2234
-
David Gloe authored
I've created this patch which we're using in-house to allow snc4+cache and reject other snc4 combinations.
-
Morris Jette authored
Fix for tracking job resource allocation when slurmctld is reconfigured while Cray Node Health Check (NHC) is running. Previous logic would fail to record the job's allocation then perform release operation upon NHC completeion, resulting in underflow error messages. bug 2353
-
Tim Wickberg authored
-
Dominik Bartkiewicz authored
While here, mark options const, and add leading underscore to denote this as a static function (only called within hostlist.c). Also change strcmp to xstrcmp. Commit a6ffef22 changed this function and would alter the input hn, which led to subsequent calls to the function having wrong prefix lengths for that hostrange precluding it from matching correctly. Bug 2558.
-
- Jun 06, 2016
-
-
Morris Jette authored
-
Robbert Eggermont authored
Preserving the original subject has it's pro's, both for backward compatibility and for spotting failed jobs between thousands of successful jobs. bug 1611
-
Robbert Eggermont authored
seff crashes when Data::Dumper is not installed (and that one is not required by the slurm-seff rpm). It's not used so it doesn't need to be loaded. bug 1611
-
Danny Auble authored
-
Danny Auble authored
and error files.
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
The buffer to be used for reading the system /proc/*/stat files is moved from the stack to the heap (i.e. malloc'ed memory) and initialized to zero, and increased in size from 1k to 4k. I don't see how this could make any difference, but this anomaly was reported by valgrind. bug 2234
-
- Jun 03, 2016
-
-
Morris Jette authored
bug 2792
-
Morris Jette authored
bug 2793
-
Alejandro Sanchez authored
Documentation was obsolete specified this option only applied for job allocation, but not step. Nowadays it applies for both.
-
Morris Jette authored
-
Morris Jette authored
The #define of SLURMSTEPD_MEMCHECK in src/slurmd/common/slurmstepd_init.h must be changed to enable memcheck or valgrind. Also change a #if in src/slurmd/slurmd/req.c near where you find the "valgrind" references. Bug 2334 diagnostics
-
Tim Wickberg authored
If the QOS includes a time limit, skip checking the partition limit. The QOS limit is checked separately elsewhere.
-
Tim Wickberg authored
-
Tim Wickberg authored
'the the' is is a a mistake mistake.
-
- Jun 02, 2016
-
-
Tim Wickberg authored
Wrong order of operations results in the return code being 0/1.
-
Morris Jette authored
Fix for "scontrol -dd show job" with respect to displaying the specific CPUs allocated to a job on each node. Prior logic would only display the CPU information for the first node in the job allocation. Bug introduced in commit 0f826c0b due to misplaced parenthesis
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Tim Wickberg authored
Wrong order of operations results in the return code being 0/1.
-