- Aug 23, 2017
-
-
Alejandro Sanchez authored
Running slurmctld under valgrind while operating with jobcomp/elasticsearch reported the following bytes definitely lost: ==27403== 658 bytes in 1 blocks are definitely lost in loss record 301 of 342 ==27403== at 0x4C2FD4F: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==27403== by 0x2281B3: slurm_xrealloc (xmalloc.c:137) ==27403== by 0x22856A: makespace (xstring.c:114) ==27403== by 0x2285D0: _xstrcat (xstring.c:132) ==27403== by 0x228CE0: _xstrfmtcat (xstring.c:291) ==27403== by 0x83C5BCD: ??? ==27403== by 0x30A913: g_slurm_jobcomp_write (slurm_jobcomp.c:172) ==27403== by 0x18D8FC: job_completion_logger (job_mgr.c:13652) It turns out the generated buffer in slurm_jobcomp_log_record was xstrdup'ed to the corresponding job_node->serialized_job, but the originally generated buffer wasn't freed afterwards. The fix consists in change the transfer so that instead of xstrdup'ing the char * we just assign the pointer and NULL the buffer. The job_node->serialized_job was already xfree'd properly later when the job was indexed. Discovered while working on Bug 4065.
-
- Aug 22, 2017
-
-
Alejandro Sanchez authored
Otherwise the resulting URL may be invalid. Update documentation while here as well. Bug 4065.
-
Tim Shaw authored
Otherwise a race between threads in _check_node_status leads to a crash. Bug 4093.
-
Philip Kovacs authored
Bug 4094
-
- Aug 21, 2017
-
-
Alejandro Sanchez authored
Given a configuration with TopologyParam including Dragonfly option, if a job requested --switches count, the count timeout specified by either the job request or max_switch_wait SchedulerParameters was not respected. This was due to leaf_switch_count variable not being incremented in _eval_nodes_dfly() function when needed, as we do in _eval_nodes_topo(), the later being a execution path which already succeed to wait for the switch count timeout. Bug 4056
-
- Aug 17, 2017
-
-
Morris Jette authored
Coverity CID 44649 Bug 4085
-
- Aug 16, 2017
-
-
Danny Auble authored
instead of local. Bug 3546
-
- Aug 14, 2017
-
-
Morris Jette authored
-
Danny Auble authored
This reverts commit 00a691b9.
-
Morris Jette authored
-
- Aug 11, 2017
-
-
Danny Auble authored
This will allow dell's custom syscfg to work correctly. NOTE: Dell calls flat memory just memory. Bug 4034
-
Danny Auble authored
No code change, just moving existing code into a switch ready to handle multiple options. Bug 4034
-
Danny Auble authored
Add SystemType to knl_generic.conf for knl_generic in preparations for making KNL work on a Dell system. Add SystemType to knl_generic.conf. This is used to distinguish differences in vendors such as 'Dell'. Bug 4034
-
- Aug 10, 2017
-
-
Danny Auble authored
-
- Aug 07, 2017
-
-
Justin Lecher authored
Starting from glibc-2.25 the macros major and minor are only available from sys/sysmacros.h. This patch uses an autoconf macro to detect the location and includes the header accordingly. Bug 3982.
-
Danny Auble authored
-
Dominik Bartkiewicz authored
Bug 4019
-
- Aug 04, 2017
-
-
Danny Auble authored
-
Marshall Garey authored
Fix mysql plugin to correctly return parent limits for all children. Bug 4050
-
Danny Auble authored
the tree. Bug 4050
-
- Aug 01, 2017
-
-
Tim Shaw authored
Bug 3999
-
- Jul 28, 2017
-
-
Danny Auble authored
to have 'socket=' in AuthInfo to work. This is to make it so people don't have to update their slurmdbd.conf's when upgrading (and to match documentation). Continuation of last commit Bug 4009
-
- Jul 26, 2017
-
-
Dominik Bartkiewicz authored
Fix regression in commit e5c05549 that would put the stepd pid into the memory cgroup instead of the task's pid. Beforehand this would put the result of getpid() into the cgroup. Before e5c05549 this was done in the child of the fork which would get you the task's pid, but moving it to run in the parent broke this logic. What this patch does is adds pid to the input parameters of task_g_pre_launch_priv making it so we could use the correct pid.
-
- Jul 19, 2017
-
-
Morris Jette authored
Fix for possible slurmctld abort with use of salloc/sbatch/srun --gres-flags=enforce-binding option. bug 4008
-
- Jul 07, 2017
-
-
Danny Auble authored
will have a time displayed when truncating time. Bug 3940.
-
Alejandro Sanchez authored
Otherwise we can end up printing Start times greater than End times, leading to confusion when reading sacct output. 0 is displayed as Unknown. Cosmetic change. Bug 3940.
-
- Jun 30, 2017
-
-
Alejandro Sanchez authored
burst_buffer logic modified to support sizes in both SI and EIC size units (e.g. M/MiB for powers of 1024, MB for powers of 1000). bug 3922
-
- Jun 13, 2017
-
-
Tim Wickberg authored
Changes the alpsc_configure_nic() call to set the exclusive flag, and 100 for both the cpu and memory scaling values. Should only be used with exclusive jobs without concurrent steps running on a node, otherwise oversubscription of the GNI resources can occur leading to performance issues. Bug 3713.
-
- Jun 12, 2017
-
-
Morris Jette authored
An array was only being partially cleared due to bad logic bug 3876
-
Tim Wickberg authored
Bug 3874.
-
- Jun 09, 2017
-
-
Morris Jette authored
-
- Jun 08, 2017
-
-
Dominik Bartkiewicz authored
Improve selection of jobs to preempt when there are multiple partitions with jobs subject to preemption. bug 3824
-
- Jun 02, 2017
-
-
Dominik Bartkiewicz authored
list_for_each)
-
- May 31, 2017
-
-
Tim Shaw authored
Bug 3840.
-
- May 30, 2017
-
-
Tim Shaw authored
node_featurs/knl_cray plugin: Don't clear configured GRES from non-KNL node. bug 3768
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
- May 26, 2017
-
-
Dominik Bartkiewicz authored
Initial fix for handling floating partitions that use qos grp limits. Bug 3776
-