- Aug 24, 2017
-
-
Alejandro Sanchez authored
Testing if curl_handle != NULL or rc != SLURM_SUCCESS was already done in the right above if/else statements, jumping to the consequent goto cleanup label if needed. Thus the removed test was never going to be evaluated to true, and Coverity properly warned about this. Regression introduced in commit 5f5e6472 (code cleanup).
-
Alejandro Sanchez authored
-
Alejandro Sanchez authored
Calling bit_unfmt() with a zero bit_size() bitmap leads to a later call to bit_nclear() with start=0 and stop=-1, leading to the ABRT. This scenario happened when cgroup.conf has ConstrainDevices=yes and task_cgroup_devices_create() tries to collect the GRES devices but gres_cpu_cnt=0, thus creating a p->cpus_bitmap = bit_alloc(gres_cpu_cnt); of zero size which is passed by argument to bit_unfmt(). gres_cpu_cnt is 0 because we have defined a gres.conf like this: Name=gpu Type=tesla File=/tmp/gres/tesla0 CPUs=0,1 Name=gpu Type=tesla File=/tmp/gres/tesla1 CPUs=0,1 Name=gpu Type=kepler File=/tmp/gres/kepler0 CPUs=2,3 Name=gpu Type=kepler File=/tmp/gres/kepler1 CPUs=2,3 but have no GresTypes nor GRES option in the slurm.conf / node config def. Bug 3974
-
Alejandro Sanchez authored
Bug 3217
-
Danny Auble authored
-
- Aug 23, 2017
-
-
Alejandro Sanchez authored
-
Alejandro Sanchez authored
-
Alejandro Sanchez authored
Running slurmctld under valgrind while operating with jobcomp/elasticsearch reported the following bytes definitely lost: ==27403== 658 bytes in 1 blocks are definitely lost in loss record 301 of 342 ==27403== at 0x4C2FD4F: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==27403== by 0x2281B3: slurm_xrealloc (xmalloc.c:137) ==27403== by 0x22856A: makespace (xstring.c:114) ==27403== by 0x2285D0: _xstrcat (xstring.c:132) ==27403== by 0x228CE0: _xstrfmtcat (xstring.c:291) ==27403== by 0x83C5BCD: ??? ==27403== by 0x30A913: g_slurm_jobcomp_write (slurm_jobcomp.c:172) ==27403== by 0x18D8FC: job_completion_logger (job_mgr.c:13652) It turns out the generated buffer in slurm_jobcomp_log_record was xstrdup'ed to the corresponding job_node->serialized_job, but the originally generated buffer wasn't freed afterwards. The fix consists in change the transfer so that instead of xstrdup'ing the char * we just assign the pointer and NULL the buffer. The job_node->serialized_job was already xfree'd properly later when the job was indexed. Discovered while working on Bug 4065.
-
Tim Wickberg authored
This should only happen due to ESLURM_RESULT_TOO_LARGE, which leads to no list being packed. Follow on to 390da8cf / 8cf1835c. Bug 3624.
-
Danny Auble authored
launch_g_step_wait() function.
-
- Aug 22, 2017
-
-
Alejandro Sanchez authored
Otherwise the resulting URL may be invalid. Update documentation while here as well. Bug 4065.
-
Tim Shaw authored
Otherwise a race between threads in _check_node_status leads to a crash. Bug 4093.
-
Tim Wickberg authored
Modification of commit c7e6d864. Bug 4095.
-
Danny Auble authored
-
Morris Jette authored
-
Philip Kovacs authored
bug 4095
-
Morris Jette authored
-
Philip Kovacs authored
bug 4095
-
Morris Jette authored
-
Philip Kovacs authored
Bug 4094
-
Morris Jette authored
Coverity CID 166001
-
Morris Jette authored
Coverity CID 44725, 44726, 44747, 44728
-
Morris Jette authored
Coverity CID 44968
-
Morris Jette authored
Coverity CID 44810
-
Morris Jette authored
Coverity CID 53126
-
Morris Jette authored
Coverity CID 53127
-
Morris Jette authored
Coverity CID 44761
-
Morris Jette authored
Coverity CID 44696
-
Morris Jette authored
Coverity CID 44729
-
Morris Jette authored
Coverity CID 44700
-
- Aug 21, 2017
-
-
Alejandro Sanchez authored
The exit status value for these two fields was incorrectly saved as-is. The patch makes use of the appropiate macros to properly decode the low-order 8 bits of the exit status and the signal number (if any). bug 3942
-
Isaac Hartung authored
Print numbers using exponential format if required to fit in allocated field width. The sacctmgr and sshare commands are impacted. bug 1749
-
Brian Christiansen authored
was removed in 2705f9c5. Caused sview to crash when viewing the debug_flags.
-
Morris Jette authored
-
Morris Jette authored
bug 4056
-
Alejandro Sanchez authored
Given a configuration with TopologyParam including Dragonfly option, if a job requested --switches count, the count timeout specified by either the job request or max_switch_wait SchedulerParameters was not respected. This was due to leaf_switch_count variable not being incremented in _eval_nodes_dfly() function when needed, as we do in _eval_nodes_topo(), the later being a execution path which already succeed to wait for the switch count timeout. Bug 4056
-
- Aug 19, 2017
-
-
Morris Jette authored
For commit 35b505cc, bug 3982
-
Morris Jette authored
Coverity CID 44808
-
Morris Jette authored
Coverity CID 45157
-
- Aug 18, 2017
-
-
Brian Christiansen authored
-