- Jan 29, 2016
-
-
Morris Jette authored
When the slurmctld is in background mode, it will issue double free calls on the incomming message buffers, likely leading an abort.
-
Morris Jette authored
-
- Jan 28, 2016
-
-
Morris Jette authored
Do not automatically relocate an advanced reservation for individual cores that spans multiple nodes when nodes in that reservation go down (e.g. a 1 core reservation on node "tux1" will be moved if node "tux1" goes down, but a reservation containing 2 cores on node "tux1" and 3 cores on "tux2" will not be moved node "tux1" goes down). Advanced reservations for whole nodes will be moved by default for down nodes. bug 2326
-
Tim Wickberg authored
avoid attempting to execve() a directory with a name that happens to matching that of the desired command. bug 2392.
-
Morris Jette authored
Allow an existing reservation with running jobs to be modified without Flags=IGNORE_JOBS. bug 2389
-
Morris Jette authored
burst_buffer/cray - Increase size of intermediate variable used to store buffer byte size read from DW instance from 32 to 64-bits to avoid overflow and reporting invalid buffer sizes. bug 2378
-
Danny Auble authored
-
- Jan 27, 2016
-
-
Danny Auble authored
-
Danny Auble authored
gres types without a File.
-
Danny Auble authored
-
Danny Auble authored
to debug3 when trying to find the correct association. a continuation to commit 87d9370f
-
Alejandro Sanchez authored
-
- Jan 26, 2016
-
-
Morris Jette authored
Add slurmd "-b" option to report node rebooted at daemon start time. Used for testing purposes.
-
Tim Wickberg authored
reduce reliance on fixed-sized buffers for output, helps reduce warnings from coverity et al. split up key/value pairs in preparation for JSON output work. xstrfmtcat exists and is cleaner than snprintf followed by xstrcat. use a consistent line ending rather than repeat conditional block. output format should be unchanged, and has been tested to match on common cases and passes all relevant regression tests.
-
- Jan 25, 2016
-
-
Morris Jette authored
-
Morris Jette authored
Previously under some conditions that boot completion was ignored and the job kept pending.
-
Sergey Meirovich authored
-
- Jan 22, 2016
-
-
Morris Jette authored
-
Danny Auble authored
-
Morris Jette authored
-
- Jan 21, 2016
-
-
Danny Auble authored
Bug 2364
-
Danny Auble authored
Commit fa331e30 fixes this. The logic was bad to begin with... uint32_t new_cpus = detail_ptr->num_tasks / detail_ptr->cpus_per_task; The / should had been * this whole time. This was the reason we found this in the first place.
-
Morris Jette authored
If scancel is operating on large number of jobs and RPC responses from slurmctld daemon are slow then introduce a delay in sending the cancel job requests from scancel in order to reduce load on slurmctld. bug 2256
-
Morris Jette authored
-
Danny Auble authored
-
Morris Jette authored
Backfill scheduling properly synchronized with Cray Node Health Check. Prior logic could result in highest priority job getting improperly postponed. bug 2350
-
Danny Auble authored
-
- Jan 20, 2016
-
-
Danny Auble authored
-
Danny Auble authored
-
Morris Jette authored
Properly account for memory, CPUs and GRES when slurmctld is reconfigured while there is a suspended job. Previous logic would add the CPUs, but not memory or GPUs. This would result in underflow/overflow errors in select cons_res plugin. bug 2353
-
- Jan 18, 2016
-
-
jette authored
Add reservation flag of "purge_comp" which will purge an advanced reservation once it has no more active (pending, suspended or running) jobs. bug 2355
-
- Jan 17, 2016
-
-
jette authored
Fix backfill scheduling bug which could postpone the scheduling of jobs due to avoidance of nodes in COMPLETING state. bug 2350
-
- Jan 15, 2016
-
-
Brian Christiansen authored
Bug 2255
-
Morris Jette authored
-
Brian Christiansen authored
Bug 2343
-
Morris Jette authored
Fix for configuration of "AuthType=munge" and "AuthInfo=socket=..." with alternate munge socket path. bug 2348
-
Morris Jette authored
-
Brian Christiansen authored
Bug 2343
-
- Jan 14, 2016
-
-
Morris Jette authored
Fix for configuration of "AuthType=munge" and "AuthInfo=socket=..." with alternate munge socket path. bug 2348
-
Janne Blomqvist authored
The initgroups()/getgrouplist() caching in slurmd is changed to not require enumeration, instead individual entries are cached when first needed. This cache is always enabled, thus the CacheGroups configuration setting has been removed. The time that each cache entry is considered valid is determined by the GroupUpdateTime configuration parameter. scontrol reconfig will purge the cache. The default value for the GroupUpdateForce configuration parameter has changed, as systems where /etc/group contains all the groups instead of some external system like NIS, LDAP are nowadays probably the exception rather than the rule. For slurmctld, the group cache still uses enumeration, but this is needed only to take care of special situations like multiple groups with the same GID. With enumeration disabled, group caching still works otherwise. validate_groups() does a little more optional work in order to handle the case where the user primary group is in the AllowGroups list, but getgrnam_r() does not return that user as a group member. bug 1629
-