- Apr 13, 2021
-
-
Skyler Malinowski authored
UnkillableStepProgram exposes environment variables to the script. Those variables are now documented. Bug 11231
-
Tim McMullan authored
Bug 11352
-
Tim McMullan authored
Bug 11352
-
Tim McMullan authored
Bug 11352
-
Marcin Stolarek authored
dealing with --mem-per-cpu and --threads-per-core. The handling of --threads-per-core and --mem-per-cpu introduced by 49a7d7f9 is inconsistent with memory values calculated from job credential, which resulted in memory over-allocation. This needs to be revisited for proper solution since it relies on non-signed information. This will require modification of job credential. Bug 11148
-
- Apr 12, 2021
-
-
Nate Rini authored
Update example for correctness and style. Bug 7573
-
Brian Christiansen authored
-
Brian Christiansen authored
Previous 3 commits. Bug 10980
-
Michael Hinton authored
The last task reuses array_job_id. So if job_ptr doesn't change, that means we have already scheduled it and should never go to next_task. job_launch() can requeue the job if it fails. This puts the job in the completing and pending state, which could allow the last task to get scheduled *again* without the check added here. Rescheduling a completing job will destroy its node_bitmap and job_resrcs and cause the job to stay completing forever. Bug 10980 Co-authored-by:
Brian Christiansen <brian@schedmd.com>
-
Nate Rini authored
In practice, there are some rare situations where the job can get corrupted and lose its job_resrcs object, preventing the controller from starting due to a segfault in _step_dealloc_lps(). Instead of allowing this segfault, simply emit an error and return. Leaving xassert in place so that non-production builds will catch this situation. Bug 10980, 7757, 9474, 6837 Co-authored-by:
Marshall Garey <marshall@schedmd.com> Co-authored-by:
Michael Hinton <hinton@schedmd.com>
-
Michael Hinton authored
Bug 10980
-
Marcin Stolarek authored
Remove the misleading statment in MaxArraySize documentation. Bug 11317
-
- Apr 10, 2021
-
-
Brian Christiansen authored
-
Colby Ashley authored
Bug 10458
-
Colby Ashley authored
Bug 10458
-
Colby Ashley authored
Bug 10458
-
Tim McMullan authored
Bug 11331
-
Tim McMullan authored
Bug 11331
-
Tim McMullan authored
Bug 11331
-
Tim McMullan authored
Bug 11331
-
Tim McMullan authored
Bug 11333
-
Tim McMullan authored
Bug 11333
-
Tim McMullan authored
Bug 11333
-
- Apr 09, 2021
-
-
Tim Wickberg authored
Continuation of bde072c6. Explicitly mention addition of --exact, and deprecation of --whole. Bug 10914.
-
Marcin Stolarek authored
slurm_auth_init() may fail because of AuthAltTypes, not just AuthType, so generalize this error message to make it less confusing in such cases. Bug 11334.
-
Marcin Stolarek authored
Ensure an appropriate error message is printed. Only applies to slurmdbd (which lacks an equivalent to StateSaveLocation). Bug 11334.
-
- Apr 07, 2021
-
-
Scott Hilton authored
Bug 11251
-
Nate Rini authored
The plugins are only reloaded at restart of the daemons and configuration changes will likely be missed otherwise. Bug 10974
-
Nate Rini authored
Bug 10974
-
Nate Rini authored
Bug 10974
-
Nate Rini authored
Bug 10974
-
Nate Rini authored
Bug 10974
-
Nate Rini authored
Bug 10974
-
Nate Rini authored
Bug 10974
-
Nate Rini authored
Update documentation for v2.x release of Influxdb. Differentiate how current configuration values are different in v2.x database. Correct formatting in several parts for readability. Bug 10974
-
Nate Rini authored
Set formatting same as source code to avoid needing inline formatting macro instructions. Bug 10974
-
- Apr 06, 2021
-
-
Marshall Garey authored
Due to other changes for bug 9193, finding job_ptr here is no longer needed, so remove it. Because job_ptr isn't used here, we can also move where we lock the job_write_lock since we don't need it in the else block anymore, but we still need it in the if block. Bug 9193
-
Marshall Garey authored
The job burst buffer state was already state to BB_STATE_STAGING_IN before this function so setting it here was redundant. Bug 9193
-
Marshall Garey authored
The burst buffer record which is state saved was not created for a job until the middle of the stage-in thread. If slurmctld was shutdown during burst buffer stage-in for a job, the thread would be killed before creating the burst buffer record, and therefore the burst buffer was not state saved. Bug 9193
-
Marshall Garey authored
Previously even though burst buffers were state saved, the state of the burst buffer was always reset to PENDING. This commit gets the state of the burst buffer when slurmctld was shutdown and then uses that information to do whatever is needed: certain threads are restarted and the burst buffer state may be changed, or the burst buffer may just be cleaned up. Previously a burst buffer that was state saved was ignored if the job was completed. This commit removed that logic since the burst buffer may not have completed stage-out and teardown. Bug 9193
-