Skip to content
Snippets Groups Projects
Commit 82bbae66 authored by Ben Roberts's avatar Ben Roberts Committed by Danny Auble
Browse files

Docs - Clarify different OOM behavior for cgroups vs polling

Bug 11318
parent 68ee435c
No related branches found
No related tags found
No related merge requests found
......@@ -116,7 +116,13 @@ which case the job's RAM limit will be set to its swap space limit if
\fBConstrainSwapSpace\fR is set to "yes".
Also see \fBAllowedSwapSpace\fR, \fBAllowedRAMSpace\fR and
\fBConstrainSwapSpace\fR.
NOTE: When enabled, ConstrainRAMSpace can lead to a noticeable decline in
\fBNOTE\fR: When using \fBConstrainRAMSpace\fR, if a process tries to consume
more memory than is available, the step that process is running in will be
killed. This differs from the behavior when using \fBOverMemoryKill\fR,
where just the offending process will be killed.
\fBNOTE\fR: When enabled, ConstrainRAMSpace can lead to a noticeable decline in
per-node job throughout. Sites with high-throughput requirements should
carefully weigh the tradeoff between per-node throughput, versus potential
problems that can arise from unconstrained memory usage on the node. See
......
......@@ -1207,8 +1207,11 @@ allocation may affect other processes and/or machine health.
task/cgroup as a TaskPlugin and making use of ConstrainRAMSpace=yes in the
cgroup.conf instead of using this JobAcctGather mechanism for memory
enforcement. With OverMemoryKill, memory limit is applied against each process
individually and is not applied to the step as a whole as it is with
ConstrainRAMSpace=yes. Using JobAcctGather is polling based and there is a
individually and is not applied to the step as a whole. This means that when
jobs have a process that consumes too much memory, the process will be killed
but the step will continue to run. When using cgroups with
ConstrainRAMSpace=yes, a process that consumes too much memory will result in
the job step being killed. Using JobAcctGather is polling based and there is a
delay before a job is killed, which could lead to system Out of Memory events.
.RE
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment