Skip to content
Snippets Groups Projects
  1. Oct 22, 2012
  2. Sep 13, 2012
  3. Aug 14, 2012
  4. Jul 31, 2012
    • Janne Blomqvist's avatar
      Use mount and umount syscalls when handling cgroup namespaces. · 485c80bc
      Janne Blomqvist authored
      Using the syscalls directly rather than calling bin/(u)mount via
      system() avoids a few fork + exec calls, and provides better error
      handling if something goes wrong.
      
      Users of this functionality are also updated to use slurm_strerror in
      order to provide a more informative error message.
      
      The mount and umount syscalls are Linux-specific, but so are cgroups
      so no portability is lost.
      485c80bc
    • Danny Auble's avatar
      remove last patch to give author credit · 557c52d1
      Danny Auble authored
      557c52d1
    • Danny Auble's avatar
      Use mount and umount syscalls when handling cgroup namespaces. · b4c1d3d7
      Danny Auble authored
      Using the syscalls directly rather than calling bin/(u)mount via
      system() avoids a few fork + exec calls, and provides better error
      handling if something goes wrong.
      
      Users of this functionality are also updated to use slurm_strerror in
      order to provide a more informative error message.
      
      The mount and umount syscalls are Linux-specific, but so are cgroups
      so no portability is lost.
      b4c1d3d7
  5. Mar 18, 2012
    • Mark A. Grondona's avatar
      task/cgroup: delete job step memcg instead of using force_empty · a93afcd1
      Mark A. Grondona authored
      The current task/cgroup memory code writes to force_empty at job step
      completion and then waits for the release agent to be triggered to
      remove the memcg. However, force_empty only causes clean cache pages
      to be dropped from the memcg and does not actually move charges to
      the parent [1].
      
      This has two unfortunate side-effects. First, pages that can't be
      dropped by force_empty are in-use and could stay that way indefinitely
      (e.g. system library that is in-use until just after force_empty
      completes). Thus, the step memcg never becomes 'empty' and the release
      agent is not activated. Second, cached pages that can be freed are
      likely associated with the job itself, and those files and libraries
      will have to be paged in again for subsequent job steps.
      
      In contrast, calling rmdir(2) on a memcg with no active tasks
      causes *all* current charges to move to parent, which is really what
      we want in this case. This allows cached libraries and binaries to
      stay resident and be associated with the job, and also ensures that
      the step memcg is removed immediately as the job step ends.
      
      Thus, this patch replaces the write to force_empty with a call
      to xcgroup_delete() on the step memcg, which in turn removes
      the memcg with rmdir(2).
      
      The functionality of this patch depends on the previous fix that
      uses xcgroup_move_process() to move slurmstepd to the root memcg.
      Otherwise, there will be leftover slurmstepd threads in the job
      step memcg, and the rmdir will fail with EBUSY.
      
       [1] Sec 4.3: http://www.kernel.org/doc/Documentation/cgroups/memory.txt
      a93afcd1
    • Mark A. Grondona's avatar
      task/cgroup: use xcgroup_move_process to move slurmstepd to root memcg · 2dd13506
      Mark A. Grondona authored
      In task_cgroup_memory_fini() the implementation attempts to move
      the existing slurmstepd task to the root memory cgroup by writing
      the result of getpid(2) to the root memory's 'task' file. This
      does not work, however, because slurmstepd is multi-threaded and
      thus only the main thread is moved.
      
      This patch replaces the explicit write to 'tasks' with a call to
      the new xcgroup_move_process() call, which handles moving all
      threads in the process.
      2dd13506
  6. Oct 13, 2011
  7. Oct 12, 2011
    • Mark A. Grondona's avatar
      task/cgroup: Expand debug message during memcg creation · abfdfcbe
      Mark A. Grondona authored
      Add the amount of memory allocated by slurm to the job or step
      to the debug message in memcg_initialize(). Also, change the
      message from debug to info, so that a user can see the information
      by using --slurmd-debug=1.
      abfdfcbe
    • Mark A. Grondona's avatar
      task/cgroup: Add debug message after memory cgroup initialization · 25d51e90
      Mark A. Grondona authored
      For debugging purposes, add a debug level message with some values
      of interest just after task_cgroup_memory has initialized.
      25d51e90
    • Mark A. Grondona's avatar
      cgroups: Add new config parameter MinRAMSpace · 6ce0e77b
      Mark A. Grondona authored
      Add a new configuration parameter MinRAMSpace which sets a lower bound on
      memory.limit_in_bytes and memory.memsw.limit_in_bytes . This is required in
      case an administrator or user sets an absurdly low value for memory limit,
      potentially causing the slurmstepd to be terminated by the OOM killer.
      
      MinRAMSpace is set in MB of RAM and is 30 by default. (An arbitrarily
      chosen value)
      6ce0e77b
    • Mark A. Grondona's avatar
      cgroups: Allow percent values in cgroup.conf to be floating point · fa38c431
      Mark A. Grondona authored
      The use of whole percent values for cgroup.conf parameters such
      as AllowedRAMSpace, MaxRAMPercent, AllowedSwapSpace and MaxSwapPercent
      may be too coarse grained on systems with large amounts of memory.
      (e.g. 1% of 64G is over 650MB).
      
      This patch allows these percentage values to be arbitrary floating
      point numbers to allow finer grained tuning of these limits and
      parameters.
      fa38c431
    • Mark A. Grondona's avatar
      task/cgroup: Don't create memory cgroups with limit of 0 bytes · e1bb1689
      Mark A. Grondona authored
      Treat a 0 byte memory limit from SLURM as unlimited and instead use
      MaxRAMPercent and MaxSwapPercent as RAM and Swap limits for the job/job
      step. This avoids creating a memory cgroup with limit_in_bytes = 0,
      which would end up causing the cgroup to OOM before slurmstepd could
      even be started.
      
      This also allows systems in which SLURM isn't explicitly allocating
      memory to use the task/cgroup plugin with ConstrainRAMSpace=yes.
      e1bb1689
    • Mark A. Grondona's avatar
      task/cgroup: Apply MaxRamPercent and MaxSwapPercent to memory cgroups · db99233d
      Mark A. Grondona authored
      Calculate the upper bound RAM in bytes and Swap in bytes that may
      be used by any one cgroup and apply this limit in the task/cgroup
      code.
      db99233d
    • Mark A. Grondona's avatar
      task/cgroup: Refactor task_cgroup_memory_create · 941262a3
      Mark A. Grondona authored
      There was some duplicated code in task_cgroup_memory_create. In order
      to facilitate extending this code in the future, refactor it into
      a common function memcg_initialize().
      941262a3
    • Mark A. Grondona's avatar
      cgroups: Allow cgroup mount point to be configurable · c9ea11b5
      Mark A. Grondona authored
      cgroups code currently assumes cgroup subsystems will be mounted
      under /cgroup, which is not the ideal location for many situations.
      Add a new cgroup.conf parameter to redefine the mount point to an
      arbitrary location. (for example, some systems may already have
      cgroupfs mounted under /dev/cgroup or /sys/fs/cgroup)
      c9ea11b5
  8. Aug 09, 2011
  9. May 25, 2011
  10. May 24, 2011
  11. Mar 30, 2011
Loading