Skip to content
Snippets Groups Projects
  1. Oct 20, 2011
  2. Oct 19, 2011
  3. Oct 18, 2011
  4. Oct 13, 2011
  5. Oct 12, 2011
    • Mark A. Grondona's avatar
      task/cgroup: Expand debug message during memcg creation · abfdfcbe
      Mark A. Grondona authored
      Add the amount of memory allocated by slurm to the job or step
      to the debug message in memcg_initialize(). Also, change the
      message from debug to info, so that a user can see the information
      by using --slurmd-debug=1.
      abfdfcbe
    • Mark A. Grondona's avatar
      task/cgroup: Add debug message after memory cgroup initialization · 25d51e90
      Mark A. Grondona authored
      For debugging purposes, add a debug level message with some values
      of interest just after task_cgroup_memory has initialized.
      25d51e90
    • Mark A. Grondona's avatar
      cgroups: Add new config parameter MinRAMSpace · 6ce0e77b
      Mark A. Grondona authored
      Add a new configuration parameter MinRAMSpace which sets a lower bound on
      memory.limit_in_bytes and memory.memsw.limit_in_bytes . This is required in
      case an administrator or user sets an absurdly low value for memory limit,
      potentially causing the slurmstepd to be terminated by the OOM killer.
      
      MinRAMSpace is set in MB of RAM and is 30 by default. (An arbitrarily
      chosen value)
      6ce0e77b
    • Mark A. Grondona's avatar
      cgroups: Allow percent values in cgroup.conf to be floating point · fa38c431
      Mark A. Grondona authored
      The use of whole percent values for cgroup.conf parameters such
      as AllowedRAMSpace, MaxRAMPercent, AllowedSwapSpace and MaxSwapPercent
      may be too coarse grained on systems with large amounts of memory.
      (e.g. 1% of 64G is over 650MB).
      
      This patch allows these percentage values to be arbitrary floating
      point numbers to allow finer grained tuning of these limits and
      parameters.
      fa38c431
    • Mark A. Grondona's avatar
      task/cgroup: Don't create memory cgroups with limit of 0 bytes · e1bb1689
      Mark A. Grondona authored
      Treat a 0 byte memory limit from SLURM as unlimited and instead use
      MaxRAMPercent and MaxSwapPercent as RAM and Swap limits for the job/job
      step. This avoids creating a memory cgroup with limit_in_bytes = 0,
      which would end up causing the cgroup to OOM before slurmstepd could
      even be started.
      
      This also allows systems in which SLURM isn't explicitly allocating
      memory to use the task/cgroup plugin with ConstrainRAMSpace=yes.
      e1bb1689
    • Mark A. Grondona's avatar
      task/cgroup: Apply MaxRamPercent and MaxSwapPercent to memory cgroups · db99233d
      Mark A. Grondona authored
      Calculate the upper bound RAM in bytes and Swap in bytes that may
      be used by any one cgroup and apply this limit in the task/cgroup
      code.
      db99233d
    • Mark A. Grondona's avatar
      task/cgroup: Refactor task_cgroup_memory_create · 941262a3
      Mark A. Grondona authored
      There was some duplicated code in task_cgroup_memory_create. In order
      to facilitate extending this code in the future, refactor it into
      a common function memcg_initialize().
      941262a3
    • Mark A. Grondona's avatar
      cgroups: Allow cgroup mount point to be configurable · c9ea11b5
      Mark A. Grondona authored
      cgroups code currently assumes cgroup subsystems will be mounted
      under /cgroup, which is not the ideal location for many situations.
      Add a new cgroup.conf parameter to redefine the mount point to an
      arbitrary location. (for example, some systems may already have
      cgroupfs mounted under /dev/cgroup or /sys/fs/cgroup)
      c9ea11b5
  6. Oct 11, 2011
    • Matthieu Hautreux's avatar
      proctrack/cgroup: no longer rely on release agent to clean step cg · ef8cc0a7
      Matthieu Hautreux authored
      With release_agent notified at the step cgroup level, the step cgroup
      can be removed while slurmstepd as not yet finished its internals
      epilog mechanisms. Inhibiting release agent at the step level and
      ensuring its proper removal helps to guarantee that the node will only
      be eligible for job execution when the resources will be completely
      available (no longer used by the job or the epilogs).
      ef8cc0a7
  7. Oct 05, 2011
  8. Oct 03, 2011
  9. Sep 30, 2011
  10. Sep 29, 2011
  11. Sep 26, 2011
    • Morris Jette's avatar
      Cosmetic mods for GCC v4.6 · 413b1c2c
      Morris Jette authored
      Many cosmetic modifications to eliminate warning message from GCC version
      4.6 compiler, mostly due to unused variables.
      413b1c2c
  12. Sep 17, 2011
  13. Sep 16, 2011
    • Morris Jette's avatar
      Problem using salloc/mpirun with task affinity socket binding · 98b203d4
      Morris Jette authored
      salloc/mpirun does not play well together with task affinity socket binding.  The following example illustrates the problem.
      
      [sulu] (slurm) mnp> salloc -p bones-only -N1-1 -n3 --cpu_bind=socket mpirun cat /proc/self/status | grep Cpus_allowed_list
      salloc: Granted job allocation 387
      --------------------------------------------------------------------------
      An invalid physical processor id was returned ...
      
      The problem is that with mpirun jobs Slurm launches only a single task, regardless of the value of -n. This confuses the socket binding logic in task affinity.  The result is that task affinity binds the task to only a single cpu, instead of all the allocated cpus on the socket.  When mpi attempts to bind to any of the other allocated cpus on the socket, it gets the "invalid physical processor id" error. Note that the problem may occur even if socket binding is not explicitly requested by the user.  If task/affinity is configured and the allocated CPUs are a whole number of sockets, Slurm will use "implicit auto binding" to sockets, triggering the problem.
      Patch from Martin Perry (Bull).
      98b203d4
  14. Sep 12, 2011
  15. Sep 10, 2011
  16. Sep 09, 2011
    • Morris Jette's avatar
      Improve performance of preemption logic · b5a8a742
      Morris Jette authored
      This modifcation improves the performance of SLURM's preemption logic
      be reducing the execution time of the scheduling logic and doing a better
      job of minimizing the number of job's preempted to initiate a new job.
      Based largely upon work by Phil Eckert, LLNL.
      b5a8a742
  17. Sep 08, 2011
  18. Sep 06, 2011
Loading