Skip to content
Snippets Groups Projects
  1. Mar 02, 2012
  2. Feb 29, 2012
  3. Feb 28, 2012
  4. Feb 24, 2012
  5. Feb 23, 2012
  6. Feb 20, 2012
  7. Feb 06, 2012
    • Danny Auble's avatar
      The openpty(3) call used by slurmstepd to allocate a pseudo-terminal · 2a1c08b0
      Danny Auble authored
      is a convenience function in BSD and glibc that internally calls
      the equivalent of
      
          int masterfd = open("/dev/ptmx", flags);
          grantpt (masterfd);
          unlockpt (masterfd);
          int slavefd = open (slave, O_RDRW|O_NOCTTY);
      
      (in psuedocode)
      
      On Linux, with some combinations of glibc/kernel (in this
      case glibc-2.14/Linux-3.1), the equivalent of grantpt(3) was failing
      in slurmstepd with EPERM, because the allocated pty was getting
      root ownership instead of the user running the slurm job.
      
      From the POSIX description of grantpt:
      
       "The grantpt() function shall change the mode and ownership of the
        slave pseudo-terminal device... The user ID of the slave shall
        be set to the real UID of the calling process..."
      
       http://pubs.opengroup.org/onlinepubs/007904875/functions/grantpt.html
      
      This means that for POSIX-compliance, the real user id of slurmstepd
      must be the user executing the SLURM job at the time openpty(3) is
      called. Unfortunately, the real user id of slurmstepd at this
      point is still root, and only the effective uid is set to the user.
      
      This patch is a work-around that uses the (non-portable) setresuid(2)
      system call to reset the real and effective uids of the slurmstepd
      process to the job user, but keep the saved uid of root. Then after
      the openpty(3) call, the previous credentials are reestablished
      using the same call.
      2a1c08b0
  8. Feb 03, 2012
    • Morris Jette's avatar
      Fix for srun with --exclude and --nodes · a4551158
      Morris Jette authored
      Fix for srun allocating running within existing allocation with --exclude
      option and --nnodes count small enough to remove more nodes.
      
          > salloc -N 8
          salloc: Granted job allocation 1000008
          > srun -N 2 -n 2 --exclude=tux3 hostname
          srun: error: Unable to create job step: Requested node configuration is not available
      
      Patch from Phil Eckert, LLNL.
      a4551158
  9. Feb 02, 2012
  10. Feb 01, 2012
    • Morris Jette's avatar
      Fix job requeue bug · c0a7a7a4
      Morris Jette authored
      Fix bug when requeued batch job is scheduled to run on a different node
      zero, but attemts job launch on old node zero causing fatal error
      "Invalid host_index -1 for job #"
      c0a7a7a4
    • Morris Jette's avatar
      Avoid slurmctld abort due to bad pointer · 43936335
      Morris Jette authored
      Avoid slurmctld abort due to bad pointer when setting an advanced
      reservation MAINT flag if it contains no nodes (only licenses).
      43936335
  11. Jan 31, 2012
  12. Jan 27, 2012
  13. Jan 25, 2012
    • Morris Jette's avatar
      Set DEFAULT flag in partition structure · 9f4ef925
      Morris Jette authored
      Set DEFAULT flag in partition structure when slurmctld reads the
      configuration file. Patch from Rémi Palancher. Note the flag is set
      when the information is sent via RPC for sinfo.
      9f4ef925
  14. Jan 24, 2012
  15. Jan 20, 2012
  16. Jan 19, 2012
  17. Jan 18, 2012
  18. Jan 13, 2012
  19. Jan 09, 2012
  20. Dec 28, 2011
  21. Dec 21, 2011
  22. Dec 19, 2011
  23. Dec 17, 2011
  24. Dec 15, 2011
  25. Dec 14, 2011
  26. Dec 09, 2011
  27. Dec 08, 2011
Loading