- May 18, 2016
-
-
Morris Jette authored
Remove some SQUEUE output format environment variables set by default on Cray systems, which break some tests
-
Morris Jette authored
AND should have been an OR. Modify test to log more details on failure.
-
Brian Christiansen authored
Group id was being overwritten by user id.
-
Brian Christiansen authored
Writing and reading to a file on a slow shared filesytem could cause this test to fail.
-
Brian Christiansen authored
-
Brian Christiansen authored
The filesystem may have problems removing and creating the same filename quickly.
-
- May 17, 2016
-
-
Morris Jette authored
-
Morris Jette authored
Correct description of the SLURMD_NODENAME environment variable in the sbatch and srun man pages.
-
Morris Jette authored
Also add some job time limits
-
Morris Jette authored
Increase a sleep for slower step start times on a Cray Make error message more detailed (add expected/actual count to message)
-
Morris Jette authored
One memory leak fixed Some other code moved around for better clarity
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Tim Wickberg authored
Run autogen.sh to pickup changes and resolve conflicts.
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Morris Jette authored
Previous logic could truncate CPU masks at 32-bits
-
Morris Jette authored
(1 << x) where "x" is a uint64_t would treat "1" as an int and roll over at 32-bits. Typecasting "1" to uint64_t eliminates that and supports a full 64-bit value.
-
- May 16, 2016
-
-
Josko Plazonic authored
Update slurm.spec file to have seff depend on slurm-perlapi.
-
Tim Wickberg authored
-
Jason Bacon authored
-
Morris Jette authored
-
- May 13, 2016
-
-
Giovanni Torres authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
Make test more robust for compute nodes with large CPU counts.
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
This change fixes Slurm's ability to optimize selection of resources for a job requesting feature counts where some of those node features are currently inactive (require node reboot to claim).
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
# Conflicts: # src/common/slurm_acct_gather_energy.c # src/common/slurm_acct_gather_filesystem.c # src/common/slurm_acct_gather_infiniband.c # src/common/slurm_acct_gather_profile.c # src/common/slurm_jobacct_gather.c
-
Danny Auble authored
when in use. The problem here is the polling threads in the various acct_gather codes were detached and could possibly still be polling after the plugin had been unloaded making a seg fault with a backtrace like this... #0 0x00007fe7af008c00 in ?? () #1 0x00007fe7b1138479 in __nptl_deallocate_tsd () at pthread_create.c:175 #2 0x00007fe7b11398b0 in __nptl_deallocate_tsd () at pthread_create.c:326 #3 start_thread (arg=0x7fe7b1f12700) at pthread_create.c:346 #4 0x00007fe7b0e6fb5d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 The fix was to make the threads non-detached and join them before calling a dlclose.
-
Morris Jette authored
Whenever possible, avoid allocating nodes that require a reboot. Previous logic failed to re-sort the job set table based upon the need for rebooting to achieve the desired features (e.g. KNL MCDRAM or CACHE mode). bug 2726
-
- May 12, 2016
-
-
Morris Jette authored
Put header files in alphabetic order, No change in logic
-
Danny Auble authored
# Conflicts: # src/slurmctld/controller.c
-
Danny Auble authored
-
Danny Auble authored
trying to verify the cluster name (which may try to /create/ files or directories) *before* dropping privs results in a fatal error as slurmctld tries to create items which ultimately fail. Moving this process until after the privs and uid have changed allows the process to succeed. Reported by Jon Nelson <jdnelson@dyn.com> Bug 2728
-
Morris Jette authored
Reject invalid step at submit time rather than leaving it queued. Bug 2722 describes one of the use cases triggering the bug.
-
Morris Jette authored
-