Correct anomalies when accounting for GRES with types
> In slurm.conf: > AccountingStorageTRES=gres/craynetwork,gres/gpu > $ srun --gres=craynetwork:3,gpu:volta:1,gpu:tesla:1 -N1 sleep 100 > $ scontrol show job > > TRES=cpu=1,mem=1000M,node=1,billing=1,fs/disk=0,vmem=0,pages=0,gres/ > craynetwork=3,gres/gpu=1 > TresPerNode=craynetwork:3,gpu:volta:1,gpu:tesla:1 > > Note that TRES is not reporting the total GPU count (1 tesla and 1 volta), > but only 1 GPU total With new logic: TRES=cpu=1,mem=1000M,node=1,billing=1,gres/craynetwork=3,gres/gpu=2 TresPerNode=craynetwork:3,gpu:volta:1,gpu:tesla:1 > ======================================= > > In slurm.conf: > AccountingStorageTRES=gres/craynetwork,gres/gpu:volta,gres/gpu:tesla > $ srun --gres=craynetwork:3,gpu:2 -N1 sleep 100 > $ scontrol show job > > TRES=cpu=1,mem=1000M,node=1,billing=1,fs/disk=0,vmem=0,pages=0,gres/ > craynetwork=3,gres/gpu:tesla=0,gres/gpu:volta=0 > TresPerNode=craynetwork:3,gpu:2 > > Here we the GPUs have to either be tesla or volta, but none are accounted > for (in TRES count of both GPU types is 0). With new logic: TRES=cpu=1,mem=1000M,node=1,billing=1,gres/craynetwork=3,gres/gpu:tesla=2 TresPerNode=craynetwork:3,gpu:2 bug 6070
Showing
- doc/man/man5/slurm.conf.5 20 additions, 0 deletionsdoc/man/man5/slurm.conf.5
- slurm/slurmdb.h 7 additions, 6 deletionsslurm/slurmdb.h
- src/common/assoc_mgr.c 66 additions, 6 deletionssrc/common/assoc_mgr.c
- src/common/assoc_mgr.h 20 additions, 2 deletionssrc/common/assoc_mgr.h
- src/common/gres.c 80 additions, 25 deletionssrc/common/gres.c
- src/common/gres.h 2 additions, 1 deletionsrc/common/gres.h
- src/slurmctld/slurmctld.h 1 addition, 1 deletionsrc/slurmctld/slurmctld.h
Loading
Please register or sign in to comment