NEWS

This file describes changes in recent versions of SLURM. It primarily
documents those changes that are of interest to users and admins.

* Changes in SLURM 2.2.0.pre8
=============================
 -- Add DebugFlags parameter of "Backfill" for sched/backfill detailed logging.
 -- Add run time to mail message upon job termination and wait time for mail
    message upon job begin.
 -- Add email notification option for job requeue.
 -- Generate a fatal error if the srun --relative option is used when not
    within an existing job allocation.
 -- Modify the meaning of InactiveLimit slightly. It will now cancel the job
    allocation created using the salloc or srun command if those commands
    cease responding for the InactiveLimit regardless of any running job steps.
    This parameter will no longer effect jobs spawned using sbatch.
 
* Changes in SLURM 2.2.0.pre7
=============================
 -- Fixed issue with sacctmgr if querying against non-existent cluster it
    works the same way as 2.1.
 -- Added infrastructure to support allocation of generic node resources (gres).
    -Modified select/linear and select/cons_res plugins to allocate resources
     at the level of a job without oversubcription.
    -Get sched/backfill operating with gres allocations.
    -Get gres configuration changes (reconfiguration) working.
    -Have job steps allocate resources.
    -Modified job step credential to include the job's and step's gres
     allocation details.
    -Integrate with HWLOC library to identify GPUs and NICs configured on each
     node.
 -- SLURM commands (squeue, sinfo, etc...) can now go cross-cluster on like
    linux systems.  Cross-cluster for bluegene to linux and such should
    work fine, even sview.
 -- Added the ability to configure PreemptMode on a per-partition basis.
 -- Change slurmctld's default thread limit count to 1024, but adjust that down
    as needed based upon the process's resource limits.
 -- Removed the non-functional "SystemCPU" and "TotalCPU" reporting fields from
    sstat and updated man page
 -- Correct location of apbasil command on Cray XT systems.
 -- Fixed bug in MinCPU and AveCPU calculations in sstat command
 -- Send message to srun when the Prolog takes too long (MessageTimeout) to
    complete.
 -- Change timeout for socket connect() to be half of configured MessageTimeout.
 -- Added high-throughput computing web page with configuration guidance.
 -- Use more srun sockets to process incoming PMI (MPICH2) connections for
    better scalability.
 -- Added DebugFlags for the select/bluegene plugin: DEBUG_FLAG_BG_PICK,
    DEBUG_FLAG_BG_WIRES, DEBUG_FLAG_BG_ALGO, and DEBUG_FLAG_BG_ALGO_DEEP.
 -- Remove vestigial job record field "kill_on_step_done" (internal to the
    slurmctld daemon only).
 -- For MPICH2 jobs: Clear PMI state between job steps.

* Changes in SLURM 2.2.0.pre6
=============================
 -- sview - added ability to see database configuration.
 -- sview - added ability to add/remove visible tabs.
 -- sview - change way grid highlighting takes place on selected objects.
 -- Added infrastructure to support allocation of generic node resources.
    -Added node configuration parameter of Gres=.
    -Added ability to view/modify a node's gres using scontrol, sinfo and sview.
    -Added salloc, sbatch and srun --gres option.
    -Added ability to view a job or job step's gres using scontrol, squeue and
     sview.
    -Added new configuration parameter GresPlugins to define plugins used to
     manage generic resources.
    -Added framework for gres plugins.
    -Added DebugFlags option of "gres" for detailed debugging of gres actions.
 -- Slurmd modified to log slow slurmstepd startup and note possible file system
    problem.
 -- sview - There is now a .slurm/sviewrc created when running sview.
    Defaults are put in there as to how sview looks when first launched.
    You can set these by Ctrl-S or Options->Set Default Settings.
 -- Add scontrol "wait_job <job_id>" option to wait for nodes to boot as needed.
    Useful for batch jobs (in Prolog, PrologSlurmctld or the script) if powering
    down idle nodes.
 -- Added salloc and sbatch option --wait-for-nodes. If set non-zero, job 
    initiation will be delayed until all allocated nodes have booted. Salloc
    will log the delay with the messages "Waiting for nodes to boot" and "Nodes
    are ready for use". 
 -- The Priority/mulitfactor plugin now takes into consideration size of job
    in cpus as well as size in nodes when looking at the job size factor.
    Previously only nodes were considered.
 -- When using the SlurmDBD messages waiting to be sent will be combined
    and sent in one message.
 -- Remove srun's --core option. Move the logic to an optional SPANK plugin
    (currently in the contribs directory, but plan to distribute through
    http://code.google.com/p/slurm-spank-plugins/).
 -- Patch for adding CR_CORE_DEFAULT_DIST_BLOCK as a select option to layout
    jobs using block layout across cores within each node instead of cyclic
    which was previously the default.
 -- Accounting - When removing associations if jobs are running, those jobs
    must be killed before proceeding.  Before the jobs were killed
    automatically thus causing user confusion on what is most likely an
    admin's mistake.
 -- sview - color column keeps reference color when highlighting.
 -- Configuration parameter MaxJobCount changed from 16-bit to 32-bit field.
    The default MaxJobCount was changed from 5,000 to 10,000.
 -- SLURM commands (squeue, sinfo, etc...) can now go cross-cluster on like
    linux systems.  Cross-cluster for bluegene to linux and such does not
    currently work.  You can submit jobs with sbatch.  Salloc and srun are not
    cross-cluster compatible, and given their nature to talk to actual compute
    nodes these will likely never be.
 -- salloc modified to forward SIGTERM to the spawned program.
 -- In sched/wiki2 (for Moab support) - Add GRES and WCKEY fields to MODIFYJOBS
    and GETJOBS commands. Add GRES field to GETNODES command.
 -- In struct job_descriptor and struct job_info: rename min_sockets to
    sockets_per_node, min_cores to cores_per_socket, and min_threads to
    threads_per_core (the values are not minimum, but represent the target
    values).
 -- Fixed bug in clearing a partition's DisableRootJobs value reported by
    Hongjia Cao.
 -- Purge (or ignore) terminated jobs in a more timely fashion based upon the
    MinJobAge configuration parameter. Small values for MinJobAge should improve
    responsiveness for high job throughput.

* Changes in SLURM 2.2.0.pre5
=============================
 -- Modify commands to accept time format with one or two digit hour value
    (e.g. 8:00 or 08:00 or 8:00:00 or 08:00:00).
 -- Modify time parsing logic to accept "minute", "hour", "day", and "week" in
    addition to the currently accepted "minutes", "hours", etc.
 -- Add slurmd option of "-C" to print actual hardware configuration and exit.
 -- Pass EnforcePartLimits configuration parameter from slurmctld for user
    commands to see the correct value instead of always "NO".
 -- Modify partition data structures to replace the default_part,
    disable_root_jobs, hidden and root_only fields with a single field called
    "flags" populated with the flags PART_FLAG_DEFAULT, PART_FLAG_NO_ROOT
    PART_FLAG_HIDDEN and/or PART_FLAG_ROOT_ONLY. This is a more flexible
    solution besides making for smaller data structures.
 -- Add node state flag of JOB_RESIZING. This will only exist when a job's
    accounting record is being written immediately before or after it changes
    size. This permits job accounting records to be written for a job at each
    size.
 -- Make calls to jobcomp and accounting_storage plugins before and after a job
    changes size (with the job state being JOB_RESIZING). All plugins write a
    record for the job at each size with intermediate job states being
    JOB_RESIZING.
 -- When changing a job size using scontrol, generate a script that can be
    executed by the user to reset SLURM environment variables.
 -- Modify select/linear and select/cons_res to use resources released by job
    resizing.
 -- Added to contribs foundation for Perl extension for slurmdb library.
 -- Add new configuration parameter JobSubmitPlugins which provides a mechanism
    to set default job parameters or perform other site-configurable actions at
    job submit time.
 -- Better postgres support for accounting, still beta.
 -- Speed up job start when using the slurmdbd.
 -- Forward step failure reason back to slurmd before in some cases it would
    just be SLURM_FAILURE returned.
 -- Changed squeue to fail when passed invalid -o <output_format> or
    -S <sort_list> specifications.

* Changes in SLURM 2.2.0.pre4
=============================
 -- Add support for a PropagatePrioProcess configuration parameter value of 2
    to restrict spawned task nice values to that of the slurmd daemon plus 1.
    This insures that the slurmd daemon always have a higher scheduling
    priority than spawned tasks.
 -- Add support in slurmctld, slurmd and slurmdbd for option of "-n <value>" to
    reset the daemon's nice value.
 -- Fixed slurm_load_slurmd_status and slurm_pid2jobid to work correctly when
    multiple slurmds are in use.
 -- Altered srun to set max_nodes to min_nodes if not set when doing an
    allocation to mimic that which salloc and sbatch do.  If running a step if
    the max isn't set it remains unset.
 -- Applied patch from David Egolf (David.Egolf@Bull.com). Added the ability
    to purge/archive accounting data on a day or hour basis, previously
    it was only available on a monthly basis.
 -- Add support for maximum node count in job step request.
 -- Fix bug in CPU count logic for job step allocation (used count of CPUS per
    node rather than CPUs allocated to the job).
 -- Add new configuration parameters GroupUpdateForce and GroupUpdateTime.
    See "man slurm.conf" for details about how these control when slurmctld
    updates its information of which users are in the groups allowed to use
    partitions.
 -- Added sacctmgr list events which will list events that have happened on
    clusters in accounting.
 -- Permit a running job to shrink in size using a command of
    "scontrol update JobId=# NumNodes=#" or
    "scontrol update JobId=# NodeList=<names>". Subsequent job steps must
    explicitly specify an appropriate node count to work properly.
 -- Added resize_time field to job record noting the time of the latest job
    size change (to be used for accounting purposes).
 -- sview/smap now hides hidden partitions and their jobs by default, with an
    option to display them.

* Changes in SLURM 2.2.0.pre3
=============================
 -- Refine support for TotalView partial attach. Add parameter to configure
    program of "--enable-partial-attach".
 -- In select/cons_res, the count of CPUs on required nodes was formerly
    ignored in enforcing the maximum CPU limit. Also enforce maximum CPU
    limit when the topology/tree plugin is configured (previously ignored).
 -- In select/cons_res, allocate cores for a job using a best-fit approach.
 -- In select/cons_res, for jobs that can run on a single node, use a best-fit
    packing approach.
 -- Add support for new partition states of DRAIN and INACTIVE and new partition
    option of "Alternate" (alternate partition to use for jobs submitted to 
    partitions that are currently in a state of DRAIN or INACTIVE).
 -- Add group membership cache. This can substantially speed up slurmctld