Skip to content
Snippets Groups Projects
NEWS 54.1 KiB
Newer Older
Christopher J. Morrone's avatar
Christopher J. Morrone committed
This file describes changes in recent versions of SLURM. It primarily
documents those changes that are of interest to users and admins. 
* Changes in SLURM 0.7.0-pre1
=============================
 -- Support defered initiation of job (e.g. srun --begin=11:30 ...).

* Changes in SLURM 0.6.0-pre6
=============================
 -- Added logic to return scheduled nodes to Maui scheduler (David
    Jackson, Cluster Resources)
 -- Fix bug in handling job request with maximum node count.
 -- Fix node selection scheduling bug with heterogeneous nodes and
    srun --cpus-per-task option
 -- Generate error file to note prolog failures.

* Changes in SLURM 0.6.0-pre5
=============================
 -- Modify sfree (BGL command) so that --all option no longer requires
    an argument.
 -- Modify smap so it shows all nodes and partitions by default (even 
    nodes that the user can't access, otherwise there are holes in 
 -- Added module to parse time string (src/common/parse_time.c) for 
    future use.
 -- Fix BlueGene hostlist processing for non-rectangular prisms and
    add string length checking.
 -- Modify orphan batch job time calculation for BGL to account for 
    slowness when booting many bglblocks at the same time.
* Changes in SLURM 0.6.0-pre4
=============================
 -- Added etc/slurm.epilog.clean to kill processes initiated outside of 
    slurm when a user's last job on a node terminates.
 -- Added config.xml and configurator.html files for use by OSCAR.
 -- Increased maximum job step count from 64 to 130 for BGL systems only.
Christopher J. Morrone's avatar
Christopher J. Morrone committed
* Changes in SLURM 0.6.0-pre3
 -- Add code so job request for shared nodes gets explicitly requested 
    nodes, but lightly loaded nodes otherwise.
 -- Add job step name field.
 -- Add job step network specification field.
Christopher J. Morrone's avatar
Christopher J. Morrone committed
 -- Add proctrack/rms plugin
 -- Change the proctrack API to send a slurmd_job_t pointer to both
    slurm_container_create() and slurm_container_add().  One of those
    functions MUST set job->cont_id.
 -- Remove vestigial node_use (virtual or coprocessor) field from job
    request RPC.
 -- Fix mpich-gm bugs, thanks to Takao Hatazaki (HP).
 -- Fix code for clean build with gcc 2.96, Takao Hatazaki (HP).
 -- Add node update state of "RESUME" to return DRAINED, DRAINING, or 
    DOWN node to service (IDLE or ALLOCATED state).
 -- smap keeps trying to connect to slurmctld in iterative mode rather 
    than just aborting on failure.
 -- Add squeue option --node to filter by node name.
 -- Modify squeue --user option to accept not only user names, but also
    user IDs.
Christopher J. Morrone's avatar
Christopher J. Morrone committed

* Changes in SLURM 0.6.0-pre2
=============================
 -- Removed "make rpm" target.
* Changes in SLURM 0.6.0-pre1
=============================
 -- Added bgl/partition_allocator/smap changes from 0.5.7.
 -- Added configurable resource limit propagation  (Daniel Christians, HP).
Danny Auble's avatar
Danny Auble committed
 -- Added mpi plugin specify at start of srun.
 -- Changed SlurmUser ID from 16-bit to 32-bit.
 -- Added MpiDefault slurm.conf parameter.
 -- Remove KillTree configuration parameter (replace with
    "ProctrackType=proctrack/linuxproc")
 -- Remove MpichGmDirectSupport configuration parameter (replace with
    "MpiDefault=mpich-gm")
 -- Make default plugin be "none" for mpi.
 -- Added mpi/none plugin and made it the default.
 -- Replace extern program_invocation_short_name with program_invocation_name
    due to short name being truncated to 16 bytes on some systems.
 -- Added support for Elan clusters with different CPU counts on nodes
    (Chris Holmes, HP).
 -- Added Consumable Resources web page (Susanne Balle, HP).
Christopher J. Morrone's avatar
Christopher J. Morrone committed
 -- "Session manager" slurmd process has been eliminated.
 -- switch/federation fixes migrated from 0.5.*
 -- srun pthreads really set detached, fixes scaling problem

* Changes in SLURM 0.5.7
========================
 -- added infrastructure for (eventual) support of AIX checkpointing
    of slurm batch and interactive poe jobs
 -- added wiring for BGL to do wiring for physical location first and then
    logical.
 -- only one thread used to query database before polling thread is there.

* Changes in SLURM 0.5.6
========================
 -- fix for BGL hostnames and full system partition finding

* Changes in SLURM 0.5.5
========================
 -- Increase SLURM_MESSAGE_TIMEOUT_MSEC_STATIC to 15000
 -- Fix for premature timeout in _slurm_send_timeout
 -- Fix for federation overlapping calls to non-thread-safe _get_adapters

* Changes in SLURM 0.5.4
========================
 -- Added support for no reboot for VN to CO on BGL
 -- Fix for if a job starts after it finishes on BGL

* Changes in SLURM 0.5.3
========================
 -- federation patch so the slurm controller has sane window status at
    start-up regardless of the window status reported in the slurmd
    registration.
 -- federation driver exits with fatal() if the federation driver can not
    find all of the adapters listed in the federation.conf

* Changes in SLURM 0.5.2
========================
 -- Extra federation driver sanity checks

* Changes in SLURM 0.5.1
========================
 -- Fix federation driver bad free(), other minor fed fixes
 -- Allow slurm to parse very long lines in the slurm.conf
Moe Jette's avatar
Moe Jette committed
* Changes in SLURM 0.5.0
========================
 -- Fix race condition in job accouting plugin, could hang slurmd
 -- Report SlurmUser id over 16 bits as an error (fix on v0.6)

* Changes in SLURM 0.5.0-pre19
==============================
 -- Fix memory management bug in federation driver

* Changes in SLURM 0.5.0-pre18
==============================
 -- elan switch plugin memory leak plugged
 -- added g_slurmctld_jobacct_fini() to release all memory (useful 
    to confirm no memory leaks)
 -- Fix slurmd bug introduced in pre17
* Changes in SLURM 0.5.0-pre17
==============================
 -- slurmd calls the proctrack destroy function at job step completion
 -- federation driver tries harder to clean up switch windows
 -- BGL wiring changes
* Changes in SLURM 0.5.0-pre16
==============================
 -- Check slurm.conf values for under/overflows (some are 16 bit values).
 -- Federation driver clears windows at job step completion
 -- Modify code for clean build with gcc v4.0
 -- New SLURM_NETWORK environmant variable used by slurm_ll_api
* Changes in SLURM 0.5.0-pre15
==============================
 -- Added "network" field to "scontrol show job" output. 
 -- Federation fix for unfreed windows when multiple adapters on
    one node use the same LID
Danny Auble's avatar
Danny Auble committed
* Changes in SLURM 0.5.0-pre14
==============================
 -- RDMA works on fed plugin.

* Changes in SLURM 0.5.0-pre13
==============================
 -- Major mods to support checkpoint on AIX.
 -- Job accounting documenation expanded, added tuning options, minor bug fixes
Danny Auble's avatar
Danny Auble committed
 -- BGL wiring will now work on <= 4 node X-dim partitions and also 8 node 
    X-dim partitions.
 -- ENV variables set for spawning jobs. 
 -- jobacct patch from HP to not erroneously lock a mutex in the 
    jobacct_log plugin.
 -- switch/federation supports multiple adapters per task.  sn_all behaviour
    is now correct, and it also supports sn_single.
* Changes in SLURM 0.5.0-pre12
==============================
 -- Minor build changes to support RPM creation on AIX

Moe Jette's avatar
Moe Jette committed
* Changes in SLURM 0.5.0-pre11
==============================
 -- Slurmd tests for initialized session manager (user's) slurmd pid before 
    killing it to avoid killing system daemon (race condition).
 -- srun --output or --error file names of "none" mapped to /dev/null for 
    batch jobs rather than a file actually named "none".
Moe Jette's avatar
Moe Jette committed
 -- BGL: don't try to read bglblock state until they are all created to 
    avoid having BGL Bridge API seg fault.
Moe Jette's avatar
Moe Jette committed

* Changes in SLURM 0.5.0-pre10
==============================
 -- Fix bug that was resetting BGL job geometry on unrelated field update.
 -- squeue and sinfo print timestamp in interate mode by default.
 -- added scrolling windows in smap
 -- introduced new variable to start polling thread in the bluegene plugin.
 -- Latest accounting patches from Riebs/HP, retry communications.
 -- Added srun option --kill-on-bad-exit from Holmes/HP.
 -- Support large (64-bit address) log files where possible.
 -- Fix problem of signals being delivered twice to tasks.  Note that as
    part of the fix the slurmd session manger no longer calls setsid to
Loading
Loading full blame...