- version 15.08.7.1
* Remove the 1024-character limit on lines in batch scripts.
task/affinity: Disable core-level task binding if more CPUs required than
available cores.
* Preemption/gang scheduling: If a job is suspended at slurmctld restart or
reconfiguration time, then leave it suspended rather than resume+suspend.
* Don't use lower weight nodes for job allocation when topology/tree used.
* Don't allow user specified reservation names to disrupt the normal
reservation sequeuece numbering scheme.
* Avoid hard-link/copy of script/environment files for job arrays. Use the
master job record file for all tasks of the job array.
NOTE: Job arrays submitted to Slurm version 15.08.6 or later will fail if
the slurmctld daemon is downgraded to an earlier version of Slurm.
* In slurmctld log file, log duplicate job ID found by slurmd. Previously was
being logged as prolog/epilog failure.
* If a job is requeued while in the process of being launch, remove it's
job ID from slurmd's record of active jobs in order to avoid generating a
duplicate job ID error when launched for the second time (which would
drain the node).
* Cleanup messages when handling job script and environment variables in
older directory structure formats.
* Prevent triggering gang scheduling within a partition if configured with
PreemptType=partition_prio and PreemptMode=suspend,gang.
* Decrease parallelism in job cancel request to prevent denial of service
when cancelling huge numbers of jobs.
* If all ephemeral ports are in use, try using other port numbers.
* Prevent "scontrol update job" from updating jobs that have already finished.
* Show requested TRES in "squeue -O tres" when job is pending.
* Backfill scheduler: Test association and QOS node limits before reserving
resources for pending job.
* Many bug fixes.
- Use source services to download package.
- Fix code for new API of hwloc-2.0.
- package netloc_to_topology where avialable.
- Package documentation.
OBS-URL: https://build.opensuse.org/request/show/435622
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=10
- version 14.03.6
* Added support for native Slurm operation on Cray systems (without ALPS).
* Added partition configuration parameters AllowAccounts, AllowQOS, DenyAccounts and DenyQOS to provide greater control over use.
* Added the ability to perform load based scheduling. Allocating resources to jobs on the nodes with the largest number if idle CPUs.
* Added support for reserving cores on a compute node for system services (core specialization)
* Add mechanism for job_submit plugin to generate error message for srun, salloc or sbatch to stderr.
* Support for Postgres database has long since been out of date and problematic, so it has been removed entirely. If you would like to use it the code still exists in <= 2.6, but will not be included in this and future versions of the code.
* Added new structures and support for both server and cluster resources.
* Significant performance improvements, especially with respect to job array support.
- update files list
OBS-URL: https://build.opensuse.org/request/show/242503
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=5
- update to version 2.6.7
* Support for job arrays, which increases performance and ease of
use for sets of similar jobs.
* Job profiling capability added to record a wide variety of job
characteristics for each task on a user configurable periodic
basis. Data currently available includes CPU use, memory use,
energy use, Infiniband network use, Lustre file system use, etc.
* Support for MPICH2 using PMI2 communications interface with much
greater scalability.
* Prolog and epilog support for advanced reservations.
* Much faster throughput for job step execution with --exclusive
option. The srun process is notified when resources become
available rather than periodic polling.
* Support improved for Intel MIC (Many Integrated Core) processor.
* Advanced reservations with hostname and core counts now supports
asymmetric reservations (e.g. specific different core count for
each node).
* External sensor plugin infrastructure added to record power
consumption, temperature, etc.
* Improved performance for high-throughput computing.
* MapReduce+ support (launches ~1000x faster, runs ~10x faster).
* Added "MaxCPUsPerNode" partition configuration parameter. This
can be especially useful to schedule GPUs. For example a node
can be associated with two Slurm partitions (e.g. "cpu" and
"gpu") and the partition/queue "cpu" could be limited to only a
subset of the node's CPUs, insuring that one or more CPUs would
be available to jobs in the "gpu" partition/queue.
OBS-URL: https://build.opensuse.org/request/show/226317
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=4
- version 2.5.7
* Fix for linking to the select/cray plugin to not give warning
about undefined variable.
* Add missing symbols to the xlator.h
* Avoid placing pending jobs in AdminHold state due to backfill
scheduler interactions with advanced reservation.
* Accounting - make average by task not cpu.
* POE - Correct logic to support poe option "-euidevice sn_all"
and "-euidevice sn_single".
* Accounting - Fix minor initialization error.
* POE - Correct logic to support srun network instances count
with POE.
* POE - With the srun --launch-cmd option, report proper task
count when the --cpus-per-task option is used without the
--ntasks option.
* POE - Fix logic binding tasks to CPUs.
* sview - Fix race condition where new information could of
slipped past the node tab and we didn't notice.
* Accounting - Fix an invalid memory read when slurmctld sends
data about start job to slurmdbd.
* If a prolog or epilog failure occurs, drain the node rather
than setting it down and killing all of its jobs.
* Priority/multifactor - Avoid underflow in half-life calculation.
* POE - pack missing variable to allow fanout (more than 32
nodes)
* Prevent clearing reason field for pending jobs. This bug was
introduced in v2.5.5 (see "Reject job at submit time ...").
* BGQ - Fix issue with preemption on sub-block jobs where a job
would kill all preemptable jobs on the midplane instead of just
the ones it needed to.
OBS-URL: https://build.opensuse.org/request/show/177944
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=3
- version 2.5.4
* Support for Intel® Many Integrated Core (MIC) processors.
* User control over CPU frequency of each job step.
* Recording power usage information for each job.
* Advanced reservation of cores rather than whole nodes.
* Integration with IBM's Parallel Environment including POE (Parallel Operating Environment) and NRT (Network Resource Table) API.
* Highly optimized throughput for serial jobs in a new "select/serial" plugin.
* CPU load is information available
* Configurable number of CPUs available to jobs in each SLURM partition, which provides a mechanism to reserve CPUs for use with GPUs.
OBS-URL: https://build.opensuse.org/request/show/163479
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=2