forked from pool/slurm
2028708d3a
- version 15.08.7.1 * Remove the 1024-character limit on lines in batch scripts. task/affinity: Disable core-level task binding if more CPUs required than available cores. * Preemption/gang scheduling: If a job is suspended at slurmctld restart or reconfiguration time, then leave it suspended rather than resume+suspend. * Don't use lower weight nodes for job allocation when topology/tree used. * Don't allow user specified reservation names to disrupt the normal reservation sequeuece numbering scheme. * Avoid hard-link/copy of script/environment files for job arrays. Use the master job record file for all tasks of the job array. NOTE: Job arrays submitted to Slurm version 15.08.6 or later will fail if the slurmctld daemon is downgraded to an earlier version of Slurm. * In slurmctld log file, log duplicate job ID found by slurmd. Previously was being logged as prolog/epilog failure. * If a job is requeued while in the process of being launch, remove it's job ID from slurmd's record of active jobs in order to avoid generating a duplicate job ID error when launched for the second time (which would drain the node). * Cleanup messages when handling job script and environment variables in older directory structure formats. * Prevent triggering gang scheduling within a partition if configured with PreemptType=partition_prio and PreemptMode=suspend,gang. * Decrease parallelism in job cancel request to prevent denial of service when cancelling huge numbers of jobs. * If all ephemeral ports are in use, try using other port numbers. * Prevent "scontrol update job" from updating jobs that have already finished. * Show requested TRES in "squeue -O tres" when job is pending. * Backfill scheduler: Test association and QOS node limits before reserving resources for pending job. * Many bug fixes. - Use source services to download package. - Fix code for new API of hwloc-2.0. - package netloc_to_topology where avialable. - Package documentation. OBS-URL: https://build.opensuse.org/request/show/435622 OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=10
244 lines
10 KiB
Plaintext
244 lines
10 KiB
Plaintext
-------------------------------------------------------------------
|
|
Sat Oct 15 18:11:39 UTC 2016 - eich@suse.com
|
|
|
|
- version 15.08.7.1
|
|
* Remove the 1024-character limit on lines in batch scripts.
|
|
task/affinity: Disable core-level task binding if more CPUs required than
|
|
available cores.
|
|
* Preemption/gang scheduling: If a job is suspended at slurmctld restart or
|
|
reconfiguration time, then leave it suspended rather than resume+suspend.
|
|
* Don't use lower weight nodes for job allocation when topology/tree used.
|
|
* Don't allow user specified reservation names to disrupt the normal
|
|
reservation sequeuece numbering scheme.
|
|
* Avoid hard-link/copy of script/environment files for job arrays. Use the
|
|
master job record file for all tasks of the job array.
|
|
NOTE: Job arrays submitted to Slurm version 15.08.6 or later will fail if
|
|
the slurmctld daemon is downgraded to an earlier version of Slurm.
|
|
* In slurmctld log file, log duplicate job ID found by slurmd. Previously was
|
|
being logged as prolog/epilog failure.
|
|
* If a job is requeued while in the process of being launch, remove it's
|
|
job ID from slurmd's record of active jobs in order to avoid generating a
|
|
duplicate job ID error when launched for the second time (which would
|
|
drain the node).
|
|
* Cleanup messages when handling job script and environment variables in
|
|
older directory structure formats.
|
|
* Prevent triggering gang scheduling within a partition if configured with
|
|
PreemptType=partition_prio and PreemptMode=suspend,gang.
|
|
* Decrease parallelism in job cancel request to prevent denial of service
|
|
when cancelling huge numbers of jobs.
|
|
* If all ephemeral ports are in use, try using other port numbers.
|
|
* Prevent "scontrol update job" from updating jobs that have already finished.
|
|
* Show requested TRES in "squeue -O tres" when job is pending.
|
|
* Backfill scheduler: Test association and QOS node limits before reserving
|
|
resources for pending job.
|
|
* Many bug fixes.
|
|
- Use source services to download package.
|
|
- Fix code for new API of hwloc-2.0.
|
|
- package netloc_to_topology where avialable.
|
|
- Package documentation.
|
|
|
|
-------------------------------------------------------------------
|
|
Sun Nov 1 13:45:52 UTC 2015 - scorot@free.fr
|
|
|
|
- version 15.08.3
|
|
* Many new features and bug fixes. See NEWS file
|
|
- update files list accordingly
|
|
- fix wrong end of line in some files
|
|
|
|
-------------------------------------------------------------------
|
|
Thu Aug 6 19:06:18 UTC 2015 - scorot@free.fr
|
|
|
|
- version 14.11.8
|
|
* Many bug fixes. See NEWS file
|
|
- update files list accordingly
|
|
|
|
-------------------------------------------------------------------
|
|
Sun Nov 2 22:12:34 UTC 2014 - scorot@free.fr
|
|
|
|
- add missing systemd requirements
|
|
- add missing rclink
|
|
|
|
-------------------------------------------------------------------
|
|
Sun Nov 2 15:04:42 UTC 2014 - scorot@free.fr
|
|
|
|
- version 14.03.9
|
|
* Many bug fixes. See NEWS file
|
|
- add systemd support
|
|
|
|
-------------------------------------------------------------------
|
|
Sat Jul 26 10:22:32 UTC 2014 - scorot@free.fr
|
|
|
|
- version 14.03.6
|
|
* Added support for native Slurm operation on Cray systems
|
|
(without ALPS).
|
|
* Added partition configuration parameters AllowAccounts,
|
|
AllowQOS, DenyAccounts and DenyQOS to provide greater control
|
|
over use.
|
|
* Added the ability to perform load based scheduling. Allocating
|
|
resources to jobs on the nodes with the largest number if idle
|
|
CPUs.
|
|
* Added support for reserving cores on a compute node for system
|
|
services (core specialization)
|
|
* Add mechanism for job_submit plugin to generate error message
|
|
for srun, salloc or sbatch to stderr.
|
|
* Support for Postgres database has long since been out of date
|
|
and problematic, so it has been removed entirely. If you
|
|
would like to use it the code still exists in <= 2.6, but will
|
|
not be included in this and future versions of the code.
|
|
* Added new structures and support for both server and cluster
|
|
resources.
|
|
* Significant performance improvements, especially with respect
|
|
to job array support.
|
|
- update files list
|
|
|
|
-------------------------------------------------------------------
|
|
Sun Mar 16 15:59:01 UTC 2014 - scorot@free.fr
|
|
|
|
- update to version 2.6.7
|
|
* Support for job arrays, which increases performance and ease of
|
|
use for sets of similar jobs.
|
|
* Job profiling capability added to record a wide variety of job
|
|
characteristics for each task on a user configurable periodic
|
|
basis. Data currently available includes CPU use, memory use,
|
|
energy use, Infiniband network use, Lustre file system use, etc.
|
|
* Support for MPICH2 using PMI2 communications interface with much
|
|
greater scalability.
|
|
* Prolog and epilog support for advanced reservations.
|
|
* Much faster throughput for job step execution with --exclusive
|
|
option. The srun process is notified when resources become
|
|
available rather than periodic polling.
|
|
* Support improved for Intel MIC (Many Integrated Core) processor.
|
|
* Advanced reservations with hostname and core counts now supports
|
|
asymmetric reservations (e.g. specific different core count for
|
|
each node).
|
|
* External sensor plugin infrastructure added to record power
|
|
consumption, temperature, etc.
|
|
* Improved performance for high-throughput computing.
|
|
* MapReduce+ support (launches ~1000x faster, runs ~10x faster).
|
|
* Added "MaxCPUsPerNode" partition configuration parameter. This
|
|
can be especially useful to schedule GPUs. For example a node
|
|
can be associated with two Slurm partitions (e.g. "cpu" and
|
|
"gpu") and the partition/queue "cpu" could be limited to only a
|
|
subset of the node's CPUs, insuring that one or more CPUs would
|
|
be available to jobs in the "gpu" partition/queue.
|
|
|
|
-------------------------------------------------------------------
|
|
Thu Jun 6 20:31:49 UTC 2013 - scorot@free.fr
|
|
|
|
- version 2.5.7
|
|
* Fix for linking to the select/cray plugin to not give warning
|
|
about undefined variable.
|
|
* Add missing symbols to the xlator.h
|
|
* Avoid placing pending jobs in AdminHold state due to backfill
|
|
scheduler interactions with advanced reservation.
|
|
* Accounting - make average by task not cpu.
|
|
* POE - Correct logic to support poe option "-euidevice sn_all"
|
|
and "-euidevice sn_single".
|
|
* Accounting - Fix minor initialization error.
|
|
* POE - Correct logic to support srun network instances count
|
|
with POE.
|
|
* POE - With the srun --launch-cmd option, report proper task
|
|
count when the --cpus-per-task option is used without the
|
|
--ntasks option.
|
|
* POE - Fix logic binding tasks to CPUs.
|
|
* sview - Fix race condition where new information could of
|
|
slipped past the node tab and we didn't notice.
|
|
* Accounting - Fix an invalid memory read when slurmctld sends
|
|
data about start job to slurmdbd.
|
|
* If a prolog or epilog failure occurs, drain the node rather
|
|
than setting it down and killing all of its jobs.
|
|
* Priority/multifactor - Avoid underflow in half-life calculation.
|
|
* POE - pack missing variable to allow fanout (more than 32
|
|
nodes)
|
|
* Prevent clearing reason field for pending jobs. This bug was
|
|
introduced in v2.5.5 (see "Reject job at submit time ...").
|
|
* BGQ - Fix issue with preemption on sub-block jobs where a job
|
|
would kill all preemptable jobs on the midplane instead of just
|
|
the ones it needed to.
|
|
* switch/nrt - Validate dynamic window allocation size.
|
|
* BGQ - When --geo is requested do not impose the default
|
|
conn_types.
|
|
* RebootNode logic - Defers (rather than forgets) reboot request
|
|
with job running on the node within a reservation.
|
|
* switch/nrt - Correct network_id use logic. Correct support for
|
|
user sn_all and sn_single options.
|
|
* sched/backfill - Modify logic to reduce overhead under heavy
|
|
load.
|
|
* Fix job step allocation with --exclusive and --hostlist option.
|
|
* Select/cons_res - Fix bug resulting in error of "cons_res: sync
|
|
loop not progressing, holding job #"
|
|
* checkpoint/blcr - Reset max_nodes from zero to NO_VAL on job
|
|
restart.
|
|
* launch/poe - Fix for hostlist file support with repeated host
|
|
names.
|
|
* priority/multifactor2 - Prevent possible divide by zero.
|
|
-- srun - Don't check for executable if --test-only flag is
|
|
used.
|
|
* energy - On a single node only use the last task for gathering
|
|
energy. Since we don't currently track energy usage per task
|
|
(only per step). Otherwise we get double the energy.
|
|
|
|
-------------------------------------------------------------------
|
|
Sat Apr 6 11:13:17 UTC 2013 - scorot@free.fr
|
|
|
|
- version 2.5.4
|
|
* Support for Intel® Many Integrated Core (MIC) processors.
|
|
* User control over CPU frequency of each job step.
|
|
* Recording power usage information for each job.
|
|
* Advanced reservation of cores rather than whole nodes.
|
|
* Integration with IBM's Parallel Environment including POE (Parallel
|
|
Operating Environment) and NRT (Network Resource Table) API.
|
|
* Highly optimized throughput for serial jobs in a new
|
|
"select/serial" plugin.
|
|
* CPU load is information available
|
|
* Configurable number of CPUs available to jobs in each SLURM
|
|
partition, which provides a mechanism to reserve CPUs for use
|
|
with GPUs.
|
|
|
|
-------------------------------------------------------------------
|
|
Sat Nov 17 18:02:16 UTC 2012 - scorot@free.fr
|
|
|
|
- remore runlevel 4 from init script thanks to patch1
|
|
- fix self obsoletion of slurm-munge package
|
|
- use fdupes to remove duplicates
|
|
- spec file reformaing
|
|
|
|
-------------------------------------------------------------------
|
|
Sat Nov 17 17:30:11 UTC 2012 - scorot@free.fr
|
|
|
|
- put perl macro in a better within install section
|
|
|
|
-------------------------------------------------------------------
|
|
Sat Nov 17 17:01:20 UTC 2012 - scorot@free.fr
|
|
|
|
- enable numa on x86_64 arch only
|
|
|
|
-------------------------------------------------------------------
|
|
Sat Nov 17 16:54:18 UTC 2012 - scorot@free.fr
|
|
|
|
- add numa and hwloc support
|
|
- fix rpath with patch0
|
|
|
|
-------------------------------------------------------------------
|
|
Fri Nov 16 21:46:49 UTC 2012 - scorot@free.fr
|
|
|
|
- fix perl module files list
|
|
|
|
-------------------------------------------------------------------
|
|
Mon Nov 5 21:48:52 UTC 2012 - scorot@free.fr
|
|
|
|
- use perl_process_packlist macro for the perl files cleanup
|
|
- fix some summaries length
|
|
- add cgoups directory and example the cgroup.release_common file
|
|
|
|
-------------------------------------------------------------------
|
|
Sat Nov 3 18:19:59 UTC 2012 - scorot@free.fr
|
|
|
|
- spec file cleanup
|
|
|
|
-------------------------------------------------------------------
|
|
Sat Nov 3 15:57:47 UTC 2012 - scorot@free.fr
|
|
|
|
- first package
|
|
|