- Updated to 20.02.3 which fixes CVE-2020-12693 (bsc#1172004).
- Other changes are:
* Factor in ntasks-per-core=1 with cons_tres.
* Fix formatting in error message in cons_tres.
* Fix calling stat on a NULL variable.
* Fix minor memory leak when using reservations with flags=first_cores.
* Fix gpu bind issue when CPUs=Cores and ThreadsPerCore > 1 on a node.
* Fix --mem-per-gpu for heterogenous --gres requests.
* Fix slurmctld load order in load_all_part_state().
* Fix race condition not finding jobacct gather task cgroup entry.
* Suppress error message when selecting nodes on disjoint topologies.
* Improve performance of _pack_default_job_details() with large number of job
* arguments.
* Fix archive loading previous to 17.11 jobs per-node req_mem.
* Fix regresion validating that --gpus-per-socket requires --sockets-per-node
* for steps. Should only validate allocation requests.
* error() instead of fatal() when parsing an invalid hostlist.
* nss_slurm - fix potential deadlock in slurmstepd on overloaded systems.
* cons_tres - fix --gres-flags=enforce-binding and related --cpus-per-gres.
* cons_tres - Allocate lowest numbered cores when filtering cores with gres.
* Fix getting system counts for named GRES/TRES.
* MySQL - Fix for handing typed GRES for association rollups.
* Fix step allocations when tasks_per_core > 1.
* Fix allocating more GRES than requested when asking for multiple GRES types.
OBS-URL: https://build.opensuse.org/request/show/808569
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=45
- Updated to 20.02.3 which fixes CVE-2020-12693
- Other changes are:
* Factor in ntasks-per-core=1 with cons_tres.
* Fix formatting in error message in cons_tres.
* Fix calling stat on a NULL variable.
* Fix minor memory leak when using reservations with flags=first_cores.
* Fix gpu bind issue when CPUs=Cores and ThreadsPerCore > 1 on a node.
* Fix --mem-per-gpu for heterogenous --gres requests.
* Fix slurmctld load order in load_all_part_state().
* Fix race condition not finding jobacct gather task cgroup entry.
* Suppress error message when selecting nodes on disjoint topologies.
* Improve performance of _pack_default_job_details() with large number of job
* arguments.
* Fix archive loading previous to 17.11 jobs per-node req_mem.
* Fix regresion validating that --gpus-per-socket requires --sockets-per-node
* for steps. Should only validate allocation requests.
* error() instead of fatal() when parsing an invalid hostlist.
* nss_slurm - fix potential deadlock in slurmstepd on overloaded systems.
* cons_tres - fix --gres-flags=enforce-binding and related --cpus-per-gres.
* cons_tres - Allocate lowest numbered cores when filtering cores with gres.
* Fix getting system counts for named GRES/TRES.
* MySQL - Fix for handing typed GRES for association rollups.
* Fix step allocations when tasks_per_core > 1.
* Fix allocating more GRES than requested when asking for multiple GRES types.
OBS-URL: https://build.opensuse.org/request/show/808130
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=147
- Updated to 20.02.1 with following changes"
* Improve job state reason for jobs hitting partition_job_depth.
* Speed up testing of singleton dependencies.
* Fix negative loop bound in cons_tres.
* srun - capture the MPI plugin return code from mpi_hook_client_fini() and
use as final return code for step failure.
* Fix segfault in cli_filter/lua.
* Fix --gpu-bind=map_gpu reusability if tasks > elements.
* Make sure config_flags on a gres are sent to the slurmctld on node
registration.
* Prolog/Epilog - Fix missing GPU information.
* Fix segfault when using config parser for expanded lines.
* Fix bit overlap test function.
* Don't accrue time if job begin time is in the future.
* Remove accrue time when updating a job start/eligible time to the future.
* Fix regression in 20.02.0 that broke --depend=expand.
* Reset begin time on job release if it's not in the future.
* Fix for recovering burst buffers when using high-availability.
* Fix invalid read due to freeing an incorrectly allocated env array.
* Update slurmctld -i message to warn about losing data.
* Fix scontrol cancel_reboot so it clears the DRAIN flag and node reason for a
pending ASAP reboot.
OBS-URL: https://build.opensuse.org/request/show/788905
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=145
- Update to version 20.02.0 (jsc#SLE-8491)
* Fix minor memory leak in slurmd on reconfig.
* Fix invalid ptr reference when rolling up data in the database.
* Change shtml2html.py to require python3 for RHEL8 support, and match
man2html.py.
* slurm.spec - override "hardening" linker flags to ensure RHEL8 builds
in a usable manner.
* Fix type mismatches in the perl API.
* Prevent use of uninitialized slurmctld_diag_stats.
* Fixed various Coverity issues.
* Only show warning about root-less topology in daemons.
* Fix accounting of jobs in IGNORE_JOBS reservations.
* Fix issue with batch steps state not loading correctly when upgrading from
19.05.
* Deprecate max_depend_depth in SchedulerParameters and move it to
DependencyParameters.
* Silence erroneous error on slurmctld upgrade when loading federation state.
* Break infinite loop in cons_tres dealing with incorrect tasks per tres
request resulting in slurmctld hang.
* Improve handling of --gpus-per-task to make sure appropriate number of GPUs
is assigned to job.
* Fix seg fault on cons_res when requesting --spread-job.
- Move to python3 for everything but SLE-11-SP4
* For SLE-11-SP4 add a workaround to handle a python3 script (python2.7
compliant).
OBS-URL: https://build.opensuse.org/request/show/779379
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=136
- Update to version 20.02.0-rc1
* sbatch - fix segfault when no newline at the end of a burst buffer file.
* Change scancel to only check job's base state when matching -t options.
* Save job dependency list in state files.
* cons_tres - allow jobs to be run on systems with root-less topologies.
* Restore pre-20.02pre1 PrologSlurmctld synchonization behavior to avoid
various race conditions, and ensure proper batch job launch.
* Add new slurmrestd command/daemon which implements the Slurm REST API.
- Update to version 20.02.0-0pre1, highlights are
OBS-URL: https://build.opensuse.org/request/show/774250
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=134
- Updated to version 20.02.0-0pre1, highlights are
Highlights:
* Exclusive behavior of a node includes all GRES on a node as well
as the cpus.
* Use python3 instead of python for internal build/test scripts.
The slurm.spec file has been updated to depend on python3 as well.
* Added new NodeSet configuration option to help simplify partition
configuration sections for heterogeneous / condo*style clusters.
* Added slurm.conf option MaxDBDMsgs to control how many messages will be
stored in the slurmctld before throwing them away when the slurmdbd is down.
* The checkpoint plugin interface and all associated API calls have been
removed.
* slurm_init_job_desc_msg() initializes mail_type as uint16_t. This allows
mail_type to be set to NONE with scontrol.
* Add new slurm_spank_log() function to print messages back to the user from
within a SPANK plugin without prepending "error: " from slurm_error().
* Enforce having partition name and nodelist=ALL when creating reservations
with flags=PART_NODES.
* SPANK - removed never-implemented slurm_spank_slurmd_init() interface. This
hook has always been accessible through slurm_spank_init() in the
S_CTX_SLURMD context instead.
* sbcast - add new BcastAddr option to NodeName lines to allow sbcast traffic
to flow over an alternate network path.
* Added auth/jwt plugin, and 'scontrol token' subcommand. PMIx - improve
* performance of proc map generation. Deprecate kill_invalid_depend in
* SchedulerParameters and move it to a new
option called DependencyParameters.
* Enable job dependencies for any job on any cluster in the same federation.
* Allow clusters to be added automatically to db at startup of ctld. Add
* AccountingStorageExternalHost slurm.conf parameter. The
OBS-URL: https://build.opensuse.org/request/show/773459
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=130
- BuildRequire pkgconfig(systemd) instead of systemd: allow OBS to
shortcut through the -mini flavors.
- Use systemd_ordering instead of systemd_requires: systemd is
never a strict requirement; but in case the system is scheduled
for installation together with systemd, we want systemd to be
installed prior to slurm.
- start slurmdbd after mariadb (bsc#1161716)
OBS-URL: https://build.opensuse.org/request/show/766872
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=123
- Update to version 19.05.5 (jsc#SLE-8491)
* Check %docdir/NEWS for details.
* Includes security fixes CVE-2019-19727, CVE-2019-19728,
CVE-2019-12838.
* Disable i586 builds as this is no longer supported.
* Create libnss_slurm package to support user and group resolution
thru slurmstepd.
* slurm-2.4.4-rpath.patch -> Remove-rpath-from-build.patch
Obsoleted:
- pam_slurm_adopt-avoid-running-outside-of-the-sshd-PA.patch
- pam_slurm_adopt-send_user_msg-don-t-copy-undefined-d.patch
- pam_slurm_adopt-use-uid-to-determine-whether-root-is.patch
OBS-URL: https://build.opensuse.org/request/show/762650
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=118
- Deprecate "ControlMachine" only for SLURM version upgrades and
products newer than 1501. This ensures that the original setting
is retained for the SLURM version shipped origianlly with SLE-15-SP1
or Leap 15.1.
- Update to v18.08.9 for fixing CVE-2019-19728 (bsc#1159692).
* Wrap END_TIMER{,2,3} macro definition in "do {} while (0)" block.
* Make sview work with glib2 v2.62.
* Make Slurm compile on linux after sys/sysctl.h was deprecated.
* Install slurmdbd.conf.example with 0600 permissions to encourage secure
use. CVE-2019-19727.
* srun - do not continue with job launch if --uid fails. CVE-2019-19728.
- added pmix support jsc#SLE-10800
- Use --with-shared-libslurm to build slurm binaries using libslurm.
- Make libslurm depend on slurm-config.
- Fix ownership of /var/spool/slurm on new installations
and upgrade (boo#1158696).
- Fix permissions of slurmdbd.conf (bsc#1155784, CVE-2019-19727).
- Fix %posttrans macro _res_update to cope with added newline
(bsc#1153259).
- Add package slurm-webdoc which sets up a web server to provide
the documentation for the version shipped.
- Move srun from 'slurm' to 'slurm-node': srun is required on the
nodes as well so sbatch will work. 'slurm-node' is a requirement (forwarded request 760450 from eeich)
OBS-URL: https://build.opensuse.org/request/show/761961
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=34
- Deprecate "ControlMachine" only for SLURM version upgrades and
products newer than 1501. This ensures that the original setting
is retained for the SLURM version shipped origianlly with SLE-15-SP1
or Leap 15.1.
- Update to v18.08.9 for fixing CVE-2019-19728 (bsc#1159692).
* Wrap END_TIMER{,2,3} macro definition in "do {} while (0)" block.
* Make sview work with glib2 v2.62.
* Make Slurm compile on linux after sys/sysctl.h was deprecated.
* Install slurmdbd.conf.example with 0600 permissions to encourage secure
use. CVE-2019-19727.
* srun - do not continue with job launch if --uid fails. CVE-2019-19728.
- added pmix support jsc#SLE-10800
- Use --with-shared-libslurm to build slurm binaries using libslurm.
- Make libslurm depend on slurm-config.
- Fix ownership of /var/spool/slurm on new installations
and upgrade (boo#1158696).
- Fix permissions of slurmdbd.conf (bsc#1155784, CVE-2019-19727).
- Fix %posttrans macro _res_update to cope with added newline
(bsc#1153259).
- Add package slurm-webdoc which sets up a web server to provide
the documentation for the version shipped.
- Move srun from 'slurm' to 'slurm-node': srun is required on the
nodes as well so sbatch will work. 'slurm-node' is a requirement
OBS-URL: https://build.opensuse.org/request/show/760450
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=116
- added cray depend libraries to seperate package, as they are now
built, since json is enabled
- Updated to 18.0.7 for fixing CVE-2019-12838 and (bsc#1140709)
* Update "xauth list" to use the same 10000ms timeout as the other xauth
commands.
* Fix issue in gres code to handle a gres cnt of 0.
* Don't purge jobs if backfill is running.
* Verify job is pending add/removing accrual time.
* Don't abort when the job doesn't have an association that was removed
before the job was able to make it to the database.
* Set state_reason if select_nodes() fails job for QOS or Account.
* Avoid seg_fault on referencing association without a valid_qos bitmap.
* If Association/QOS is removed on a pending job set that job as ineligible.
* When changing a jobs account/qos always make sure you remove the old limits.
* Don't reset a FAIL_QOS or FAIL_ACCOUNT job reason until the qos or
account changed.
* Restore "sreport -T ALL" functionality.
* Correctly typecast signals being sent through the api.
* Properly initialize structures throughout Slurm.
* Sync "numtask" squeue format option for jobs and steps to "numtasks".
* Fix sacct -PD to avoid CA before start jobs.
* Fix potential deadlock with backup slurmctld.
* Fixed issue with jobs not appearing in sacct after dependency satisfied.
* Fix showing non-eligible jobs when asking with -j and not -s.
* Fix issue with backfill scheduler scheduling tasks of an array
when not the head job.
* accounting_storage/mysql - fix SIGABRT in the archive load logic.
* accounting_storage/mysql - fix memory leak in the archive load logic.
* Limit records per single SQL statement when loading archived data. (forwarded request 714908 from mslacken)
OBS-URL: https://build.opensuse.org/request/show/714909
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=28
- added cray depend libraries to seperate package, as they are now
built, since json is enabled
- Updated to 18.0.7 for fixing CVE-2019-12838 and (bsc#1140709)
* Update "xauth list" to use the same 10000ms timeout as the other xauth
commands.
* Fix issue in gres code to handle a gres cnt of 0.
* Don't purge jobs if backfill is running.
* Verify job is pending add/removing accrual time.
* Don't abort when the job doesn't have an association that was removed
before the job was able to make it to the database.
* Set state_reason if select_nodes() fails job for QOS or Account.
* Avoid seg_fault on referencing association without a valid_qos bitmap.
* If Association/QOS is removed on a pending job set that job as ineligible.
* When changing a jobs account/qos always make sure you remove the old limits.
* Don't reset a FAIL_QOS or FAIL_ACCOUNT job reason until the qos or
account changed.
* Restore "sreport -T ALL" functionality.
* Correctly typecast signals being sent through the api.
* Properly initialize structures throughout Slurm.
* Sync "numtask" squeue format option for jobs and steps to "numtasks".
* Fix sacct -PD to avoid CA before start jobs.
* Fix potential deadlock with backup slurmctld.
* Fixed issue with jobs not appearing in sacct after dependency satisfied.
* Fix showing non-eligible jobs when asking with -j and not -s.
* Fix issue with backfill scheduler scheduling tasks of an array
when not the head job.
* accounting_storage/mysql - fix SIGABRT in the archive load logic.
* accounting_storage/mysql - fix memory leak in the archive load logic.
* Limit records per single SQL statement when loading archived data.
OBS-URL: https://build.opensuse.org/request/show/714908
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=100