- Update to version 23.11.03
* slurmrestd - Reject single http query with multiple path
requests.
* Fix launching Singularity v4.x containers with
`srun --container` by setting .process.terminal to true in
generated `config.json` when step has pseudoterminal (`--pty`)
requested.
* Fix loading in `dyanmic/cloud` node jobs after `net_cred`
expired.
* Fix cgroup null path error on `slurmd/slurmstepd` tear down.
* `data_parser/v0.0.40` - Prevent failure if accounting is
disabled, instead issue a warning if needed data from the
database can not be retrieved.
* `openapi/slurmctld` - Prevent failure if accounting is disabled.
* Prevent `slurmscriptd` processing delays from blocking other
threads in `slurmctld` while trying to launch various scripts.
This is additional work for a fix in 23.02.6.
* Fix memory leak when receiving alias addrs from controller.
* `scontrol` - Accept `scontrol token lifespan=infinite` to
create tokens that effectively do not expire.
* Avoid errors when Slurmdb accounting disabled when `--json` or
`--yaml` is invoked with CLI commands and `slurmrestd`. Add
warnings when query would have populated data from Slurmdb
instead of errors.
* Fix `slurmctld` memory leak when running job with
`--tres-per-task=gres:shard:#`
* Fix backfill trying to start jobs outside of backfill window.
* Fix oversubscription on partitions with `PreemptMode=OFF`.
* Preserve node reason on power up if the node is downed
or drained.
OBS-URL: https://build.opensuse.org/request/show/1150524
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=289
- Update to 23.02.6 to fix (CVE-2023-49933 - bsc#1218046, CVE-2023-49935 -
bsc#1218049, CVE-2023-49936 - bsc#1218050, CVE-2023-49937 - bsc#1218051,
CVE-2023-49938 - bsc#1218053)
* Security Fixes:
+ Add `JobAcctGatherParams=DisableGPUAcct` to disable gpu accounting.
+ `acct_gather_energy/ipmi` - Improve logging of DCMI issues.
+ `gpu/oneapi` - Add support for new env vars `ZE_FLAT_DEVICE_HIERARCHY`
and `ZE_ENABLE_PCI_ID_DEVICE_ORDER`.
+ `data_parser/v0.0.39` - skip empty string when parsing QOS ids.
+ Remove error message from `assoc_mgr_update_assocs` when purposefully
resetting the default QOS.
* Bug Fixes:
+ `libslurm_nss` - Avoid causing glibc to assert due to an unexpected
return from slurm_nss due to an error during lookup.
+ Fix job requests with `--tres-per-task` sometimes resulting in bad
allocations that cannot run subsequent job steps.
+ Fix issue with `slurmd` where `srun` fails to be warned when a node
prolog script runs beyond `MsgTimeout` set in `slurm.conf`.
+ `gres/shard` - Fix plugin functions to have matching parameter orders.
+ `gpu/nvml` - Fix issue that resulted in the wrong MIG devices being
constrained to a job
+ `gpu/nvml` - Fix linking issue with MIGs that prevented multiple MIGs
being used in a single job for certain MIG configurations
+ Fix file descriptor leak in slurmd when using `acct_gather_energy/ipmi`
with DCMI devices.
+ `sview` - avoid crash when job has a node list string > 49 characters.
+ Prevent `slurmctld` crash during reconfigure when packing job start
messages.
+ Preserve reason uid on reconfig.
+ Update node reason with updated `INVAL` state reason if different from
OBS-URL: https://build.opensuse.org/request/show/1136624
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=282
- Explicitly create an Obsoletes: entry for each package version
that is obsoleted by the present version. These are all published
versions of the last two major releases as well as all minor
versions of the present release lower than the current one
(bsc#1216869 2nd part).
This prevents the current version to upgrade a old Slurm version
for which no upgrade path exists.
OBS-URL: https://build.opensuse.org/request/show/1129638
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=279