forked from pool/slurm
Accepting request 1117145 from home:mslacken:branches:network:cluster
* Bug Fixes: + Fix CpusPerTres= not upgreadable with scontrol update + Fix unintentional gres removal when validating the gres job state. + Fix --without-hpe-slingshot configure option. + Fix cgroup v2 memory calculations when transparent huge pages are used. + Fix parsing of sgather --timeout option. + Fix regression from 22.05.0 that caused srun --cpu-bind "=verbose" and "=v" options give different CPU bind masks. + Fix "_find_node_record: lookup failure for node" error message appearing for all dynamic nodes during reconfigure. + Avoid segfault if loading serializer plugin fails. + slurmrestd - Correct OpenAPI format for 'GET /slurm/v0.0.39/licenses'. + slurmrestd - Correct OpenAPI format for 'GET /slurm/v0.0.39/job/{job_id}'. + slurmrestd - Change format to multiple fields in 'GET /slurmdb/v0.0.39/assocations' and 'GET /slurmdb/v0.0.39/qos' to handle infinite and unset states. + When a node fails in a job with --no-kill, preserve the extern step on the remaining nodes to avoid breaking features that rely on the extern step such as pam_slurm_adopt, x11, and job_container/tmpfs. + auth/jwt - Ignore 'x5c' field in JWKS files. + auth/jwt - Treat 'alg' field as optional in JWKS files. + Allow job_desc.selinux_context to be read from the job_submit.lua script. + Skip check in slurmstepd that causes a large number of errors in the munge log: "Unauthorized credential for client UID=0 GID=0". This error will still appear on slurmd/slurmctld/slurmdbd start up and is not a cause for concern. + slurmctld - Allow startup with zero partitions. + Fix some mig profile names in slurm not matching nvidia mig profiles. + Prevent slurmscriptd processing delays from blocking other threads in slurmctld while trying to launch {Prolog|Epilog}Slurmctld. OBS-URL: https://build.opensuse.org/request/show/1117145 OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=268
This commit is contained in:
parent
90bba6a8aa
commit
cd2c5bfc50
@ -3,6 +3,63 @@ Thu Oct 12 08:23:20 UTC 2023 - Christian Goll <cgoll@suse.com>
|
|||||||
|
|
||||||
- update to 23.02.6 to fix (CVE-2023-41914)
|
- update to 23.02.6 to fix (CVE-2023-41914)
|
||||||
* Removed Fix-test-32.8.patch as fixed upstream
|
* Removed Fix-test-32.8.patch as fixed upstream
|
||||||
|
* Bug Fixes:
|
||||||
|
+ Fix CpusPerTres= not upgreadable with scontrol update
|
||||||
|
+ Fix unintentional gres removal when validating the gres job state.
|
||||||
|
+ Fix --without-hpe-slingshot configure option.
|
||||||
|
+ Fix cgroup v2 memory calculations when transparent huge pages are used.
|
||||||
|
+ Fix parsing of sgather --timeout option.
|
||||||
|
+ Fix regression from 22.05.0 that caused srun --cpu-bind "=verbose" and "=v"
|
||||||
|
options give different CPU bind masks.
|
||||||
|
+ Fix "_find_node_record: lookup failure for node" error message appearing
|
||||||
|
for all dynamic nodes during reconfigure.
|
||||||
|
+ Avoid segfault if loading serializer plugin fails.
|
||||||
|
+ slurmrestd - Correct OpenAPI format for 'GET /slurm/v0.0.39/licenses'.
|
||||||
|
+ slurmrestd - Correct OpenAPI format for 'GET /slurm/v0.0.39/job/{job_id}'.
|
||||||
|
+ slurmrestd - Change format to multiple fields in 'GET
|
||||||
|
/slurmdb/v0.0.39/assocations' and 'GET /slurmdb/v0.0.39/qos' to handle
|
||||||
|
infinite and unset states.
|
||||||
|
+ When a node fails in a job with --no-kill, preserve the extern step on the
|
||||||
|
remaining nodes to avoid breaking features that rely on the extern step
|
||||||
|
such as pam_slurm_adopt, x11, and job_container/tmpfs.
|
||||||
|
+ auth/jwt - Ignore 'x5c' field in JWKS files.
|
||||||
|
+ auth/jwt - Treat 'alg' field as optional in JWKS files.
|
||||||
|
+ Allow job_desc.selinux_context to be read from the job_submit.lua script.
|
||||||
|
+ Skip check in slurmstepd that causes a large number of errors in the munge
|
||||||
|
log: "Unauthorized credential for client UID=0 GID=0". This error will
|
||||||
|
still appear on slurmd/slurmctld/slurmdbd start up and is not a cause for
|
||||||
|
concern.
|
||||||
|
+ slurmctld - Allow startup with zero partitions.
|
||||||
|
+ Fix some mig profile names in slurm not matching nvidia mig profiles.
|
||||||
|
+ Prevent slurmscriptd processing delays from blocking other threads in
|
||||||
|
slurmctld while trying to launch {Prolog|Epilog}Slurmctld.
|
||||||
|
+ Fix sacct printing ReqMem field when memory doesn't exist in requested TRES.
|
||||||
|
+ Fix how heterogenous steps in an allocation with CR_PACK_NODE or -mpack are
|
||||||
|
created.
|
||||||
|
+ Fix slurmctld crash from race condition within job_submit_throttle plugin.
|
||||||
|
+ Fix --with-systemdsystemunitdir when requesting a default location.
|
||||||
|
+ Fix not being able to cancel an array task by the jobid (i.e. not
|
||||||
|
<jobid>_<taskid>) through scancel, job launch failure or prolog failure.
|
||||||
|
+ Fix cancelling the whole array job when the array task is the meta job and
|
||||||
|
it fails job or prolog launch and is not requeable. Cancel only the
|
||||||
|
specific task instead.
|
||||||
|
+ Fix regression in 21.08.2 where MailProg did not run for mail-type=end for
|
||||||
|
jobs with non+zero exit codes.
|
||||||
|
+ Fix incorrect setting of memory.swap.max in cgroup/v2.
|
||||||
|
+ Fix jobacctgather/cgroup collection of disk/io, gpumem, gpuutil TRES values.
|
||||||
|
+ Fix -d singleton for heterogeneous jobs.
|
||||||
|
+ Downgrade info logs about a job meeting a "maximum node limit" in the
|
||||||
|
select plugin to DebugFlags=SelectType. These info logs could spam the
|
||||||
|
slurmctld log file under certain circumstances.
|
||||||
|
+ prep/script - Fix [Srun|Task]<Prolog|Epilog> missing SLURM_JOB_NODELIST.
|
||||||
|
+ gres - Rebuild GRES core bitmap for nodes at startup. This fixes error:
|
||||||
|
"Core bitmaps size mismatch on node [HOSTNAME]", which causes jobs to enter
|
||||||
|
state "Requested node configuration is not available".
|
||||||
|
+ slurmctd - Allow startup with zero nodes.
|
||||||
|
+ Fix filesystem handling race conditions that could lead to an attacker
|
||||||
|
taking control of an arbitrary file, or removing entire directories'
|
||||||
|
contents. CVE-2023-41914.
|
||||||
|
|
||||||
|
|
||||||
-------------------------------------------------------------------
|
-------------------------------------------------------------------
|
||||||
Mon Sep 18 05:23:19 UTC 2023 - Egbert Eich <eich@suse.com>
|
Mon Sep 18 05:23:19 UTC 2023 - Egbert Eich <eich@suse.com>
|
||||||
|
Loading…
x
Reference in New Issue
Block a user