SHA256
1
0
forked from pool/slurm

CVE-2023-49933, CVE-2023-49934, CVE-2023-49935, CVE-2023-49936

and CVE-2023-49937
  * Substantially overhauled the SlurmDBD association management
    code. For clusters updated to 23.11, account and user
    additions or removals are significantly faster than in prior
    releases.
  * Overhauled `scontrol reconfigure` to prevent configuration
    mistakes from disabling slurmctld and slurmd. Instead, an
    error will be returned, and the running configuration will
    persist. This does require updates to the systemd service
    files to use the `--systemd` option to `slurmctld` and `slurmd`.
  * Added a new internal `auth/cred` plugin - `auth/slurm`. This
    builds off the prior `auth/jwt` model, and permits operation
    of the `slurmdbd` and `slurmctld` without access to full
    directory information with a suitable configuration.
  * Added a new `--external-launcher` option to `srun`, which is
    automatically set by common MPI launcher implementations and
    ensures processes using those non-srun launchers have full
    access to all resources allocated on each node.
  * Reworked the dynamic/cloud modes of operation to allow for
    "fanout" - where Slurm communication can be automatically
    offloaded to compute nodes for increased cluster scalability.
  * Overhauled and extended the Reservation subsystem to allow
    for most of the same resource requirements as are placed on
    the job. Notably, this permits reservations to now reserve
    GRES directly.
  * Fix `scontrol update job=... TimeLimit+=/-=` when used with a
    raw JobId of job array element.
  * Reject `TimeLimit` increment/decrement when called on job with
    `TimeLimit=UNLIMITED`.

OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=285
This commit is contained in:
Egbert Eich 2024-01-22 16:26:43 +00:00 committed by Git OBS Bridge
parent e7275730c8
commit e59754da76

View File

@ -2,277 +2,329 @@
Fri Jan 12 11:08:01 UTC 2024 - Christian Goll <cgoll@suse.com> Fri Jan 12 11:08:01 UTC 2024 - Christian Goll <cgoll@suse.com>
- Update to 23.11.1 with following major improvements and fixing - Update to 23.11.1 with following major improvements and fixing
CVE-2023-49933, CVE-2023-49934, CVE-2023-49935, CVE-2023-49936 and CVE-2023-49933, CVE-2023-49934, CVE-2023-49935, CVE-2023-49936
CVE-2023-49937 and CVE-2023-49937
* Substantially overhauled the SlurmDBD association management code. For * Substantially overhauled the SlurmDBD association management
clusters updated to 23.11, account and user additions or removals are code. For clusters updated to 23.11, account and user
significantly faster than in prior releases. additions or removals are significantly faster than in prior
* Overhauled 'scontrol reconfigure' to prevent configuration mistakes from releases.
disabling slurmctld and slurmd. Instead, an error will be returned, and the * Overhauled `scontrol reconfigure` to prevent configuration
running configuration will persist. This does require updates to the mistakes from disabling slurmctld and slurmd. Instead, an
systemd service files to use the --systemd option to slurmctld and slurmd. error will be returned, and the running configuration will
* Added a new internal auth/cred plugin - "auth/slurm". This builds off the persist. This does require updates to the systemd service
prior auth/jwt model, and permits operation of the slurmdbd and slurmctld files to use the `--systemd` option to `slurmctld` and `slurmd`.
without access to full directory information with a suitable configuration. * Added a new internal `auth/cred` plugin - `auth/slurm`. This
* Added a new --external-launcher option to srun, which is automatically set builds off the prior `auth/jwt` model, and permits operation
by common MPI launcher implementations and ensures processes using those of the `slurmdbd` and `slurmctld` without access to full
non-srun launchers have full access to all resources allocated on each directory information with a suitable configuration.
node. * Added a new `--external-launcher` option to `srun`, which is
* Reworked the dynamic/cloud modes of operation to allow for "fanout" - where automatically set by common MPI launcher implementations and
Slurm communication can be automatically offloaded to compute nodes for ensures processes using those non-srun launchers have full
increased cluster scalability. access to all resources allocated on each node.
Added initial official Debian packaging support. * Reworked the dynamic/cloud modes of operation to allow for
* Overhauled and extended the Reservation subsystem to allow for most of the "fanout" - where Slurm communication can be automatically
same resource requirements as are placed on the job. Notably, this permits offloaded to compute nodes for increased cluster scalability.
reservations to now reserve GRES directly. * Overhauled and extended the Reservation subsystem to allow
for most of the same resource requirements as are placed on
the job. Notably, this permits reservations to now reserve
GRES directly.
- Details of changes: - Details of changes:
* Fix scontrol update job=... TimeLimit+=/-= when used with a raw JobId of job * Fix `scontrol update job=... TimeLimit+=/-=` when used with a
array element. raw JobId of job array element.
* Reject TimeLimit increment/decrement when called on job with * Reject `TimeLimit` increment/decrement when called on job with
TimeLimit=UNLIMITED. `TimeLimit=UNLIMITED`.
* Fix issue with requesting a job with *licenses as well as * Fix issue with requesting a job with `*licenses` as well as
*tres-per-task=license. `*tres-per-task=license`.
* slurmctld - Prevent segfault in getopt_long() with an invalid long option. * `slurmctld` - Prevent segfault in `getopt_long()` with an
* Switch to man2html-base in Build-Depends for Debian package. invalid long option.
* slurmrestd - Added /meta/slurm/cluster field to responses. * slurmrestd - Added `/meta/slurm/cluster` field to responses.
* Adjust systemd service files to start daemons after remote-fs.target. * Adjust systemd service files to start daemons after
* Fix task/cgroup indexing tasks in cgroup plugins, which caused `remote-fs.target`.
jobacct/gather to match the gathered stats with the wrong task id. * Fix `task/cgroup` indexing tasks in cgroup plugins, which
* select/linear - Fix regression in 23.11 in which jobs that requested caused `jobacct/gather` to match the gathered stats with the
*cpus-per-task were rejected. wrong task id.
* data_parser/v0.0.40 - Fix the parsing for /slurmdb/v0.0.40/jobs exit_code * `select/linear` - Fix regression in 23.11 in which jobs that
query parameter. requested `*cpus-per-task` were rejected.
* If a job requests more shards which would allocate more than one sharing * `data_parser/v0.0.40` - Fix the parsing for
GRES (gpu) per node refuse it unless SelectTypeparameters has `/slurmdb/v0.0.40/jobs` exit_code query parameter.
MULTIPLE_SHARING_GRES_PJ. * If a job requests more shards which would allocate more than
* Trigger fatal exit when Slurm API function is called before slurm_init() is one sharing GRES (gpu) per node refuse it unless
called. `SelectTypeparameters` has `MULTIPLE_SHARING_GRES_PJ`.
* slurmd - Fix issue with 'scontrol reconfigure' when started with '-c'. * Trigger fatal exit when Slurm API function is called before
* slurmrestd - Job submissions that result in the following error codes `slurm_init()` is called.
will be considered as successfully submitted (with a warning), instead * `slurmd` - Fix issue with `scontrol reconfigure` when started
of returning an HTTP 500 error back: with `-c`.
ESLURM_NODES_BUSY, ESLURM_RESERVATION_BUSY, ESLURM_JOB_HELD, * `slurmrestd` - Job submissions that result in the following
ESLURM_NODE_NOT_AVAIL, ESLURM_QOS_THRES, ESLURM_ACCOUNTING_POLICY, error codes will be considered as successfully submitted (with
ESLURM_RESERVATION_NOT_USABLE, ESLURM_REQUESTED_PART_CONFIG_UNAVAILABLE, a warning), instead of returning an HTTP 500 error back:
ESLURM_BURST_BUFFER_WAIT, ESLURM_PARTITION_DOWN, `ESLURM_NODES_BUSY`, `ESLURM_RESERVATION_BUSY`, `ESLURM_JOB_HELD`,
ESLURM_LICENSES_UNAVAILABLE. `ESLURM_NODE_NOT_AVAIL`, `ESLURM_QOS_THRES`,
* Fix a slurmctld fatal error when upgrading to 23.11 and changing from `ESLURM_ACCOUNTING_POLICY`, `ESLURM_RESERVATION_NOT_USABLE`,
select/cons_res to select/cons_tres at the same time. `ESLURM_REQUESTED_PART_CONFIG_UNAVAILABLE`,
* slurmctld - Reject arbitrary distribution jobs that have a minimum node `ESLURM_BURST_BUFFER_WAIT`, ESLURM_PARTITION_DOWN`,
count that differs from the number of unique nodes in the hostlist. `ESLURM_LICENSES_UNAVAILABLE`.
* Prevent slurmdbd errors when updating reservations with names containing * Fix a `slurmctld` fatal error when upgrading to 23.11 and
apostrophes. changing from `select/cons_res` to `select/cons_tres` at the
* Prevent message extension attacks that could bypass the message hash. same time.
CVE-2023-49933. * `slurmctld` - Reject arbitrary distribution jobs that have a
* Prevent SQL injection attacks in slurmdbd. CVE-2023-49934. minimum node count that differs from the number of unique
* Prevent message hash bypass in slurmd which can allow an attacker to reuse nodes in the hostlist.
root-level MUNGE tokens and escalate permissions. CVE-2023-49935. * Prevent `slurmdbd` errors when updating reservations with names
* Prevent NULL pointer dereference on size_valp overflow. CVE-2023-49936. containing apostrophes.
* Prevent double-xfree() on error in _unpack_node_reg_resp(). * Prevent message extension attacks that could bypass the
message hash. CVE-2023-49933.
* Prevent SQL injection attacks in `slurmdbd`. CVE-2023-49934.
* Prevent message hash bypass in slurmd which can allow an
attacker to reuse root-level MUNGE tokens and escalate
permissions. CVE-2023-49935.
* Prevent NULL pointer dereference on size_valp overflow.
CVE-2023-49936.
* Prevent double-xfree() on error in `_unpack_node_reg_resp()`.
CVE-2023-49937. CVE-2023-49937.
* For jobs that request *cpus-per-gpu, ensure that the *cpus-per-gpu request * For jobs that request `*cpus-per-gpu`, ensure that the
is honored on every node in the and not just for the job as a whole. `*cpus-per-gpu request` is honored on every node in the and
* Fix listing available data_parser plugins for json and yaml when giving no not just for the job as a whole.
commands to scontrol or sacctmgr. * Fix listing available `data_parser` plugins for json and yaml
* slurmctld - Rework 'scontrol reconfigure' to avoid race conditions that when giving no commands to `scontrol` or `sacctmgr`.
can result in stray jobs. * `slurmctld` - Rework `scontrol reconfigure` to avoid race
* slurmctld - Shave ~1 second off average reconfigure time by terminating conditions that can result in stray jobs.
internal processing threads faster. * `slurmctld` - Shave ~1 second off average reconfigure time by
* Skip running slurmdbd -R if the connected cluster is 23.11 or newer. terminating internal processing threads faster.
This operation is nolonger relevant for 23.11. * Skip running `slurmdbd -R` if the connected cluster is 23.11
* Ensure slurmscriptd shuts down before slurmctld is stopped / reconfigured. or newer. This operation is no longer relevant for 23.11.
* Improve error handling and error messages in slurmctld to slurmscriptd * Ensure `slurmscriptd` shuts down before `slurmctld` is stopped
communications. This includes avoiding potential deadlock in slurmctld if or reconfigured.
slurmscript dies unexpectedly. * Improve error handling and error messages in `slurmctld` to
* Do not hold batch jobs whose extra constraints cannot be immediately `slurmscriptd` communications. This includes avoiding
satisfied, and set the state reason to "Constraints" instead of potential deadlock in `slurmctld` if slurmscript dies
"BadConstraints". unexpectedly.
* Fix verbose log message printing a hex number instead of a job id. * Do not hold batch jobs whose extra constraints cannot be
immediately satisfied, and set the state reason to
`Constraints` instead of `BadConstraints`.
* Fix verbose log message printing a hex number instead of a job
id.
* Upgrade rate limit parameters message from debug to info. * Upgrade rate limit parameters message from debug to info.
* For SchedulerParameters=extra_constraints, prevent slurmctld segfault when * For `SchedulerParameters=extra_constraints`, prevent `slurmctld`
starting a slurmd with *extra for a node that did not previously set this. segfault when starting a `slurmd` with `*extra` for a node
This also ensures the extra constraints model works off the current node that did not previously set this.
state, not the prior state. This also ensures the extra constraints model works off the
* Fix *tres-per-task assertion. current node state, not the prior state.
* Fix `*tres-per-task` assertion.
* Fix a few issues when creating reservations. * Fix a few issues when creating reservations.
* Add SchedulerParameters=time_min_as_soft_limit option. * Add `SchedulerParameters=time_min_as_soft_limit` option.
* Remove SLURM_WORKING_CLUSTER env from batch and srun environments. * Remove `SLURM_WORKING_CLUSTER` env from batch and srun
* cli_filter/lua - return nil for unset time options rather than the string environments.
"2982616-04:14:00" (which is the internal macro "NO_VAL" represented as * `cli_filter/lua` - return nil for unset time options rather
time string). than the string `2982616-04:14:00` (which is the internal
* Remove 'none' plugins for all but auth and cred. scontrol show config macro `NO_VAL` represented as time string).
will report (null) now. * Remove 'none' plugins for all but auth and cred. scontrol show
* Removed select/cons_res. Please update your configuration to config will report (null) now.
select/cons_tres. * Removed `select/cons_res`. Please update your configuration to
* mpi/pmix - When aborted with status 0, avoid marking job/step as failed. `select/cons_tres`.
* Fixed typo on "initialized" for the description of ESLURM_PLUGIN_NOT_LOADED. * `mpi/pmix` - When aborted with status 0, avoid marking
* cgroup.conf - Removed deprecated parameters AllowedKmemSpace, job/step as failed.
ConstrainKmemSpace, MaxKmemPercent, and MinKmemSpace. * Fixed typo on `initialized` for the description of
* proctrack/cgroup - Add "SignalChildrenProcesses=<yes|no>" option to `ESLURM_PLUGIN_NOT_LOADED`.
cgroup.conf. This allows signals for cancelling, suspending, resuming, etc. * `cgroup.conf` - Removed deprecated parameters `AllowedKmemSpace`,
to be sent to children processes in a step/job rather than just the parent. `ConstrainKmemSpace`, `MaxKmemPercent`, and `MinKmemSpace`.
* Add PreemptParameters=suspend_grace_time parameter to control amount of * `proctrack/cgroup` - Add `SignalChildrenProcesses=<yes|no>`
time between SIGTSTP and SIGSTOP signals when suspending jobs. option to `cgroup.conf`. This allows signals for cancelling,
* job_submit/throttle - improve reset of submitted job counts per user in suspending, resuming, etc. to be sent to children processes in
order to better honor SchedulerParameters=jobs_per_user_per_hour=#. a `step/job` rather than just the parent.
* Load the user environment into a private pid namespace to avoid user scripts * Add `PreemptParameters=suspend_grace_time` parameter to
leaving background processes on a node. control amount of time between `SIGTSTP` and `SIGSTOP` signals
* scontrol show assoc_mgr will display Lineage instead of Lft for when suspending jobs.
associations. * `job_submit/throttle` - improve reset of submitted job counts
* Add SlurmctldParameters=no_quick_restart to avoid a new slurmctld taking per user in order to better honor
over the old slurmctld on accedent. `SchedulerParameters=jobs_per_user_per_hour=#`.
* Fix --cpus-per-gpu for step allocations, which was previously ignored for * Load the user environment into a private pid namespace to
job steps. *cpus-per-gpu implies --exact. avoid user scripts leaving background processes on a node.
* Fix mutual exclusivity of --cpus-per-gpu and --cpus-per-task: fatal if both * `scontrol` show `assoc_mgr` will display Lineage instead of Lft
options are requested in the commandline or both are requested in the for associations.
environment. If one option is requested in the command line, it will * Add `SlurmctldParameters=no_quick_restart` to avoid a new
override the other option in the environment. `slurmctld` taking over the old `slurmctld` by accident.
* slurmrestd - openapi/dbv0.0.37 and openapi/v0.0.37 plugins have been * Fix `--cpus-per-gpu` for step allocations, which was
removed. previously ignored for job steps. `*cpus-per-gpu` implies
* slurmrestd - openapi/dbv0.0.38 and openapi/v0.0.38 plugins have been tagged `--exact`.
as deprecated. * Fix mutual exclusivity of `--cpus-per-gpu` and
* slurmrestd - added auto population of info/version field. `--cpus-per-task`: fatal if both options are requested in the
* sdiag - add --yaml and --json arg support to specify data_parser plugin. commandline or both are requested in the environment. If one
* sacct - add --yaml and --json arg support to specify data_parser plugin. option is requested in the command line, it will override the
* scontrol - add --yaml and --json arg support to specify data_parser plugin. other option in the environment.
* sinfo - add --yaml and --json arg support to specify data_parser plugin. * `slurmrestd` - `openapi/dbv0.0.37` and `openapi/v0.0.37`
* squeue - add --yaml and --json arg support to specify data_parser plugin. plugins have been removed.
* Changed the default SelectType to select/cons_tres (from select/linear). * `slurmrestd` - `openapi/dbv0.0.38` and `openapi/v0.0.38`
* Allow SlurmUser/root to use reservations without specific permissions. plugins have been tagged as deprecated.
* `slurmrestd` - added auto population of `info/version` field.
* `sdiag` - add `--yaml` and `--json` arg support to specify
data_parser plugin.
* `sacct` - add `--yaml` and `--json` arg support to specify
`data_parser` plugin.
* `scontrol` - add `--yaml` and `--json` arg support to specify
`data_parser` plugin.
* `sinfo` - add `--yaml` and `--json` arg support to specify
`data_parser` plugin.
* `squeue` - add `--yaml` and `--json` arg support to specify
`data_parser` plugin.
* Changed the default `SelectType` to `select/cons_tres` (from
`select/linear`).
* Allow `SlurmUser`/`root` to use reservations without specific
permissions.
* Fix sending step signals to nodes not allocated by the step. * Fix sending step signals to nodes not allocated by the step.
* Remove CgroupAutomount= option from cgroup.conf. * Remove `CgroupAutomount=` option from `cgroup.conf`.
* Add TopologyRoute=RoutePart to route communications based on partition node * Add `TopologyRoute=RoutePart` to route communications based
lists. on partition node lists.
* Added ability for configless to push Prolog and Epilog scripts to slurmds. * Added ability for configless to push Prolog and Epilog
scripts to `slurmd`s.
* Prolog and Epilog do not have to be fully qualified pathnames. * Prolog and Epilog do not have to be fully qualified pathnames.
* Changed default value of PriorityType from priority/basic to * Changed default value of `PriorityType` from `priority/basic`
priority/multifactor. to `priority/multifactor`.
* torque/mpiexec - Propogate exit code from launched process. * `torque/mpiexec` - Propogate exit code from `launched` process.
* slurmrestd - Add new rlimits fields for job submission. * `slurmrestd` - Add new rlimits fields for job submission.
* Define SPANK options environment variables when --export=[NIL|NONE] is * Define SPANK options environment variables when
specified. `--export=[NIL|NONE]` is specified.
* slurmrestd - Numeric input fields provided with a null formatted value will * `slurmrestd` - Numeric input fields provided with a null
now convert to zero (0) where it can be a valid value. This is expected to formatted value will now convert to zero (0) where it can be
be only be notable with job submission against v0.0.38 versioned endpoints a valid value. This is expected to be only be notable with job
with job requests with fields provided with null values. These fields were submission against v0.0.38 versioned endpoints with job
already rejected by v0.0.39+ endpoints, unless +complex parser value is requests with fields provided with null values. These fields
provided to v0.0.40+ endpoints. were already rejected by v0.0.39+ endpoints, unless `+complex`
* slurmrestd - Improve parsing of integers and floating point numbers when parser value is provided to v0.0.40+ endpoints.
handling incoming user provided numeric fields. Fields that would have not * `slurmrestd` - Improve parsing of integers and floating point
rejected a number for a numeric field followed by other non-numeric numbers when handling incoming user provided numeric fields.
characters will now get rejected. This is expected to be only be notable Fields that would have not rejected a number for a numeric
with job submission against v0.0.38 versioned endpoints with malformed job field followed by other non-numeric characters will now get
requests. rejected. This is expected to be only be notable with job
* Reject reservation update if it will result in previously submitted submission against v0.0.38 versioned endpoints with malformed
jobs losing access to the reservation. job requests.
* data_parser/v0.0.40 - output partition state when dumping partitions. * Reject reservation update if it will result in previously
* Allow for a shared suffix to be used with the hostlist format. E.g., submitted jobs losing access to the reservation.
"node[0001-0010]-int". * `data_parser/v0.0.40` - output partition state when dumping
* Fix perlapi build when using non-default libdir. partitions.
* Replace SRUN_CPUS_PER_TASK with SLURM_CPUS_PER_TASK and get back the * Allow for a shared suffix to be used with the hostlist format.
previous behavior before Slurm 22.05 since now we have the new external E.g., `node[0001-0010]-int`.
launcher step. * Replace `SRUN_CPUS_PER_TASK` with `SLURM_CPUS_PER_TASK` and
* job_container/tmpfs - Add "BasePath=none" option to disable plugin on node get back the previous behavior before Slurm 22.05 since now we
subsets when there is a global setting. have the new external launcher step.
* Add QOS flag 'Relative'. If set the QOS limits will be treated as * `job_container/tmpfs` - Add `BasePath=none` option to disable
percentages of a cluster/partition instead of absolutes. plugin on node subsets when there is a global setting.
* Remove FIRST_CORES flag from reservations. * Add QOS flag `Relative`. If set the QOS limits will be treated
* Add cloud instance id and instance type to node records. Can be viewed/ as percentages of a cluster/partition instead of absolutes.
updated with scontrol. * Remove `FIRST_CORES` flag from reservations.
* slurmd - add "instance-id", "instance-type", and "extra" options to allow * Add cloud instance id and instance type to node records.
them to be set on startup. Can be viewed/updated with `scontrol`.
* Add cloud instance accounting to database that can be viewed with 'sacctmgr * `slurmd` - add `instance-id`, `instance-type`, and `extra`
show instance'. options to allow them to be set on startup.
* select/linear - fix task launch failure that sometimes occurred when * Add cloud instance accounting to database that can be viewed
requesting *threads-per-core or --hint=nomultithread. This also fixes with `sacctmgr show instance`.
memory calculation with one of these options and *mem-per-cpu: * `select/linear` - fix task launch failure that sometimes
Previously, memory = mem-per-cpu * all cpus including unusable threads. occurred when requesting `*threads-per-core` or
Now, memory = mem-per-cpu * only usuable threads. This behavior matches `--hint=nomultithread`. This also fixes memory calculation
the documentation and select/cons_tres. with one of these options and `*mem-per-cpu`:
* gpu/nvml - Reduce chances of NVML_ERROR_INSUFFICIENT_SIZE error when getting Previously, memory = mem-per-cpu * all cpus including unusable
gpu memory information. threads.
* slurmrestd - Convert to generating OperationIDs based on path for all Now, memory = mem-per-cpu * only usuable threads. This
v0.0.40 tagged paths. behavior matches the documentation and select/cons_tres.
* slurmrestd - Reduce memory used while dumping a job's stdio paths. * `gpu/nvml` - Reduce chances of `NVML_ERROR_INSUFFICIENT_SIZE`
* slurmrestd - Jobs queried from data_parser/v0.0.40 from slurmdb will have error when getting gpu memory information.
'step/id' field given as a string to match CLI formatting instead of an * `slurmrestd` - Convert to generating `OperationIDs` based on
path for all v0.0.40 tagged paths.
* `slurmrestd` - Reduce memory used while dumping a job's stdio
paths.
* `slurmrestd` - Jobs queried from `data_parser/v0.0.40` from
`slurmdb` will have `step/id` field given as a string to match
CLI formatting instead of an object.
* `sacct` - Output in JSON or YAML output will will have the
`step/id` field given as a string instead of an object.
* `scontrol`/`squeue` - Step output in JSON or YAML output will
will have the `id` field given as a string instead of an
object. object.
* sacct - Output in JSON or YAML output will will have the 'step/id' field * `slurmrestd` - For `GET /slurmdb/v0.0.40/jobs` mimick default
given as a string instead of an object. behavior for handling of job start and end times as `sacct`
* scontrol/squeue - Step output in JSON or YAML output will will have the when one or both fields are not provided as a query parameter.
'id' field given as a string instead of an object. * `openapi/slurmctld` - Add `GET /slurm/v0.0.40/shares` endpoint
* slurmrestd - For 'GET /slurmdb/v0.0.40/jobs' mimick default behavior for to dump same output as `sshare`.
handling of job start and end times as sacct when one or both fields are * `sshare` - add JSON/YAML support.
not provided as a query parameter. * `data_parser/v0.0.40` - Remove `required/memory` output in
* openapi/slurmctld - Add 'GET /slurm/v0.0.40/shares' endpoint to dump same json. It is replaced by `required/memory_per_cpu` and
output as sshare. `required/memory_per_node`.
* sshare - add JSON/YAML support. * `slurmrestd` - Add numeric id to all association identifiers
* data_parser/v0.0.40 - Remove "required/memory" output in json. It is to allow unique identification where association has been
replaced by "required/memory_per_cpu" and "required/memory_per_node". deleted but is still referenced by accounting record.
* slurmrestd - Add numeric id to all association identifiers to allow unique * `slurmrestd` - Add accounting, id, and comment fields to
identification where association has been deleted but is still referenced by association dumps.
accounting record. * Use `memory.current` in cgroup/v2 instead of manually
* slurmrestd - Add accounting, id, and comment fields to association dumps. calculating RSS. This makes accounting consistent with
* Use memory.current in cgroup/v2 instead of manually calculating RSS. This OOM Killer.
makes accounting consistent with OOM Killer. * `sreport` - cluster Utilization `PlannedDown` field now
* sreport - cluster Utilization PlannedDown field now includes the time that includes the time that all nodes were in the `POWERED_DOWN`
all nodes were in the POWERED_DOWN state instead of just cloud nodes. state instead of just cloud nodes.
* scontrol update partition now allows Nodes+=<node-list> and * `scontrol` update partition now allows `Nodes+=<node-list>` and
Nodes-=<node-list> to add/delete nodes from the existing partition node `Nodes-=<node-list>` to add/delete nodes from the existing
list. Nodes=+host1,-host2 is also allowed. partition node list. `Nodes=+host1,-host2` is also allowed.
* sacctmgr - add --yaml and --json arg support to specify data_parser plugin. * `sacctmgr` - add `--yaml` and `--json` arg support to specify
* sacctmgr can now modify QOS's RawUsage to zero or a positive value. `data_parser` plugin.
* sdiag - Added statistics on why the main and backfill schedulers have * `sacctmgr` can now modify QOS's RawUsage to zero or a positive
stopped evaluation on each scheduling cycle. value.
the number of 'RPC limit exceeded...' messages that are logged. * `sdiag` - Added statistics on why the main and backfill
* Rename sbcast --fanout to --treewidth. schedulers have stopped evaluation on each scheduling cycle.
* Remove SLURM_NODE_ALIASES env variable. the number of `RPC limit exceeded...` messages that are logged.
* Rename `sbcast --fanout` to `--treewidth`.
* Remove `SLURM_NODE_ALIASES` env variable.
* Enable fanout for dynamic and unaddresable cloud nodes. * Enable fanout for dynamic and unaddresable cloud nodes.
* Fix how steps are dealloced in an allocation if the last step of an srun * Fix how steps are dealloced in an allocation if the last step
never completes due to a node failure. of an srun never completes due to a node failure.
* Remove redundant database indexes. * Remove redundant database indexes.
* Add database index to suspend table to speed up archive/purges. * Add database index to suspend table to speed up archive/purges.
* When requesting --tres-per-task alter incorrect request for TRES, * When requesting `--tres-per-task` alter incorrect request for
it should be TRESType/TRESName not TRESType:TRESName. TRES, it should be `TRESType/TRESName` not `TRESType:TRESName`.
* Make it so reservations can reserve GRES. * Make it so reservations can reserve GRES.
* sbcast - use the specified --fanout value on all hops in message * `sbcast` - use the specified `--fanout` value on all hops in
forwarding; previously the specified fanout was only used on the first hop, message forwarding; previously the specified fanout was only
and additional hops used TreeWidth in slurm.conf. used on the first hop, and additional hops used `TreeWidth` in
* slurmrestd - remove logger prefix from '-s/-a list' options outputs. `slurm.conf`.
* switch/hpe_slingshot - Add support for collectives. * `slurmrestd`- remove logger prefix from `-s/-a list` options
* Nodes with suspended jobs can now be displayed as MIXED. outputs.
* Fix inconsistent handling of using cli and/or environment options for * `switch/hpe_slingshot` - Add support for collectives.
tres_per_task=cpu:# and cpus_per_gpu. * Nodes with suspended jobs can now be displayed as `MIXED`.
* Requesting --cpus-per-task will now set SLURM_TRES_PER_TASK=cpu:# in the * Fix inconsistent handling of using cli and/or environment
environment. options for `tres_per_task=cpu:#` and `cpus_per_gpu`.
* For some tres related environment variables such as SLURM_TRES_PER_TASK, * Requesting `--cpus-per-task` will now set
when srun requests a different value for that option, set these environment `SLURM_TRES_PER_TASK=cpu:#` in the environment.
variables to the value requested by srun. Previously these environment * For some tres related environment variables such as
variables were unchanged from the job allocation. This bug only affected the `SLURM_TRES_PER_TASK`, when `srun` requests a different value
output environment variables, not the actual step resource allocation. for that option, set these environment variables to the value
* RoutePlugin=route/topology has been replaced with TopologyParam=RouteTree. requested by `srun`. Previously these environment variables
* If ThreadsPerCore in slurm.conf is configured with less were unchanged from the job allocation. This bug only affected
than the number of hardware threads, fix a bug where the task plugins used the output environment variables, not the actual step resource
fewer cores instead of using fewer threads per core. allocation.
* Fix arbitrary distribution allowing it to be used with salloc and sbatch and * `RoutePlugin=route/topology` has been replaced with
fix how cpus are allocated to nodes. `TopologyParam=RouteTree`.
* Allow nodes to reboot while node is drained or in a maintenance state. * If `ThreadsPerCore` in `slurm.conf` is configured with less
* Allow scontrol reboot to use nodesets to filter nodes to reboot. than the number of hardware threads, fix a bug where the task
plugins used fewer cores instead of using fewer threads per core.
* Fix arbitrary distribution allowing it to be used with `salloc`
and `sbatch` and fix how cpus are allocated to nodes.
* Allow nodes to reboot while node is drained or in a
maintenance state.
* Allow `scontrol` reboot to use nodesets to filter nodes to reboot.
* Fix how the topology of typed gres gets updated. * Fix how the topology of typed gres gets updated.
* Changes to the Type option in gres.conf now can be applied with scontrol * Changes to the Type option in gres.conf now can be applied with
reconfig. `scontrol` reconfig.
* Allow for jobs that request a newly configured gres type to be queued * Allow for jobs that request a newly configured gres type to be
even when the needed slurmds have not yet registered. queued even when the needed `slurmd`s have not yet registered.
* Kill recovered jobs that require unconfigured gres types. * Kill recovered jobs that require unconfigured gres types.
* If keepalives are configured, enable them on all persistent connections. * If keepalives are configured, enable them on all persistent
* Configless - Also send Includes from configuration files not parsed by the connections.
controller (i.e. from plugstack.conf). * Configless - Also send Includes from configuration files not
* Add gpu/nrt plugin for nodes using Trainium/Inferentia devices. parsed by the controller (i.e. from `plugstack.conf`).
* data_parser/v0.0.40 - Add START_RECEIVED to job flags in dumped output. * Add `gpu/nrt` plugin for nodes using Trainium/Inferentia
* SPANK - Failures from most spank functions (not epilog or exit) will now devices.
cause the step to be marked as failed and the command (srun, salloc, * `data_parser/v0.0.40` - Add `START_RECEIVED` to job flags in
sbatch *wait) to return 1. dumped output.
* SPANK - Failures from most spank functions (not epilog or
exit) will now cause the step to be marked as failed and the
command (`srun`, `salloc`, `sbatch *wait`) to return 1.
------------------------------------------------------------------- -------------------------------------------------------------------
Wed Jan 3 10:45:48 UTC 2024 - Egbert Eich <eich@suse.com> Wed Jan 3 10:45:48 UTC 2024 - Egbert Eich <eich@suse.com>