From e59754da76268fae3659e1ea132832db61a7e0c2530e10c3df7e4ef7f94ad723 Mon Sep 17 00:00:00 2001 From: Egbert Eich Date: Mon, 22 Jan 2024 16:26:43 +0000 Subject: [PATCH] CVE-2023-49933, CVE-2023-49934, CVE-2023-49935, CVE-2023-49936 and CVE-2023-49937 * Substantially overhauled the SlurmDBD association management code. For clusters updated to 23.11, account and user additions or removals are significantly faster than in prior releases. * Overhauled `scontrol reconfigure` to prevent configuration mistakes from disabling slurmctld and slurmd. Instead, an error will be returned, and the running configuration will persist. This does require updates to the systemd service files to use the `--systemd` option to `slurmctld` and `slurmd`. * Added a new internal `auth/cred` plugin - `auth/slurm`. This builds off the prior `auth/jwt` model, and permits operation of the `slurmdbd` and `slurmctld` without access to full directory information with a suitable configuration. * Added a new `--external-launcher` option to `srun`, which is automatically set by common MPI launcher implementations and ensures processes using those non-srun launchers have full access to all resources allocated on each node. * Reworked the dynamic/cloud modes of operation to allow for "fanout" - where Slurm communication can be automatically offloaded to compute nodes for increased cluster scalability. * Overhauled and extended the Reservation subsystem to allow for most of the same resource requirements as are placed on the job. Notably, this permits reservations to now reserve GRES directly. * Fix `scontrol update job=... TimeLimit+=/-=` when used with a raw JobId of job array element. * Reject `TimeLimit` increment/decrement when called on job with `TimeLimit=UNLIMITED`. OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=285 --- slurm.changes | 568 +++++++++++++++++++++++++++----------------------- 1 file changed, 310 insertions(+), 258 deletions(-) diff --git a/slurm.changes b/slurm.changes index 5217833..3d89146 100644 --- a/slurm.changes +++ b/slurm.changes @@ -2,277 +2,329 @@ Fri Jan 12 11:08:01 UTC 2024 - Christian Goll - Update to 23.11.1 with following major improvements and fixing - CVE-2023-49933, CVE-2023-49934, CVE-2023-49935, CVE-2023-49936 and - CVE-2023-49937 - * Substantially overhauled the SlurmDBD association management code. For - clusters updated to 23.11, account and user additions or removals are - significantly faster than in prior releases. - * Overhauled 'scontrol reconfigure' to prevent configuration mistakes from - disabling slurmctld and slurmd. Instead, an error will be returned, and the - running configuration will persist. This does require updates to the - systemd service files to use the --systemd option to slurmctld and slurmd. - * Added a new internal auth/cred plugin - "auth/slurm". This builds off the - prior auth/jwt model, and permits operation of the slurmdbd and slurmctld - without access to full directory information with a suitable configuration. - * Added a new --external-launcher option to srun, which is automatically set - by common MPI launcher implementations and ensures processes using those - non-srun launchers have full access to all resources allocated on each - node. - * Reworked the dynamic/cloud modes of operation to allow for "fanout" - where - Slurm communication can be automatically offloaded to compute nodes for - increased cluster scalability. - Added initial official Debian packaging support. - * Overhauled and extended the Reservation subsystem to allow for most of the - same resource requirements as are placed on the job. Notably, this permits - reservations to now reserve GRES directly. + CVE-2023-49933, CVE-2023-49934, CVE-2023-49935, CVE-2023-49936 + and CVE-2023-49937 + * Substantially overhauled the SlurmDBD association management + code. For clusters updated to 23.11, account and user + additions or removals are significantly faster than in prior + releases. + * Overhauled `scontrol reconfigure` to prevent configuration + mistakes from disabling slurmctld and slurmd. Instead, an + error will be returned, and the running configuration will + persist. This does require updates to the systemd service + files to use the `--systemd` option to `slurmctld` and `slurmd`. + * Added a new internal `auth/cred` plugin - `auth/slurm`. This + builds off the prior `auth/jwt` model, and permits operation + of the `slurmdbd` and `slurmctld` without access to full + directory information with a suitable configuration. + * Added a new `--external-launcher` option to `srun`, which is + automatically set by common MPI launcher implementations and + ensures processes using those non-srun launchers have full + access to all resources allocated on each node. + * Reworked the dynamic/cloud modes of operation to allow for + "fanout" - where Slurm communication can be automatically + offloaded to compute nodes for increased cluster scalability. + * Overhauled and extended the Reservation subsystem to allow + for most of the same resource requirements as are placed on + the job. Notably, this permits reservations to now reserve + GRES directly. - Details of changes: - * Fix scontrol update job=... TimeLimit+=/-= when used with a raw JobId of job - array element. - * Reject TimeLimit increment/decrement when called on job with - TimeLimit=UNLIMITED. - * Fix issue with requesting a job with *licenses as well as - *tres-per-task=license. - * slurmctld - Prevent segfault in getopt_long() with an invalid long option. - * Switch to man2html-base in Build-Depends for Debian package. - * slurmrestd - Added /meta/slurm/cluster field to responses. - * Adjust systemd service files to start daemons after remote-fs.target. - * Fix task/cgroup indexing tasks in cgroup plugins, which caused - jobacct/gather to match the gathered stats with the wrong task id. - * select/linear - Fix regression in 23.11 in which jobs that requested - *cpus-per-task were rejected. - * data_parser/v0.0.40 - Fix the parsing for /slurmdb/v0.0.40/jobs exit_code - query parameter. - * If a job requests more shards which would allocate more than one sharing - GRES (gpu) per node refuse it unless SelectTypeparameters has - MULTIPLE_SHARING_GRES_PJ. - * Trigger fatal exit when Slurm API function is called before slurm_init() is - called. - * slurmd - Fix issue with 'scontrol reconfigure' when started with '-c'. - * slurmrestd - Job submissions that result in the following error codes - will be considered as successfully submitted (with a warning), instead - of returning an HTTP 500 error back: - ESLURM_NODES_BUSY, ESLURM_RESERVATION_BUSY, ESLURM_JOB_HELD, - ESLURM_NODE_NOT_AVAIL, ESLURM_QOS_THRES, ESLURM_ACCOUNTING_POLICY, - ESLURM_RESERVATION_NOT_USABLE, ESLURM_REQUESTED_PART_CONFIG_UNAVAILABLE, - ESLURM_BURST_BUFFER_WAIT, ESLURM_PARTITION_DOWN, - ESLURM_LICENSES_UNAVAILABLE. - * Fix a slurmctld fatal error when upgrading to 23.11 and changing from - select/cons_res to select/cons_tres at the same time. - * slurmctld - Reject arbitrary distribution jobs that have a minimum node - count that differs from the number of unique nodes in the hostlist. - * Prevent slurmdbd errors when updating reservations with names containing - apostrophes. - * Prevent message extension attacks that could bypass the message hash. - CVE-2023-49933. - * Prevent SQL injection attacks in slurmdbd. CVE-2023-49934. - * Prevent message hash bypass in slurmd which can allow an attacker to reuse - root-level MUNGE tokens and escalate permissions. CVE-2023-49935. - * Prevent NULL pointer dereference on size_valp overflow. CVE-2023-49936. - * Prevent double-xfree() on error in _unpack_node_reg_resp(). + * Fix `scontrol update job=... TimeLimit+=/-=` when used with a + raw JobId of job array element. + * Reject `TimeLimit` increment/decrement when called on job with + `TimeLimit=UNLIMITED`. + * Fix issue with requesting a job with `*licenses` as well as + `*tres-per-task=license`. + * `slurmctld` - Prevent segfault in `getopt_long()` with an + invalid long option. + * slurmrestd - Added `/meta/slurm/cluster` field to responses. + * Adjust systemd service files to start daemons after + `remote-fs.target`. + * Fix `task/cgroup` indexing tasks in cgroup plugins, which + caused `jobacct/gather` to match the gathered stats with the + wrong task id. + * `select/linear` - Fix regression in 23.11 in which jobs that + requested `*cpus-per-task` were rejected. + * `data_parser/v0.0.40` - Fix the parsing for + `/slurmdb/v0.0.40/jobs` exit_code query parameter. + * If a job requests more shards which would allocate more than + one sharing GRES (gpu) per node refuse it unless + `SelectTypeparameters` has `MULTIPLE_SHARING_GRES_PJ`. + * Trigger fatal exit when Slurm API function is called before + `slurm_init()` is called. + * `slurmd` - Fix issue with `scontrol reconfigure` when started + with `-c`. + * `slurmrestd` - Job submissions that result in the following + error codes will be considered as successfully submitted (with + a warning), instead of returning an HTTP 500 error back: + `ESLURM_NODES_BUSY`, `ESLURM_RESERVATION_BUSY`, `ESLURM_JOB_HELD`, + `ESLURM_NODE_NOT_AVAIL`, `ESLURM_QOS_THRES`, + `ESLURM_ACCOUNTING_POLICY`, `ESLURM_RESERVATION_NOT_USABLE`, + `ESLURM_REQUESTED_PART_CONFIG_UNAVAILABLE`, + `ESLURM_BURST_BUFFER_WAIT`, ESLURM_PARTITION_DOWN`, + `ESLURM_LICENSES_UNAVAILABLE`. + * Fix a `slurmctld` fatal error when upgrading to 23.11 and + changing from `select/cons_res` to `select/cons_tres` at the + same time. + * `slurmctld` - Reject arbitrary distribution jobs that have a + minimum node count that differs from the number of unique + nodes in the hostlist. + * Prevent `slurmdbd` errors when updating reservations with names + containing apostrophes. + * Prevent message extension attacks that could bypass the + message hash. CVE-2023-49933. + * Prevent SQL injection attacks in `slurmdbd`. CVE-2023-49934. + * Prevent message hash bypass in slurmd which can allow an + attacker to reuse root-level MUNGE tokens and escalate + permissions. CVE-2023-49935. + * Prevent NULL pointer dereference on size_valp overflow. + CVE-2023-49936. + * Prevent double-xfree() on error in `_unpack_node_reg_resp()`. CVE-2023-49937. - * For jobs that request *cpus-per-gpu, ensure that the *cpus-per-gpu request - is honored on every node in the and not just for the job as a whole. - * Fix listing available data_parser plugins for json and yaml when giving no - commands to scontrol or sacctmgr. - * slurmctld - Rework 'scontrol reconfigure' to avoid race conditions that - can result in stray jobs. - * slurmctld - Shave ~1 second off average reconfigure time by terminating - internal processing threads faster. - * Skip running slurmdbd -R if the connected cluster is 23.11 or newer. - This operation is nolonger relevant for 23.11. - * Ensure slurmscriptd shuts down before slurmctld is stopped / reconfigured. - * Improve error handling and error messages in slurmctld to slurmscriptd - communications. This includes avoiding potential deadlock in slurmctld if - slurmscript dies unexpectedly. - * Do not hold batch jobs whose extra constraints cannot be immediately - satisfied, and set the state reason to "Constraints" instead of - "BadConstraints". - * Fix verbose log message printing a hex number instead of a job id. + * For jobs that request `*cpus-per-gpu`, ensure that the + `*cpus-per-gpu request` is honored on every node in the and + not just for the job as a whole. + * Fix listing available `data_parser` plugins for json and yaml + when giving no commands to `scontrol` or `sacctmgr`. + * `slurmctld` - Rework `scontrol reconfigure` to avoid race + conditions that can result in stray jobs. + * `slurmctld` - Shave ~1 second off average reconfigure time by + terminating internal processing threads faster. + * Skip running `slurmdbd -R` if the connected cluster is 23.11 + or newer. This operation is no longer relevant for 23.11. + * Ensure `slurmscriptd` shuts down before `slurmctld` is stopped + or reconfigured. + * Improve error handling and error messages in `slurmctld` to + `slurmscriptd` communications. This includes avoiding + potential deadlock in `slurmctld` if slurmscript dies + unexpectedly. + * Do not hold batch jobs whose extra constraints cannot be + immediately satisfied, and set the state reason to + `Constraints` instead of `BadConstraints`. + * Fix verbose log message printing a hex number instead of a job + id. * Upgrade rate limit parameters message from debug to info. - * For SchedulerParameters=extra_constraints, prevent slurmctld segfault when - starting a slurmd with *extra for a node that did not previously set this. - This also ensures the extra constraints model works off the current node - state, not the prior state. - * Fix *tres-per-task assertion. + * For `SchedulerParameters=extra_constraints`, prevent `slurmctld` + segfault when starting a `slurmd` with `*extra` for a node + that did not previously set this. + This also ensures the extra constraints model works off the + current node state, not the prior state. + * Fix `*tres-per-task` assertion. * Fix a few issues when creating reservations. - * Add SchedulerParameters=time_min_as_soft_limit option. - * Remove SLURM_WORKING_CLUSTER env from batch and srun environments. - * cli_filter/lua - return nil for unset time options rather than the string - "2982616-04:14:00" (which is the internal macro "NO_VAL" represented as - time string). - * Remove 'none' plugins for all but auth and cred. scontrol show config - will report (null) now. - * Removed select/cons_res. Please update your configuration to - select/cons_tres. - * mpi/pmix - When aborted with status 0, avoid marking job/step as failed. - * Fixed typo on "initialized" for the description of ESLURM_PLUGIN_NOT_LOADED. - * cgroup.conf - Removed deprecated parameters AllowedKmemSpace, - ConstrainKmemSpace, MaxKmemPercent, and MinKmemSpace. - * proctrack/cgroup - Add "SignalChildrenProcesses=" option to - cgroup.conf. This allows signals for cancelling, suspending, resuming, etc. - to be sent to children processes in a step/job rather than just the parent. - * Add PreemptParameters=suspend_grace_time parameter to control amount of - time between SIGTSTP and SIGSTOP signals when suspending jobs. - * job_submit/throttle - improve reset of submitted job counts per user in - order to better honor SchedulerParameters=jobs_per_user_per_hour=#. - * Load the user environment into a private pid namespace to avoid user scripts - leaving background processes on a node. - * scontrol show assoc_mgr will display Lineage instead of Lft for - associations. - * Add SlurmctldParameters=no_quick_restart to avoid a new slurmctld taking - over the old slurmctld on accedent. - * Fix --cpus-per-gpu for step allocations, which was previously ignored for - job steps. *cpus-per-gpu implies --exact. - * Fix mutual exclusivity of --cpus-per-gpu and --cpus-per-task: fatal if both - options are requested in the commandline or both are requested in the - environment. If one option is requested in the command line, it will - override the other option in the environment. - * slurmrestd - openapi/dbv0.0.37 and openapi/v0.0.37 plugins have been - removed. - * slurmrestd - openapi/dbv0.0.38 and openapi/v0.0.38 plugins have been tagged - as deprecated. - * slurmrestd - added auto population of info/version field. - * sdiag - add --yaml and --json arg support to specify data_parser plugin. - * sacct - add --yaml and --json arg support to specify data_parser plugin. - * scontrol - add --yaml and --json arg support to specify data_parser plugin. - * sinfo - add --yaml and --json arg support to specify data_parser plugin. - * squeue - add --yaml and --json arg support to specify data_parser plugin. - * Changed the default SelectType to select/cons_tres (from select/linear). - * Allow SlurmUser/root to use reservations without specific permissions. + * Add `SchedulerParameters=time_min_as_soft_limit` option. + * Remove `SLURM_WORKING_CLUSTER` env from batch and srun + environments. + * `cli_filter/lua` - return nil for unset time options rather + than the string `2982616-04:14:00` (which is the internal + macro `NO_VAL` represented as time string). + * Remove 'none' plugins for all but auth and cred. scontrol show + config will report (null) now. + * Removed `select/cons_res`. Please update your configuration to + `select/cons_tres`. + * `mpi/pmix` - When aborted with status 0, avoid marking + job/step as failed. + * Fixed typo on `initialized` for the description of + `ESLURM_PLUGIN_NOT_LOADED`. + * `cgroup.conf` - Removed deprecated parameters `AllowedKmemSpace`, + `ConstrainKmemSpace`, `MaxKmemPercent`, and `MinKmemSpace`. + * `proctrack/cgroup` - Add `SignalChildrenProcesses=` + option to `cgroup.conf`. This allows signals for cancelling, + suspending, resuming, etc. to be sent to children processes in + a `step/job` rather than just the parent. + * Add `PreemptParameters=suspend_grace_time` parameter to + control amount of time between `SIGTSTP` and `SIGSTOP` signals + when suspending jobs. + * `job_submit/throttle` - improve reset of submitted job counts + per user in order to better honor + `SchedulerParameters=jobs_per_user_per_hour=#`. + * Load the user environment into a private pid namespace to + avoid user scripts leaving background processes on a node. + * `scontrol` show `assoc_mgr` will display Lineage instead of Lft + for associations. + * Add `SlurmctldParameters=no_quick_restart` to avoid a new + `slurmctld` taking over the old `slurmctld` by accident. + * Fix `--cpus-per-gpu` for step allocations, which was + previously ignored for job steps. `*cpus-per-gpu` implies + `--exact`. + * Fix mutual exclusivity of `--cpus-per-gpu` and + `--cpus-per-task`: fatal if both options are requested in the + commandline or both are requested in the environment. If one + option is requested in the command line, it will override the + other option in the environment. + * `slurmrestd` - `openapi/dbv0.0.37` and `openapi/v0.0.37` + plugins have been removed. + * `slurmrestd` - `openapi/dbv0.0.38` and `openapi/v0.0.38` + plugins have been tagged as deprecated. + * `slurmrestd` - added auto population of `info/version` field. + * `sdiag` - add `--yaml` and `--json` arg support to specify + data_parser plugin. + * `sacct` - add `--yaml` and `--json` arg support to specify + `data_parser` plugin. + * `scontrol` - add `--yaml` and `--json` arg support to specify + `data_parser` plugin. + * `sinfo` - add `--yaml` and `--json` arg support to specify + `data_parser` plugin. + * `squeue` - add `--yaml` and `--json` arg support to specify + `data_parser` plugin. + * Changed the default `SelectType` to `select/cons_tres` (from + `select/linear`). + * Allow `SlurmUser`/`root` to use reservations without specific + permissions. * Fix sending step signals to nodes not allocated by the step. - * Remove CgroupAutomount= option from cgroup.conf. - * Add TopologyRoute=RoutePart to route communications based on partition node - lists. - * Added ability for configless to push Prolog and Epilog scripts to slurmds. + * Remove `CgroupAutomount=` option from `cgroup.conf`. + * Add `TopologyRoute=RoutePart` to route communications based + on partition node lists. + * Added ability for configless to push Prolog and Epilog + scripts to `slurmd`s. * Prolog and Epilog do not have to be fully qualified pathnames. - * Changed default value of PriorityType from priority/basic to - priority/multifactor. - * torque/mpiexec - Propogate exit code from launched process. - * slurmrestd - Add new rlimits fields for job submission. - * Define SPANK options environment variables when --export=[NIL|NONE] is - specified. - * slurmrestd - Numeric input fields provided with a null formatted value will - now convert to zero (0) where it can be a valid value. This is expected to - be only be notable with job submission against v0.0.38 versioned endpoints - with job requests with fields provided with null values. These fields were - already rejected by v0.0.39+ endpoints, unless +complex parser value is - provided to v0.0.40+ endpoints. - * slurmrestd - Improve parsing of integers and floating point numbers when - handling incoming user provided numeric fields. Fields that would have not - rejected a number for a numeric field followed by other non-numeric - characters will now get rejected. This is expected to be only be notable - with job submission against v0.0.38 versioned endpoints with malformed job - requests. - * Reject reservation update if it will result in previously submitted - jobs losing access to the reservation. - * data_parser/v0.0.40 - output partition state when dumping partitions. - * Allow for a shared suffix to be used with the hostlist format. E.g., - "node[0001-0010]-int". - * Fix perlapi build when using non-default libdir. - * Replace SRUN_CPUS_PER_TASK with SLURM_CPUS_PER_TASK and get back the - previous behavior before Slurm 22.05 since now we have the new external - launcher step. - * job_container/tmpfs - Add "BasePath=none" option to disable plugin on node - subsets when there is a global setting. - * Add QOS flag 'Relative'. If set the QOS limits will be treated as - percentages of a cluster/partition instead of absolutes. - * Remove FIRST_CORES flag from reservations. - * Add cloud instance id and instance type to node records. Can be viewed/ - updated with scontrol. - * slurmd - add "instance-id", "instance-type", and "extra" options to allow - them to be set on startup. - * Add cloud instance accounting to database that can be viewed with 'sacctmgr - show instance'. - * select/linear - fix task launch failure that sometimes occurred when - requesting *threads-per-core or --hint=nomultithread. This also fixes - memory calculation with one of these options and *mem-per-cpu: - Previously, memory = mem-per-cpu * all cpus including unusable threads. - Now, memory = mem-per-cpu * only usuable threads. This behavior matches - the documentation and select/cons_tres. - * gpu/nvml - Reduce chances of NVML_ERROR_INSUFFICIENT_SIZE error when getting - gpu memory information. - * slurmrestd - Convert to generating OperationIDs based on path for all - v0.0.40 tagged paths. - * slurmrestd - Reduce memory used while dumping a job's stdio paths. - * slurmrestd - Jobs queried from data_parser/v0.0.40 from slurmdb will have - 'step/id' field given as a string to match CLI formatting instead of an + * Changed default value of `PriorityType` from `priority/basic` + to `priority/multifactor`. + * `torque/mpiexec` - Propogate exit code from `launched` process. + * `slurmrestd` - Add new rlimits fields for job submission. + * Define SPANK options environment variables when + `--export=[NIL|NONE]` is specified. + * `slurmrestd` - Numeric input fields provided with a null + formatted value will now convert to zero (0) where it can be + a valid value. This is expected to be only be notable with job + submission against v0.0.38 versioned endpoints with job + requests with fields provided with null values. These fields + were already rejected by v0.0.39+ endpoints, unless `+complex` + parser value is provided to v0.0.40+ endpoints. + * `slurmrestd` - Improve parsing of integers and floating point + numbers when handling incoming user provided numeric fields. + Fields that would have not rejected a number for a numeric + field followed by other non-numeric characters will now get + rejected. This is expected to be only be notable with job + submission against v0.0.38 versioned endpoints with malformed + job requests. + * Reject reservation update if it will result in previously + submitted jobs losing access to the reservation. + * `data_parser/v0.0.40` - output partition state when dumping + partitions. + * Allow for a shared suffix to be used with the hostlist format. + E.g., `node[0001-0010]-int`. + * Replace `SRUN_CPUS_PER_TASK` with `SLURM_CPUS_PER_TASK` and + get back the previous behavior before Slurm 22.05 since now we + have the new external launcher step. + * `job_container/tmpfs` - Add `BasePath=none` option to disable + plugin on node subsets when there is a global setting. + * Add QOS flag `Relative`. If set the QOS limits will be treated + as percentages of a cluster/partition instead of absolutes. + * Remove `FIRST_CORES` flag from reservations. + * Add cloud instance id and instance type to node records. + Can be viewed/updated with `scontrol`. + * `slurmd` - add `instance-id`, `instance-type`, and `extra` + options to allow them to be set on startup. + * Add cloud instance accounting to database that can be viewed + with `sacctmgr show instance`. + * `select/linear` - fix task launch failure that sometimes + occurred when requesting `*threads-per-core` or + `--hint=nomultithread`. This also fixes memory calculation + with one of these options and `*mem-per-cpu`: + Previously, memory = mem-per-cpu * all cpus including unusable + threads. + Now, memory = mem-per-cpu * only usuable threads. This + behavior matches the documentation and select/cons_tres. + * `gpu/nvml` - Reduce chances of `NVML_ERROR_INSUFFICIENT_SIZE` + error when getting gpu memory information. + * `slurmrestd` - Convert to generating `OperationIDs` based on + path for all v0.0.40 tagged paths. + * `slurmrestd` - Reduce memory used while dumping a job's stdio + paths. + * `slurmrestd` - Jobs queried from `data_parser/v0.0.40` from + `slurmdb` will have `step/id` field given as a string to match + CLI formatting instead of an object. + * `sacct` - Output in JSON or YAML output will will have the + `step/id` field given as a string instead of an object. + * `scontrol`/`squeue` - Step output in JSON or YAML output will + will have the `id` field given as a string instead of an object. - * sacct - Output in JSON or YAML output will will have the 'step/id' field - given as a string instead of an object. - * scontrol/squeue - Step output in JSON or YAML output will will have the - 'id' field given as a string instead of an object. - * slurmrestd - For 'GET /slurmdb/v0.0.40/jobs' mimick default behavior for - handling of job start and end times as sacct when one or both fields are - not provided as a query parameter. - * openapi/slurmctld - Add 'GET /slurm/v0.0.40/shares' endpoint to dump same - output as sshare. - * sshare - add JSON/YAML support. - * data_parser/v0.0.40 - Remove "required/memory" output in json. It is - replaced by "required/memory_per_cpu" and "required/memory_per_node". - * slurmrestd - Add numeric id to all association identifiers to allow unique - identification where association has been deleted but is still referenced by - accounting record. - * slurmrestd - Add accounting, id, and comment fields to association dumps. - * Use memory.current in cgroup/v2 instead of manually calculating RSS. This - makes accounting consistent with OOM Killer. - * sreport - cluster Utilization PlannedDown field now includes the time that - all nodes were in the POWERED_DOWN state instead of just cloud nodes. - * scontrol update partition now allows Nodes+= and - Nodes-= to add/delete nodes from the existing partition node - list. Nodes=+host1,-host2 is also allowed. - * sacctmgr - add --yaml and --json arg support to specify data_parser plugin. - * sacctmgr can now modify QOS's RawUsage to zero or a positive value. - * sdiag - Added statistics on why the main and backfill schedulers have - stopped evaluation on each scheduling cycle. - the number of 'RPC limit exceeded...' messages that are logged. - * Rename sbcast --fanout to --treewidth. - * Remove SLURM_NODE_ALIASES env variable. + * `slurmrestd` - For `GET /slurmdb/v0.0.40/jobs` mimick default + behavior for handling of job start and end times as `sacct` + when one or both fields are not provided as a query parameter. + * `openapi/slurmctld` - Add `GET /slurm/v0.0.40/shares` endpoint + to dump same output as `sshare`. + * `sshare` - add JSON/YAML support. + * `data_parser/v0.0.40` - Remove `required/memory` output in + json. It is replaced by `required/memory_per_cpu` and + `required/memory_per_node`. + * `slurmrestd` - Add numeric id to all association identifiers + to allow unique identification where association has been + deleted but is still referenced by accounting record. + * `slurmrestd` - Add accounting, id, and comment fields to + association dumps. + * Use `memory.current` in cgroup/v2 instead of manually + calculating RSS. This makes accounting consistent with + OOM Killer. + * `sreport` - cluster Utilization `PlannedDown` field now + includes the time that all nodes were in the `POWERED_DOWN` + state instead of just cloud nodes. + * `scontrol` update partition now allows `Nodes+=` and + `Nodes-=` to add/delete nodes from the existing + partition node list. `Nodes=+host1,-host2` is also allowed. + * `sacctmgr` - add `--yaml` and `--json` arg support to specify + `data_parser` plugin. + * `sacctmgr` can now modify QOS's RawUsage to zero or a positive + value. + * `sdiag` - Added statistics on why the main and backfill + schedulers have stopped evaluation on each scheduling cycle. + the number of `RPC limit exceeded...` messages that are logged. + * Rename `sbcast --fanout` to `--treewidth`. + * Remove `SLURM_NODE_ALIASES` env variable. * Enable fanout for dynamic and unaddresable cloud nodes. - * Fix how steps are dealloced in an allocation if the last step of an srun - never completes due to a node failure. + * Fix how steps are dealloced in an allocation if the last step + of an srun never completes due to a node failure. * Remove redundant database indexes. * Add database index to suspend table to speed up archive/purges. - * When requesting --tres-per-task alter incorrect request for TRES, - it should be TRESType/TRESName not TRESType:TRESName. + * When requesting `--tres-per-task` alter incorrect request for + TRES, it should be `TRESType/TRESName` not `TRESType:TRESName`. * Make it so reservations can reserve GRES. - * sbcast - use the specified --fanout value on all hops in message - forwarding; previously the specified fanout was only used on the first hop, - and additional hops used TreeWidth in slurm.conf. - * slurmrestd - remove logger prefix from '-s/-a list' options outputs. - * switch/hpe_slingshot - Add support for collectives. - * Nodes with suspended jobs can now be displayed as MIXED. - * Fix inconsistent handling of using cli and/or environment options for - tres_per_task=cpu:# and cpus_per_gpu. - * Requesting --cpus-per-task will now set SLURM_TRES_PER_TASK=cpu:# in the - environment. - * For some tres related environment variables such as SLURM_TRES_PER_TASK, - when srun requests a different value for that option, set these environment - variables to the value requested by srun. Previously these environment - variables were unchanged from the job allocation. This bug only affected the - output environment variables, not the actual step resource allocation. - * RoutePlugin=route/topology has been replaced with TopologyParam=RouteTree. - * If ThreadsPerCore in slurm.conf is configured with less - than the number of hardware threads, fix a bug where the task plugins used - fewer cores instead of using fewer threads per core. - * Fix arbitrary distribution allowing it to be used with salloc and sbatch and - fix how cpus are allocated to nodes. - * Allow nodes to reboot while node is drained or in a maintenance state. - * Allow scontrol reboot to use nodesets to filter nodes to reboot. + * `sbcast` - use the specified `--fanout` value on all hops in + message forwarding; previously the specified fanout was only + used on the first hop, and additional hops used `TreeWidth` in + `slurm.conf`. + * `slurmrestd`- remove logger prefix from `-s/-a list` options + outputs. + * `switch/hpe_slingshot` - Add support for collectives. + * Nodes with suspended jobs can now be displayed as `MIXED`. + * Fix inconsistent handling of using cli and/or environment + options for `tres_per_task=cpu:#` and `cpus_per_gpu`. + * Requesting `--cpus-per-task` will now set + `SLURM_TRES_PER_TASK=cpu:#` in the environment. + * For some tres related environment variables such as + `SLURM_TRES_PER_TASK`, when `srun` requests a different value + for that option, set these environment variables to the value + requested by `srun`. Previously these environment variables + were unchanged from the job allocation. This bug only affected + the output environment variables, not the actual step resource + allocation. + * `RoutePlugin=route/topology` has been replaced with + `TopologyParam=RouteTree`. + * If `ThreadsPerCore` in `slurm.conf` is configured with less + than the number of hardware threads, fix a bug where the task + plugins used fewer cores instead of using fewer threads per core. + * Fix arbitrary distribution allowing it to be used with `salloc` + and `sbatch` and fix how cpus are allocated to nodes. + * Allow nodes to reboot while node is drained or in a + maintenance state. + * Allow `scontrol` reboot to use nodesets to filter nodes to reboot. * Fix how the topology of typed gres gets updated. - * Changes to the Type option in gres.conf now can be applied with scontrol - reconfig. - * Allow for jobs that request a newly configured gres type to be queued - even when the needed slurmds have not yet registered. + * Changes to the Type option in gres.conf now can be applied with + `scontrol` reconfig. + * Allow for jobs that request a newly configured gres type to be + queued even when the needed `slurmd`s have not yet registered. * Kill recovered jobs that require unconfigured gres types. - * If keepalives are configured, enable them on all persistent connections. - * Configless - Also send Includes from configuration files not parsed by the - controller (i.e. from plugstack.conf). - * Add gpu/nrt plugin for nodes using Trainium/Inferentia devices. - * data_parser/v0.0.40 - Add START_RECEIVED to job flags in dumped output. - * SPANK - Failures from most spank functions (not epilog or exit) will now - cause the step to be marked as failed and the command (srun, salloc, - sbatch *wait) to return 1. - + * If keepalives are configured, enable them on all persistent + connections. + * Configless - Also send Includes from configuration files not + parsed by the controller (i.e. from `plugstack.conf`). + * Add `gpu/nrt` plugin for nodes using Trainium/Inferentia + devices. + * `data_parser/v0.0.40` - Add `START_RECEIVED` to job flags in + dumped output. + * SPANK - Failures from most spank functions (not epilog or + exit) will now cause the step to be marked as failed and the + command (`srun`, `salloc`, `sbatch *wait`) to return 1. ------------------------------------------------------------------- Wed Jan 3 10:45:48 UTC 2024 - Egbert Eich