forked from pool/slurm
plugins makes use of the MpiParams=ports=
option, and previously
features with the `|` operator, which could prevent jobs from + `node_features/helpers` - Fix inconsistent handling of `&` and `|`, instead of just the current set. E.g. `foo|bar&baz` was interpreted `{foo} or {bar,baz}`. tasks fewer than GPUs, which resulted in incorrectly rejecting these jobs. + `slurmrestd` - For `GET /slurm/v0.0.39/node[s]`, change format of node's energy field `current_watts` to a dictionary to account for + `slurmrestd` - For `GET /slurm/v0.0.39/qos`, change format of QOS's + slurmrestd - For `GET /slurm/v0.0.39/job[s]`, the 'return code' `GET /slurmdb/v0.0.39/jobs` from slurmrestd. were present in the log: `error: Attempt to change gres/gpu Count`. + Hold the job with `(Reservation ... invalid)` state reason if the OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=265
This commit is contained in:
parent
74529b6cc2
commit
f0b994e220
@ -9,7 +9,7 @@ Mon Sep 18 05:23:19 UTC 2023 - Egbert Eich <eich@suse.com>
|
||||
accurate in more situations.
|
||||
+ Change pmi2 plugin to honor the `SrunPortRange` option. This matches the
|
||||
new behavior of the pmix plugin in 23.02.0. Note that neither of these
|
||||
plugins makes use of the "`MpiParams=ports=`" option, and previously
|
||||
plugins makes use of the `MpiParams=ports=` option, and previously
|
||||
were only limited by the systems ephemeral port range.
|
||||
+ Fix regression in 23.02.2 that caused slurmctld -R to crash on startup if
|
||||
a node features plugin is configured.
|
||||
@ -44,13 +44,13 @@ Mon Sep 18 05:23:19 UTC 2023 - Egbert Eich <eich@suse.com>
|
||||
federation before they have registered with the dbd.
|
||||
+ `node_features/helpers` - Fix node selection for jobs requesting
|
||||
changeable.
|
||||
features with the '`|`' operator, which could prevent jobs from
|
||||
features with the `|` operator, which could prevent jobs from
|
||||
running on some valid nodes.
|
||||
+ `node_features/helpers` - Fix inconsistent handling of '`&`' and '`|`',
|
||||
+ `node_features/helpers` - Fix inconsistent handling of `&` and `|`,
|
||||
where an AND'd feature was sometimes AND'd to all sets of features
|
||||
instead of just the current set. E.g. "`foo|bar&baz`" was interpreted
|
||||
instead of just the current set. E.g. `foo|bar&baz` was interpreted
|
||||
as `{foo,baz}` or `{bar,baz}` instead of how it is documented:
|
||||
"`{foo} or {bar,baz}`".
|
||||
`{foo} or {bar,baz}`.
|
||||
+ Fix job accounting so that when a job is requeued its allocated node
|
||||
count is cleared. After the requeue, sacct will correctly show that
|
||||
the job has 0 `AllocNodes` while it is pending or if it is canceled
|
||||
@ -60,7 +60,8 @@ Mon Sep 18 05:23:19 UTC 2023 - Egbert Eich <eich@suse.com>
|
||||
+ Fix intel OneAPI autodetect: detect the `/dev/dri/renderD[0-9]+` GPUs,
|
||||
and do not detect `/dev/dri/card[0-9]+`.
|
||||
+ Fix node selection for jobs that request `--gpus` and a number of
|
||||
tasks fewer than GPUs, which resulted in incorrectly rejecting these jobs.
|
||||
tasks fewer than GPUs, which resulted in incorrectly rejecting these
|
||||
jobs.
|
||||
+ Remove `MYSQL_OPT_RECONNECT` completely.
|
||||
+ Fix cloud nodes in `POWERING_UP` state disappearing (getting set
|
||||
to `FUTURE`)
|
||||
@ -102,13 +103,13 @@ Mon Sep 18 05:23:19 UTC 2023 - Egbert Eich <eich@suse.com>
|
||||
+ Fix minor memory leak with `--tres-per-task` and licenses.
|
||||
+ Fix cyclic socket cpu distribution for tasks in a step where
|
||||
`--cpus-per-task` < usable threads per core.
|
||||
+ `slurmrestd` - For '`GET /slurm/v0.0.39/node[s]`', change format of
|
||||
node's energy field "`current_watts`" to a dictionary to account for
|
||||
+ `slurmrestd` - For `GET /slurm/v0.0.39/node[s]`, change format of
|
||||
node's energy field `current_watts` to a dictionary to account for
|
||||
unset value instead of dumping 4294967294.
|
||||
+ `slurmrestd` - For '`GET /slurm/v0.0.39/qos`', change format of QOS's
|
||||
+ `slurmrestd` - For `GET /slurm/v0.0.39/qos`, change format of QOS's
|
||||
field "priority" to a dictionary to account for unset value instead of
|
||||
dumping 4294967294.
|
||||
+ slurmrestd - For '`GET /slurm/v0.0.39/job[s]`', the 'return code'
|
||||
+ slurmrestd - For `GET /slurm/v0.0.39/job[s]`, the 'return code'
|
||||
code field in `v0.0.39_job_exit`_code will be set to -127 instead of
|
||||
being left unset where job does not have a relevant return code.
|
||||
* Other Changes:
|
||||
@ -127,7 +128,7 @@ Mon Sep 18 05:23:19 UTC 2023 - Egbert Eich <eich@suse.com>
|
||||
+ `slurmrestd` - Reduce memory usage when printing out job CPU frequency.
|
||||
+ `data_parser/v0.0.39` - Add `required/memory_per_cpu` and
|
||||
`required/memory_per_node` to `sacct --json` and `sacct --yaml` and
|
||||
'`GET /slurmdb/v0.0.39/jobs`' from slurmrestd.
|
||||
`GET /slurmdb/v0.0.39/jobs` from slurmrestd.
|
||||
+ `gpu/oneapi` - Store cores correctly so CPU affinity is tracked.
|
||||
+ Allow `slurmdbd -R` to work if the root assoc id is not 1.
|
||||
+ Limit periodic node registrations to 50 instead of the full `TreeWidth`.
|
||||
@ -156,7 +157,7 @@ Mon Aug 21 09:43:08 UTC 2023 - Christian Goll <cgoll@suse.com>
|
||||
+ Fix regression in 23.02.2 when checking gres state on `slurmctld`
|
||||
startup or reconfigure. Gres changes in the configuration were not
|
||||
updated on slurmctld startup. On startup or reconfigure, these messages
|
||||
were present in the log: `"error: Attempt to change gres/gpu Count`".
|
||||
were present in the log: `error: Attempt to change gres/gpu Count`.
|
||||
+ Fix potential double count of gres when dealing with limits.
|
||||
+ Fix `slurmstepd` segfault when `ContainerPath` is not set in `oci.conf`
|
||||
+ Fixed an issue where jobs requesting licenses were incorrectly rejected.
|
||||
@ -300,7 +301,7 @@ Mon Aug 21 09:43:08 UTC 2023 - Christian Goll <cgoll@suse.com>
|
||||
lookups.
|
||||
+ `sacct` - when printing `PLANNED` time, use end time instead of start
|
||||
time for jobs cancelled before they started.
|
||||
+ Hold the job with "`(Reservation ... invalid)`" state reason if the
|
||||
+ Hold the job with `(Reservation ... invalid)` state reason if the
|
||||
reservation is not usable by the job.
|
||||
+ `sbatch` - Added new `--export=NIL` option.
|
||||
- Removed:
|
||||
|
@ -1321,7 +1321,7 @@ rm -rf /srv/slurm-testsuite/src /srv/slurm-testsuite/testsuite \
|
||||
%{_mandir}/man5/cgroup.*
|
||||
%{_mandir}/man5/gres.*
|
||||
%{_mandir}/man5/helpers.*
|
||||
%{_mandir}/man5/nonstop.conf.5.*
|
||||
#%%{_mandir}/man5/nonstop.conf.5.*
|
||||
%{_mandir}/man5/oci.conf.5.gz
|
||||
%{_mandir}/man5/topology.*
|
||||
%{_mandir}/man5/knl.conf.5.*
|
||||
|
Loading…
Reference in New Issue
Block a user