Accepting request 845108 from home:anag:branches:network:cluster
- Updated to 20.02.5, changes: * Fix leak of TRESRunMins when job time is changed with --time-min * pam_slurm - explicitly initialize slurm config to support configless mode. * scontrol - Fix exit code when creating/updating reservations with wrong Flags. * When a GRES has a no_consume flag, report 0 for allocated. * Fix cgroup cleanup by jobacct_gather/cgroup. * When creating reservations/jobs don't allow counts on a feature unless using an XOR. * Improve number of boards discovery * Fix updating a reservation NodeCnt on a zero-count reservation. * slurmrestd - provide an explicit error messages when PSK auth fails. * cons_tres - fix job requesting single gres per-node getting two or more nodes with less CPUs than requested per-task. * cons_tres - fix calculation of cores when using gres and cpus-per-task. * cons_tres - fix job not getting access to socket without GPU or with less than --gpus-per-socket when not enough cpus available on required socket and not using --gres-flags=enforce binding. * Fix HDF5 type version build error. * Fix creation of CoreCnt only reservations when the first node isn't available. * Fix wrong DBD Agent queue size in sdiag when using accounting_storage/none. * Improve job constraints XOR option logic. * Fix preemption of hetjobs when needed nodes not in leader component. * Fix wrong bit_or() messing potential preemptor jobs node bitmap, causing bad node deallocations and even allocation of nodes from other partitions. * Fix double-deallocation of preempted non-leader hetjob components. * slurmdbd - prevent truncation of the step nodelists over 4095. * Fix nodes remaining in drain state state after rebooting with ASAP option. - changes from 20.02.4: OBS-URL: https://build.opensuse.org/request/show/845108 OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=156
This commit is contained in:
parent
e3512185d8
commit
e481851f5a
@ -12,14 +12,11 @@ PAM is security critical:
|
||||
|
||||
Signed-off-by: Egbert Eich <eich@suse.com>
|
||||
---
|
||||
contribs/pam/pam_slurm.c | 20 +++++++++++---------
|
||||
1 file changed, 11 insertions(+), 9 deletions(-)
|
||||
|
||||
diff --git a/contribs/pam/pam_slurm.c b/contribs/pam/pam_slurm.c
|
||||
index 0968a9c..ee179d5 100644
|
||||
diff -Nrua a/contribs/pam/pam_slurm.c b/contribs/pam/pam_slurm.c
|
||||
--- a/contribs/pam/pam_slurm.c
|
||||
+++ b/contribs/pam/pam_slurm.c
|
||||
@@ -266,9 +266,9 @@ static int
|
||||
@@ -266,9 +266,9 @@
|
||||
_gethostname_short (char *name, size_t len)
|
||||
{
|
||||
int error_code, name_len;
|
||||
@ -31,7 +28,7 @@ index 0968a9c..ee179d5 100644
|
||||
if (error_code)
|
||||
return error_code;
|
||||
|
||||
@@ -296,11 +296,11 @@ static int
|
||||
@@ -296,13 +296,13 @@
|
||||
_slurm_match_allocation(uid_t uid)
|
||||
{
|
||||
int authorized = 0, i;
|
||||
@ -40,12 +37,14 @@ index 0968a9c..ee179d5 100644
|
||||
char *nodename = NULL;
|
||||
job_info_msg_t * msg;
|
||||
|
||||
slurm_conf_init(NULL);
|
||||
|
||||
- if (_gethostname_short(hostname, sizeof(hostname)) < 0) {
|
||||
+ if (_gethostname_short(hostname, sizeof(hostname) - 1) < 0) {
|
||||
_log_msg(LOG_ERR, "gethostname: %m");
|
||||
return 0;
|
||||
}
|
||||
@@ -409,7 +409,7 @@ _send_denial_msg(pam_handle_t *pamh, struct _options *opts,
|
||||
@@ -425,7 +425,7 @@
|
||||
*/
|
||||
extern void libpam_slurm_init (void)
|
||||
{
|
||||
@ -54,7 +53,7 @@ index 0968a9c..ee179d5 100644
|
||||
|
||||
if (slurm_h)
|
||||
return;
|
||||
@@ -417,10 +417,10 @@ extern void libpam_slurm_init (void)
|
||||
@@ -433,10 +433,10 @@
|
||||
/* First try to use the same libslurm version ("libslurm.so.24.0.0"),
|
||||
* Second try to match the major version number ("libslurm.so.24"),
|
||||
* Otherwise use "libslurm.so" */
|
||||
@ -67,7 +66,7 @@ index 0968a9c..ee179d5 100644
|
||||
_log_msg (LOG_ERR, "Unable to write libslurmname\n");
|
||||
} else if ((slurm_h = dlopen(libslurmname, RTLD_NOW|RTLD_GLOBAL))) {
|
||||
return;
|
||||
@@ -429,8 +429,10 @@ extern void libpam_slurm_init (void)
|
||||
@@ -445,8 +445,10 @@
|
||||
libslurmname, dlerror ());
|
||||
}
|
||||
|
||||
|
@ -1,3 +0,0 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:b73ea8ce63dd73a0744c444f2fbe56fc98e79c9a1ea34e9c82e09534b55c44a3
|
||||
size 6330257
|
3
slurm-20.02.5.tar.bz2
Normal file
3
slurm-20.02.5.tar.bz2
Normal file
@ -0,0 +1,3 @@
|
||||
version https://git-lfs.github.com/spec/v1
|
||||
oid sha256:c32a7a32010a526bb8a303df1df081c79dbe15423576543a73c65bdd33301723
|
||||
size 6325393
|
118
slurm.changes
118
slurm.changes
@ -1,3 +1,121 @@
|
||||
-------------------------------------------------------------------
|
||||
Thu Oct 29 12:35:18 UTC 2020 - Ana Guerrero Lopez <aguerrero@suse.com>
|
||||
|
||||
- Updated to 20.02.5, changes:
|
||||
* Fix leak of TRESRunMins when job time is changed with --time-min
|
||||
* pam_slurm - explicitly initialize slurm config to support configless mode.
|
||||
* scontrol - Fix exit code when creating/updating reservations with wrong
|
||||
Flags.
|
||||
* When a GRES has a no_consume flag, report 0 for allocated.
|
||||
* Fix cgroup cleanup by jobacct_gather/cgroup.
|
||||
* When creating reservations/jobs don't allow counts on a feature unless
|
||||
using an XOR.
|
||||
* Improve number of boards discovery
|
||||
* Fix updating a reservation NodeCnt on a zero-count reservation.
|
||||
* slurmrestd - provide an explicit error messages when PSK auth fails.
|
||||
* cons_tres - fix job requesting single gres per-node getting two or more
|
||||
nodes with less CPUs than requested per-task.
|
||||
* cons_tres - fix calculation of cores when using gres and cpus-per-task.
|
||||
* cons_tres - fix job not getting access to socket without GPU or with less
|
||||
than --gpus-per-socket when not enough cpus available on required socket
|
||||
and not using --gres-flags=enforce binding.
|
||||
* Fix HDF5 type version build error.
|
||||
* Fix creation of CoreCnt only reservations when the first node isn't
|
||||
available.
|
||||
* Fix wrong DBD Agent queue size in sdiag when using accounting_storage/none.
|
||||
* Improve job constraints XOR option logic.
|
||||
* Fix preemption of hetjobs when needed nodes not in leader component.
|
||||
* Fix wrong bit_or() messing potential preemptor jobs node bitmap, causing
|
||||
bad node deallocations and even allocation of nodes from other partitions.
|
||||
* Fix double-deallocation of preempted non-leader hetjob components.
|
||||
* slurmdbd - prevent truncation of the step nodelists over 4095.
|
||||
* Fix nodes remaining in drain state state after rebooting with ASAP option.
|
||||
|
||||
- changes from 20.02.4:
|
||||
* srun - suppress job step creation warning message when waiting on
|
||||
PrologSlurmctld.
|
||||
* slurmrestd - fix incorrect return values in data_list_for_each() functions.
|
||||
* mpi/pmix - fix issue where HetJobs could fail to launch.
|
||||
* slurmrestd - set content-type header in responses.
|
||||
* Fix cons_res GRES overallocation for --gres-flags=disable-binding.
|
||||
* Fix cons_res incorrectly filtering cores with respect to GRES locality for
|
||||
--gres-flags=disable-binding requests.
|
||||
* Fix regression where a dependency on multiple jobs in a single array using
|
||||
underscores would only add the first job.
|
||||
* slurmrestd - fix corrupted output due to incorrect use of memcpy().
|
||||
* slurmrestd - address a number of minor Coverity warnings.
|
||||
* Handle retry failure when slurmstepd is communicating with srun correctly.
|
||||
* Fix jobacct_gather possibly duplicate stats when _is_a_lwp error shows up.
|
||||
* Fix tasks binding to GRES which are closest to the allocated CPUs.
|
||||
* Fix AMD GPU ROCM 3.5 support.
|
||||
* Fix handling of job arrays in sacct when querying specific steps.
|
||||
* slurmrestd - avoid fallback to local socket authentication if JWT
|
||||
authentication is ill-formed.
|
||||
* slurmrestd - restrict ability of requests to use different authentication
|
||||
plugins.
|
||||
* slurmrestd - unlink named unix sockets before closing.
|
||||
* slurmrestd - fix invalid formatting in openapi.json.
|
||||
* Fix batch jobs stuck in CF state on FrontEnd mode.
|
||||
* Add a separate explicit error message when rejecting changes to active node
|
||||
features.
|
||||
* cons_common/job_test - fix slurmctld SIGABRT due to double-free.
|
||||
* Fix updating reservations to set the duration correctly if updating the
|
||||
start time.
|
||||
* Fix update reservation to promiscuous mode.
|
||||
* Fix override of job tasks count to max when ntasks-per-node present.
|
||||
* Fix min CPUs per node not being at least CPUs per task requested.
|
||||
* Fix CPUs allocated to match CPUs requested when requesting GRES and
|
||||
threads per core equal to one.
|
||||
* Fix NodeName config parsing with Boards and without CPUs.
|
||||
* Ensure SLURM_JOB_USER and SLURM_JOB_UID are set in SrunProlog/Epilog.
|
||||
* Fix error messages for certain invalid salloc/sbatch/srun options.
|
||||
* pmi2 - clean up sockets at step termination.
|
||||
* Fix 'scontrol hold' to work with 'JobName'.
|
||||
* sbatch - handle --uid/--gid in #SBATCH directives properly.
|
||||
* Fix race condition in job termination on slurmd.
|
||||
* Print specific error messages if trying to run use certain
|
||||
priority/multifactor factors that cannot work without SlurmDBD.
|
||||
* Avoid partial GRES allocation when --gpus-per-job is not satisfied.
|
||||
* Cray - Avoid referencing a variable outside of it's correct scope when
|
||||
dealing with creating steps within a het job.
|
||||
* slurmrestd - correctly handle larger addresses from accept().
|
||||
* Avoid freeing wrong pointer with SlurmctldParameters=max_dbd_msg_action
|
||||
with another option after that.
|
||||
* Restore MCS label when suspended job is resumed.
|
||||
* Fix insufficient lock levels.
|
||||
* slurmrestd - use errno from job submission.
|
||||
* Fix "user" filter for sacctmgr show transactions.
|
||||
* Fix preemption logic.
|
||||
* Fix no_consume GRES for exclusive (whole node) requests.
|
||||
* Fix regression in 20.02 that caused an infinite loop in slurmctld when
|
||||
requesting --distribution=plane for the job.
|
||||
* Fix parsing of the --distribution option.
|
||||
* Add CONF READ_LOCK to _handle_fed_send_job_sync.
|
||||
* prep/script - always call slurmctld PrEp callback in _run_script().
|
||||
* Fix node estimation for jobs that use GPUs or --cpus-per-task.
|
||||
* Fix jobcomp, job_submit and cli_filter Lua implementation plugins causing
|
||||
slurmctld and/or job submission CLI tools segfaults due to bad return
|
||||
handling when the respective Lua script failed to load.
|
||||
* Fix propagation of gpu options through hetjob components.
|
||||
* Add SLURM_CLUSTERS environment variable to scancel.
|
||||
* Fix packing/unpacking of "unlinked" jobs.
|
||||
* Connect slurmstepd's stderr to srun for steps launched with --pty.
|
||||
* Handle MPS correctly when doing exclusive allocations.
|
||||
* slurmrestd - fix compiling against libhttpparser in a non-default path.
|
||||
* slurmrestd - avoid compilation issues with libhttpparser < 2.6.
|
||||
* Fix compile issues when compiling slurmrestd without --enable-debug.
|
||||
* Reset idle time on a reservation that is getting purged.
|
||||
* Fix reoccurring reservations that have Purge_comp= to keep correct
|
||||
duration if they are purged.
|
||||
* scontrol - changed the "PROMISCUOUS" flag to "MAGNETIC"
|
||||
* Early return from epilog_set_env in case of no_consume.
|
||||
* Fix cons_common/job_test start time discovery logic to prevent skewed
|
||||
results between "will run test" executions.
|
||||
* Ensure TRESRunMins limits are maintained during "scontrol reconfigure".
|
||||
* Improve error message when host lookup fails.
|
||||
|
||||
- Refresh patch: pam_slurm-Initialize-arrays-and-pass-sizes.patch
|
||||
|
||||
-------------------------------------------------------------------
|
||||
Tue Jul 7 09:05:40 UTC 2020 - Egbert Eich <eich@suse.com>
|
||||
|
||||
|
@ -18,7 +18,7 @@
|
||||
|
||||
# Check file META in sources: update so_version to (API_CURRENT - API_AGE)
|
||||
%define so_version 35
|
||||
%define ver 20.02.3
|
||||
%define ver 20.02.5
|
||||
%define _ver _20_02
|
||||
%define dl_ver %{ver}
|
||||
# so-version is 0 and seems to be stable
|
||||
|
Loading…
Reference in New Issue
Block a user