SHA256
1
0
forked from pool/slurm
slurm/pam_slurm-Initialize-arrays-and-pass-sizes.patch

85 lines
2.8 KiB
Diff
Raw Normal View History

Accepting request 1067475 from home:eeich:branches:network:cluster - updated to 23.02.0-0rc1 * Highlights + slurmctld - Add new RPC rate limiting feature. This is enabled through SlurmctldParameters=rl_enable, otherwise disabled by default. + Make scontrol reconfigure and sending a SIGHUP to the slurmctld behave the same. If you were using SIGHUP as a 'lighter' scontrol reconfigure to rotate logs please update your scripts to use SIGUSR2 instead. + Change cloud nodes to show by default. PrivateData=cloud is no longer needed. + sreport - Count planned (FKA reserved) time for jobs running in IGNORE_JOBS reservations. Previously was lumped into IDLE time. + job_container/tmpfs - Support running with an arbitrary list of private mount points (/tmp and /dev/shm are the default, but not required). + job_container/tmpfs - Set more environment variables in InitScript. + Make all cgroup directories created by Slurm owned by root. This was the behavior in cgroup/v2 but not in cgroup/v1 where by default the step directories ownership were set to the user and group of the job. + accounting_storage/mysql - change purge/archive to calculate record ages based on end time, rather than start or submission times. + job_submit/lua - add support for log_user() from slurm_job_modify(). + Run the following scripts in slurmscriptd instead of slurmctld: ResumeProgram, ResumeFailProgram, SuspendProgram, ResvProlog, ResvEpilog, and RebootProgram (only with SlurmctldParameters=reboot_from_controller). + Only permit changing log levels with 'srun --slurmd-debug' by root or SlurmUser. + slurmctld will fatal() when reconfiguring the job_submit plugin fails. + Add PowerDownOnIdle partition option to power down nodes after nodes become idle. + Add "[jobid.stepid]" prefix from slurmstepd and "slurmscriptd" prefix from slurmcriptd to Syslog logging. Previously was only happening when OBS-URL: https://build.opensuse.org/request/show/1067475 OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=231
2023-02-23 20:32:51 +01:00
From: Egbert Eich <eich@suse.com>
Date: Mon Feb 20 21:29:27 2023 +0100
Subject: pam_slurm: Initialize arrays and pass sizes
Patch-mainline: Not yet
Git-commit: 5feca5c29d4e820dafd8d34c0343944b28890902
Accepting request 454272 from home:eeich:branches:network:cluster - Updated to 16.05.8.1 * Remove StoragePass from being printed out in the slurmdbd log at debug2 level. * Defer PATH search for task program until launch in slurmstepd. * Modify regression test1.89 to avoid leaving vestigial job. Also reduce logging to reduce likelyhood of Expect buffer overflow. * Do not PATH search for mult-prog launches if LaunchParamters=test_exec is enabled. * Fix for possible infinite loop in select/cons_res plugin when trying to satisfy a job's ntasks_per_core or socket specification. * If job is held for bad constraints make it so once updated the job doesn't go into JobAdminHeld. * sched/backfill - Fix logic to reserve resources for jobs that require a node reboot (i.e. to change KNL mode) in order to start. * When unpacking a node or front_end record from state and the protocol version is lower than the min version, set it to the min. * Remove redundant lookup for part_ptr when updating a reservation's nodes. * Fix memory and file descriptor leaks in slurmd daemon's sbcast logic. * Do not allocate specialized cores to jobs using the --exclusive option. * Cancel interactive job if Prolog failure with "PrologFlags=contain" or "PrologFlags=alloc" configured. Send new error prolog failure message to the salloc or srun command as needed. * Prevent possible out-of-bounds read in slurmstepd on an invalid #! line. * Fix check for PluginDir within slurmctld to work with multiple directories. * Cancel interactive jobs automatically on communication error to launching srun/salloc process. * Fix security issue caused by insecure file path handling triggered by the failure of a Prolog script. To exploit this a user needs to anticipate or cause the Prolog to fail for their job. CVE-2016-10030 (bsc#1018371). - Replace group/user add macros with function calls. - Disable building with netloc support: the netloc API is part of the devel branch of hwloc. Since this devel branch was included accidentally and has been reversed since, we need to disable this for the time being. - Conditionalized architecture specific pieces to support non-x86 architectures better. - Remove: unneeded 'BuildRequires: python' - Add: BuildRequires: freeipmi-devel BuildRequires: libibmad-devel BuildRequires: libibumad-devel so they are picked up by the slurm build. - Enable modifications from openHPC Project. - Enable lua API package build. - Add a recommends for slurm-munge to the slurm package: This is way, the munge auth method is available and slurm works out of the box. - Create /var/lib/slurm as StateSaveLocation directory. /tmp is dangerous. - Keep %{_libdir}/libpmi* and %{_libdir}/mpi_pmi2* on SUSE. OBS-URL: https://build.opensuse.org/request/show/454272 OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=13
2017-02-02 21:23:02 +01:00
References: bsc#1007053
PAM is security critical:
- clear arrays
- ensure strings are NULL-terminated.
Signed-off-by: Egbert Eich <eich@suse.com>
Accepting request 1067475 from home:eeich:branches:network:cluster - updated to 23.02.0-0rc1 * Highlights + slurmctld - Add new RPC rate limiting feature. This is enabled through SlurmctldParameters=rl_enable, otherwise disabled by default. + Make scontrol reconfigure and sending a SIGHUP to the slurmctld behave the same. If you were using SIGHUP as a 'lighter' scontrol reconfigure to rotate logs please update your scripts to use SIGUSR2 instead. + Change cloud nodes to show by default. PrivateData=cloud is no longer needed. + sreport - Count planned (FKA reserved) time for jobs running in IGNORE_JOBS reservations. Previously was lumped into IDLE time. + job_container/tmpfs - Support running with an arbitrary list of private mount points (/tmp and /dev/shm are the default, but not required). + job_container/tmpfs - Set more environment variables in InitScript. + Make all cgroup directories created by Slurm owned by root. This was the behavior in cgroup/v2 but not in cgroup/v1 where by default the step directories ownership were set to the user and group of the job. + accounting_storage/mysql - change purge/archive to calculate record ages based on end time, rather than start or submission times. + job_submit/lua - add support for log_user() from slurm_job_modify(). + Run the following scripts in slurmscriptd instead of slurmctld: ResumeProgram, ResumeFailProgram, SuspendProgram, ResvProlog, ResvEpilog, and RebootProgram (only with SlurmctldParameters=reboot_from_controller). + Only permit changing log levels with 'srun --slurmd-debug' by root or SlurmUser. + slurmctld will fatal() when reconfiguring the job_submit plugin fails. + Add PowerDownOnIdle partition option to power down nodes after nodes become idle. + Add "[jobid.stepid]" prefix from slurmstepd and "slurmscriptd" prefix from slurmcriptd to Syslog logging. Previously was only happening when OBS-URL: https://build.opensuse.org/request/show/1067475 OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=231
2023-02-23 20:32:51 +01:00
Originally-from: Sebastian Krahmer <krahmer@suse.com>
Signed-off-by: Egbert Eich <eich@suse.de>
Accepting request 454272 from home:eeich:branches:network:cluster - Updated to 16.05.8.1 * Remove StoragePass from being printed out in the slurmdbd log at debug2 level. * Defer PATH search for task program until launch in slurmstepd. * Modify regression test1.89 to avoid leaving vestigial job. Also reduce logging to reduce likelyhood of Expect buffer overflow. * Do not PATH search for mult-prog launches if LaunchParamters=test_exec is enabled. * Fix for possible infinite loop in select/cons_res plugin when trying to satisfy a job's ntasks_per_core or socket specification. * If job is held for bad constraints make it so once updated the job doesn't go into JobAdminHeld. * sched/backfill - Fix logic to reserve resources for jobs that require a node reboot (i.e. to change KNL mode) in order to start. * When unpacking a node or front_end record from state and the protocol version is lower than the min version, set it to the min. * Remove redundant lookup for part_ptr when updating a reservation's nodes. * Fix memory and file descriptor leaks in slurmd daemon's sbcast logic. * Do not allocate specialized cores to jobs using the --exclusive option. * Cancel interactive job if Prolog failure with "PrologFlags=contain" or "PrologFlags=alloc" configured. Send new error prolog failure message to the salloc or srun command as needed. * Prevent possible out-of-bounds read in slurmstepd on an invalid #! line. * Fix check for PluginDir within slurmctld to work with multiple directories. * Cancel interactive jobs automatically on communication error to launching srun/salloc process. * Fix security issue caused by insecure file path handling triggered by the failure of a Prolog script. To exploit this a user needs to anticipate or cause the Prolog to fail for their job. CVE-2016-10030 (bsc#1018371). - Replace group/user add macros with function calls. - Disable building with netloc support: the netloc API is part of the devel branch of hwloc. Since this devel branch was included accidentally and has been reversed since, we need to disable this for the time being. - Conditionalized architecture specific pieces to support non-x86 architectures better. - Remove: unneeded 'BuildRequires: python' - Add: BuildRequires: freeipmi-devel BuildRequires: libibmad-devel BuildRequires: libibumad-devel so they are picked up by the slurm build. - Enable modifications from openHPC Project. - Enable lua API package build. - Add a recommends for slurm-munge to the slurm package: This is way, the munge auth method is available and slurm works out of the box. - Create /var/lib/slurm as StateSaveLocation directory. /tmp is dangerous. - Keep %{_libdir}/libpmi* and %{_libdir}/mpi_pmi2* on SUSE. OBS-URL: https://build.opensuse.org/request/show/454272 OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=13
2017-02-02 21:23:02 +01:00
---
Accepting request 1067475 from home:eeich:branches:network:cluster - updated to 23.02.0-0rc1 * Highlights + slurmctld - Add new RPC rate limiting feature. This is enabled through SlurmctldParameters=rl_enable, otherwise disabled by default. + Make scontrol reconfigure and sending a SIGHUP to the slurmctld behave the same. If you were using SIGHUP as a 'lighter' scontrol reconfigure to rotate logs please update your scripts to use SIGUSR2 instead. + Change cloud nodes to show by default. PrivateData=cloud is no longer needed. + sreport - Count planned (FKA reserved) time for jobs running in IGNORE_JOBS reservations. Previously was lumped into IDLE time. + job_container/tmpfs - Support running with an arbitrary list of private mount points (/tmp and /dev/shm are the default, but not required). + job_container/tmpfs - Set more environment variables in InitScript. + Make all cgroup directories created by Slurm owned by root. This was the behavior in cgroup/v2 but not in cgroup/v1 where by default the step directories ownership were set to the user and group of the job. + accounting_storage/mysql - change purge/archive to calculate record ages based on end time, rather than start or submission times. + job_submit/lua - add support for log_user() from slurm_job_modify(). + Run the following scripts in slurmscriptd instead of slurmctld: ResumeProgram, ResumeFailProgram, SuspendProgram, ResvProlog, ResvEpilog, and RebootProgram (only with SlurmctldParameters=reboot_from_controller). + Only permit changing log levels with 'srun --slurmd-debug' by root or SlurmUser. + slurmctld will fatal() when reconfiguring the job_submit plugin fails. + Add PowerDownOnIdle partition option to power down nodes after nodes become idle. + Add "[jobid.stepid]" prefix from slurmstepd and "slurmscriptd" prefix from slurmcriptd to Syslog logging. Previously was only happening when OBS-URL: https://build.opensuse.org/request/show/1067475 OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=231
2023-02-23 20:32:51 +01:00
contribs/pam/pam_slurm.c | 20 +++++++++++---------
1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/contribs/pam/pam_slurm.c b/contribs/pam/pam_slurm.c
index 20d21a9..363b6ae 100644
Accepting request 454272 from home:eeich:branches:network:cluster - Updated to 16.05.8.1 * Remove StoragePass from being printed out in the slurmdbd log at debug2 level. * Defer PATH search for task program until launch in slurmstepd. * Modify regression test1.89 to avoid leaving vestigial job. Also reduce logging to reduce likelyhood of Expect buffer overflow. * Do not PATH search for mult-prog launches if LaunchParamters=test_exec is enabled. * Fix for possible infinite loop in select/cons_res plugin when trying to satisfy a job's ntasks_per_core or socket specification. * If job is held for bad constraints make it so once updated the job doesn't go into JobAdminHeld. * sched/backfill - Fix logic to reserve resources for jobs that require a node reboot (i.e. to change KNL mode) in order to start. * When unpacking a node or front_end record from state and the protocol version is lower than the min version, set it to the min. * Remove redundant lookup for part_ptr when updating a reservation's nodes. * Fix memory and file descriptor leaks in slurmd daemon's sbcast logic. * Do not allocate specialized cores to jobs using the --exclusive option. * Cancel interactive job if Prolog failure with "PrologFlags=contain" or "PrologFlags=alloc" configured. Send new error prolog failure message to the salloc or srun command as needed. * Prevent possible out-of-bounds read in slurmstepd on an invalid #! line. * Fix check for PluginDir within slurmctld to work with multiple directories. * Cancel interactive jobs automatically on communication error to launching srun/salloc process. * Fix security issue caused by insecure file path handling triggered by the failure of a Prolog script. To exploit this a user needs to anticipate or cause the Prolog to fail for their job. CVE-2016-10030 (bsc#1018371). - Replace group/user add macros with function calls. - Disable building with netloc support: the netloc API is part of the devel branch of hwloc. Since this devel branch was included accidentally and has been reversed since, we need to disable this for the time being. - Conditionalized architecture specific pieces to support non-x86 architectures better. - Remove: unneeded 'BuildRequires: python' - Add: BuildRequires: freeipmi-devel BuildRequires: libibmad-devel BuildRequires: libibumad-devel so they are picked up by the slurm build. - Enable modifications from openHPC Project. - Enable lua API package build. - Add a recommends for slurm-munge to the slurm package: This is way, the munge auth method is available and slurm works out of the box. - Create /var/lib/slurm as StateSaveLocation directory. /tmp is dangerous. - Keep %{_libdir}/libpmi* and %{_libdir}/mpi_pmi2* on SUSE. OBS-URL: https://build.opensuse.org/request/show/454272 OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=13
2017-02-02 21:23:02 +01:00
--- a/contribs/pam/pam_slurm.c
+++ b/contribs/pam/pam_slurm.c
Accepting request 1067475 from home:eeich:branches:network:cluster - updated to 23.02.0-0rc1 * Highlights + slurmctld - Add new RPC rate limiting feature. This is enabled through SlurmctldParameters=rl_enable, otherwise disabled by default. + Make scontrol reconfigure and sending a SIGHUP to the slurmctld behave the same. If you were using SIGHUP as a 'lighter' scontrol reconfigure to rotate logs please update your scripts to use SIGUSR2 instead. + Change cloud nodes to show by default. PrivateData=cloud is no longer needed. + sreport - Count planned (FKA reserved) time for jobs running in IGNORE_JOBS reservations. Previously was lumped into IDLE time. + job_container/tmpfs - Support running with an arbitrary list of private mount points (/tmp and /dev/shm are the default, but not required). + job_container/tmpfs - Set more environment variables in InitScript. + Make all cgroup directories created by Slurm owned by root. This was the behavior in cgroup/v2 but not in cgroup/v1 where by default the step directories ownership were set to the user and group of the job. + accounting_storage/mysql - change purge/archive to calculate record ages based on end time, rather than start or submission times. + job_submit/lua - add support for log_user() from slurm_job_modify(). + Run the following scripts in slurmscriptd instead of slurmctld: ResumeProgram, ResumeFailProgram, SuspendProgram, ResvProlog, ResvEpilog, and RebootProgram (only with SlurmctldParameters=reboot_from_controller). + Only permit changing log levels with 'srun --slurmd-debug' by root or SlurmUser. + slurmctld will fatal() when reconfiguring the job_submit plugin fails. + Add PowerDownOnIdle partition option to power down nodes after nodes become idle. + Add "[jobid.stepid]" prefix from slurmstepd and "slurmscriptd" prefix from slurmcriptd to Syslog logging. Previously was only happening when OBS-URL: https://build.opensuse.org/request/show/1067475 OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=231
2023-02-23 20:32:51 +01:00
@@ -266,9 +266,9 @@ static int
Accepting request 454272 from home:eeich:branches:network:cluster - Updated to 16.05.8.1 * Remove StoragePass from being printed out in the slurmdbd log at debug2 level. * Defer PATH search for task program until launch in slurmstepd. * Modify regression test1.89 to avoid leaving vestigial job. Also reduce logging to reduce likelyhood of Expect buffer overflow. * Do not PATH search for mult-prog launches if LaunchParamters=test_exec is enabled. * Fix for possible infinite loop in select/cons_res plugin when trying to satisfy a job's ntasks_per_core or socket specification. * If job is held for bad constraints make it so once updated the job doesn't go into JobAdminHeld. * sched/backfill - Fix logic to reserve resources for jobs that require a node reboot (i.e. to change KNL mode) in order to start. * When unpacking a node or front_end record from state and the protocol version is lower than the min version, set it to the min. * Remove redundant lookup for part_ptr when updating a reservation's nodes. * Fix memory and file descriptor leaks in slurmd daemon's sbcast logic. * Do not allocate specialized cores to jobs using the --exclusive option. * Cancel interactive job if Prolog failure with "PrologFlags=contain" or "PrologFlags=alloc" configured. Send new error prolog failure message to the salloc or srun command as needed. * Prevent possible out-of-bounds read in slurmstepd on an invalid #! line. * Fix check for PluginDir within slurmctld to work with multiple directories. * Cancel interactive jobs automatically on communication error to launching srun/salloc process. * Fix security issue caused by insecure file path handling triggered by the failure of a Prolog script. To exploit this a user needs to anticipate or cause the Prolog to fail for their job. CVE-2016-10030 (bsc#1018371). - Replace group/user add macros with function calls. - Disable building with netloc support: the netloc API is part of the devel branch of hwloc. Since this devel branch was included accidentally and has been reversed since, we need to disable this for the time being. - Conditionalized architecture specific pieces to support non-x86 architectures better. - Remove: unneeded 'BuildRequires: python' - Add: BuildRequires: freeipmi-devel BuildRequires: libibmad-devel BuildRequires: libibumad-devel so they are picked up by the slurm build. - Enable modifications from openHPC Project. - Enable lua API package build. - Add a recommends for slurm-munge to the slurm package: This is way, the munge auth method is available and slurm works out of the box. - Create /var/lib/slurm as StateSaveLocation directory. /tmp is dangerous. - Keep %{_libdir}/libpmi* and %{_libdir}/mpi_pmi2* on SUSE. OBS-URL: https://build.opensuse.org/request/show/454272 OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=13
2017-02-02 21:23:02 +01:00
_gethostname_short (char *name, size_t len)
{
int error_code, name_len;
- char *dot_ptr, path_name[1024];
+ char *dot_ptr, path_name[1024] = {0};
- error_code = gethostname(path_name, sizeof(path_name));
+ error_code = gethostname(path_name, sizeof(path_name) - 1);
if (error_code)
return error_code;
Accepting request 1067475 from home:eeich:branches:network:cluster - updated to 23.02.0-0rc1 * Highlights + slurmctld - Add new RPC rate limiting feature. This is enabled through SlurmctldParameters=rl_enable, otherwise disabled by default. + Make scontrol reconfigure and sending a SIGHUP to the slurmctld behave the same. If you were using SIGHUP as a 'lighter' scontrol reconfigure to rotate logs please update your scripts to use SIGUSR2 instead. + Change cloud nodes to show by default. PrivateData=cloud is no longer needed. + sreport - Count planned (FKA reserved) time for jobs running in IGNORE_JOBS reservations. Previously was lumped into IDLE time. + job_container/tmpfs - Support running with an arbitrary list of private mount points (/tmp and /dev/shm are the default, but not required). + job_container/tmpfs - Set more environment variables in InitScript. + Make all cgroup directories created by Slurm owned by root. This was the behavior in cgroup/v2 but not in cgroup/v1 where by default the step directories ownership were set to the user and group of the job. + accounting_storage/mysql - change purge/archive to calculate record ages based on end time, rather than start or submission times. + job_submit/lua - add support for log_user() from slurm_job_modify(). + Run the following scripts in slurmscriptd instead of slurmctld: ResumeProgram, ResumeFailProgram, SuspendProgram, ResvProlog, ResvEpilog, and RebootProgram (only with SlurmctldParameters=reboot_from_controller). + Only permit changing log levels with 'srun --slurmd-debug' by root or SlurmUser. + slurmctld will fatal() when reconfiguring the job_submit plugin fails. + Add PowerDownOnIdle partition option to power down nodes after nodes become idle. + Add "[jobid.stepid]" prefix from slurmstepd and "slurmscriptd" prefix from slurmcriptd to Syslog logging. Previously was only happening when OBS-URL: https://build.opensuse.org/request/show/1067475 OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=231
2023-02-23 20:32:51 +01:00
@@ -296,13 +296,13 @@ static int
Accepting request 454272 from home:eeich:branches:network:cluster - Updated to 16.05.8.1 * Remove StoragePass from being printed out in the slurmdbd log at debug2 level. * Defer PATH search for task program until launch in slurmstepd. * Modify regression test1.89 to avoid leaving vestigial job. Also reduce logging to reduce likelyhood of Expect buffer overflow. * Do not PATH search for mult-prog launches if LaunchParamters=test_exec is enabled. * Fix for possible infinite loop in select/cons_res plugin when trying to satisfy a job's ntasks_per_core or socket specification. * If job is held for bad constraints make it so once updated the job doesn't go into JobAdminHeld. * sched/backfill - Fix logic to reserve resources for jobs that require a node reboot (i.e. to change KNL mode) in order to start. * When unpacking a node or front_end record from state and the protocol version is lower than the min version, set it to the min. * Remove redundant lookup for part_ptr when updating a reservation's nodes. * Fix memory and file descriptor leaks in slurmd daemon's sbcast logic. * Do not allocate specialized cores to jobs using the --exclusive option. * Cancel interactive job if Prolog failure with "PrologFlags=contain" or "PrologFlags=alloc" configured. Send new error prolog failure message to the salloc or srun command as needed. * Prevent possible out-of-bounds read in slurmstepd on an invalid #! line. * Fix check for PluginDir within slurmctld to work with multiple directories. * Cancel interactive jobs automatically on communication error to launching srun/salloc process. * Fix security issue caused by insecure file path handling triggered by the failure of a Prolog script. To exploit this a user needs to anticipate or cause the Prolog to fail for their job. CVE-2016-10030 (bsc#1018371). - Replace group/user add macros with function calls. - Disable building with netloc support: the netloc API is part of the devel branch of hwloc. Since this devel branch was included accidentally and has been reversed since, we need to disable this for the time being. - Conditionalized architecture specific pieces to support non-x86 architectures better. - Remove: unneeded 'BuildRequires: python' - Add: BuildRequires: freeipmi-devel BuildRequires: libibmad-devel BuildRequires: libibumad-devel so they are picked up by the slurm build. - Enable modifications from openHPC Project. - Enable lua API package build. - Add a recommends for slurm-munge to the slurm package: This is way, the munge auth method is available and slurm works out of the box. - Create /var/lib/slurm as StateSaveLocation directory. /tmp is dangerous. - Keep %{_libdir}/libpmi* and %{_libdir}/mpi_pmi2* on SUSE. OBS-URL: https://build.opensuse.org/request/show/454272 OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=13
2017-02-02 21:23:02 +01:00
_slurm_match_allocation(uid_t uid)
{
int authorized = 0, i;
- char hostname[MAXHOSTNAMELEN];
+ char hostname[MAXHOSTNAMELEN] = {0};
char *nodename = NULL;
job_info_msg_t * msg;
Accepting request 1067475 from home:eeich:branches:network:cluster - updated to 23.02.0-0rc1 * Highlights + slurmctld - Add new RPC rate limiting feature. This is enabled through SlurmctldParameters=rl_enable, otherwise disabled by default. + Make scontrol reconfigure and sending a SIGHUP to the slurmctld behave the same. If you were using SIGHUP as a 'lighter' scontrol reconfigure to rotate logs please update your scripts to use SIGUSR2 instead. + Change cloud nodes to show by default. PrivateData=cloud is no longer needed. + sreport - Count planned (FKA reserved) time for jobs running in IGNORE_JOBS reservations. Previously was lumped into IDLE time. + job_container/tmpfs - Support running with an arbitrary list of private mount points (/tmp and /dev/shm are the default, but not required). + job_container/tmpfs - Set more environment variables in InitScript. + Make all cgroup directories created by Slurm owned by root. This was the behavior in cgroup/v2 but not in cgroup/v1 where by default the step directories ownership were set to the user and group of the job. + accounting_storage/mysql - change purge/archive to calculate record ages based on end time, rather than start or submission times. + job_submit/lua - add support for log_user() from slurm_job_modify(). + Run the following scripts in slurmscriptd instead of slurmctld: ResumeProgram, ResumeFailProgram, SuspendProgram, ResvProlog, ResvEpilog, and RebootProgram (only with SlurmctldParameters=reboot_from_controller). + Only permit changing log levels with 'srun --slurmd-debug' by root or SlurmUser. + slurmctld will fatal() when reconfiguring the job_submit plugin fails. + Add PowerDownOnIdle partition option to power down nodes after nodes become idle. + Add "[jobid.stepid]" prefix from slurmstepd and "slurmscriptd" prefix from slurmcriptd to Syslog logging. Previously was only happening when OBS-URL: https://build.opensuse.org/request/show/1067475 OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=231
2023-02-23 20:32:51 +01:00
slurm_init(NULL);
Accepting request 845108 from home:anag:branches:network:cluster - Updated to 20.02.5, changes: * Fix leak of TRESRunMins when job time is changed with --time-min * pam_slurm - explicitly initialize slurm config to support configless mode. * scontrol - Fix exit code when creating/updating reservations with wrong Flags. * When a GRES has a no_consume flag, report 0 for allocated. * Fix cgroup cleanup by jobacct_gather/cgroup. * When creating reservations/jobs don't allow counts on a feature unless using an XOR. * Improve number of boards discovery * Fix updating a reservation NodeCnt on a zero-count reservation. * slurmrestd - provide an explicit error messages when PSK auth fails. * cons_tres - fix job requesting single gres per-node getting two or more nodes with less CPUs than requested per-task. * cons_tres - fix calculation of cores when using gres and cpus-per-task. * cons_tres - fix job not getting access to socket without GPU or with less than --gpus-per-socket when not enough cpus available on required socket and not using --gres-flags=enforce binding. * Fix HDF5 type version build error. * Fix creation of CoreCnt only reservations when the first node isn't available. * Fix wrong DBD Agent queue size in sdiag when using accounting_storage/none. * Improve job constraints XOR option logic. * Fix preemption of hetjobs when needed nodes not in leader component. * Fix wrong bit_or() messing potential preemptor jobs node bitmap, causing bad node deallocations and even allocation of nodes from other partitions. * Fix double-deallocation of preempted non-leader hetjob components. * slurmdbd - prevent truncation of the step nodelists over 4095. * Fix nodes remaining in drain state state after rebooting with ASAP option. - changes from 20.02.4: OBS-URL: https://build.opensuse.org/request/show/845108 OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=156
2020-11-02 14:42:03 +01:00
Accepting request 454272 from home:eeich:branches:network:cluster - Updated to 16.05.8.1 * Remove StoragePass from being printed out in the slurmdbd log at debug2 level. * Defer PATH search for task program until launch in slurmstepd. * Modify regression test1.89 to avoid leaving vestigial job. Also reduce logging to reduce likelyhood of Expect buffer overflow. * Do not PATH search for mult-prog launches if LaunchParamters=test_exec is enabled. * Fix for possible infinite loop in select/cons_res plugin when trying to satisfy a job's ntasks_per_core or socket specification. * If job is held for bad constraints make it so once updated the job doesn't go into JobAdminHeld. * sched/backfill - Fix logic to reserve resources for jobs that require a node reboot (i.e. to change KNL mode) in order to start. * When unpacking a node or front_end record from state and the protocol version is lower than the min version, set it to the min. * Remove redundant lookup for part_ptr when updating a reservation's nodes. * Fix memory and file descriptor leaks in slurmd daemon's sbcast logic. * Do not allocate specialized cores to jobs using the --exclusive option. * Cancel interactive job if Prolog failure with "PrologFlags=contain" or "PrologFlags=alloc" configured. Send new error prolog failure message to the salloc or srun command as needed. * Prevent possible out-of-bounds read in slurmstepd on an invalid #! line. * Fix check for PluginDir within slurmctld to work with multiple directories. * Cancel interactive jobs automatically on communication error to launching srun/salloc process. * Fix security issue caused by insecure file path handling triggered by the failure of a Prolog script. To exploit this a user needs to anticipate or cause the Prolog to fail for their job. CVE-2016-10030 (bsc#1018371). - Replace group/user add macros with function calls. - Disable building with netloc support: the netloc API is part of the devel branch of hwloc. Since this devel branch was included accidentally and has been reversed since, we need to disable this for the time being. - Conditionalized architecture specific pieces to support non-x86 architectures better. - Remove: unneeded 'BuildRequires: python' - Add: BuildRequires: freeipmi-devel BuildRequires: libibmad-devel BuildRequires: libibumad-devel so they are picked up by the slurm build. - Enable modifications from openHPC Project. - Enable lua API package build. - Add a recommends for slurm-munge to the slurm package: This is way, the munge auth method is available and slurm works out of the box. - Create /var/lib/slurm as StateSaveLocation directory. /tmp is dangerous. - Keep %{_libdir}/libpmi* and %{_libdir}/mpi_pmi2* on SUSE. OBS-URL: https://build.opensuse.org/request/show/454272 OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=13
2017-02-02 21:23:02 +01:00
- if (_gethostname_short(hostname, sizeof(hostname)) < 0) {
+ if (_gethostname_short(hostname, sizeof(hostname) - 1) < 0) {
_log_msg(LOG_ERR, "gethostname: %m");
return 0;
}
Accepting request 1067475 from home:eeich:branches:network:cluster - updated to 23.02.0-0rc1 * Highlights + slurmctld - Add new RPC rate limiting feature. This is enabled through SlurmctldParameters=rl_enable, otherwise disabled by default. + Make scontrol reconfigure and sending a SIGHUP to the slurmctld behave the same. If you were using SIGHUP as a 'lighter' scontrol reconfigure to rotate logs please update your scripts to use SIGUSR2 instead. + Change cloud nodes to show by default. PrivateData=cloud is no longer needed. + sreport - Count planned (FKA reserved) time for jobs running in IGNORE_JOBS reservations. Previously was lumped into IDLE time. + job_container/tmpfs - Support running with an arbitrary list of private mount points (/tmp and /dev/shm are the default, but not required). + job_container/tmpfs - Set more environment variables in InitScript. + Make all cgroup directories created by Slurm owned by root. This was the behavior in cgroup/v2 but not in cgroup/v1 where by default the step directories ownership were set to the user and group of the job. + accounting_storage/mysql - change purge/archive to calculate record ages based on end time, rather than start or submission times. + job_submit/lua - add support for log_user() from slurm_job_modify(). + Run the following scripts in slurmscriptd instead of slurmctld: ResumeProgram, ResumeFailProgram, SuspendProgram, ResvProlog, ResvEpilog, and RebootProgram (only with SlurmctldParameters=reboot_from_controller). + Only permit changing log levels with 'srun --slurmd-debug' by root or SlurmUser. + slurmctld will fatal() when reconfiguring the job_submit plugin fails. + Add PowerDownOnIdle partition option to power down nodes after nodes become idle. + Add "[jobid.stepid]" prefix from slurmstepd and "slurmscriptd" prefix from slurmcriptd to Syslog logging. Previously was only happening when OBS-URL: https://build.opensuse.org/request/show/1067475 OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=231
2023-02-23 20:32:51 +01:00
@@ -425,7 +425,7 @@ _send_denial_msg(pam_handle_t *pamh, struct _options *opts,
Accepting request 454272 from home:eeich:branches:network:cluster - Updated to 16.05.8.1 * Remove StoragePass from being printed out in the slurmdbd log at debug2 level. * Defer PATH search for task program until launch in slurmstepd. * Modify regression test1.89 to avoid leaving vestigial job. Also reduce logging to reduce likelyhood of Expect buffer overflow. * Do not PATH search for mult-prog launches if LaunchParamters=test_exec is enabled. * Fix for possible infinite loop in select/cons_res plugin when trying to satisfy a job's ntasks_per_core or socket specification. * If job is held for bad constraints make it so once updated the job doesn't go into JobAdminHeld. * sched/backfill - Fix logic to reserve resources for jobs that require a node reboot (i.e. to change KNL mode) in order to start. * When unpacking a node or front_end record from state and the protocol version is lower than the min version, set it to the min. * Remove redundant lookup for part_ptr when updating a reservation's nodes. * Fix memory and file descriptor leaks in slurmd daemon's sbcast logic. * Do not allocate specialized cores to jobs using the --exclusive option. * Cancel interactive job if Prolog failure with "PrologFlags=contain" or "PrologFlags=alloc" configured. Send new error prolog failure message to the salloc or srun command as needed. * Prevent possible out-of-bounds read in slurmstepd on an invalid #! line. * Fix check for PluginDir within slurmctld to work with multiple directories. * Cancel interactive jobs automatically on communication error to launching srun/salloc process. * Fix security issue caused by insecure file path handling triggered by the failure of a Prolog script. To exploit this a user needs to anticipate or cause the Prolog to fail for their job. CVE-2016-10030 (bsc#1018371). - Replace group/user add macros with function calls. - Disable building with netloc support: the netloc API is part of the devel branch of hwloc. Since this devel branch was included accidentally and has been reversed since, we need to disable this for the time being. - Conditionalized architecture specific pieces to support non-x86 architectures better. - Remove: unneeded 'BuildRequires: python' - Add: BuildRequires: freeipmi-devel BuildRequires: libibmad-devel BuildRequires: libibumad-devel so they are picked up by the slurm build. - Enable modifications from openHPC Project. - Enable lua API package build. - Add a recommends for slurm-munge to the slurm package: This is way, the munge auth method is available and slurm works out of the box. - Create /var/lib/slurm as StateSaveLocation directory. /tmp is dangerous. - Keep %{_libdir}/libpmi* and %{_libdir}/mpi_pmi2* on SUSE. OBS-URL: https://build.opensuse.org/request/show/454272 OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=13
2017-02-02 21:23:02 +01:00
*/
extern void libpam_slurm_init (void)
{
- char libslurmname[64];
+ char libslurmname[64] = {0};
if (slurm_h)
return;
Accepting request 1067475 from home:eeich:branches:network:cluster - updated to 23.02.0-0rc1 * Highlights + slurmctld - Add new RPC rate limiting feature. This is enabled through SlurmctldParameters=rl_enable, otherwise disabled by default. + Make scontrol reconfigure and sending a SIGHUP to the slurmctld behave the same. If you were using SIGHUP as a 'lighter' scontrol reconfigure to rotate logs please update your scripts to use SIGUSR2 instead. + Change cloud nodes to show by default. PrivateData=cloud is no longer needed. + sreport - Count planned (FKA reserved) time for jobs running in IGNORE_JOBS reservations. Previously was lumped into IDLE time. + job_container/tmpfs - Support running with an arbitrary list of private mount points (/tmp and /dev/shm are the default, but not required). + job_container/tmpfs - Set more environment variables in InitScript. + Make all cgroup directories created by Slurm owned by root. This was the behavior in cgroup/v2 but not in cgroup/v1 where by default the step directories ownership were set to the user and group of the job. + accounting_storage/mysql - change purge/archive to calculate record ages based on end time, rather than start or submission times. + job_submit/lua - add support for log_user() from slurm_job_modify(). + Run the following scripts in slurmscriptd instead of slurmctld: ResumeProgram, ResumeFailProgram, SuspendProgram, ResvProlog, ResvEpilog, and RebootProgram (only with SlurmctldParameters=reboot_from_controller). + Only permit changing log levels with 'srun --slurmd-debug' by root or SlurmUser. + slurmctld will fatal() when reconfiguring the job_submit plugin fails. + Add PowerDownOnIdle partition option to power down nodes after nodes become idle. + Add "[jobid.stepid]" prefix from slurmstepd and "slurmscriptd" prefix from slurmcriptd to Syslog logging. Previously was only happening when OBS-URL: https://build.opensuse.org/request/show/1067475 OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=231
2023-02-23 20:32:51 +01:00
@@ -433,10 +433,10 @@ extern void libpam_slurm_init (void)
Accepting request 454272 from home:eeich:branches:network:cluster - Updated to 16.05.8.1 * Remove StoragePass from being printed out in the slurmdbd log at debug2 level. * Defer PATH search for task program until launch in slurmstepd. * Modify regression test1.89 to avoid leaving vestigial job. Also reduce logging to reduce likelyhood of Expect buffer overflow. * Do not PATH search for mult-prog launches if LaunchParamters=test_exec is enabled. * Fix for possible infinite loop in select/cons_res plugin when trying to satisfy a job's ntasks_per_core or socket specification. * If job is held for bad constraints make it so once updated the job doesn't go into JobAdminHeld. * sched/backfill - Fix logic to reserve resources for jobs that require a node reboot (i.e. to change KNL mode) in order to start. * When unpacking a node or front_end record from state and the protocol version is lower than the min version, set it to the min. * Remove redundant lookup for part_ptr when updating a reservation's nodes. * Fix memory and file descriptor leaks in slurmd daemon's sbcast logic. * Do not allocate specialized cores to jobs using the --exclusive option. * Cancel interactive job if Prolog failure with "PrologFlags=contain" or "PrologFlags=alloc" configured. Send new error prolog failure message to the salloc or srun command as needed. * Prevent possible out-of-bounds read in slurmstepd on an invalid #! line. * Fix check for PluginDir within slurmctld to work with multiple directories. * Cancel interactive jobs automatically on communication error to launching srun/salloc process. * Fix security issue caused by insecure file path handling triggered by the failure of a Prolog script. To exploit this a user needs to anticipate or cause the Prolog to fail for their job. CVE-2016-10030 (bsc#1018371). - Replace group/user add macros with function calls. - Disable building with netloc support: the netloc API is part of the devel branch of hwloc. Since this devel branch was included accidentally and has been reversed since, we need to disable this for the time being. - Conditionalized architecture specific pieces to support non-x86 architectures better. - Remove: unneeded 'BuildRequires: python' - Add: BuildRequires: freeipmi-devel BuildRequires: libibmad-devel BuildRequires: libibumad-devel so they are picked up by the slurm build. - Enable modifications from openHPC Project. - Enable lua API package build. - Add a recommends for slurm-munge to the slurm package: This is way, the munge auth method is available and slurm works out of the box. - Create /var/lib/slurm as StateSaveLocation directory. /tmp is dangerous. - Keep %{_libdir}/libpmi* and %{_libdir}/mpi_pmi2* on SUSE. OBS-URL: https://build.opensuse.org/request/show/454272 OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=13
2017-02-02 21:23:02 +01:00
/* First try to use the same libslurm version ("libslurm.so.24.0.0"),
* Second try to match the major version number ("libslurm.so.24"),
* Otherwise use "libslurm.so" */
- if (snprintf(libslurmname, sizeof(libslurmname),
+ if (snprintf(libslurmname, sizeof(libslurmname) - 1,
"libslurm.so.%d.%d.%d", SLURM_API_CURRENT,
SLURM_API_REVISION, SLURM_API_AGE) >=
- sizeof(libslurmname) ) {
+ sizeof(libslurmname) - 1) {
_log_msg (LOG_ERR, "Unable to write libslurmname\n");
} else if ((slurm_h = dlopen(libslurmname, RTLD_NOW|RTLD_GLOBAL))) {
return;
Accepting request 1067475 from home:eeich:branches:network:cluster - updated to 23.02.0-0rc1 * Highlights + slurmctld - Add new RPC rate limiting feature. This is enabled through SlurmctldParameters=rl_enable, otherwise disabled by default. + Make scontrol reconfigure and sending a SIGHUP to the slurmctld behave the same. If you were using SIGHUP as a 'lighter' scontrol reconfigure to rotate logs please update your scripts to use SIGUSR2 instead. + Change cloud nodes to show by default. PrivateData=cloud is no longer needed. + sreport - Count planned (FKA reserved) time for jobs running in IGNORE_JOBS reservations. Previously was lumped into IDLE time. + job_container/tmpfs - Support running with an arbitrary list of private mount points (/tmp and /dev/shm are the default, but not required). + job_container/tmpfs - Set more environment variables in InitScript. + Make all cgroup directories created by Slurm owned by root. This was the behavior in cgroup/v2 but not in cgroup/v1 where by default the step directories ownership were set to the user and group of the job. + accounting_storage/mysql - change purge/archive to calculate record ages based on end time, rather than start or submission times. + job_submit/lua - add support for log_user() from slurm_job_modify(). + Run the following scripts in slurmscriptd instead of slurmctld: ResumeProgram, ResumeFailProgram, SuspendProgram, ResvProlog, ResvEpilog, and RebootProgram (only with SlurmctldParameters=reboot_from_controller). + Only permit changing log levels with 'srun --slurmd-debug' by root or SlurmUser. + slurmctld will fatal() when reconfiguring the job_submit plugin fails. + Add PowerDownOnIdle partition option to power down nodes after nodes become idle. + Add "[jobid.stepid]" prefix from slurmstepd and "slurmscriptd" prefix from slurmcriptd to Syslog logging. Previously was only happening when OBS-URL: https://build.opensuse.org/request/show/1067475 OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=231
2023-02-23 20:32:51 +01:00
@@ -445,8 +445,10 @@ extern void libpam_slurm_init (void)
Accepting request 454272 from home:eeich:branches:network:cluster - Updated to 16.05.8.1 * Remove StoragePass from being printed out in the slurmdbd log at debug2 level. * Defer PATH search for task program until launch in slurmstepd. * Modify regression test1.89 to avoid leaving vestigial job. Also reduce logging to reduce likelyhood of Expect buffer overflow. * Do not PATH search for mult-prog launches if LaunchParamters=test_exec is enabled. * Fix for possible infinite loop in select/cons_res plugin when trying to satisfy a job's ntasks_per_core or socket specification. * If job is held for bad constraints make it so once updated the job doesn't go into JobAdminHeld. * sched/backfill - Fix logic to reserve resources for jobs that require a node reboot (i.e. to change KNL mode) in order to start. * When unpacking a node or front_end record from state and the protocol version is lower than the min version, set it to the min. * Remove redundant lookup for part_ptr when updating a reservation's nodes. * Fix memory and file descriptor leaks in slurmd daemon's sbcast logic. * Do not allocate specialized cores to jobs using the --exclusive option. * Cancel interactive job if Prolog failure with "PrologFlags=contain" or "PrologFlags=alloc" configured. Send new error prolog failure message to the salloc or srun command as needed. * Prevent possible out-of-bounds read in slurmstepd on an invalid #! line. * Fix check for PluginDir within slurmctld to work with multiple directories. * Cancel interactive jobs automatically on communication error to launching srun/salloc process. * Fix security issue caused by insecure file path handling triggered by the failure of a Prolog script. To exploit this a user needs to anticipate or cause the Prolog to fail for their job. CVE-2016-10030 (bsc#1018371). - Replace group/user add macros with function calls. - Disable building with netloc support: the netloc API is part of the devel branch of hwloc. Since this devel branch was included accidentally and has been reversed since, we need to disable this for the time being. - Conditionalized architecture specific pieces to support non-x86 architectures better. - Remove: unneeded 'BuildRequires: python' - Add: BuildRequires: freeipmi-devel BuildRequires: libibmad-devel BuildRequires: libibumad-devel so they are picked up by the slurm build. - Enable modifications from openHPC Project. - Enable lua API package build. - Add a recommends for slurm-munge to the slurm package: This is way, the munge auth method is available and slurm works out of the box. - Create /var/lib/slurm as StateSaveLocation directory. /tmp is dangerous. - Keep %{_libdir}/libpmi* and %{_libdir}/mpi_pmi2* on SUSE. OBS-URL: https://build.opensuse.org/request/show/454272 OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=13
2017-02-02 21:23:02 +01:00
libslurmname, dlerror ());
}
- if (snprintf(libslurmname, sizeof(libslurmname), "libslurm.so.%d",
- SLURM_API_CURRENT) >= sizeof(libslurmname) ) {
+ memset(libslurmname, 0, sizeof(libslurmname));
+
+ if (snprintf(libslurmname, sizeof(libslurmname) - 1, "libslurm.so.%d",
+ SLURM_API_CURRENT) >= sizeof(libslurmname) - 1) {
_log_msg (LOG_ERR, "Unable to write libslurmname\n");
} else if ((slurm_h = dlopen(libslurmname, RTLD_NOW|RTLD_GLOBAL))) {
return;