Accepting request 629222 from home:eeich:branches:network:cluster
- Update to 17.11.9 * Fix segfault in slurmctld when a job's node bitmap is NULL during a scheduling cycle. Primarily caused by EnforcePartLimits=ALL. * Remove erroneous unlock in acct_gather_energy/ipmi. * Enable support for hwloc version 2.0.1. * Fix 'srun -q' (--qos) option handling. * Fix socket communication issue that can lead to lost task completition messages, which will cause a permanently stuck srun process. * Handle creation of TMPDIR if environment variable is set or changed in a task prolog script. * Avoid node layout fragmentation if running with a fixed CPU count but without Sockets and CoresPerSocket defined. * burst_buffer/cray - Fix datawarp swap default pool overriding jobdw. * Fix incorrect job priority assignment for multi-partition job with different PriorityTier settings on the partitions. * Fix sinfo to print correct node state. - When using a remote shared StateSaveLocation, slurmctld needs to be started after remote filesystems have become available. Add 'remote-fs.target' to the 'After=' directive in slurmctld.service (boo#1103561). - Update to 17.11.8 * Fix incomplete RESPONSE_[RESOURCE|JOB_PACK]_ALLOCATION building path. * Do not allocate nodes that were marked down due to the node not responding by ResumeTimeout. * task/cray plugin - search for "mems" cgroup information in the file "cpuset.mems" then fall back to the file "mems". * Fix ipmi profile debug uninitialized variable. * PMIx: fixed the direct connect inline msg sending. OBS-URL: https://build.opensuse.org/request/show/629222 OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=64
This commit is contained in:
parent
62ef6634bc
commit
d5a2e95d8c
@ -1,3 +0,0 @@
|
|||||||
version https://git-lfs.github.com/spec/v1
|
|
||||||
oid sha256:a4ab10870b1c35f67a3465796960b32e4270e52acc257987b10acc4f17035a57
|
|
||||||
size 6249399
|
|
3
slurm-17.11.9.tar.bz2
Normal file
3
slurm-17.11.9.tar.bz2
Normal file
@ -0,0 +1,3 @@
|
|||||||
|
version https://git-lfs.github.com/spec/v1
|
||||||
|
oid sha256:c56ed2eab6d2d2adf2ab5aec203175a64b9e8c5a5ba2af29470358e7808bd942
|
||||||
|
size 6258698
|
@ -1,3 +1,98 @@
|
|||||||
|
-------------------------------------------------------------------
|
||||||
|
Tue Aug 14 10:26:43 UTC 2018 - eich@suse.com
|
||||||
|
|
||||||
|
- Update to 17.11.9
|
||||||
|
* Fix segfault in slurmctld when a job's node bitmap is NULL during a
|
||||||
|
scheduling cycle. Primarily caused by EnforcePartLimits=ALL.
|
||||||
|
* Remove erroneous unlock in acct_gather_energy/ipmi.
|
||||||
|
* Enable support for hwloc version 2.0.1.
|
||||||
|
* Fix 'srun -q' (--qos) option handling.
|
||||||
|
* Fix socket communication issue that can lead to lost task completition
|
||||||
|
messages, which will cause a permanently stuck srun process.
|
||||||
|
* Handle creation of TMPDIR if environment variable is set or changed in
|
||||||
|
a task prolog script.
|
||||||
|
* Avoid node layout fragmentation if running with a fixed CPU count but
|
||||||
|
without Sockets and CoresPerSocket defined.
|
||||||
|
* burst_buffer/cray - Fix datawarp swap default pool overriding jobdw.
|
||||||
|
* Fix incorrect job priority assignment for multi-partition job with
|
||||||
|
different PriorityTier settings on the partitions.
|
||||||
|
* Fix sinfo to print correct node state.
|
||||||
|
|
||||||
|
-------------------------------------------------------------------
|
||||||
|
Thu Aug 2 11:35:55 UTC 2018 - eich@suse.com
|
||||||
|
|
||||||
|
- When using a remote shared StateSaveLocation, slurmctld needs to
|
||||||
|
be started after remote filesystems have become available.
|
||||||
|
Add 'remote-fs.target' to the 'After=' directive in slurmctld.service
|
||||||
|
(boo#1103561).
|
||||||
|
|
||||||
|
-------------------------------------------------------------------
|
||||||
|
Tue Jul 31 18:29:40 UTC 2018 - eich@suse.com
|
||||||
|
|
||||||
|
- Update to 17.11.8
|
||||||
|
* Fix incomplete RESPONSE_[RESOURCE|JOB_PACK]_ALLOCATION building path.
|
||||||
|
* Do not allocate nodes that were marked down due to the node not responding
|
||||||
|
by ResumeTimeout.
|
||||||
|
* task/cray plugin - search for "mems" cgroup information in the file
|
||||||
|
"cpuset.mems" then fall back to the file "mems".
|
||||||
|
* Fix ipmi profile debug uninitialized variable.
|
||||||
|
* PMIx: fixed the direct connect inline msg sending.
|
||||||
|
* MYSQL: Fix issue not handling all fields when loading an archive dump.
|
||||||
|
* Allow a job_submit plugin to change the admin_comment field during
|
||||||
|
job_submit_plugin_modify().
|
||||||
|
* job_submit/lua - fix access into reservation table.
|
||||||
|
* MySQL - Prevent deadlock caused by archive logic locking reads.
|
||||||
|
* Don't enforce MaxQueryTimeRange when requesting specific jobs.
|
||||||
|
* Modify --test-only logic to properly support jobs submitted to more than
|
||||||
|
one partition.
|
||||||
|
* Prevent slurmctld from abort when attempting to set non-existing
|
||||||
|
qos as def_qos_id.
|
||||||
|
* Add new job dependency type of "afterburstbuffer". The pending job will be
|
||||||
|
delayed until the first job completes execution and it's burst buffer
|
||||||
|
stage-out is completed.
|
||||||
|
* Reorder proctrack/task plugin load in the slurmstepd to match that of
|
||||||
|
slurmd
|
||||||
|
and avoid race condition calling task before proctrack can introduce.
|
||||||
|
* Prevent reboot of a busy KNL node when requesting inactive features.
|
||||||
|
* Revert to previous behavior when requesting memory per cpu/node introduced
|
||||||
|
in 17.11.7.
|
||||||
|
* Fix to reinitialize previously adjusted job members to their original
|
||||||
|
value
|
||||||
|
when validating the job memory in multi-partition requests.
|
||||||
|
* Fix _step_signal() from always returning SLURM_SUCCESS.
|
||||||
|
* Combine active and available node feature change logs on one line rather
|
||||||
|
than one line per node for performance reasons.
|
||||||
|
* Prevent occasionally leaking freezer cgroups.
|
||||||
|
* Fix potential segfault when closing the mpi/pmi2 plugin.
|
||||||
|
* Fix issues with --exclusive=[user|mcs] to work correctly
|
||||||
|
with preemption or when job requests a specific list of hosts.
|
||||||
|
* Make code compile with hdf5 1.10.2+
|
||||||
|
* mpi/pmix: Fixed the collectives canceling.
|
||||||
|
* SlurmDBD: improve error message handling on archive load failure.
|
||||||
|
* Fix incorrect locking when deleting reservations.
|
||||||
|
* Fix incorrect locking when setting up the power save module.
|
||||||
|
* Fix setting format output length for squeue when showing array jobs.
|
||||||
|
* Add xstrstr function.
|
||||||
|
* Fix printing out of --hint options in sbatch, salloc --help.
|
||||||
|
* Prevent possible divide by zero in _validate_time_limit().
|
||||||
|
* Add Delegate=yes to the slurmd.service file to prevent systemd from
|
||||||
|
interfering with the jobs' cgroup hierarchies.
|
||||||
|
* Change the backlog argument to the listen() syscall within srun to 4096
|
||||||
|
to match elsewhere in the code, and avoid communication problems at scale.
|
||||||
|
|
||||||
|
-------------------------------------------------------------------
|
||||||
|
Tue Jul 31 17:30:08 UTC 2018 - eich@suse.com
|
||||||
|
|
||||||
|
- Fix race in the slurmctld backup controller which prevents it
|
||||||
|
to clean up allocations on nodes properly after failing over
|
||||||
|
(bsc#1084917).
|
||||||
|
- Handled %license in a backward compatible manner.
|
||||||
|
|
||||||
|
-------------------------------------------------------------------
|
||||||
|
Sat Jul 28 15:30:58 UTC 2018 - eich@suse.com
|
||||||
|
|
||||||
|
- Add a 'Recommends: slurm-munge' to slurm-slurmdbd.
|
||||||
|
|
||||||
-------------------------------------------------------------------
|
-------------------------------------------------------------------
|
||||||
Wed Jul 11 12:04:55 UTC 2018 - eich@suse.com
|
Wed Jul 11 12:04:55 UTC 2018 - eich@suse.com
|
||||||
|
|
||||||
|
33
slurm.spec
33
slurm.spec
@ -18,7 +18,7 @@
|
|||||||
|
|
||||||
# Check file META in sources: update so_version to (API_CURRENT - API_AGE)
|
# Check file META in sources: update so_version to (API_CURRENT - API_AGE)
|
||||||
%define so_version 32
|
%define so_version 32
|
||||||
%define ver 17.11.7
|
%define ver 17.11.9
|
||||||
# so-version is 0 and seems to be stable
|
# so-version is 0 and seems to be stable
|
||||||
%define pmi_so 0
|
%define pmi_so 0
|
||||||
|
|
||||||
@ -73,6 +73,7 @@ Patch5: slurmd-uses-xdaemon_-for-systemd.patch
|
|||||||
Patch6: slurmdbd-uses-xdaemon_-for-systemd.patch
|
Patch6: slurmdbd-uses-xdaemon_-for-systemd.patch
|
||||||
Patch7: slurmsmwd-uses-xdaemon_-for-systemd.patch
|
Patch7: slurmsmwd-uses-xdaemon_-for-systemd.patch
|
||||||
Patch8: removed-deprecated-xdaemon.patch
|
Patch8: removed-deprecated-xdaemon.patch
|
||||||
|
Patch9: slurmctld-rerun-agent_init-when-backup-controller-takes-over.patch
|
||||||
|
|
||||||
Requires: slurm-config = %{version}
|
Requires: slurm-config = %{version}
|
||||||
Requires: slurm-node = %{version}
|
Requires: slurm-node = %{version}
|
||||||
@ -208,6 +209,7 @@ Group: Productivity/Clustering/Computing
|
|||||||
Requires: slurm-config = %{version}
|
Requires: slurm-config = %{version}
|
||||||
Requires: slurm-plugins = %{version}
|
Requires: slurm-plugins = %{version}
|
||||||
Requires: slurm-sql = %{version}
|
Requires: slurm-sql = %{version}
|
||||||
|
Recommends: slurm-munge = %{version}
|
||||||
%if 0%{?with_systemd}
|
%if 0%{?with_systemd}
|
||||||
%{?systemd_requires}
|
%{?systemd_requires}
|
||||||
%else
|
%else
|
||||||
@ -328,6 +330,7 @@ for the slurm daemons.
|
|||||||
%patch6 -p1
|
%patch6 -p1
|
||||||
%patch7 -p1
|
%patch7 -p1
|
||||||
%patch8 -p1
|
%patch8 -p1
|
||||||
|
%patch9 -p1
|
||||||
|
|
||||||
%build
|
%build
|
||||||
%configure --enable-shared \
|
%configure --enable-shared \
|
||||||
@ -399,14 +402,20 @@ PartitionName=normal Nodes=linux Default=YES MaxTime=24:00:00 State=UP
|
|||||||
EOF
|
EOF
|
||||||
# 9/17/14 karl.w.schulz@intel.com - Add option to drop VM cache during epilog
|
# 9/17/14 karl.w.schulz@intel.com - Add option to drop VM cache during epilog
|
||||||
sed -i '/^# No other SLURM jobs,/i \\n# Drop clean caches (OpenHPC)\necho 3 > /proc/sys/vm/drop_caches\n\n#' %{buildroot}/%{_sysconfdir}/%{name}/slurm.epilog.clean
|
sed -i '/^# No other SLURM jobs,/i \\n# Drop clean caches (OpenHPC)\necho 3 > /proc/sys/vm/drop_caches\n\n#' %{buildroot}/%{_sysconfdir}/%{name}/slurm.epilog.clean
|
||||||
# chnage slurmdbd.conf for our needs
|
# change slurmdbd.conf for our needs
|
||||||
sed -i 's@LogFile=/var/log/slurm/slurmdbd.log@LogFile=/var/log/slurmdbd.log@' %{buildroot}/%{_sysconfdir}/%{name}/slurmdbd.conf
|
sed -i 's@LogFile=/var/log/slurm/slurmdbd.log@LogFile=/var/log/slurmdbd.log@'\
|
||||||
sed -i -e "s@PidFile=.*@PidFile=%{_localstatedir}/run/slurm/slurmdbd.pid@" %{buildroot}/%{_sysconfdir}/%{name}/slurmdbd.conf
|
%{buildroot}/%{_sysconfdir}/%{name}/slurmdbd.conf
|
||||||
# manage local state dir
|
sed -i -e "s@PidFile=.*@PidFile=%{_localstatedir}/run/slurm/slurmdbd.pid@" \
|
||||||
|
%{buildroot}/%{_sysconfdir}/%{name}/slurmdbd.conf
|
||||||
|
# manage local state dir and a remote states save location
|
||||||
mkdir -p %{buildroot}/%_localstatedir/lib/slurm
|
mkdir -p %{buildroot}/%_localstatedir/lib/slurm
|
||||||
sed -i -e "s@PIDFile=.*@PIDFile=%{_localstatedir}/run/slurm/slurmctld.pid@" %{buildroot}/%{_unitdir}/slurmctld.service
|
sed -i -e "s@PIDFile=.*@PIDFile=%{_localstatedir}/run/slurm/slurmctld.pid@" \
|
||||||
sed -i -e "s@PIDFile=.*@PIDFile=%{_localstatedir}/run/slurm/slurmd.pid@" %{buildroot}/%{_unitdir}/slurmd.service
|
-e "s@After=.*@After=network.target munge.service remote-fs.target@" \
|
||||||
sed -i -e "s@PIDFile=.*@PIDFile=%{_localstatedir}/run/slurm/slurmdbd.pid@" %{buildroot}/%{_unitdir}/slurmdbd.service
|
%{buildroot}/%{_unitdir}/slurmctld.service
|
||||||
|
sed -i -e "s@PIDFile=.*@PIDFile=%{_localstatedir}/run/slurm/slurmd.pid@" \
|
||||||
|
%{buildroot}/%{_unitdir}/slurmd.service
|
||||||
|
sed -i -e "s@PIDFile=.*@PIDFile=%{_localstatedir}/run/slurm/slurmdbd.pid@" \
|
||||||
|
%{buildroot}/%{_unitdir}/slurmdbd.service
|
||||||
%endif
|
%endif
|
||||||
|
|
||||||
# Delete unpackaged files:
|
# Delete unpackaged files:
|
||||||
@ -604,10 +613,16 @@ exit 0
|
|||||||
%_res_update slurmdbd.service
|
%_res_update slurmdbd.service
|
||||||
%_rest slurmdbd
|
%_rest slurmdbd
|
||||||
|
|
||||||
|
%if 0%{?sle_version} > 120200 || 0%{?suse_version} > 1320
|
||||||
|
%define my_license %license
|
||||||
|
%else
|
||||||
|
%define my_license %doc
|
||||||
|
%endif
|
||||||
|
|
||||||
%files
|
%files
|
||||||
%defattr(-,root,root)
|
%defattr(-,root,root)
|
||||||
%doc AUTHORS NEWS RELEASE_NOTES DISCLAIMER
|
%doc AUTHORS NEWS RELEASE_NOTES DISCLAIMER
|
||||||
%license COPYING
|
%my_license COPYING
|
||||||
%doc doc/html
|
%doc doc/html
|
||||||
%{_bindir}/sacct
|
%{_bindir}/sacct
|
||||||
%{_bindir}/sacctmgr
|
%{_bindir}/sacctmgr
|
||||||
|
@ -0,0 +1,58 @@
|
|||||||
|
From: Egbert Eich <eich@suse.com>
|
||||||
|
Date: Tue Jul 31 17:31:15 2018 +0200
|
||||||
|
Subject: slurmctld: rerun agent_init() when backup controller takes over
|
||||||
|
Patch-mainline: Not yet
|
||||||
|
Git-commit: 169d9522c89a10dcffbf1403c20b4e6249bac79b
|
||||||
|
References:
|
||||||
|
|
||||||
|
A slurmctld backup controller often fails to clean up jobs which have
|
||||||
|
finished, the node appears in an 'IDLE+COMPLETING' state while squeue -l
|
||||||
|
still shows the job in a completing state.
|
||||||
|
This situation persists until the primary controller is restarted and
|
||||||
|
cleans up all tasks in 'COMPLETING' state.
|
||||||
|
This issue is caused by a race condition in the backup controller:
|
||||||
|
When the backup controller detects that the primary controller is
|
||||||
|
inaccessible, it will run thru a restart cycle. To trigger the shutdown
|
||||||
|
of some entities, it will set slurmctld_config.shutdown_time to a value
|
||||||
|
!= 0. Before continuing as the controller in charge, it resets this
|
||||||
|
variable to 0 again.
|
||||||
|
The agent which handles the request queue - from a separate thread -
|
||||||
|
wakes up periodically (in a 2 sec interval) and checks for things to do.
|
||||||
|
If it finds slurmctld_config.shutdown_time set to a value != 0, it will
|
||||||
|
terminate.
|
||||||
|
If this wakeup occurs in the 'takeover window' between the variable
|
||||||
|
being set to !=0 and reset to 0, the agent goes away and will no longer
|
||||||
|
be available to handle queued requests as there is nothing at the end
|
||||||
|
of the 'takeover window' that would restart it.
|
||||||
|
|
||||||
|
This fix adds a restart of the agent by calling agent_init() after
|
||||||
|
slurmctld_config.shutdown_time has been reset to 0.
|
||||||
|
Should an agent still be running (because it didn't wake up during the
|
||||||
|
'takeover window') it will be caught in agent_init().
|
||||||
|
|
||||||
|
Signed-off-by: Egbert Eich <eich@suse.com>
|
||||||
|
---
|
||||||
|
src/slurmctld/backup.c | 4 ++++
|
||||||
|
1 file changed, 4 insertions(+)
|
||||||
|
diff --git a/src/slurmctld/backup.c b/src/slurmctld/backup.c
|
||||||
|
index 24ddcde..cf3bb43 100644
|
||||||
|
--- a/src/slurmctld/backup.c
|
||||||
|
+++ b/src/slurmctld/backup.c
|
||||||
|
@@ -65,6 +65,7 @@
|
||||||
|
#include "src/slurmctld/read_config.h"
|
||||||
|
#include "src/slurmctld/slurmctld.h"
|
||||||
|
#include "src/slurmctld/trigger_mgr.h"
|
||||||
|
+#include "src/slurmctld/agent.h"
|
||||||
|
|
||||||
|
#define SHUTDOWN_WAIT 2 /* Time to wait for primary server shutdown */
|
||||||
|
|
||||||
|
@@ -225,6 +226,9 @@ void run_backup(slurm_trigger_callbacks_t *callbacks)
|
||||||
|
abort();
|
||||||
|
}
|
||||||
|
slurmctld_config.shutdown_time = (time_t) 0;
|
||||||
|
+ /* Reinit agent in case it has been terminated - agent_init()
|
||||||
|
+ will check itself */
|
||||||
|
+ agent_init();
|
||||||
|
unlock_slurmctld(config_write_lock);
|
||||||
|
select_g_select_nodeinfo_set_all();
|
||||||
|
|
Loading…
Reference in New Issue
Block a user