slurm/slurmd-Fix-for-newer-API-versions.patch

44 lines
1.6 KiB
Diff
Raw Normal View History

Accepting request 435622 from home:eeich:branches:network:cluster - version 15.08.7.1 * Remove the 1024-character limit on lines in batch scripts. task/affinity: Disable core-level task binding if more CPUs required than available cores. * Preemption/gang scheduling: If a job is suspended at slurmctld restart or reconfiguration time, then leave it suspended rather than resume+suspend. * Don't use lower weight nodes for job allocation when topology/tree used. * Don't allow user specified reservation names to disrupt the normal reservation sequeuece numbering scheme. * Avoid hard-link/copy of script/environment files for job arrays. Use the master job record file for all tasks of the job array. NOTE: Job arrays submitted to Slurm version 15.08.6 or later will fail if the slurmctld daemon is downgraded to an earlier version of Slurm. * In slurmctld log file, log duplicate job ID found by slurmd. Previously was being logged as prolog/epilog failure. * If a job is requeued while in the process of being launch, remove it's job ID from slurmd's record of active jobs in order to avoid generating a duplicate job ID error when launched for the second time (which would drain the node). * Cleanup messages when handling job script and environment variables in older directory structure formats. * Prevent triggering gang scheduling within a partition if configured with PreemptType=partition_prio and PreemptMode=suspend,gang. * Decrease parallelism in job cancel request to prevent denial of service when cancelling huge numbers of jobs. * If all ephemeral ports are in use, try using other port numbers. * Prevent "scontrol update job" from updating jobs that have already finished. * Show requested TRES in "squeue -O tres" when job is pending. * Backfill scheduler: Test association and QOS node limits before reserving resources for pending job. * Many bug fixes. - Use source services to download package. - Fix code for new API of hwloc-2.0. - package netloc_to_topology where avialable. - Package documentation. OBS-URL: https://build.opensuse.org/request/show/435622 OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=10
2016-10-16 21:51:20 +02:00
From: Egbert Eich <eich@suse.de>
Date: Fri Oct 14 17:49:13 2016 +0200
Subject: [PATCH] slurmd: Fix for newer API versions
Git-commit: 9f263fa4cd8e9e8090eda2f533294e10ae984190
References:
Signed-off-by: Egbert Eich <eich@suse.com>
Replace hwloc_topology_ignore_type() by hwloc_topology_set_type_filter()
for API versions >= 0x00020000
Signed-off-by: Egbert Eich <eich@suse.de>
---
src/slurmd/common/xcpuinfo.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/src/slurmd/common/xcpuinfo.c b/src/slurmd/common/xcpuinfo.c
index ee213d3..ae9112f 100644
--- a/src/slurmd/common/xcpuinfo.c
+++ b/src/slurmd/common/xcpuinfo.c
@@ -203,8 +203,23 @@ get_cpuinfo(uint16_t *p_cpus, uint16_t *p_boards,
hwloc_topology_set_flags(topology, HWLOC_TOPOLOGY_FLAG_WHOLE_SYSTEM);
/* ignores cache, misc */
+#if HWLOC_API_VERSION < 0x00020000
hwloc_topology_ignore_type (topology, HWLOC_OBJ_CACHE);
hwloc_topology_ignore_type (topology, HWLOC_OBJ_MISC);
+#else
+ hwloc_topology_set_type_filter(topology,HWLOC_OBJ_L1CACHE,
+ HWLOC_TYPE_FILTER_KEEP_NONE);
+ hwloc_topology_set_type_filter(topology,HWLOC_OBJ_L2CACHE,
+ HWLOC_TYPE_FILTER_KEEP_NONE);
+ hwloc_topology_set_type_filter(topology,HWLOC_OBJ_L3CACHE,
+ HWLOC_TYPE_FILTER_KEEP_NONE);
+ hwloc_topology_set_type_filter(topology,HWLOC_OBJ_L4CACHE,
+ HWLOC_TYPE_FILTER_KEEP_NONE);
+ hwloc_topology_set_type_filter(topology,HWLOC_OBJ_L5CACHE,
+ HWLOC_TYPE_FILTER_KEEP_NONE);
+ hwloc_topology_set_type_filter(topology,HWLOC_OBJ_MISC
+ ,HWLOC_TYPE_FILTER_KEEP_NONE);
+#endif
/* load topology */
debug2("hwloc_topology_load");