SHA256
1
0
forked from pool/slurm
Commit Graph

113 Commits

Author SHA256 Message Date
89b4ed3f9f - Updated to 20.11.7 which fixes CVE-2021-31215 (bsc#1186024)
- New featuresi from 20.11.7:
 * slurmd - handle configless failures gracefully instead of hanging
   indefinitely.
 * select/cons_tres - fix Dragonfly topology not selecting nodes in the same
   leaf switch when it should as well as requests with *-switches option.
 * Fix issue where certain step requests wouldn't run if the first node in the
   job allocation was full and there were idle resources on other nodes in
   the job allocation.
 * Fix deadlock issue with <Prolog|Epilog>Slurmctld.
 * torque/qstat - fix printf error message in output.
 * When adding associations or wckeys avoid checking multiple times a user or
   cluster name.
 * Fix wrong jobacctgather information on a step on multiple nodes
   due to timeouts sending its the information gathered on its node.
 * Fix missing xstrdup which could result in slurmctld segfault on array jobs.
 * Fix security issue in PrologSlurmctld and EpilogSlurmctld by always
   prepending SPANK_ to all user-set environment variables. CVE-2021-31215.
- New features from 20.11.6:
 * Fix sacct assert with the --qos option.
 * Use pkg-config --atleast-version instead of --modversion for systemd.
 * common/fd - fix getsockopt() call in fd_get_socket_error().
 * Properly handle the return from fd_get_socket_error() in _conn_readable().
 * cons_res - Fix issue where running jobs were not taken into consideration
   when creating a reservation.
 * Avoid a deadlock between job_list for_each and assoc QOS_LOCK.
 * Fix TRESRunMins usage for partition qos on restart/reconfig.
 * Fix printing of number of tasks on a completed job that didn't request
   tasks.
 * Fix updating GrpTRESRunMins when decrementing job time is bigger than it.

OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=179
2021-05-14 10:35:47 +00:00
47fc726263 Accepting request 890261 from home:eeich:branches:network:cluster
- Ship REST API version and auth plugins with slurmrestd.
- Add YAML support for REST API to build (bsc#1185603).

OBS-URL: https://build.opensuse.org/request/show/890261
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=177
2021-05-04 08:36:53 +00:00
Ana Guerrero
ff5dc58526 Accepting request 879659 from home:anag:branches:home:mslacken:slurm_up
update + typo fix

OBS-URL: https://build.opensuse.org/request/show/879659
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=175
2021-03-17 10:26:51 +00:00
927cd6ab24 Accepting request 874647 from home:mslacken:branches:network:cluster
- Udpate to 20.11.04
 * Fix node selection for advanced reservations with features.
 * mpi/pmix: Handle pipe failure better when using ucx.
 * mpi/pmix: include PMIX_NODEID for each process entry.
 * Fix job getting rejected after being requeued on same node that died.
 * job_submit/lua - add "network" field.
 * Fix situations when a reoccuring reservation could erroneously skip a
   period.
 * Ensure that a reservations [pro|epi]log are ran on reoccuring reservations.
 * Fix threads-per-core memory allocation issue when using CR_CPU_MEMORY.
 * Fix scheduling issue with --gpus.
 * Fix gpu allocations that request --cpus-per-task.
 * mpi/pmix: fixed print messages for all PMIXP_* macros
 * Add mapping for XCPU to --signal option.
 * Fix regression in 20.11 that prevented a full pass of the main scheduler
   from ever executing.
 * Work around a glibc bug in which "0" is incorrectly printed as "nan"
   which will result in corrupted association state on restart.
 * Fix regression in 20.11 which made slurmd incorrectly attempt to find the
   parent slurmd address when not applicable and send incorrect reverse*tree
   info to the slurmstepd.
 * Fix cgroup ns detection when using containers (e.g. LXC or Docker).
 * scrontab - change temporary file handling to work with emacs. 
- Removed check-for-lipmix.so.MAJOR.patch
- Added: load-pmix-major-version.patch

OBS-URL: https://build.opensuse.org/request/show/874647
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=173
2021-02-24 09:49:16 +00:00
Ana Guerrero
4ab9986278 Accepting request 864993 from home:anag:branches:network:cluster
- Update to 20.11.03
- This release includes a major functional change to how job step launch is 
  handled compared to the previous 20.11 releases. This affects srun as 
  well as MPI stacks - such as Open MPI - which may use srun internally as 
  part of the process launch.
  One of the changes made in the Slurm 20.11 release was to the semantics 
  for job steps launched through the 'srun' command. This also 
  inadvertently impacts many MPI releases that use srun underneath their 
  own mpiexec/mpirun command.
  For 20.11.{0,1,2} releases, the default behavior for srun was changed  
  such that each step was allocated exactly what was requested by the 
  options given to srun, and did not have access to all resources assigned 
  to the job on the node by default. This change was equivalent to Slurm 
  setting the --exclusive option by default on all job steps. Job steps 
  desiring all resources on the node needed to explicitly request them 
  through the new '--whole' option.
  In the 20.11.3 release, we have reverted to the 20.02 and older behavior 
  of assigning all resources on a node to the job step by default.
  This reversion is a major behavioral change which we would not generally 
  do on a maintenance release, but is being done in the interest of 
  restoring compatibility with the large number of existing Open MPI (and 
  other MPI flavors) and job scripts that exist in production, and to 
  remove what has proven to be a significant hurdle in moving to the new 
  release.
  Please note that one change to step launch remains - by default, in 
  20.11 steps are no longer permitted to overlap on the resources they 
  have been assigned. If that behavior is desired, all steps must 
  explicitly opt-in through the newly added '--overlap' option.
  Further details and a full explanation of the issue can be found at:
  https://bugs.schedmd.com/show_bug.cgi?id=10383#c63

OBS-URL: https://build.opensuse.org/request/show/864993
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=171
2021-01-20 13:58:46 +00:00
82c61d739d Accepting request 861776 from home:eeich:branches:network:cluster
- Fix fallout introduced by:
  "Replace  '%service_del_postun -n' with '%service_del_postun_without_restart'"
  for older Leap/SLE versions.

OBS-URL: https://build.opensuse.org/request/show/861776
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=169
2021-01-08 17:40:48 +00:00
0d02ad4cfa - Fix Provides:/Conflicts: for libnss_slurm.
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=167
2021-01-08 12:21:49 +00:00
c50d4048dc Accepting request 845752 from home:fbui:branches:network:cluster
- Replace  '%service_del_postun -n' with '%service_del_postun_without_restart'
  '-n' is deprecated and will be removed in the future.

OBS-URL: https://build.opensuse.org/request/show/845752
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=166
2021-01-08 12:18:52 +00:00
Ana Guerrero
08c7233b38 Accepting request 860690 from home:anag:branches:network:cluster
- Add support for configuration files from external plugins. 
  While built-in plugins have their configuration added in slurm.conf,
  external SPANK plugins add their configuration to plugstack.conf
  To allow packaging easily spank plugins, their configuration files
  should be added independently at /etc/spack/plugstack.conf.d and
  plugstack.conf should be left with an oneliner including all the
  files under /etc/spack/plugstack.conf.d

OBS-URL: https://build.opensuse.org/request/show/860690
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=164
2021-01-06 10:42:08 +00:00
Ana Guerrero
caa18eaeaa Accepting request 859114 from home:anag:branches:network:cluster
- Update to 20.11.02 
  * Fix older versions of sacct not working with 20.11.
  * Fix slurmctld crash when using a pre-20.11 srun in a job allocation.
  * Correct logic problem in _validate_user_access.
  * Fix libpmi to initialize Slurm configuration correctly.
- Update to 20.11.01
  * Fix spelling of "overcomited" to "overcomitted" in sreport's cluster
    utilization report.
  * Silence debug message about shutting down backup controllers if none are
    configured.
  * Don't create interactive srun until PrologSlurmctld is done.
  * Fix fd symlink path resolution.
  * Fix slurmctld segfault on subnode reservation restore after node
    configuration change.
  * Fix resource allocation response message environment allocation size.
  * Ensure that details->env_sup is NULL terminated.
  * select/cray_aries - Correctly remove jobs/steps from blades using NPC.
  * cons_tres - Avoid max_node_gres when entire node is allocated with
    --ntasks-per-gpu.
  * Allow NULL arg to data_get_type().
  * In sreport have usage for a reservation contain all jobs that ran in the
    reservation instead of just the ones that ran in the time specified. This
    matches the report for the reservation is not truncated for a time period.
  * Fix issue with sending wrong batch step id to a < 20.11 slurmd.
  * Add a job's alloc_node to lua for job modification and completion.
  * Fix regression getting a slurmdbd connection through the perl API.
  * Stop the extern step terminate monitor right after proctrack_g_wait().
  * Fix removing the normalized priority of assocs.
  * slurmrestd/v0.0.36 - Use correct name for partition field:
    "min nodes per job" -"min_nodes_per_job".

OBS-URL: https://build.opensuse.org/request/show/859114
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=162
2020-12-29 03:15:30 +00:00
d5d3aa2162 Accepting request 852039 from home:eeich:branches:network:cluster
- Update to version 20.11.0
  Slurm 20.11 includes a number of new features including:
  * Overhaul of the job step management and launch code, alongside improved
    GPU task placement support.
  * A new "Interactive Step" mode of operation for salloc.
  * A new "scrontab" command that can be used to submit and manage
    periodically repeating jobs.
  * IPv6 support.
  * Changes to the reservation logic, with new options allowing users
    to delete reservations, allowing admins to skip the next occurance of a
    repeated reservation, and allowing for a job to be submitted and eligible
    to run within multiple reservations.
  * Dynamic Future Nodes - automatically associate a dynamically
    provisioned (or "cloud") node against a NodeName definition with matching
    hardware.
  * An experimental new RPC queuing mode for slurmctld to reduce thread
    contention on heavily loaded clusters.
  * SlurmDBD integration with the Slurm REST API.
  Also check
  https://github.com/SchedMD/slurm/blob/slurm-20-11-0-1/RELEASE_NOTES

OBS-URL: https://build.opensuse.org/request/show/852039
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=160
2020-12-05 14:46:07 +00:00
Ana Guerrero
370ac32279 Accepting request 849252 from home:anag:branches:network:cluster
- Updated to 20.02.6, addresses two security fixes:
  * PMIx - fix potential buffer overflows from use of unpackmem().
    CVE-2020-27745 (bsc#1178890)
  * X11 forwarding - fix potential leak of the magic cookie when sent as an
     argument to the xauth command. CVE-2020-27746 (bsc#1178891)
- And many other bugfixes, full log and details available at:
  * https://lists.schedmd.com/pipermail/slurm-announce/2020/000045.html

OBS-URL: https://build.opensuse.org/request/show/849252
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=158
2020-11-18 09:57:56 +00:00
e481851f5a Accepting request 845108 from home:anag:branches:network:cluster
- Updated to 20.02.5, changes:
 * Fix leak of TRESRunMins when job time is changed with --time-min
 * pam_slurm - explicitly initialize slurm config to support configless mode.
 * scontrol - Fix exit code when creating/updating reservations with wrong
   Flags.
 * When a GRES has a no_consume flag, report 0 for allocated.
 * Fix cgroup cleanup by jobacct_gather/cgroup.
 * When creating reservations/jobs don't allow counts on a feature unless
   using an XOR.
 * Improve number of boards discovery
 * Fix updating a reservation NodeCnt on a zero-count reservation.
 * slurmrestd - provide an explicit error messages when PSK auth fails.
 * cons_tres - fix job requesting single gres per-node getting two or more
   nodes with less CPUs than requested per-task.
 * cons_tres - fix calculation of cores when using gres and cpus-per-task.
 * cons_tres - fix job not getting access to socket without GPU or with less
   than --gpus-per-socket when not enough cpus available on required socket
   and not using --gres-flags=enforce binding.
 * Fix HDF5 type version build error.
 * Fix creation of CoreCnt only reservations when the first node isn't
   available.
 * Fix wrong DBD Agent queue size in sdiag when using accounting_storage/none.
 * Improve job constraints XOR option logic.
 * Fix preemption of hetjobs when needed nodes not in leader component.
 * Fix wrong bit_or() messing potential preemptor jobs node bitmap, causing
   bad node deallocations and even allocation of nodes from other partitions.
 * Fix double-deallocation of preempted non-leader hetjob components.
 * slurmdbd - prevent truncation of the step nodelists over 4095.
 * Fix nodes remaining in drain state state after rebooting with ASAP option.
 - changes from 20.02.4:

OBS-URL: https://build.opensuse.org/request/show/845108
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=156
2020-11-02 13:42:03 +00:00
e3512185d8 - Disable build on s390 (requires 64bit).
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=154
2020-07-07 20:14:00 +00:00
361d99b111 - Add support for openPMIx also for Leap/SLE 15.0/1 (bsc#1173805).
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=153
2020-07-07 16:20:06 +00:00
4b04d88697 Accepting request 819233 from home:eeich:branches:network:cluster
- Add support for openPMIx also for Leap/SLE 15.0/1.
- Do not run %check on SLE-12-SP2: Some incompatibility in tcl
  makes this fail.
- Remove unneeded build dependency to postgresql-devel.

OBS-URL: https://build.opensuse.org/request/show/819233
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=152
2020-07-07 13:08:10 +00:00
e8d4b0e920 Accepting request 811475 from home:eeich:branches:network:cluster
- Bring QA to the package build: add %%check stage.
- Remove cruft that isn't needed any longer.
- Add 'ghosted' run-file.
- Add rpmlint filter to handle issues with library packages
  for Leap and enterprise upgrade versions.

- Treat libnss_slurm like any other package: add version string to
  upgrade package.

OBS-URL: https://build.opensuse.org/request/show/811475
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=150
2020-06-17 11:15:39 +00:00
85a31ae1b5 - Updated to 20.02.3 which fixes CVE-2020-12693 (bsc#1172004).
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=148
2020-05-25 05:01:16 +00:00
6f1a2e50da Accepting request 808130 from home:mslacken:branches:network:cluster
- Updated to 20.02.3 which fixes CVE-2020-12693
- Other changes are:
 * Factor in ntasks-per-core=1 with cons_tres.
 * Fix formatting in error message in cons_tres.
 * Fix calling stat on a NULL variable.
 * Fix minor memory leak when using reservations with flags=first_cores.
 * Fix gpu bind issue when CPUs=Cores and ThreadsPerCore > 1 on a node.
 * Fix --mem-per-gpu for heterogenous --gres requests.
 * Fix slurmctld load order in load_all_part_state().
 * Fix race condition not finding jobacct gather task cgroup entry.
 * Suppress error message when selecting nodes on disjoint topologies.
 * Improve performance of _pack_default_job_details() with large number of job
 * arguments.
 * Fix archive loading previous to 17.11 jobs per-node req_mem.
 * Fix regresion validating that --gpus-per-socket requires --sockets-per-node
 * for steps. Should only validate allocation requests.
 * error() instead of fatal() when parsing an invalid hostlist.
 * nss_slurm - fix potential deadlock in slurmstepd on overloaded systems.
 * cons_tres - fix --gres-flags=enforce-binding and related --cpus-per-gres.
 * cons_tres - Allocate lowest numbered cores when filtering cores with gres.
 * Fix getting system counts for named GRES/TRES.
 * MySQL - Fix for handing typed GRES for association rollups.
 * Fix step allocations when tasks_per_core > 1.
 * Fix allocating more GRES than requested when asking for multiple GRES types.

OBS-URL: https://build.opensuse.org/request/show/808130
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=147
2020-05-22 09:31:56 +00:00
8ae99b8cc0 Accepting request 788905 from home:mslacken:branches:network:cluster
- Updated to 20.02.1 with following changes"
 * Improve job state reason for jobs hitting partition_job_depth.
 * Speed up testing of singleton dependencies.
 * Fix negative loop bound in cons_tres.
 * srun - capture the MPI plugin return code from mpi_hook_client_fini() and
   use as final return code for step failure.
 * Fix segfault in cli_filter/lua.
 * Fix --gpu-bind=map_gpu reusability if tasks > elements.
 * Make sure config_flags on a gres are sent to the slurmctld on node
   registration.
 * Prolog/Epilog - Fix missing GPU information.
 * Fix segfault when using config parser for expanded lines.
 * Fix bit overlap test function.
 * Don't accrue time if job begin time is in the future.
 * Remove accrue time when updating a job start/eligible time to the future.
 * Fix regression in 20.02.0 that broke --depend=expand.
 * Reset begin time on job release if it's not in the future.
 * Fix for recovering burst buffers when using high-availability.
 * Fix invalid read due to freeing an incorrectly allocated env array.
 * Update slurmctld -i message to warn about losing data.
 * Fix scontrol cancel_reboot so it clears the DRAIN flag and node reason for a
   pending ASAP reboot.

OBS-URL: https://build.opensuse.org/request/show/788905
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=145
2020-03-27 08:46:13 +00:00
efb023382f Accepting request 783058 from home:eeich:branches:network:cluster
- Remove legacy_cray: with 20.02 the special treatment for
  cray-specific plugins on SLE version prior to 15SP2 is
  no longer required.

OBS-URL: https://build.opensuse.org/request/show/783058
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=143
2020-03-13 17:33:40 +00:00
cf20470554 Accepting request 781517 from home:mslacken:branches:network:cluster
- slurm-plugins will now also require pmix not only libpmix 
  (bsc#1164326)

OBS-URL: https://build.opensuse.org/request/show/781517
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=141
2020-03-05 10:42:25 +00:00
fd9e32c9b0 Accepting request 780353 from home:eeich:branches:network:cluster
- Removed autopatch as it doesn't work for the SLE-11-SP4 build.

- pmix searches now also for libpmix.so.2 so that there is no dependency
  for devel package (bsc#1164386)
  * added patch file check-for-lipmix.so.MAJOR.patch
  * reworded patch file Remove-rpath-from-build.patch to use %autopatch

OBS-URL: https://build.opensuse.org/request/show/780353
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=139
2020-02-28 17:43:45 +00:00
6bfc8d389d Accepting request 780053 from home:kasimir:branches:network:cluster
- Disable %arm builds as this is no longer supported.

OBS-URL: https://build.opensuse.org/request/show/780053
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=137
2020-02-28 07:48:43 +00:00
63d5c47eb1 Accepting request 779379 from home:eeich:branches:network:cluster
- Update to version 20.02.0 (jsc#SLE-8491)
  * Fix minor memory leak in slurmd on reconfig.
  * Fix invalid ptr reference when rolling up data in the database.
  * Change shtml2html.py to require python3 for RHEL8 support, and match
    man2html.py.
  * slurm.spec - override "hardening" linker flags to ensure RHEL8 builds
    in a usable manner.
  * Fix type mismatches in the perl API.
  * Prevent use of uninitialized slurmctld_diag_stats.
  * Fixed various Coverity issues.
  * Only show warning about root-less topology in daemons.
  * Fix accounting of jobs in IGNORE_JOBS reservations.
  * Fix issue with batch steps state not loading correctly when upgrading from
    19.05.
  * Deprecate max_depend_depth in SchedulerParameters and move it to
    DependencyParameters.
  * Silence erroneous error on slurmctld upgrade when loading federation state.
  * Break infinite loop in cons_tres dealing with incorrect tasks per tres
    request resulting in slurmctld hang.
  * Improve handling of --gpus-per-task to make sure appropriate number of GPUs
    is assigned to job.
  * Fix seg fault on cons_res when requesting --spread-job.
- Move to python3 for everything but SLE-11-SP4
  * For SLE-11-SP4 add a workaround to handle a python3 script (python2.7
    compliant).

OBS-URL: https://build.opensuse.org/request/show/779379
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=136
2020-02-26 11:12:32 +00:00
e5be8f4bf8 - Add explicit version dependency to libpmix as well.
'slurm-devel' has a tight version dependency on libpmix -
  allowing multiple libpmix versions in one package repository
  is therefore essential.

OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=135
2020-02-19 21:31:15 +00:00
f9c5d7da3d Accepting request 774250 from home:eeich:branches:network:cluster
- Update to version 20.02.0-rc1
  * sbatch - fix segfault when no newline at the end of a burst buffer file.
  * Change scancel to only check job's base state when matching -t options.
  * Save job dependency list in state files.
  * cons_tres - allow jobs to be run on systems with root-less topologies.
  * Restore pre-20.02pre1 PrologSlurmctld synchonization behavior to avoid
    various race conditions, and ensure proper batch job launch.
  * Add new slurmrestd command/daemon which implements the Slurm REST API.

- Update to version 20.02.0-0pre1, highlights are

OBS-URL: https://build.opensuse.org/request/show/774250
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=134
2020-02-14 07:52:54 +00:00
54640668e5 Accepting request 773459 from home:mslacken:branches:network:cluster
- Updated to version 20.02.0-0pre1, highlights are
  Highlights:
 * Exclusive behavior of a node includes all GRES on a node as well
   as the cpus.
 * Use python3 instead of python for internal build/test scripts.
   The slurm.spec file has been updated to depend on python3 as well.
 * Added new NodeSet configuration option to help simplify partition
   configuration sections for heterogeneous / condo*style clusters.
 * Added slurm.conf option MaxDBDMsgs to control how many messages will be
   stored in the slurmctld before throwing them away when the slurmdbd is down.
 * The checkpoint plugin interface and all associated API calls have been
   removed.
 * slurm_init_job_desc_msg() initializes mail_type as uint16_t. This allows
   mail_type to be set to NONE with scontrol.
 * Add new slurm_spank_log() function to print messages back to the user from
   within a SPANK plugin without prepending "error: " from slurm_error().
 * Enforce having partition name and nodelist=ALL when creating reservations
   with flags=PART_NODES.
 * SPANK - removed never-implemented slurm_spank_slurmd_init() interface. This
   hook has always been accessible through slurm_spank_init() in the
   S_CTX_SLURMD context instead.
 * sbcast - add new BcastAddr option to NodeName lines to allow sbcast traffic
   to flow over an alternate network path.
 * Added auth/jwt plugin, and 'scontrol token' subcommand.  PMIx - improve
 * performance of proc map generation.  Deprecate kill_invalid_depend in
 * SchedulerParameters and move it to a new
   option called DependencyParameters.
 * Enable job dependencies for any job on any cluster in the same federation.
 * Allow clusters to be added automatically to db at startup of ctld.  Add
 * AccountingStorageExternalHost slurm.conf parameter.  The

OBS-URL: https://build.opensuse.org/request/show/773459
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=130
2020-02-11 14:31:26 +00:00
d94a66a178 - standard slurm.conf uses now also SlurmctldHost on all build
targets (bsc#1162377)

OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=128
2020-02-05 15:38:55 +00:00
17b070147f - Fix a missed systemd_requires -> systemd_ordering conversion.
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=126
2020-01-27 08:54:27 +00:00
73e298f12f Accepting request 767005 from home:eeich:branches:network:cluster
- Remove special OHPC compatibility macro: these settings should
  be applied univerally.
- Add a Recommends for mariadb to slurm-slurmdbd: it is recommened
  to run the database on the same machine as the daemon.

OBS-URL: https://build.opensuse.org/request/show/767005
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=124
2020-01-25 06:14:47 +00:00
345d1bbb94 Accepting request 766872 from home:dimstar:Factory
- BuildRequire pkgconfig(systemd) instead of systemd: allow OBS to
  shortcut through the -mini flavors.
- Use systemd_ordering instead of systemd_requires: systemd is
  never a strict requirement; but in case the system is scheduled
  for installation together with systemd, we want systemd to be
  installed prior to slurm.

- start slurmdbd after mariadb (bsc#1161716)

OBS-URL: https://build.opensuse.org/request/show/766872
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=123
2020-01-24 17:12:50 +00:00
995841bad4 Accepting request 766677 from home:mslacken:branches:network:cluster
- start slurmdbd after mariabd (bsc#1161716)

OBS-URL: https://build.opensuse.org/request/show/766677
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=122
2020-01-23 17:49:33 +00:00
c39f0bf6fb - Fix base_ver for SLE 15 SP2.
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=120
2020-01-13 15:42:28 +00:00
0581b91660 Accepting request 762650 from home:eeich:branches:network:cluster
- Update to version 19.05.5 (jsc#SLE-8491)
  * Check %docdir/NEWS for details.
  * Includes security fixes CVE-2019-19727, CVE-2019-19728,
    CVE-2019-12838.
  * Disable i586 builds as this is no longer supported.
  * Create libnss_slurm package to support user and group resolution
    thru slurmstepd.
  * slurm-2.4.4-rpath.patch -> Remove-rpath-from-build.patch
    Obsoleted:
    - pam_slurm_adopt-avoid-running-outside-of-the-sshd-PA.patch
    - pam_slurm_adopt-send_user_msg-don-t-copy-undefined-d.patch
    - pam_slurm_adopt-use-uid-to-determine-whether-root-is.patch

OBS-URL: https://build.opensuse.org/request/show/762650
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=118
2020-01-10 10:38:48 +00:00
69c13014d9 Accepting request 760450 from home:eeich:branches:network:cluster
- Deprecate "ControlMachine" only for SLURM version upgrades and
  products newer than 1501. This ensures that the original setting
  is retained for the SLURM version shipped origianlly with SLE-15-SP1
  or Leap 15.1.

- Update to v18.08.9 for fixing CVE-2019-19728 (bsc#1159692).
  * Wrap END_TIMER{,2,3} macro definition in "do {} while (0)" block.
  * Make sview work with glib2 v2.62.
  * Make Slurm compile on linux after sys/sysctl.h was deprecated.
  * Install slurmdbd.conf.example with 0600 permissions to encourage secure
    use. CVE-2019-19727.
  * srun - do not continue with job launch if --uid fails. CVE-2019-19728.

- added pmix support jsc#SLE-10800 

- Use --with-shared-libslurm to build slurm binaries using libslurm.
- Make libslurm depend on slurm-config.

- Fix ownership of /var/spool/slurm on new installations
  and upgrade (boo#1158696).

- Fix permissions of slurmdbd.conf (bsc#1155784, CVE-2019-19727).
- Fix %posttrans macro _res_update to cope with added newline
  (bsc#1153259).

- Add package slurm-webdoc which sets up a web server to provide
  the documentation for the version shipped.

- Move srun from 'slurm' to 'slurm-node': srun is required on the
  nodes as well so sbatch will work. 'slurm-node' is a requirement

OBS-URL: https://build.opensuse.org/request/show/760450
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=116
2020-01-08 19:27:10 +00:00
163930db89 - Set %base_ver for SLE-15-SP2 to 18.08 (for now).
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=114
2019-10-02 08:27:50 +00:00
e3e7bce7dc Accepting request 731004 from home:eeich:branches:network:cluster
- Edit sample configuration to deprecate "ControlMachine",
  "ControlAddr", "BackupController" and "BackupAddr" in favor
  "SlurmctldHost".

OBS-URL: https://build.opensuse.org/request/show/731004
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=112
2019-09-14 21:47:11 +00:00
9c7abff085 - Updated to 18.08.8 for fixing (CVE-2019-12838, bsc#1140709, jsc#SLE-7341,
jsc#SLE-7342)

OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=110
2019-08-18 20:13:20 +00:00
c0e29e647e - Updated to 18.08.8 for fixing (CVE-2019-12838, bsc#1140709, jre#SLE-7341,
jre#SLE-7342)

OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=109
2019-08-18 18:46:31 +00:00
f2775f6e1e - Fix logic of slurm-munge recommends: slurm-munge requires munge
already, so if we have munge installed we recommend slurm-munge
  as the authentication when installing slurm or slurm-node.

OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=108
2019-08-17 14:25:47 +00:00
89f111874a Accepting request 715613 from home:mslacken:branches:network:cluster
removed explanation of changelog entry

OBS-URL: https://build.opensuse.org/request/show/715613
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=106
2019-07-16 08:32:48 +00:00
5a7922ceef Accepting request 715604 from home:mslacken:branches:network:cluster
- Fixed changelog entry from Jul 11 in order to use the right

OBS-URL: https://build.opensuse.org/request/show/715604
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=105
2019-07-16 08:18:32 +00:00
9d923e48e1 Accepting request 715597 from home:mslacken:branches:network:cluster
- Fixed changelog entry if Jul 11 in order to use the right 
  version slurm 18.08.8

- Updated to 18.08.8 for fixing CVE-2019-12838 and (bsc#1140709)

OBS-URL: https://build.opensuse.org/request/show/715597
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=104
2019-07-16 07:57:42 +00:00
f88a1f8e69 Accepting request 715348 from home:eeich:branches:network:cluster
- Fix build for SLE-11-SP4 and older.

OBS-URL: https://build.opensuse.org/request/show/715348
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=102
2019-07-14 21:25:41 +00:00
257676d4f2 Accepting request 714908 from home:mslacken:branches:network:cluster
- added cray depend libraries to seperate package, as they are now
  built, since json is enabled

- Updated to 18.0.7 for fixing CVE-2019-12838 and (bsc#1140709)
  * Update "xauth list" to use the same 10000ms timeout as the other xauth
    commands.
  * Fix issue in gres code to handle a gres cnt of 0.
  * Don't purge jobs if backfill is running.
  * Verify job is pending add/removing accrual time.
  * Don't abort when the job doesn't have an association that was removed
    before the job was able to make it to the database.
  * Set state_reason if select_nodes() fails job for QOS or Account.
  * Avoid seg_fault on referencing association without a valid_qos bitmap.
  * If Association/QOS is removed on a pending job set that job as ineligible.
  * When changing a jobs account/qos always make sure you remove the old limits.
  * Don't reset a FAIL_QOS or FAIL_ACCOUNT job reason until the qos or
    account changed.
  * Restore "sreport -T ALL" functionality.
  * Correctly typecast signals being sent through the api.
  * Properly initialize structures throughout Slurm.
  * Sync "numtask" squeue format option for jobs and steps to "numtasks".
  * Fix sacct -PD to avoid CA before start jobs.
  * Fix potential deadlock with backup slurmctld.
  * Fixed issue with jobs not appearing in sacct after dependency satisfied.
  * Fix showing non-eligible jobs when asking with -j and not -s.
  * Fix issue with backfill scheduler scheduling tasks of an array
    when not the head job.
  * accounting_storage/mysql - fix SIGABRT in the archive load logic.
  * accounting_storage/mysql - fix memory leak in the archive load logic.
  * Limit records per single SQL statement when loading archived data.

OBS-URL: https://build.opensuse.org/request/show/714908
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=100
2019-07-12 18:09:50 +00:00
fa2138ebce Accepting request 714002 from home:eeich:slurm-staging
- Fix build dependency issue around libibmad-devel introduced
  in SLE-12-SP4.

OBS-URL: https://build.opensuse.org/request/show/714002
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=99
2019-07-08 08:21:33 +00:00
5a25a5ea8b Accepting request 713918 from home:eeich:slurm-staging
- Add BuildRequires to address warnings during build:
  * for libcurl-devel, libssh2-devel and rrdtool-devel
  * for libjson-c-devel and liblz4-devel where available,
    disable these with --without-json and --without-lz4
    where not.
  * disable DataWarp (--without-datawarp).

OBS-URL: https://build.opensuse.org/request/show/713918
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=98
2019-07-08 05:48:14 +00:00
d212ad0245 Accepting request 713773 from home:eeich:branches:network:cluster
- Update SLURM to 18.08.7:
  * Set debug statement to debug2 to avoid benign error messages.
  * Add SchedulerParameters option of bf_hetjob_immediate to attempt to start
    a heterogeneous job as soon as all of its components are determined able
    to do so.
  * Fix underflow causing decay thread to exit.
  * Fix main scheduler not considering hetjobs when building the job queue.
  * Fix regression for sacct to display old jobs without a start time.
  * Fix setting correct number of gres topology bits.
  * Update hetjobs pending state reason when appropriate.
  * Fix accounting_storage/filetxt's understanding of TRES.
  * Set Accrue time when not enforcing limits.
  * Fix srun segfault when requesting a hetjob with test_exec or bcast
    options.
  * Hide multipart priorities log message behind Priority debug flag.
  * sched/backfill - Make hetjobs sensitive to bf_max_job_start.
  * Fix slurmctld segfault due to job's partition pointer NULL dereference.
  * Fix issue with OR'ed job dependencies.
  * Add new job's bit_flags of INVALID_DEPEND to prevent rebuilding a job's
    dependency string when it has at least one invalid and purged dependency.
  * Promote federation unsynced siblings log message from debug to info.
  * burst_buffer/cray - fix slurmctld SIGABRT due to illegal read/writes.
  * burst_buffer/cray - fix memory leak due to unfreed job script content.
  * node_features/knl_cray - fix script_argv use-after-free.
  * burst_buffer/cray - fix script_argv use-after-free.
  * Fix invalid reads of size 1 due to non null-terminated string reads.
  * Add extra debug2 logs to identify why BadConstraints reason is set.

OBS-URL: https://build.opensuse.org/request/show/713773
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=94
2019-07-07 04:27:16 +00:00
0c8ed23dc7 Accepting request 713744 from home:eeich:branches:network:cluster
- Do not build hdf5 support where not available.

OBS-URL: https://build.opensuse.org/request/show/713744
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=93
2019-07-06 20:02:33 +00:00