SHA256
1
0
forked from pool/slurm
Commit Graph

302 Commits

Author SHA256 Message Date
Dominique Leuenberger
0793824683 Accepting request 932162 from network:cluster
OBS-URL: https://build.opensuse.org/request/show/932162
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=68
2021-11-21 22:51:50 +00:00
350be975f5 Accepting request 932063 from home:aginies:branches:network:cluster
add a ref to SLE-22741 (firewall config) in changelog

OBS-URL: https://build.opensuse.org/request/show/932063
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=194
2021-11-18 09:37:45 +00:00
d4c2b2bcf3 - updated to 21.08.4 which fixes (CVE-2021-43337) which is only present
in 21.08 tree.
  * CVE-2021-43337:
    For sites using the new AccountingStoreFlags=job_script and/or job_env
    options, an issue was reported with the access control rules in SlurmDBD
    that will permit users to request job scripts and environment files that
    they should not have access to. (Scripts/environments are meant to only be
    accessible by user accounts with administrator privileges, by account
    coordinators for jobs submitted under their account, and by the user
    themselves.)
- changes from 21.08.3:
  * This includes a number of fixes since the last release a month ago,
    including one critical fix to prevent a communication issue between
    slurmctld and slurmdbd for sites that have started using the new
    AccountingStoreFlags=job_script functionality.

OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=193
2021-11-17 08:37:51 +00:00
Dominique Leuenberger
147f929296 Accepting request 928192 from network:cluster
- Utilize sysuser infrastructure to set user/group slurm.
  For munge authentication slurm should have a fixed UID across
  all nodes including the management server. Set it to 120
- Limit firewalld service definitions to SUSE versions >= 15. (forwarded request 928191 from eeich)

OBS-URL: https://build.opensuse.org/request/show/928192
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=67
2021-10-29 20:34:40 +00:00
c67f43163f Accepting request 928191 from home:eeich:branches:network:cluster
- Utilize sysuser infrastructure to set user/group slurm.
  For munge authentication slurm should have a fixed UID across
  all nodes including the management server. Set it to 120
- Limit firewalld service definitions to SUSE versions >= 15.

OBS-URL: https://build.opensuse.org/request/show/928191
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=192
2021-10-29 17:38:05 +00:00
f4a3f06e75 Accepting request 926016 from home:mslacken:branches:network:cluster
- added service definitions for firewalld

OBS-URL: https://build.opensuse.org/request/show/926016
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=191
2021-10-29 14:17:34 +00:00
Dominique Leuenberger
2cf5062473 Accepting request 924633 from network:cluster
OBS-URL: https://build.opensuse.org/request/show/924633
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=66
2021-10-11 13:31:58 +00:00
7a20fda376 Accepting request 923425 from home:mslacken:branches:network:cluster
- update to 21.08.2 
- major change:
  * removed of support of the TaskAffinity=yes option in cgroup.conf. Please
    consider using "TaskPlugins=cgroup,affinity" in slurm.conf as an option.
- minor changes and bugfixes:
  * slurmctld - fix how the max number of cores on a node in a partition are
    calculated when the partition contains multi*socket nodes. This in turn
    corrects certain jobs node count estimations displayed client*side.
  * job_submit/cray_aries - fix "craynetwork" GRES specification after changes
    introduced in 21.08.0rc1 that made TRES always have a type prefix.
  * Ignore nonsensical check in the slurmd for [Pro|Epi]logSlurmctld.
  * Fix writing to stderr/syslog when systemd runs slurmctld in the foreground.
  * Fix issue with updating job started with node range.
  * Fix issue with nodes not clearing state in the database when the slurmctld
    is started with clean*start.
  * Fix hetjob components > 1 timing out due to InactiveLimit.
  * Fix sprio printing -nan for normalized association priority if
    PriorityWeightAssoc was not defined.
  * Disallow FirstJobId=0.
  * Preserve job start info in the database for a requeued job that hadn't
    registered the first time in the database yet.
  * Only send one message on prolog failure from the slurmd.
  * Remove support for TaskAffinity=yes in cgroup.conf.
  * accounting_storage/mysql - fix issue where querying jobs via sacct
    *-whole-hetjob=yes or slurmrestd (which automatically includes this flag)
    could in some cases return more records than expected.
  * Fix issue for preemption of job array task that makes afterok dependency
    fail. Additionally, send emails when requeueing happens due to preemption.
  * Fix sending requeue mail type.
  * Properly resize a job's GRES bitmaps and counts when resizing the job.

OBS-URL: https://build.opensuse.org/request/show/923425
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=190
2021-10-11 08:40:56 +00:00
Dominique Leuenberger
ad0b52bd59 Accepting request 922117 from network:cluster
- moved pam module from /lib64 to /usr/lib64 which fixes boo#1191095 
  via the macro %_pam_moduledir

OBS-URL: https://build.opensuse.org/request/show/922117
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=65
2021-09-29 18:18:55 +00:00
64b9f7f60a macro fixed
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=189
2021-09-29 07:35:03 +00:00
1b26b8910b via the macro %_pam_moduledir
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=188
2021-09-29 07:08:48 +00:00
728a1b3c1e updated major version
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=187
2021-09-28 15:54:50 +00:00
Dominique Leuenberger
de212e226d Accepting request 921717 from network:cluster
OBS-URL: https://build.opensuse.org/request/show/921717
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=64
2021-09-27 18:08:57 +00:00
5b07269e3d Accepting request 919668 from home:mslacken:branches:network:cluster
- updated to 21.08.1 with following bug fixes:
  * Fix potential memory leak if a problem happens while allocating GRES for
    a job.
  * If an overallocation of GRES happens terminate the creation of a job.
  * AutoDetect=nvml: Fatal if no devices found in MIG mode.
  * Print federation and cluster sacctmgr error messages to stderr.
  * Fix off by one error in --gpu-bind=mask_gpu.
  * Add --gpu-bind=none to disable gpu binding when using --gpus-per-task.
  * Handle the burst buffer state "alloc-revoke" which previously would not
    display in the job correctly.
  * Fix issue in the slurmstepd SPANK prolog/epilog handler where configuration
    values were used before being initialized.
  * Restore a step's ability to utilize all of an allocations memory if --mem=0.
  * Fix --cpu-bind=verbose garbage taskid.
  * Fix cgroup task affinity issues from garbage taskid info.
  * Make gres_job_state_validate() client logging behavior as before 44466a4641.
  * Fix steps with --hint overriding an allocation with --threads-per-core.
  * Require requesting a GPU if --mem-per-gpu is requested.
  * Return error early if a job is requesting --ntasks-per-gpu and no gpus or
    task count.
  * Properly clear out pending step if unavailable to run with available
    resources.
  * Kill all processes spawned by burst_buffer.lua including decendents.
  * openapi/v0.0.{35,36,37} - Avoid setting default values of min_cpus,
    job name, cwd, mail_type, and contiguous on job update.
  * openapi/v0.0.{35,36,37} - Clear user hold on job update if hold=false.
  * Prevent CRON_JOB flag from being cleared when loading job state.
  * sacctmgr - Fix deleting WCKeys when not specifying a cluster.
  * Fix getting memory for a step when the first node in the step isn't the
    first node in the allocation.

OBS-URL: https://build.opensuse.org/request/show/919668
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=186
2021-09-27 09:23:35 +00:00
Dominique Leuenberger
8a0b85fee5 Accepting request 917457 from network:cluster
OBS-URL: https://build.opensuse.org/request/show/917457
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=63
2021-09-08 19:36:49 +00:00
e22daa9ce5 Accepting request 917243 from home:eeich:branches:network:cluster
- Fix-statement-condition-in-netloc-autoconf-macro.patch:
  Fix netloc check, reestablish netloc disable code.
- Make configure arg '--with-pmix' conditional.
- Move openapi plugins to package slurm-restd.

OBS-URL: https://build.opensuse.org/request/show/917243
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=185
2021-09-08 07:34:10 +00:00
Dominique Leuenberger
b0f9c9aa23 Accepting request 917119 from network:cluster
OBS-URL: https://build.opensuse.org/request/show/917119
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=62
2021-09-07 19:21:20 +00:00
562a595d05 Accepting request 915777 from home:mslacken:slurm_update
- updated to 21.08.1, major changes:
  * A new "AccountingStoreFlags=job_script" option to store the job scripts
    directly in SlurmDBD.
  * Added "sacct -o SubmitLine" format option to get the submit line 
    of a job/step.
  * Changes to the node state management so that nodes are marked as PLANNED
    instead of IDLE if the scheduler is still accumulating resources while
    waiting to launch a job on them.
  * RS256 token support in auth/jwt.
  * Overhaul of the cgroup subsystems to simplify operation, mitigate a number
    of inherent race conditions, and prepare for future cgroup v2 support.
  * Further improvements to cloud node power state management.
  * A new child process of the Slurm controller called "slurmscriptd"
    responsible for executing PrologSlurmctld and EpilogSlurmctld scripts,
    which significantly reduces performance issues associated with enabling
    those options.
  * A new burst_buffer/lua plugin allowing for site-specific asynchronous job
    data management.
  * Fixes to the job_container/tmpfs plugin to allow the slurmd process to be
    restarted while the job is running without issue.
  * Added json/yaml output to sacct, squeue, and sinfo commands.
  * Added a new node_features/helpers plugin to provide a generic way to change
    settings on a compute node across a reboot.
  * Added support for automatically detecting and broadcasting shared libraries
    for an executable launched with "srun --bcast".
  * Added initial OCI container execution support with a new --container option
    to sbatch and srun.
  * Improved "configless" support by allowing multiple control servers to be
    specified through the slurmd --conf-server option, and send additional
    configuration files at startup including cli_filter.lua.

OBS-URL: https://build.opensuse.org/request/show/915777
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=184
2021-09-06 13:29:00 +00:00
Dominique Leuenberger
2c3271fa4b Accepting request 903746 from network:cluster
OBS-URL: https://build.opensuse.org/request/show/903746
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=61
2021-07-03 18:50:46 +00:00
b61c5b25fa Accepting request 903744 from home:mslacken:slurm_update
- Updated to  20.11.8:
  * slurmctld - fix erroneous "StepId=CORRUPT" messages in error logs.
  * Correct the error given when auth plugin fails to pack a credential.
  * Fix unused-variable compiler warning on FreeBSD in fd_resolve_path().
  * acct_gather_filesystem/lustre - only emit collection error once per step.
  * Add GRES environment variables (e.g., CUDA_VISIBLE_DEVICES) into the
    interactive step, the same as is done for the batch step.
  * Fix various potential deadlocks when altering objects in the database
    dealing with every cluster in the database.
  * slurmrestd:
   - handle slurmdbd connection failures without segfaulting.
   - fix segfault for searches in slurmdb/v0.0.36/jobs.
   - remove (non-functioning) users query parameter for
     slurmdb/v0.0.36/jobs from openapi.json
   - fix segfault in slurmrestd db/jobs with numeric queries
   - add argv handling for job/submit endpoint.
   - add description for slurmdb/job endpoint.
  * slurmrestd/dbv0.0.36:
   - Fix values dumped in job state/current and
     job step state.
   - Correct description for previous state property.
  * srun:
   - fix broken node step allocation in a heterogeneous allocation.
   - leave SLURM_DIST_UNKNOWN as default for --interactive.
  * Fail step creation if -n is not multiple of --ntasks-per-gpu.
  * job_container/tmpfs - Fix slowdown on teardown.
  * Fix problem with SlurmctldProlog where requeued jobs would never launch.
  * job_container/tmpfs - Fix issue when restarting slurmd where the namespace
    mount points could disappear.
  * sacct:

OBS-URL: https://build.opensuse.org/request/show/903744
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=183
2021-07-02 15:32:26 +00:00
Dominique Leuenberger
fa3ad08714 Accepting request 894432 from network:cluster
- New features in 20.11.7:
- New features in 20.11.6:
- Fix Provides:/Conflicts: for libnss_slurm (bsc#1180700).

OBS-URL: https://build.opensuse.org/request/show/894432
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=60
2021-05-20 17:25:01 +00:00
b4f7e9209d - New features in 20.11.7:
- New features in 20.11.6:
- Fix Provides:/Conflicts: for libnss_slurm (bsc#1180700).

OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=181
2021-05-19 18:34:28 +00:00
Dominique Leuenberger
8c583ef6c0 Accepting request 893087 from network:cluster
- Updated to 20.11.7 which fixes CVE-2021-31215 (bsc#1186024)
- New featuresi from 20.11.7:
 * slurmd - handle configless failures gracefully instead of hanging
   indefinitely.
 * select/cons_tres - fix Dragonfly topology not selecting nodes in the same
   leaf switch when it should as well as requests with *-switches option.
 * Fix issue where certain step requests wouldn't run if the first node in the
   job allocation was full and there were idle resources on other nodes in
   the job allocation.
 * Fix deadlock issue with <Prolog|Epilog>Slurmctld.
 * torque/qstat - fix printf error message in output.
 * When adding associations or wckeys avoid checking multiple times a user or
   cluster name.
 * Fix wrong jobacctgather information on a step on multiple nodes
   due to timeouts sending its the information gathered on its node.
 * Fix missing xstrdup which could result in slurmctld segfault on array jobs.
 * Fix security issue in PrologSlurmctld and EpilogSlurmctld by always
   prepending SPANK_ to all user-set environment variables. CVE-2021-31215.
- New features from 20.11.6:
 * Fix sacct assert with the --qos option.
 * Use pkg-config --atleast-version instead of --modversion for systemd.
 * common/fd - fix getsockopt() call in fd_get_socket_error().
 * Properly handle the return from fd_get_socket_error() in _conn_readable().
 * cons_res - Fix issue where running jobs were not taken into consideration
   when creating a reservation.
 * Avoid a deadlock between job_list for_each and assoc QOS_LOCK.
 * Fix TRESRunMins usage for partition qos on restart/reconfig.
 * Fix printing of number of tasks on a completed job that didn't request
   tasks.
 * Fix updating GrpTRESRunMins when decrementing job time is bigger than it.

OBS-URL: https://build.opensuse.org/request/show/893087
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=59
2021-05-14 23:24:22 +00:00
89b4ed3f9f - Updated to 20.11.7 which fixes CVE-2021-31215 (bsc#1186024)
- New featuresi from 20.11.7:
 * slurmd - handle configless failures gracefully instead of hanging
   indefinitely.
 * select/cons_tres - fix Dragonfly topology not selecting nodes in the same
   leaf switch when it should as well as requests with *-switches option.
 * Fix issue where certain step requests wouldn't run if the first node in the
   job allocation was full and there were idle resources on other nodes in
   the job allocation.
 * Fix deadlock issue with <Prolog|Epilog>Slurmctld.
 * torque/qstat - fix printf error message in output.
 * When adding associations or wckeys avoid checking multiple times a user or
   cluster name.
 * Fix wrong jobacctgather information on a step on multiple nodes
   due to timeouts sending its the information gathered on its node.
 * Fix missing xstrdup which could result in slurmctld segfault on array jobs.
 * Fix security issue in PrologSlurmctld and EpilogSlurmctld by always
   prepending SPANK_ to all user-set environment variables. CVE-2021-31215.
- New features from 20.11.6:
 * Fix sacct assert with the --qos option.
 * Use pkg-config --atleast-version instead of --modversion for systemd.
 * common/fd - fix getsockopt() call in fd_get_socket_error().
 * Properly handle the return from fd_get_socket_error() in _conn_readable().
 * cons_res - Fix issue where running jobs were not taken into consideration
   when creating a reservation.
 * Avoid a deadlock between job_list for_each and assoc QOS_LOCK.
 * Fix TRESRunMins usage for partition qos on restart/reconfig.
 * Fix printing of number of tasks on a completed job that didn't request
   tasks.
 * Fix updating GrpTRESRunMins when decrementing job time is bigger than it.

OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=179
2021-05-14 10:35:47 +00:00
Dominique Leuenberger
7cb151db13 Accepting request 890262 from network:cluster
- Ship REST API version and auth plugins with slurmrestd.
- Add YAML support for REST API to build (bsc#1185603). (forwarded request 890261 from eeich)

OBS-URL: https://build.opensuse.org/request/show/890262
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=58
2021-05-04 20:00:59 +00:00
47fc726263 Accepting request 890261 from home:eeich:branches:network:cluster
- Ship REST API version and auth plugins with slurmrestd.
- Add YAML support for REST API to build (bsc#1185603).

OBS-URL: https://build.opensuse.org/request/show/890261
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=177
2021-05-04 08:36:53 +00:00
Dominique Leuenberger
a4d0f3eef7 Accepting request 879660 from network:cluster
OBS-URL: https://build.opensuse.org/request/show/879660
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=57
2021-03-17 19:16:54 +00:00
Ana Guerrero
ff5dc58526 Accepting request 879659 from home:anag:branches:home:mslacken:slurm_up
update + typo fix

OBS-URL: https://build.opensuse.org/request/show/879659
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=175
2021-03-17 10:26:51 +00:00
Richard Brown
7f8f9f1010 Accepting request 874787 from network:cluster
OBS-URL: https://build.opensuse.org/request/show/874787
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=56
2021-02-25 17:27:59 +00:00
927cd6ab24 Accepting request 874647 from home:mslacken:branches:network:cluster
- Udpate to 20.11.04
 * Fix node selection for advanced reservations with features.
 * mpi/pmix: Handle pipe failure better when using ucx.
 * mpi/pmix: include PMIX_NODEID for each process entry.
 * Fix job getting rejected after being requeued on same node that died.
 * job_submit/lua - add "network" field.
 * Fix situations when a reoccuring reservation could erroneously skip a
   period.
 * Ensure that a reservations [pro|epi]log are ran on reoccuring reservations.
 * Fix threads-per-core memory allocation issue when using CR_CPU_MEMORY.
 * Fix scheduling issue with --gpus.
 * Fix gpu allocations that request --cpus-per-task.
 * mpi/pmix: fixed print messages for all PMIXP_* macros
 * Add mapping for XCPU to --signal option.
 * Fix regression in 20.11 that prevented a full pass of the main scheduler
   from ever executing.
 * Work around a glibc bug in which "0" is incorrectly printed as "nan"
   which will result in corrupted association state on restart.
 * Fix regression in 20.11 which made slurmd incorrectly attempt to find the
   parent slurmd address when not applicable and send incorrect reverse*tree
   info to the slurmstepd.
 * Fix cgroup ns detection when using containers (e.g. LXC or Docker).
 * scrontab - change temporary file handling to work with emacs. 
- Removed check-for-lipmix.so.MAJOR.patch
- Added: load-pmix-major-version.patch

OBS-URL: https://build.opensuse.org/request/show/874647
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=173
2021-02-24 09:49:16 +00:00
Dominique Leuenberger
1a5fe227cc Accepting request 865000 from network:cluster
OBS-URL: https://build.opensuse.org/request/show/865000
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=55
2021-01-20 17:29:15 +00:00
Ana Guerrero
4ab9986278 Accepting request 864993 from home:anag:branches:network:cluster
- Update to 20.11.03
- This release includes a major functional change to how job step launch is 
  handled compared to the previous 20.11 releases. This affects srun as 
  well as MPI stacks - such as Open MPI - which may use srun internally as 
  part of the process launch.
  One of the changes made in the Slurm 20.11 release was to the semantics 
  for job steps launched through the 'srun' command. This also 
  inadvertently impacts many MPI releases that use srun underneath their 
  own mpiexec/mpirun command.
  For 20.11.{0,1,2} releases, the default behavior for srun was changed  
  such that each step was allocated exactly what was requested by the 
  options given to srun, and did not have access to all resources assigned 
  to the job on the node by default. This change was equivalent to Slurm 
  setting the --exclusive option by default on all job steps. Job steps 
  desiring all resources on the node needed to explicitly request them 
  through the new '--whole' option.
  In the 20.11.3 release, we have reverted to the 20.02 and older behavior 
  of assigning all resources on a node to the job step by default.
  This reversion is a major behavioral change which we would not generally 
  do on a maintenance release, but is being done in the interest of 
  restoring compatibility with the large number of existing Open MPI (and 
  other MPI flavors) and job scripts that exist in production, and to 
  remove what has proven to be a significant hurdle in moving to the new 
  release.
  Please note that one change to step launch remains - by default, in 
  20.11 steps are no longer permitted to overlap on the resources they 
  have been assigned. If that behavior is desired, all steps must 
  explicitly opt-in through the newly added '--overlap' option.
  Further details and a full explanation of the issue can be found at:
  https://bugs.schedmd.com/show_bug.cgi?id=10383#c63

OBS-URL: https://build.opensuse.org/request/show/864993
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=171
2021-01-20 13:58:46 +00:00
Dominique Leuenberger
9123a17403 Accepting request 861777 from network:cluster
- Fix fallout introduced by:
  "Replace  '%service_del_postun -n' with '%service_del_postun_without_restart'"
  for older Leap/SLE versions. (forwarded request 861776 from eeich)

OBS-URL: https://build.opensuse.org/request/show/861777
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=54
2021-01-10 18:43:37 +00:00
82c61d739d Accepting request 861776 from home:eeich:branches:network:cluster
- Fix fallout introduced by:
  "Replace  '%service_del_postun -n' with '%service_del_postun_without_restart'"
  for older Leap/SLE versions.

OBS-URL: https://build.opensuse.org/request/show/861776
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=169
2021-01-08 17:40:48 +00:00
Dominique Leuenberger
7cf27bf750 Accepting request 861655 from network:cluster
- Fix Provides:/Conflicts: for libnss_slurm.

- Replace  '%service_del_postun -n' with '%service_del_postun_without_restart'
  '-n' is deprecated and will be removed in the future.

OBS-URL: https://build.opensuse.org/request/show/861655
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=53
2021-01-08 16:39:19 +00:00
0d02ad4cfa - Fix Provides:/Conflicts: for libnss_slurm.
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=167
2021-01-08 12:21:49 +00:00
c50d4048dc Accepting request 845752 from home:fbui:branches:network:cluster
- Replace  '%service_del_postun -n' with '%service_del_postun_without_restart'
  '-n' is deprecated and will be removed in the future.

OBS-URL: https://build.opensuse.org/request/show/845752
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=166
2021-01-08 12:18:52 +00:00
Dominique Leuenberger
a8ec215de5 Accepting request 860691 from network:cluster
OBS-URL: https://build.opensuse.org/request/show/860691
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=52
2021-01-06 18:57:04 +00:00
Ana Guerrero
08c7233b38 Accepting request 860690 from home:anag:branches:network:cluster
- Add support for configuration files from external plugins. 
  While built-in plugins have their configuration added in slurm.conf,
  external SPANK plugins add their configuration to plugstack.conf
  To allow packaging easily spank plugins, their configuration files
  should be added independently at /etc/spack/plugstack.conf.d and
  plugstack.conf should be left with an oneliner including all the
  files under /etc/spack/plugstack.conf.d

OBS-URL: https://build.opensuse.org/request/show/860690
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=164
2021-01-06 10:42:08 +00:00
Dominique Leuenberger
3f82fae399 Accepting request 859115 from network:cluster
OBS-URL: https://build.opensuse.org/request/show/859115
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=51
2020-12-29 14:52:58 +00:00
Ana Guerrero
caa18eaeaa Accepting request 859114 from home:anag:branches:network:cluster
- Update to 20.11.02 
  * Fix older versions of sacct not working with 20.11.
  * Fix slurmctld crash when using a pre-20.11 srun in a job allocation.
  * Correct logic problem in _validate_user_access.
  * Fix libpmi to initialize Slurm configuration correctly.
- Update to 20.11.01
  * Fix spelling of "overcomited" to "overcomitted" in sreport's cluster
    utilization report.
  * Silence debug message about shutting down backup controllers if none are
    configured.
  * Don't create interactive srun until PrologSlurmctld is done.
  * Fix fd symlink path resolution.
  * Fix slurmctld segfault on subnode reservation restore after node
    configuration change.
  * Fix resource allocation response message environment allocation size.
  * Ensure that details->env_sup is NULL terminated.
  * select/cray_aries - Correctly remove jobs/steps from blades using NPC.
  * cons_tres - Avoid max_node_gres when entire node is allocated with
    --ntasks-per-gpu.
  * Allow NULL arg to data_get_type().
  * In sreport have usage for a reservation contain all jobs that ran in the
    reservation instead of just the ones that ran in the time specified. This
    matches the report for the reservation is not truncated for a time period.
  * Fix issue with sending wrong batch step id to a < 20.11 slurmd.
  * Add a job's alloc_node to lua for job modification and completion.
  * Fix regression getting a slurmdbd connection through the perl API.
  * Stop the extern step terminate monitor right after proctrack_g_wait().
  * Fix removing the normalized priority of assocs.
  * slurmrestd/v0.0.36 - Use correct name for partition field:
    "min nodes per job" -"min_nodes_per_job".

OBS-URL: https://build.opensuse.org/request/show/859114
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=162
2020-12-29 03:15:30 +00:00
Dominique Leuenberger
72671d260f Accepting request 853268 from network:cluster
- Update to version 20.11.0
  Slurm 20.11 includes a number of new features including:
  * Overhaul of the job step management and launch code, alongside improved
    GPU task placement support.
  * A new "Interactive Step" mode of operation for salloc.
  * A new "scrontab" command that can be used to submit and manage
    periodically repeating jobs.
  * IPv6 support.
  * Changes to the reservation logic, with new options allowing users
    to delete reservations, allowing admins to skip the next occurance of a
    repeated reservation, and allowing for a job to be submitted and eligible
    to run within multiple reservations.
  * Dynamic Future Nodes - automatically associate a dynamically
    provisioned (or "cloud") node against a NodeName definition with matching
    hardware.
  * An experimental new RPC queuing mode for slurmctld to reduce thread
    contention on heavily loaded clusters.
  * SlurmDBD integration with the Slurm REST API.
  Also check
  https://github.com/SchedMD/slurm/blob/slurm-20-11-0-1/RELEASE_NOTES (forwarded request 852039 from eeich)

OBS-URL: https://build.opensuse.org/request/show/853268
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=50
2020-12-05 19:37:37 +00:00
d5d3aa2162 Accepting request 852039 from home:eeich:branches:network:cluster
- Update to version 20.11.0
  Slurm 20.11 includes a number of new features including:
  * Overhaul of the job step management and launch code, alongside improved
    GPU task placement support.
  * A new "Interactive Step" mode of operation for salloc.
  * A new "scrontab" command that can be used to submit and manage
    periodically repeating jobs.
  * IPv6 support.
  * Changes to the reservation logic, with new options allowing users
    to delete reservations, allowing admins to skip the next occurance of a
    repeated reservation, and allowing for a job to be submitted and eligible
    to run within multiple reservations.
  * Dynamic Future Nodes - automatically associate a dynamically
    provisioned (or "cloud") node against a NodeName definition with matching
    hardware.
  * An experimental new RPC queuing mode for slurmctld to reduce thread
    contention on heavily loaded clusters.
  * SlurmDBD integration with the Slurm REST API.
  Also check
  https://github.com/SchedMD/slurm/blob/slurm-20-11-0-1/RELEASE_NOTES

OBS-URL: https://build.opensuse.org/request/show/852039
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=160
2020-12-05 14:46:07 +00:00
Dominique Leuenberger
b50ab109f0 Accepting request 849253 from network:cluster
OBS-URL: https://build.opensuse.org/request/show/849253
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=49
2020-11-19 10:59:53 +00:00
Ana Guerrero
370ac32279 Accepting request 849252 from home:anag:branches:network:cluster
- Updated to 20.02.6, addresses two security fixes:
  * PMIx - fix potential buffer overflows from use of unpackmem().
    CVE-2020-27745 (bsc#1178890)
  * X11 forwarding - fix potential leak of the magic cookie when sent as an
     argument to the xauth command. CVE-2020-27746 (bsc#1178891)
- And many other bugfixes, full log and details available at:
  * https://lists.schedmd.com/pipermail/slurm-announce/2020/000045.html

OBS-URL: https://build.opensuse.org/request/show/849252
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=158
2020-11-18 09:57:56 +00:00
Dominique Leuenberger
c14773b6e9 Accepting request 845437 from network:cluster
OBS-URL: https://build.opensuse.org/request/show/845437
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=48
2020-11-03 14:22:10 +00:00
e481851f5a Accepting request 845108 from home:anag:branches:network:cluster
- Updated to 20.02.5, changes:
 * Fix leak of TRESRunMins when job time is changed with --time-min
 * pam_slurm - explicitly initialize slurm config to support configless mode.
 * scontrol - Fix exit code when creating/updating reservations with wrong
   Flags.
 * When a GRES has a no_consume flag, report 0 for allocated.
 * Fix cgroup cleanup by jobacct_gather/cgroup.
 * When creating reservations/jobs don't allow counts on a feature unless
   using an XOR.
 * Improve number of boards discovery
 * Fix updating a reservation NodeCnt on a zero-count reservation.
 * slurmrestd - provide an explicit error messages when PSK auth fails.
 * cons_tres - fix job requesting single gres per-node getting two or more
   nodes with less CPUs than requested per-task.
 * cons_tres - fix calculation of cores when using gres and cpus-per-task.
 * cons_tres - fix job not getting access to socket without GPU or with less
   than --gpus-per-socket when not enough cpus available on required socket
   and not using --gres-flags=enforce binding.
 * Fix HDF5 type version build error.
 * Fix creation of CoreCnt only reservations when the first node isn't
   available.
 * Fix wrong DBD Agent queue size in sdiag when using accounting_storage/none.
 * Improve job constraints XOR option logic.
 * Fix preemption of hetjobs when needed nodes not in leader component.
 * Fix wrong bit_or() messing potential preemptor jobs node bitmap, causing
   bad node deallocations and even allocation of nodes from other partitions.
 * Fix double-deallocation of preempted non-leader hetjob components.
 * slurmdbd - prevent truncation of the step nodelists over 4095.
 * Fix nodes remaining in drain state state after rebooting with ASAP option.
 - changes from 20.02.4:

OBS-URL: https://build.opensuse.org/request/show/845108
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=156
2020-11-02 13:42:03 +00:00
Dominique Leuenberger
b0358c26fb Accepting request 819285 from network:cluster
- Add support for openPMIx also for Leap/SLE 15.0/1 (bsc#1173805).
- Do not run %check on SLE-12-SP2: Some incompatibility in tcl
  makes this fail.
- Remove unneeded build dependency to postgresql-devel.
- Disable build on s390 (requires 64bit).

OBS-URL: https://build.opensuse.org/request/show/819285
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/slurm?expand=0&rev=47
2020-07-08 17:16:29 +00:00
e3512185d8 - Disable build on s390 (requires 64bit).
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=154
2020-07-07 20:14:00 +00:00
361d99b111 - Add support for openPMIx also for Leap/SLE 15.0/1 (bsc#1173805).
OBS-URL: https://build.opensuse.org/package/show/network:cluster/slurm?expand=0&rev=153
2020-07-07 16:20:06 +00:00