- Update to version 0.3.27 (jsc#PED-9676):
* General:
+ Reworked the unfinished implementation of `HUGETLB` from GotoBLAS
for allocating huge memory pages as buffers on suitable systems.
+ Changed the unfinished implementation of `GEMM3M` for the generic
target on all architectures to at least forward to regular GEMM.
+ Improved multithreaded `GEMM` performance for large non-skinny
matrices.
+ Improved BLAS3 performance on larger multicore systems through
improved parallelism.
+ Improved performance of the initial memory allocation by reducing
locking overhead.
+ Improved performance of `GBMV` at small problem sizes by introducing
a size barrier for the switch to multithreading.
+ Added an implementation of the `CBLAS_GEMM_BATCH` extension.
+ Fixed corner cases involving the handling of NAN and INFINITY
arguments in `?SCAL` on all architectures.
+ Fixed NAN handling and potential accuracy issues in compilations
with Intel ICX by supplying a suitable fp-model option by default.
+ It is now possible to register a callback function that replaces
the built-in support for multithreading with an external backend
like TBB (`openblas_set_threads_callback_function`).
+ Fixed potential duplication of suffixes in shared library naming.
+ Improved C compiler detection by the build system to tolerate
more naming variants for gcc builds.
+ Fixed an unnecessary dependency of the utest on CBLAS.
+ Fixed spurious error reports from the BLAS extensions `utest`.
+ Fixed unwanted invocation of the `GEMM3M` tests in cross-
compilation.
+ Fixed a flaw in the makefile build that could lead to the (forwarded request 1234589 from eeich)
OBS-URL: https://build.opensuse.org/request/show/1234592
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openblas?expand=0&rev=66
* General:
+ Reworked the unfinished implementation of `HUGETLB` from GotoBLAS
for allocating huge memory pages as buffers on suitable systems.
+ Changed the unfinished implementation of `GEMM3M` for the generic
target on all architectures to at least forward to regular GEMM.
+ Improved multithreaded `GEMM` performance for large non-skinny
matrices.
+ Improved BLAS3 performance on larger multicore systems through
improved parallelism.
+ Improved performance of the initial memory allocation by reducing
locking overhead.
+ Improved performance of `GBMV` at small problem sizes by introducing
a size barrier for the switch to multithreading.
+ Added an implementation of the `CBLAS_GEMM_BATCH` extension.
+ Fixed corner cases involving the handling of NAN and INFINITY
arguments in `?SCAL` on all architectures.
+ Fixed NAN handling and potential accuracy issues in compilations
with Intel ICX by supplying a suitable fp-model option by default.
+ It is now possible to register a callback function that replaces
the built-in support for multithreading with an external backend
like TBB (`openblas_set_threads_callback_function`).
+ Fixed potential duplication of suffixes in shared library naming.
+ Improved C compiler detection by the build system to tolerate
more naming variants for gcc builds.
+ Fixed an unnecessary dependency of the utest on CBLAS.
+ Fixed spurious error reports from the BLAS extensions `utest`.
+ Fixed unwanted invocation of the `GEMM3M` tests in cross-
compilation.
+ Fixed a flaw in the makefile build that could lead to the
OBS-URL: https://build.opensuse.org/package/show/science/openblas?expand=0&rev=184
- Duplicate all options passed to `make` also to `make install`:
The openblas build output suggests this: 'Note that any flags
passed to make during build should also be passed to make install
to circumvent any install errors'.
This also makes sure that minimum CPU requirement is set in
the pkgconfig file is the same one as used for building.
This helps to maintain a reproducible build (boo#1228177). (forwarded request 1190850 from eeich)
OBS-URL: https://build.opensuse.org/request/show/1190851
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openblas?expand=0&rev=65
- Duplicate all options passed to `make` also to `make install`:
The openblas build output suggests this: 'Note that any flags
passed to make during build should also be passed to make install
to circumvent any install errors'.
This also makes sure that minimum CPU requirement is set in
the pkgconfig file is the same one as used for building.
This helps to maintain a reproducible build (boo#1228177).
OBS-URL: https://build.opensuse.org/request/show/1190850
OBS-URL: https://build.opensuse.org/package/show/science/openblas?expand=0&rev=182
- Update to version 0.3.27 (boo#1225869):
General:
* Added initial (generic) support for the `CSKY` architecture.
* Capped the maximum number of threads used in `GEMM`, `GETRF`
and `POTRF` to avoid creating underutilized or idle threads.
* Sped up multithreaded `POTRF` on all platforms.
* Added extension `openblas_set_num_threads_local()` that returns
the previous thread count.
* Re-evaluated the `SGEMV` and `DGEMV` load thresholds to avoid
activating multithreading for too small workloads.
* Improved the fallback code used when the precompiled number of
threads is exceeded, and made it callable multiple times
during the lifetime of an instance.
* Added CBLAS interfaces for the BLAS extensions `?AMIN`,`?AMAX`,
`CAXPYC` and `ZAXPYC`.
* Fixed a potential buffer overflow in the interface to the
`GEMMT` kernels.
* Fixed use of incompatible pointer types in `GEMMT` and
`C`/`ZAXPBY` as flagged by GCC-14.
* Fixed unwanted case sensitivity of the character parameters in
`?TRTRS` sped up the OpenMP thread management code.
* Fixed sizing of logical variables in `INTERFACE64` builds of
the C version of LAPACK.
* Fixed inclusion of new LAPACK and LAPACKE functions from
LAPACK 3.11 in the shared library.
* Modified the error thresholds for `SGS`/`DGS` functions in
the LAPACK testsuite to suppress spurious errors.
* Added support for calling ?NRM2 with a negative increment value
on all architectures.
* Fixed handling of the `OPENBLAS_LOOPS` variable in several
OBS-URL: https://build.opensuse.org/request/show/1179598
OBS-URL: https://build.opensuse.org/package/show/science/openblas?expand=0&rev=175
- Cleaned up changelog:
* Added missing changes from 0.3.22 to 0.3.24 release.
* Formated list of package changes in markdown format for easier
conversion.
* Dropped all entries that are irrelevant for SUSE or to
users:
- build related - in particular CMAKE
- OS-related except Linux
- related to compilers not supported on SUSE
- related to architectures presently not supported on SUSE (forwarded request 1160107 from eeich)
OBS-URL: https://build.opensuse.org/request/show/1173654
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openblas?expand=0&rev=61
- Cleaned up changelog:
* Added missing changes from 0.3.22 to 0.3.24 release.
* Formated list of package changes in markdown format for easier
conversion.
* Dropped all entries that are irrelevant for SUSE or to
users:
- build related - in particular CMAKE
- OS-related except Linux
- related to compilers not supported on SUSE
- related to architectures presently not supported on SUSE
OBS-URL: https://build.opensuse.org/request/show/1160107
OBS-URL: https://build.opensuse.org/package/show/science/openblas?expand=0&rev=173
Please stage together with https://build.opensuse.org/request/show/1153127!
Currently, openSUSE:Factory:Staging:E.
- Remove DYNAMIC_LIST for aarch64 for older gcc versions: This has
been fixed upstream.
- Update to version 0.3.26:
* General:
- Improved the version of openblas.pc that is created by the
CMAKE build.
- Fixed a CMAKE-specific build problem on older versions of
MacOS.
- Worked around linking problems on old versions of MacOS.
- Corrected installation location of the lapacke_mangling
header in CMAKE builds.
- Added type declarations for complex variables to the
MSVC-specific parts of the LAPACK header.
- Significantly sped up ?GESV for small problem sizes by
introducing a lower bound for multithreading.
- Imported additions and corrections from the Reference-LAPACK
project:
+ Added new LAPACK functions for truncated QR with pivoting
(Reference-LAPACK PRs 891&941).
+ Handle miscalculation of minimum work array size in corner
cases (Reference-LAPACK PR 942).
+ Fixed use of uninitialized variables in ?GEDMD and
improved inline documentation.
+ Fixed use of uninitialized variables (and consequential
failures) in ?BBCSD.
+ Added tests for the recently introduced Dynamic Mode
Decomposition functions.
+ Fixed several memory leaks in the LAPACK testsuite.
+ Fixed counting of testsuite results by the Python script.
OBS-URL: https://build.opensuse.org/request/show/1153572
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openblas?expand=0&rev=60
- Update to version 0.3.26:
* General:
- Improved the version of openblas.pc that is created by the
CMAKE build.
- Fixed a CMAKE-specific build problem on older versions of
MacOS.
- Worked around linking problems on old versions of MacOS.
- Corrected installation location of the lapacke_mangling
header in CMAKE builds.
- Added type declarations for complex variables to the
MSVC-specific parts of the LAPACK header.
- Significantly sped up ?GESV for small problem sizes by
introducing a lower bound for multithreading.
- Imported additions and corrections from the Reference-LAPACK
project:
+ Added new LAPACK functions for truncated QR with pivoting
(Reference-LAPACK PRs 891&941).
+ Handle miscalculation of minimum work array size in corner
cases (Reference-LAPACK PR 942).
+ Fixed use of uninitialized variables in ?GEDMD and
improved inline documentation.
+ Fixed use of uninitialized variables (and consequential
failures) in ?BBCSD.
+ Added tests for the recently introduced Dynamic Mode
Decomposition functions.
+ Fixed several memory leaks in the LAPACK testsuite.
+ Fixed counting of testsuite results by the Python script.
* x86-64:
- Fixed computation of CASUM on SkylakeX and newer targets in
the special case that AVX512 is not supported by the compiler
OBS-URL: https://build.opensuse.org/request/show/1140291
OBS-URL: https://build.opensuse.org/package/show/science/openblas?expand=0&rev=168
- Recreate old library scheme for existing products:
It turned out the new scheme on existing systems has
been causing package breakages.
- Do not generate baselibs.conf for HPC builds.
- Add support for gcc11 & 12.
- For SLE/Leap on x86_64 and s390x do not mix compiler versions
as this will make the gfortran ABI version inconsistent. Instead
use the stock compiler and set the list of kernels for x86_64
cores explicitly as Cooperlake requires compiler intrinsics
which are not provided by gcc 7.
- Require at least 7G of disk space for building. (forwarded request 1068121 from eeich)
OBS-URL: https://build.opensuse.org/request/show/1068124
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openblas?expand=0&rev=56
- Recreate old library scheme for existing products:
It turned out the new scheme on existing systems has
been causing package breakages.
- Do not generate baselibs.conf for HPC builds.
- Add support for gcc11 & 12.
- For SLE/Leap on x86_64 and s390x do not mix compiler versions
as this will make the gfortran ABI version inconsistent. Instead
use the stock compiler and set the list of kernels for x86_64
cores explicitly as Cooperlake requires compiler intrinsics
which are not provided by gcc 7.
- Require at least 7G of disk space for building.
OBS-URL: https://build.opensuse.org/request/show/1068121
OBS-URL: https://build.opensuse.org/package/show/science/openblas?expand=0&rev=159
- Make sure pre-existing (arch-independent) update-alternatives
are wiped before registering new ones.
Since update-alternatives has no reliable way to check if
a certain 'generic name' exists, brute-force it and ignore
any error (boo#1208248).
- Remove totally pointless - ie. never executed - %%posttrans
script.
- Restore generic link for update-alternatives. This is usually
set by the update-alternatives and it is '%ghost'ed but rpmlint
complains.
- Add rpmlintrc rules to avoid false positives from consistently
guessing the update-alternatives generic name wrong.
- Make arch dependent generic names conditional.
OBS-URL: https://build.opensuse.org/request/show/1066744
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openblas?expand=0&rev=55
- Make sure pre-existing (arch-independent) update-alternatives
are wiped before registering new ones.
Since update-alternatives has no reliable way to check if
a certain 'generic name' exists, brute-force it and ignore
any error (boo#1208248).
- Remove totally pointless - ie. never executed - %%posttrans
script.
- Restore generic link for update-alternatives. This is usually
set by the update-alternatives and it is '%ghost'ed but rpmlint
complains.
- Add rpmlintrc rules to avoid false positives from consistently
guessing the update-alternatives generic name wrong.
- Make arch dependent generic names conditional.
OBS-URL: https://build.opensuse.org/request/show/1066169
OBS-URL: https://build.opensuse.org/package/show/science/openblas?expand=0&rev=156
- Do not set LIBNAMESUFFIX to mark different flavors as this causes
the SONAME to be different so that different flavors of OpenBLAS
cannot serve as plugin replacements of each other (boo#1177260).
- Fix a fallout of making alternatives directory arch dependent.
- Remove unneeded links that will be created by update-alternatives.
Create remaining links %post scripts properly %ghost-ing the files. (forwarded request 1063627 from eeich)
OBS-URL: https://build.opensuse.org/request/show/1063744
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/openblas?expand=0&rev=54
- Do not set LIBNAMESUFFIX to mark different flavors as this causes
the SONAME to be different so that different flavors of OpenBLAS
cannot serve as plugin replacements of each other (boo#1177260).
- Fix a fallout of making alternatives directory arch dependent.
- Remove unneeded links that will be created by update-alternatives.
Create remaining links %post scripts properly %ghost-ing the files.
OBS-URL: https://build.opensuse.org/request/show/1063627
OBS-URL: https://build.opensuse.org/package/show/science/openblas?expand=0&rev=154
it seems that the 32-bit compatibility packages have revealed a
conflict which was not properly detected by installcheck and actually
has made it into Leap 15.4/SLE-15-SP4 rather than caused it: The same
issue exists in the 'regular' 64-bit packages but has remained undetected
by installcheck so far. Factory hasn't suffered from this as lapack has
been fixed properly - see boo#1207358.
The possible installcheck issue has been reported in:
https://github.com/openSUSE/openSUSE-release-tools/issues/2915
OBS-URL: https://build.opensuse.org/package/show/science/openblas?expand=0&rev=152
are causing conflicts with the lapack packages during SLE staging:
found conflict of liblapacke3-32bit-3.5.0-4.6.1.x86_64 with libopenblas_openmp0-32bit-0.3.21-150500.1.2.x86_64
/usr/lib/liblapacke.so.3 [mode mismatch: l777 root:root -> liblapacke.so.3.5.0, g -644 root:root]
OBS-URL: https://build.opensuse.org/package/show/science/openblas?expand=0&rev=151
- Update to v0.3.21:
* general:
- Updated the included LAPACK to Reference-LAPACK release 3.10.1
- when no Fortran compiler is available, OpenBLAS builds will now automatically
- function LAPACKE_lsame is now annotated with the GCC attribute "const" to aid static analyzers
- added USE_TLS to the list of options reported by the openblas_get_config() function
- added SYMBOLPREFIX/SYMBOLSUFFIX handling for LAPACK 3.10.0 functions added in 0.3.20
- reverted OpenMP threadpool behaviour in the exec_blas call to its state before 0.3.11, that is
the threadpool will no longer grow or shrink on demand as the overhead for this is too big at least with
GNU OpenMP. The adaptive behaviour introduced in 0.3.11 can still be requested at runtime by setting
the environment variable OMP_ADAPTIVE
- worked around spurious STFSM/CTFSM errors reported by the LAPACK testsuite
* x86_64:
- fixed determination of compiler support for AVX512 and removed the 0.3.19
- workaround for building SKYLAKEX kernels on Sandybridge hardware
- fixed compilation for the SKYLAKEX target with gcc 6
- fixed compilation of the SkyLakeX small matrix GEMM kernels with LLVM or ICC
- added support for the Zhaoxin/Centaur KH40000 cpu
- fixed a potential crash in the ZSYMV kernel used for all targets except generic
* POWER:
- worked around an overflow error in the POWER6 DNRM2 kernel
- fixed compilation on PPC440
- fixed a performance regression in the level1 BLAS on POWER10
- fixed the POWER10 ZGEMM kernel
- fixed singlethreaded builds for POWER10
- fixed compilation of the POWER10 DGEMV kernel with older gcc versions
- enabled compilation of the BFLOAT16 kernels by default
- enabled the small matrix kernels by default for DYNAMIC_ARCH builds
- added a workaround for a miscompilation of the CDOT and ZDOT kernels by GCC 12
- Obsolete:
OBS-URL: https://build.opensuse.org/request/show/1039250
OBS-URL: https://build.opensuse.org/package/show/science/openblas?expand=0&rev=146