684ed79877
- For SLES16 target POWER9 instead of POWER8 which fixes the issue with the reported sgemm testsuite fails. [bsc#1239545]
Atri Bhattacharya2025-06-03 16:53:11 +00:00
2914a6b45c
Accepting request 1266047 from science
Ana Guerrero2025-04-03 14:45:54 +00:00
d96cf8ac37
- Disable and remove support for gnu-hpc build flavours (bsc#1239982)
Ana Guerrero
2025-04-01 08:42:12 +00:00
e248fb09b3
Accepting request 1253922 from science
Ana Guerrero2025-03-18 16:37:29 +00:00
c538c2b94e
Accepting request 1253917 from home:eeich:branches:science
Egbert Eich2025-03-17 19:36:04 +00:00
a1d44dcaed
- Add test package. - Add flags: -Wa,--noexecstack -Wl,-z,noexecstack to make sure stack is not executable. This works around problems in assembler code for z. - Make stack of empty cpuid.S non-executable as well.
Egbert Eich2025-03-06 12:52:05 +00:00
7c4967fcc3
- Update to version 0.2.29 (jsc#PED-9676): General: * Fixed a potential NULL pointer dereference in multithreaded builds. * Added function aliases for GEMMT using its new name GEMMTR adopted by Reference-BLAS. * Fixed the behavior of the recently added CBLAS_?GEMMT functions with row-major data. * Improved thread scaling of multithreaded SBGEMV. * Improved thread scaling of multithreaded TRTRI. * Fixed compilation of the CBLAS testsuite with gcc14 (and no Fortran compiler). * Fixed placement of the -fopenmp flag and libsuffix in the generated pkgconfig file. * Improved the CMakeConfig file generated by the Makefile build. * Fixed const-correctness of cblas_?geadd in cblas.h. * Fixed a potential inaccuracy in multithreaded BLAS3 calls. * Fixed empty implementations of get/set_affinity that print a warning in OpenMP builds. * Fixed function signatures for TRTRS in the converted C version of LAPACK. * Fixed omission of several single-precision LAPACK symbols in the shared library. * Improved build instructions for the provided "pybench" benchmarks. * Improved documentation, including descriptions of environment variables that affect build and runtime behavior. * Added a separate "make install_tests" target for use with cross-compilations. * Integrated improvements and corrections from Reference-LAPACK: - removed a comparison in LAPACKE ?tpmqrt that is always false. - fixed the leading dimension for B in tests for GGEV.
Egbert Eich2025-03-05 20:05:34 +00:00
dd935c2c6b
- Update to version 0.3.28 (jsc#PED-9676):
Egbert Eich2025-03-05 13:14:18 +00:00
29312f7bef
Accepting request 1242905 from science
Ana Guerrero2025-02-04 17:10:38 +00:00
bb67ef8e10
Accepting request 1234592 from science
Ana Guerrero2025-01-06 15:04:58 +00:00
0f2e101f5c
- Update to version 0.3.27 (jsc#PED-9676): * General: + Reworked the unfinished implementation of HUGETLB from GotoBLAS for allocating huge memory pages as buffers on suitable systems. + Changed the unfinished implementation of GEMM3M for the generic target on all architectures to at least forward to regular GEMM. + Improved multithreaded GEMM performance for large non-skinny matrices. + Improved BLAS3 performance on larger multicore systems through improved parallelism. + Improved performance of the initial memory allocation by reducing locking overhead. + Improved performance of GBMV at small problem sizes by introducing a size barrier for the switch to multithreading. + Added an implementation of the CBLAS_GEMM_BATCH extension. + Fixed corner cases involving the handling of NAN and INFINITY arguments in ?SCAL on all architectures. + Fixed NAN handling and potential accuracy issues in compilations with Intel ICX by supplying a suitable fp-model option by default. + It is now possible to register a callback function that replaces the built-in support for multithreading with an external backend like TBB (openblas_set_threads_callback_function). + Fixed potential duplication of suffixes in shared library naming. + Improved C compiler detection by the build system to tolerate more naming variants for gcc builds. + Fixed an unnecessary dependency of the utest on CBLAS. + Fixed spurious error reports from the BLAS extensions utest. + Fixed unwanted invocation of the GEMM3M tests in cross- compilation. + Fixed a flaw in the makefile build that could lead to the
Egbert Eich2025-01-02 16:50:32 +00:00
6076de7297
Accepting request 1063627 from home:eeich:branches:science
Egbert Eich2023-02-08 08:12:25 +00:00
ee08ada4cd
Accepting request 1061191 from home:eeich:branches:science
Egbert Eich2023-01-26 11:53:30 +00:00
fdaf650bf7
- Reverted last change: it seems that the 32-bit compatibility packages have revealed a conflict which was not properly detected by installcheck and actually has made it into Leap 15.4/SLE-15-SP4 rather than caused it: The same issue exists in the 'regular' 64-bit packages but has remained undetected by installcheck so far. Factory hasn't suffered from this as lapack has been fixed properly - see boo#1207358. The possible installcheck issue has been reported in: https://github.com/openSUSE/openSUSE-release-tools/issues/2915Egbert Eich2023-01-21 11:14:53 +00:00
12b2052a28
- Disabling 32-bit compatibility packages for Leap/SLE as they are causing conflicts with the lapack packages during SLE staging: found conflict of liblapacke3-32bit-3.5.0-4.6.1.x86_64 with libopenblas_openmp0-32bit-0.3.21-150500.1.2.x86_64 /usr/lib/liblapacke.so.3 [mode mismatch: l777 root:root -> liblapacke.so.3.5.0, g -644 root:root]
Egbert Eich2023-01-19 17:33:10 +00:00
4f9678748e
- Update to version 0.3.14 common: * Fixed a race condition on thread shutdown in non-OpenMP builds * Fixed custom BUFFERSIZE option getting ignored in gmake builds * Fixed CMAKE compilation of the TRMM kernels for GENERIC platforms * Added CBLAS interfaces for CROTG, ZROTG, CSROT and ZDROT * Improved performance of OMATCOPY_RT across all platforms * Changed perl scripts to use env instead of a hardcoded /usr/bin/perl * Fixed potential misreading of the GCC compiler version in the build scripts * Fixed convergence problems in LAPACK complex GGEV/GGES (Reference-LAPACK #477) * Reduced the stacksize requirements for running the LAPACK testsuite (Reference-LAPACK #335) RISC V: * Fixed compilation on RISCV (missing entry in getarch) POWER: * Fixed compilation for DYNAMIC_ARCH with clang and with older gcc versions * Added support for compilation on FreeBSD/ppc64le * Added optimized POWER10 kernels for SSCAL, DSCAL, CSCAL, ZSCAL * Added optimized POWER10 kernels for SROT, DROT, CDOT, SASUM, DASUM * Improved SSWAP, DSWAP, CSWAP, ZSWAP performance on POWER10 * Improved SCOPY and CCOPY performance on POWER10 * Improved SGEMM and DGEMM performance on POWER10 * Added support for compilation with the NVIDIA HPC compiler x86_64: * Added an optimized bfloat16 GEMM kernel for Cooperlake * Added CPUID autodetection for Intel Rocket Lake and Tiger Lake cpus * Improved the performance of SASUM,DASUM,SROT,DROT on AMD Ryzen cpus * Added support for compilation with the NAG Fortran compiler * Fixed recognition of the AMD AOCC compiler * Fixed compilation for DYNAMIC_ARCH with clang on Windows * Added support for running the BLAS/CBLAS tests on Windows
Ismail Dönmez
2021-03-18 08:47:05 +00:00
5b2fcb1b99
Add back the lost question mark
Ismail Dönmez
2021-02-03 11:53:45 +00:00
b20387a9c9
- BUILD_BFLOAT16=1 is not supported in s390(x) (bsc#1181522) - Add: * 0001-Require-gcc-11-for-builtin_cpu_is-power10.patch * 0002-patch-to-support-power10-in-builtin_cpu_is-was-backp.patch: Only gcc11 has builtin_cpu_is(power10) - fix build issue for ppc64 (bsc#1181522).
Egbert Eich2021-02-02 22:07:24 +00:00
51cdcb51a2
- Update to version 0.3.13 common: * Added a generic bfloat16 SBGEMV kernel * Fixed a potentially severe memory leak after fork in OpenMP builds that was introduced in 0.3.12 * Added detection of the Fujitsu Fortran compiler * Added detection of the (e)gfortran compiler on OpenBSD * Added support for overriding the default name of the library independently from symbol suffixing in the gmake builds (already supported in cmake)
Ismail Dönmez
2020-12-17 07:24:57 +00:00