- Update to version 0.2.20:
* common:
- Improved CMake support
- Fixed several thread race and locking bugs
- Fixed default LAPACK optimization level
- Updated LAPACK to 3.7.0
- Added ReLAPACK (https://github.com/HPAC/ReLAPACK), make
BUILD_RELAPACK=1
* POWER:
- Optimizations for Power9
- Fixed several Power8 assembly bugs
* ARM:
- New optimized Vulcan and ThunderX2T99 targets
- Support for ARMV7 SOFT_FP ABI (make ARM_SOFTFP_ABI=1)
- Detect all cpu cores including offline ones
- Fix compilation with CLANG
- Support building a shared library for Android
* MIPS:
- Fixed several threading issues
- Fix compilation with CLANG
* x86_64:
- Detect Intel Bay Trail and Apollo Lake
- Detect Intel Sky Lake and Kaby Lake
- Detect Intel Knights Landing
- Detect AMD A8, A10, A12 and Ryzen
- Support 64bit builds with Visual Studio
- Fix building with Intel and PGI compilers
- Fix building with MINGW and TDM-GCC
- Fix cmake builds for Haswell and related cpus
- Fix building for Sandybridge with CLANG 3.9
OBS-URL: https://build.opensuse.org/request/show/513093
OBS-URL: https://build.opensuse.org/package/show/science/openblas?expand=0&rev=52
- Update to version 0.2.16
* Upgrade LAPACK to 3.6.0 version.
* Disable multi-threading for small size swap and ger.
* Improve small zger, zgemv, ztrmv using stack alloction.
* Let openblas_get_num_threads return the number of active threads.
* Fix LAPACK Dormbr, Dormlq bug.
* Avoid potential getenv segfault.
* Import LAPACK svn bugfix #142-#147,#150-#155
* Optimize trsm kernels for AMD Bulldozer, Piledriver, Steamroller.
* Detect Intel Avoton.
* Detect AMD Trinity, Richland, E2-3200.
* Optimize c/zgemv for AMD Bulldozer, Piledriver, Steamroller
* Fix bug with scipy linalg test.
* Support and optimize Cortex-A57 AArch64.
* Update ARMV6 kernels.
* Improve DGEMM for ARM Cortex-A57.
* Fix detection of POWER architecture.
* Optimize D and Z BLAS3 functions for Power8.
- Remove openblas-libs.patch, not needed.
OBS-URL: https://build.opensuse.org/request/show/374076
OBS-URL: https://build.opensuse.org/package/show/science/openblas?expand=0&rev=42
- Update to version 0.2.15
* Enable MAX_STACK_ALLOC flags by default.
* Improve ger and gemv for small matrices.
* Improve gemv parallel with small m and large n case.
* Improve ?imatcopy when lda==ldb
* Add vecLib benchmarks
* Fix LAPACK lantr for row major matrices
* Fix LAPACKE lansy
* Import bug fixes for LAPACKE s/dormlq, c/zunmlq
* Raise the signal when pthread_create fails
* Drop obsolete openblas-arm64-build.patch
x86/x86-64:
* Support pure C generic kernels for x86/x86-64.
* Support Intel Boardwell and Skylake by Haswell kernels.
* Support AMD Excavator by Steamroller kernels.
* Optimize s/d/c/zdot for Intel SandyBridge and Haswell.
* Optimize s/d/c/zdot for AMD Piledriver and Steamroller.
* Optimize s/d/c/zapxy for Intel SandyBridge and Haswell.
* Optimize s/d/c/zapxy for AMD Piledriver and Steamroller.
* Optimize d/c/zscal for Intel Haswell, dscal for Intel SandyBridge.
* Optimize d/c/zscal for AMD Bulldozer, Piledriver and Steamroller.
* Optimize s/dger for Intel SandyBridge.
* Optimize s/dsymv for Intel SandyBridge.
* Optimize ssymv for Intel Haswell.
* Optimize dgemv for Intel Nehalem and Haswell.
* Optimize dtrmm for Intel Haswell.
ARM:
* Support Android NDK armeabi-v7a-hard ABI (-mfloat-abi=hard)
* Fix lock, rpcc bugs
POWER:
OBS-URL: https://build.opensuse.org/request/show/341265
OBS-URL: https://build.opensuse.org/package/show/science/openblas?expand=0&rev=40
* drop openblas-soname.patch
- Add RPM %post script for manual BLAS/LAPACK update-alternatives
configuration update
- Use update-alternatives mechanism for OpenBLAS variants (serial,
openmp, pthreads). pthreads variant is default for x86 and x86_64,
OpenMP for other architectures.
- Fix build on ARM64
* openblas-arm64-build.patch
- Add update-alternatives mechanism for CBLAS
- Provide cmake module
- Delete info about host cpu from openblas_config.h for dynamic arch
- Add update-alternatives to 'preup' and 'post' requires list for
libraries
- Add README.SUSE
OBS-URL: https://build.opensuse.org/package/show/science/openblas?expand=0&rev=38
- Update to version 0.2.14
* Improve ger and gemv for small matrices by stack allocation.
e.g. make -DMAX_STACK_ALLOC=2048
* Introduce openblas_get_num_threads and openblas_get_num_procs.
* Add ATLAS-style ?geadd function.
* Fix c/zsyr bug with negative incx.
* Fix race condition during shutdown causing a crash in
gotoblas_set_affinity().
x86/x86-64:
* Support AMD Streamroller.
ARM:
* Add Cortex-A9 and Cortex-A15 targets.
OBS-URL: https://build.opensuse.org/request/show/305140
OBS-URL: https://build.opensuse.org/package/show/science/openblas?expand=0&rev=36
- Update to version 0.2.12
* Added CBLAS interface for ?omatcopy and ?imatcopy.
* Enable ?gemm3m functions.
* Added benchmark for ?gemm3m.
* Optimized multithreading lower limits.
* Disabled SYMM3M and HEMM3M functions because of segment violations.
x86/x86-64:
* Improved axpy and symv performance on AMD Bulldozer.
* Improved gemv performance on modern Intel and AMD CPUs.
OBS-URL: https://build.opensuse.org/request/show/257419
OBS-URL: https://build.opensuse.org/package/show/science/openblas?expand=0&rev=31
- Update to version 0.2.9
* Update LAPACK to 3.5.0 version
* Fixed compatiable issues with Clang and Pathscale compilers.
* Added OPENBLAS_VERBOSE environment variable.(#338)
* Make OpenBLAS thread-pool resilient to fork via pthread_atfork.
(#294)
* Rewrote rotmg
* Fixed sdsdot bug.
* Improved the result for LAPACK testing. (#372)
x86/x86-64:
* Optimization on Intel Haswell.
* Enable optimization kernels on AMD Bulldozer and Piledriver.
* Detect Intel Haswell for new Macbook.
* To improve LAPACK testing, we fallback some kernels. (#372)
https://github.com/xianyi/OpenBLAS/wiki/Fixed-optimized-kernels-To-do-List
ARM:
* Support ARMv6 and ARMv7 ISA.
* Optimization on ARM Cortex-A9.
- Update patches:
* openblas-0.2.8-libs.patch
* openblas-0.2.8-noexecstack.patch
- Fix gcc warnings (#385)
* openblas-0.2.9-gcc-warnings.patch
OBS-URL: https://build.opensuse.org/request/show/237012
OBS-URL: https://build.opensuse.org/package/show/science/openblas?expand=0&rev=23
- version 0.2.7
* Support LSB (Linux Standard Base) 4.1.
e.g. make CC=lsbcc
* Include LAPACK 3.4.2 source codes to the repo.
Avoid downloading at compile time.
* Add NO_PARALLEL_MAKE flag to disable parallel make.
* Create openblas_get_parallel to retrieve information which parallelization model is used by OpenBLAS. (Thank grisuthedragon)
* Detect LLVM/Clang compiler.
* A walk round for dtrti_U single thread bug. Replace it with LAPACK codes. (#191)
* Optimize c/zgemm, trsm, dgemv_n, ddot, daxpy, dcopy on AMD Bulldozer. (Thank Werner Saar)
* Add Intel Haswell support (using Sandybridge optimizations). (Thank Dan Luu)
* Add AMD Piledriver support (using Bulldozer optimizations).
* Fix the computational error in zgemm avx kernel on Sandybridge. (#237)
* Fix the overflow bug in gemv.
* Fix the overflow bug in multi-threaded BLAS3, getrf when NUM_THREADS is very large.(#214, #221, #246).
- rebase patch noexecstack.patch
- remove lapack source tarball since lapack sources are included in openblas sources
- increase NUM_THREAD from 32 to 64
OBS-URL: https://build.opensuse.org/request/show/184489
OBS-URL: https://build.opensuse.org/package/show/science/openblas?expand=0&rev=19
- version 0.2.6
* Improved OpenMP performance slightly. (d744c9)
* Improved cblas.h compatibility with Intel MKL.(#185)
* Fixed the overflowing bug in single thread cholesky factorization.
* Fixed the overflowing buffer bug of multithreading hbmv and sbmv.(#174)
* Added AMD Bulldozer x86-64 S/DGEMM AVX kernels. (Thank Werner Saar) We will tune the performance in future.
* Auto-detect Intel Xeon E7540.
* Fixed the overflowing buffer bug of gemv. (#173)
* Fixed the bug of s/cdot about invalid reading NAN on x86_64. (#189)
- rebase patch0 openblas-0.2.6-libs.patch
OBS-URL: https://build.opensuse.org/request/show/157145
OBS-URL: https://build.opensuse.org/package/show/science/openblas?expand=0&rev=18
* Fixed the SEGFAULT bug about hyper-theading
* Support AMD Bulldozer by using GotoBLAS2 AMD Barcelona codes
* Removed the limitation (64) of numbers of CPU cores.
Now, it supports 256 cores at max.
* Supported clang compiler.
* Fixed some build bugs on FreeBSD
* Optimized Level-3 BLAS on Intel Sandy Bridge x86-64 by AVX
instructions.
* Support AMD Bobcat by using GotoBLAS2 AMD Barcelona codes.
- update patch3
OBS-URL: https://build.opensuse.org/package/show/science/openblas?expand=0&rev=9
* Upgraded LAPACK to 3.4.1 version. (Thank Zaheer Chothia)
* Supported LAPACKE, a C interface to LAPACKE. (Thank Zaheer Chothia)
* Fixed the build bug (MD5 and download) on Mac OSX.
* Auto download CUnit 2.1.2-2 from SF.net with UTEST_CHECK=1.
x86/x86_64:
* Auto-detect Intel Sandy Bridge Core i7-3xxx & Xeon E7 Westmere-EX.
* Test alpha=Nan in dscale.
* Fixed a SEGFAULT bug in samax on x86 windows.
OBS-URL: https://build.opensuse.org/package/show/science/openblas?expand=0&rev=5