------------------------------------------------------------------- Tue Oct 27 21:11:50 UTC 2015 - dmitry_r@opensuse.org - Update to version 0.2.15 * Enable MAX_STACK_ALLOC flags by default. * Improve ger and gemv for small matrices. * Improve gemv parallel with small m and large n case. * Improve ?imatcopy when lda==ldb * Add vecLib benchmarks * Fix LAPACK lantr for row major matrices * Fix LAPACKE lansy * Import bug fixes for LAPACKE s/dormlq, c/zunmlq * Raise the signal when pthread_create fails * Drop obsolete openblas-arm64-build.patch x86/x86-64: * Support pure C generic kernels for x86/x86-64. * Support Intel Boardwell and Skylake by Haswell kernels. * Support AMD Excavator by Steamroller kernels. * Optimize s/d/c/zdot for Intel SandyBridge and Haswell. * Optimize s/d/c/zdot for AMD Piledriver and Steamroller. * Optimize s/d/c/zapxy for Intel SandyBridge and Haswell. * Optimize s/d/c/zapxy for AMD Piledriver and Steamroller. * Optimize d/c/zscal for Intel Haswell, dscal for Intel SandyBridge. * Optimize d/c/zscal for AMD Bulldozer, Piledriver and Steamroller. * Optimize s/dger for Intel SandyBridge. * Optimize s/dsymv for Intel SandyBridge. * Optimize ssymv for Intel Haswell. * Optimize dgemv for Intel Nehalem and Haswell. * Optimize dtrmm for Intel Haswell. ARM: * Support Android NDK armeabi-v7a-hard ABI (-mfloat-abi=hard) * Fix lock, rpcc bugs POWER: * Support ppc64le platform (ELF ABI v2) * Support POWER7/8 by POWER6 kernels. ------------------------------------------------------------------- Wed Jul 29 21:13:47 UTC 2015 - dmitry_r@opensuse.org - Change library name suffix * drop openblas-soname.patch - Add RPM %post script for manual BLAS/LAPACK update-alternatives configuration update - Use update-alternatives mechanism for OpenBLAS variants (serial, openmp, pthreads). pthreads variant is default for x86 and x86_64, OpenMP for other architectures. - Fix build on ARM64 * openblas-arm64-build.patch - Add update-alternatives mechanism for CBLAS - Provide cmake module - Delete info about host cpu from openblas_config.h for dynamic arch - Add update-alternatives to 'preup' and 'post' requires list for libraries - Add README.SUSE ------------------------------------------------------------------- Wed Mar 25 08:05:20 UTC 2015 - dmitry_r@opensuse.org - Update to version 0.2.14 * Improve ger and gemv for small matrices by stack allocation. e.g. make -DMAX_STACK_ALLOC=2048 * Introduce openblas_get_num_threads and openblas_get_num_procs. * Add ATLAS-style ?geadd function. * Fix c/zsyr bug with negative incx. * Fix race condition during shutdown causing a crash in gotoblas_set_affinity(). x86/x86-64: * Support AMD Streamroller. ARM: * Add Cortex-A9 and Cortex-A15 targets. ------------------------------------------------------------------- Wed Dec 3 16:06:49 UTC 2014 - dmitry_r@opensuse.org - Update to version 0.2.13 * Add SYMBOLPREFIX and SYMBOLSUFFIX makefile options for adding a prefix or suffix to all exported symbol names in the shared library. * Remove openblas-0.1.0-soname.patch * Add openblas-soname.patch * Rebase openblas-noexecstack.patch x86/x86-64: * Add generic kernel files for x86-64. make TARGET=GENERIC * Fix a bug of sgemm kernel on Intel Sandy Bridge. * Fix c_check bug on some amd64 systems. ARM: * Support APM's X-Gene 1 AArch64 processors. * Optimize trmm and sgemm. ------------------------------------------------------------------- Fri Oct 17 13:09:58 UTC 2014 - dmitry_r@opensuse.org - Update to version 0.2.12 * Added CBLAS interface for ?omatcopy and ?imatcopy. * Enable ?gemm3m functions. * Added benchmark for ?gemm3m. * Optimized multithreading lower limits. * Disabled SYMM3M and HEMM3M functions because of segment violations. x86/x86-64: * Improved axpy and symv performance on AMD Bulldozer. * Improved gemv performance on modern Intel and AMD CPUs. ------------------------------------------------------------------- Mon Aug 18 12:43:10 UTC 2014 - dmitry_r@opensuse.org - Update to version 0.2.11 * Added some benchmark codes. x86/x86-64: * Improved s/c/zgemm performance for Intel Haswell. * Improved s/d/c/zgemv performance. * Support the big numa machine.(EXPERIMENT) ARM: * Fix detection when cpuinfo uses "Processor". ------------------------------------------------------------------- Thu Jul 17 20:44:58 UTC 2014 - dmitry_r@opensuse.org - Update to version 0.2.10 * Added BLAS extensions as following. s/d/c/zaxpby, s/d/c/zimatcopy, s/d/c/zomatcopy. * Added OPENBLAS_CORETYPE environment for dynamic_arch. (a86d34) * Support outputing the CPU corename on runtime.(#407) * Patched LAPACK to fix bug 114, 117, 118. (http://www.netlib.org/lapack/bug_list.html) * Disabled ?gemm3m for a work-around fix. (#400) * Fixed lots of bugs for optimized kernels on sandybridge,Haswell, bulldozer, and piledriver. * Remove obsolete openblas-0.2.9-gcc-warnings.patch ------------------------------------------------------------------- Tue Jun 10 14:34:02 UTC 2014 - dmitry_r@opensuse.org - Update to version 0.2.9 * Update LAPACK to 3.5.0 version * Fixed compatiable issues with Clang and Pathscale compilers. * Added OPENBLAS_VERBOSE environment variable.(#338) * Make OpenBLAS thread-pool resilient to fork via pthread_atfork. (#294) * Rewrote rotmg * Fixed sdsdot bug. * Improved the result for LAPACK testing. (#372) x86/x86-64: * Optimization on Intel Haswell. * Enable optimization kernels on AMD Bulldozer and Piledriver. * Detect Intel Haswell for new Macbook. * To improve LAPACK testing, we fallback some kernels. (#372) https://github.com/xianyi/OpenBLAS/wiki/Fixed-optimized-kernels-To-do-List ARM: * Support ARMv6 and ARMv7 ISA. * Optimization on ARM Cortex-A9. - Update patches: * openblas-0.2.8-libs.patch * openblas-0.2.8-noexecstack.patch to * openblas-libs.patch * openblas-noexecstack.patch - Fix gcc warnings (#385) * openblas-0.2.9-gcc-warnings.patch ------------------------------------------------------------------- Sat Apr 12 09:02:16 UTC 2014 - dmitry_r@opensuse.org - Remove files with problematic licenses ------------------------------------------------------------------- Fri Apr 4 20:32:24 UTC 2014 - dmitry_r@opensuse.org - Update to version 0.2.8 * Add executable stack markings. * Respect user's LDFLAGS * Rollback bulldozer and piledriver kernels to barcelona kernels * update openblas-0.2.6-libs.patch * update c_xerbla_no-void-return.patch * update openblas-0.2.7-noexecstack.patch ------------------------------------------------------------------- Fri Jul 26 20:31:17 UTC 2013 - scorot@free.fr - version 0.2.7 * Support LSB (Linux Standard Base) 4.1. e.g. make CC=lsbcc * Include LAPACK 3.4.2 source codes to the repo. Avoid downloading at compile time. * Add NO_PARALLEL_MAKE flag to disable parallel make. * Create openblas_get_parallel to retrieve information which parallelization model is used by OpenBLAS. (Thank grisuthedragon) * Detect LLVM/Clang compiler. * A walk round for dtrti_U single thread bug. Replace it with LAPACK codes. (#191) * Optimize c/zgemm, trsm, dgemv_n, ddot, daxpy, dcopy on AMD Bulldozer. (Thank Werner Saar) * Add Intel Haswell support (using Sandybridge optimizations). (Thank Dan Luu) * Add AMD Piledriver support (using Bulldozer optimizations). * Fix the computational error in zgemm avx kernel on Sandybridge. (#237) * Fix the overflow bug in gemv. * Fix the overflow bug in multi-threaded BLAS3, getrf when NUM_THREADS is very large.(#214, #221, #246). - rebase patch noexecstack.patch - remove lapack source tarball since lapack sources are included in openblas sources - increase NUM_THREAD from 32 to 64 ------------------------------------------------------------------- Sat Mar 2 16:08:16 UTC 2013 - scorot@free.fr - version 0.2.6 * Improved OpenMP performance slightly. (d744c9) * Improved cblas.h compatibility with Intel MKL.(#185) * Fixed the overflowing bug in single thread cholesky factorization. * Fixed the overflowing buffer bug of multithreading hbmv and sbmv.(#174) * Added AMD Bulldozer x86-64 S/DGEMM AVX kernels. (Thank Werner Saar) We will tune the performance in future. * Auto-detect Intel Xeon E7540. * Fixed the overflowing buffer bug of gemv. (#173) * Fixed the bug of s/cdot about invalid reading NAN on x86_64. (#189) - rebase patch0 openblas-0.2.6-libs.patch ------------------------------------------------------------------- Sun Feb 17 14:10:55 UTC 2013 - jengelh@inai.de - Remove redundant cleaning commands - Do not create .so.0.2.5. SO versions are not package release numbers. ------------------------------------------------------------------- Mon Jan 21 20:19:13 UTC 2013 - scorot@free.fr - use Requires(post) and Requires(preun) instead of PreReq - add patch markups in spec file ------------------------------------------------------------------- Tue Jan 15 20:42:00 UTC 2013 - scorot@free.fr - add update-alternatives support to allow easy switching between the different blas and lapack implementations ------------------------------------------------------------------- Fri Nov 30 20:46:47 UTC 2012 - scorot@free.fr - version 0.2.5 * Export LAPACK 3.4.2 symbols in shared library. (#147) * Restore the original CPU affinity when calling openblas_set_num_threads(1) (#153) * Fixed a SEGFAULT bug in dgemv_t when m is very large.(#154) ------------------------------------------------------------------- Mon Oct 8 19:12:49 UTC 2012 - scorot@free.fr - version 0.2.4 * Upgraded LAPACK to 3.4.2 version. (#145) * f77blas.h:compatibility for compilers without C99 complex number support. (#141) * Added NO_AVX flag. Check OS supporting AVX on runtime. (#139) ------------------------------------------------------------------- Mon Aug 20 21:30:03 UTC 2012 - scorot@free.fr - version 0.2.3 * Fixed LAPACK unstable bug about ?laswp. (#130) * Fixed the shared library bug about unloading the library on Linux (#132). ------------------------------------------------------------------- Sun Jul 8 20:24:03 UTC 2012 - scorot@free.fr - version 0.2.2 * Support Intel Sandy Bridge 22nm desktop/mobile CPU ------------------------------------------------------------------- Mon Jul 2 20:45:57 UTC 2012 - scorot@free.fr - version 0.2.1 * Fixed the SEGFAULT bug about hyper-theading * Support AMD Bulldozer by using GotoBLAS2 AMD Barcelona codes * Removed the limitation (64) of numbers of CPU cores. Now, it supports 256 cores at max. * Supported clang compiler. * Fixed some build bugs on FreeBSD * Optimized Level-3 BLAS on Intel Sandy Bridge x86-64 by AVX instructions. * Support AMD Bobcat by using GotoBLAS2 AMD Barcelona codes. - update patch3 ------------------------------------------------------------------- Wed May 2 21:16:16 UTC 2012 - scorot@free.fr - update patch0 ------------------------------------------------------------------- Wed May 2 20:45:18 UTC 2012 - scorot@free.fr - again fix remaining library file name error in spec file ------------------------------------------------------------------- Wed May 2 20:18:48 UTC 2012 - scorot@free.fr - fix wrong library file name version ------------------------------------------------------------------- Wed May 2 20:05:55 UTC 2012 - scorot@free.fr - Update to version 0.1.1 * Upgraded LAPACK to 3.4.1 version. (Thank Zaheer Chothia) * Supported LAPACKE, a C interface to LAPACKE. (Thank Zaheer Chothia) * Fixed the build bug (MD5 and download) on Mac OSX. * Auto download CUnit 2.1.2-2 from SF.net with UTEST_CHECK=1. x86/x86_64: * Auto-detect Intel Sandy Bridge Core i7-3xxx & Xeon E7 Westmere-EX. * Test alpha=Nan in dscale. * Fixed a SEGFAULT bug in samax on x86 windows. ------------------------------------------------------------------- Wed Apr 25 21:46:07 UTC 2012 - scorot@free.fr - version 0.1.0 - update openblas-0.1.0-soname.patch - add openblas-0.1.0-noexecstack.patch - spec file cleanup ------------------------------------------------------------------- Mon Mar 12 22:19:17 UTC 2012 - scorot@free.fr - version 0.1alpha2.5