forked from pool/onednn
Guillaume GARDET
f2a81390aa
- Add patch to fix build with latest Arm Compute Library: * 1428.patch * fa93750.patch (dep for 1428.patch) OBS-URL: https://build.opensuse.org/request/show/1005196 OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/onednn?expand=0&rev=18
231 lines
10 KiB
Plaintext
231 lines
10 KiB
Plaintext
-------------------------------------------------------------------
|
|
Tue Sep 20 08:26:43 UTC 2022 - Guillaume GARDET <guillaume.gardet@opensuse.org>
|
|
|
|
- Add patch to fix build with latest Arm Compute Library:
|
|
* 1428.patch
|
|
* fa93750.patch (dep for 1428.patch)
|
|
|
|
-------------------------------------------------------------------
|
|
Tue Sep 13 05:22:52 UTC 2022 - Paolo Stivanin <info@paolostivanin.com>
|
|
|
|
- Update to 2.6.2:
|
|
* https://github.com/oneapi-src/oneDNN/releases
|
|
- Removed onednn-1045.patch.
|
|
- Removed onednn-xbyak-aarch64.patch.
|
|
|
|
-------------------------------------------------------------------
|
|
Tue Jun 15 12:10:39 UTC 2021 - Guillaume GARDET <guillaume.gardet@opensuse.org>
|
|
|
|
- Fix build on aarch64:
|
|
* onednn-xbyak-aarch64.patch
|
|
|
|
-------------------------------------------------------------------
|
|
Tue Jun 15 08:31:16 UTC 2021 - Guillaume GARDET <guillaume.gardet@opensuse.org>
|
|
|
|
- Update to version 2.2.4:
|
|
* Fixed build error with GCC 11 (eda1add)
|
|
* Fixed an issue with reorder reporting unimplemented when
|
|
quantizing f32 weights to s8 (4f05b76, 5d3d1e1, cc77eef)
|
|
* Updated name for GPU gen12 architecture to xe (3d202c2)
|
|
- Drop upstream patch:
|
|
* 0001-common-gpu-include-thread-and-limit-headers-to-fix-G.patch
|
|
|
|
-------------------------------------------------------------------
|
|
Thu Jun 3 01:38:56 UTC 2021 - Ferdinand Thiessen <rpm@fthiessen.de>
|
|
|
|
- Update to version 2.2.3
|
|
* Fixed a bug in int8 depthwise convolution ptimitive with groups
|
|
and 1d spatial size for processors with AVX-512 and AVX2 support
|
|
* Fixed correctness issue for PReLU primitive
|
|
* Fixed corretness issue in reorder for blocked layouts with
|
|
zero padding
|
|
* Improved performance of weights reorders used by BRGEMM-based
|
|
convolution primitive for processors with AVX-512 support
|
|
* Added -fp-model=precise build flag for DPC++ code
|
|
* Fixed potential memory leak in matmul primitive
|
|
* Fixed performance of matmul primitive when fused with bias
|
|
update and sum
|
|
* Fixed a bug in matmul primitive when writing to non-contiguous
|
|
destination buffer
|
|
- Add upstream patch for GCC11 support
|
|
* 0001-common-gpu-include-thread-and-limit-headers-to-fix-G.patch
|
|
|
|
-------------------------------------------------------------------
|
|
Thu May 27 08:10:13 UTC 2021 - Jan Engelhardt <jengelh@inai.de>
|
|
|
|
- Update descriptions.
|
|
|
|
-------------------------------------------------------------------
|
|
Wed May 26 13:29:27 UTC 2021 - Guillaume GARDET <guillaume.gardet@opensuse.org>
|
|
|
|
- Update to 2.2.2, changes:
|
|
* Fixed performance regression in fp32 forward inner product for
|
|
shapes with number of output channels equal to 1 for processors
|
|
with Intel AVX-512 support (714b1fd)
|
|
* Fixed performance regression in forward convolutions with groups
|
|
for processors with Intel AVX-512 support(3555d4a)
|
|
* Removed -std=c++11 build flag for DPC++ headers (1fcb867)
|
|
* Fixed buffer access in initializing workspace in RNN
|
|
implementation on GPU (9b03091)
|
|
* Fixed fix a bug in convolution with 1x1 kernel and mixed
|
|
strides on processors with Intel AVX-512 support (d0b3e3f)
|
|
* Used getauxval for Linux to get CPU features on for AArch64
|
|
systems (25c4cea)
|
|
* Added -fp-model=precise build flag for DPC++ code (3e40e5e)
|
|
* Fixed out-of-bounds writes in elementwise primitive on
|
|
Intel Processor Graphics (bcf823c)
|
|
- Fix build with Arm Compute Library:
|
|
* onednn-1045.patch
|
|
|
|
-------------------------------------------------------------------
|
|
Tue Apr 13 07:53:16 UTC 2021 - Guillaume GARDET <guillaume.gardet@opensuse.org>
|
|
|
|
- Update to 2.2.1, changes:
|
|
* From 2.2:
|
|
Fixed segfault for cases when primitive descriptor or attributed contain NaN (e6d05ec, dbca1e9, 0326b09, 0326b09)
|
|
Fixed engine creation failure for GPU subdevices (4c3a114)
|
|
Fixed long lines clipping in verbose output (70d70a8)
|
|
Fixed segfault in bfloat16 convolution weight gradient implementation on processors with Intel AMX support (a3a73a3)
|
|
Fixed performance regression in binary primitive with per_oc broadcast strategy (9ac85d8)
|
|
Worked around a bug with Microsoft Visual C++ compiler version detection in CMake 3.19 (2f39155)
|
|
Removed -std=c++11 build flag for DPC++ code to align with SYCL standard (1b026f5)
|
|
* Changes between 2.1 and 2.2:
|
|
Performance Optimizations
|
|
Intel Architecture processors
|
|
Improved performance of int8 compute functionality for future Intel Xeon Scalable processor (code name Sapphire Rapids). The functionality is disabled by default and should be enabled via CPU dispatcher control.
|
|
Improved performance of compute functionality for future Intel Core processor with Intel AVX2 and Intel DL Boost instructions support (code name Alder Lake).
|
|
Improved fp32 inner product forward propagation performance for processors with Intel AVX-512 support.
|
|
Improved dnnl_gemm performance for cases with n=1 on all supported processors.
|
|
Intel Graphics products
|
|
Introduced NHWC format support for activations for int8 primitives.
|
|
AArch64-based processors
|
|
Improved performance of fp32 and int8 convolution, and softmax primitives for processors with SVE 512 support.
|
|
Improved performance of fp32 convolution via Arm Compute Library (ACL).
|
|
Improved performance of convolution with a combination of sum and relu post-ops via ACL.
|
|
Functionality
|
|
Extended eltwise primitive with support for mish and hardswish algorithms.
|
|
Extended binary primitive with support for comparison operators.
|
|
Introduced support for post-ops in GPU resampling implementation.
|
|
Introduced asymmetric quantization support for int8 deconvolution.
|
|
Introduced binary post-ops support for matmul primitive.
|
|
Usability
|
|
Improved presentation of oneDNN primitives in VTune Amplifier.
|
|
Introduced Linux perf support for AArch64.
|
|
Introduced support for Fujitsu C++ compiler.
|
|
Introduced a build time check for minimal supported ACL version. Currently oneDNN requires ACL 21.02 or later.
|
|
Added support for cuDNN 8.x
|
|
|
|
-------------------------------------------------------------------
|
|
Wed Feb 17 14:17:47 UTC 2021 - Guillaume GARDET <guillaume.gardet@opensuse.org>
|
|
|
|
- Update to 2.1
|
|
- Add Arm ComputeLibrary support on aarch64
|
|
|
|
-------------------------------------------------------------------
|
|
Mon Oct 5 06:16:30 UTC 2020 - Guillaume GARDET <guillaume.gardet@opensuse.org>
|
|
|
|
- Obsoletes mkl-dnn* <= %{version}
|
|
|
|
-------------------------------------------------------------------
|
|
Fri Oct 2 12:47:08 UTC 2020 - Guillaume GARDET <guillaume.gardet@opensuse.org>
|
|
|
|
- Rename mkl-dnn to onednn to follow upstream
|
|
|
|
-------------------------------------------------------------------
|
|
Wed Sep 23 13:36:02 UTC 2020 - Guillaume GARDET <guillaume.gardet@opensuse.org>
|
|
|
|
- Update to 1.6.3
|
|
- Drop upstream patch:
|
|
* cmake-no-install-ocl-cmake.patch
|
|
|
|
-------------------------------------------------------------------
|
|
Wed Sep 23 13:16:39 UTC 2020 - Guillaume GARDET <guillaume.gardet@opensuse.org>
|
|
|
|
- Build on aarch64 and ppc64le which are now also supported
|
|
- Provide oneDNN and oneDNN-devel as it is the new official name
|
|
|
|
-------------------------------------------------------------------
|
|
Tue May 5 07:38:34 UTC 2020 - Tomáš Chvátal <tchvatal@suse.com>
|
|
|
|
- Update to 1.4:
|
|
* Performance improvements all over the board
|
|
- Rebase patch cmake-no-install-ocl-cmake.patch
|
|
|
|
-------------------------------------------------------------------
|
|
Tue Mar 24 10:50:57 UTC 2020 - Tomáš Chvátal <tchvatal@suse.com>
|
|
|
|
- Add constraints to not crash during testing on OOM
|
|
|
|
-------------------------------------------------------------------
|
|
Thu Feb 27 12:44:00 UTC 2020 - Tomáš Chvátal <tchvatal@suse.com>
|
|
|
|
- Do not disable LTO there is no actual reason for that
|
|
- Export LD_LIBRARY_PATH to fix older releases build
|
|
|
|
-------------------------------------------------------------------
|
|
Wed Feb 26 10:36:26 UTC 2020 - Tomáš Chvátal <tchvatal@suse.com>
|
|
|
|
- There is no actual reason to not use github tag for tarball
|
|
fetching -> remove the service
|
|
- Format with spec-cleaner
|
|
- Use proper %cmake macros everywhere
|
|
- Add configure options for cmake to set it up in a way we really
|
|
want
|
|
- Add patch from Debian to not install OpenCL cmake finder:
|
|
* cmake-no-install-ocl-cmake.patch
|
|
|
|
-------------------------------------------------------------------
|
|
Thu Feb 20 10:26:52 UTC 2020 - Christian Goll <cgoll@suse.com>
|
|
|
|
- enabled tests
|
|
|
|
-------------------------------------------------------------------
|
|
Thu Jan 30 14:20:22 UTC 2020 - Christian Goll <cgoll@suse.com>
|
|
|
|
- packaged separate benchnn packae with its input files
|
|
- updated to v1.1.3 which includes
|
|
* Fixed the mean and variance memory descriptors in layer
|
|
normalization (65f1908)
|
|
* Fixed the layer normalization formula (c176ceb)
|
|
|
|
-------------------------------------------------------------------
|
|
Wed Jan 8 15:21:54 UTC 2020 - Christian Goll <cgoll@suse.com>
|
|
|
|
- updated to v1.1.2
|
|
* Fixed threading over the spatial in bfloat16 batched
|
|
normalization (017b6c9)
|
|
* Fixed read past end-of-buffer error for int8 convolution (7d6f45e)
|
|
* Fixed condition for dispatching optimized channel blocking in
|
|
fp32 backward convolution on Intel Xeon Phi(TM) processor (846eba1)
|
|
* Fixed fp32 backward convolution for shapes with spatial strides
|
|
over the depth dimension (002e3ab)
|
|
* Fixed softmax with zero sizes on GPU (936bff4)
|
|
* Fixed int8 deconvolution with dilation when ih <= dh (3e3bacb)
|
|
* Enabled back fp32 -> u8 reorder for RNN (a2c2507)
|
|
* Fixed segmentation fault in bfloat16 backward convolution from
|
|
kd_padding=0 computation (52d476c)
|
|
* Fixed segmentation fault in bfloat16 forward convolution due
|
|
to push/pop imbalance (4f6e3d5)
|
|
* Fixed library version for OS X build (0d85005)
|
|
* Fixed padding by channels in concat (a265c7d)
|
|
* Added full text of third party licenses and
|
|
copyright notices to LICENSE file (79f204c)
|
|
* Added separate README for binary packages (28f4c96)
|
|
* Fixed computing per-oc mask in RNN (ff3ffab)
|
|
* Added workaround for number of cores calculation in Xbyak (301b088)
|
|
|
|
-------------------------------------------------------------------
|
|
Mon Feb 11 16:35:48 UTC 2019 - cgoll@suse.com
|
|
|
|
- added ARCH_OPT_FLAGS=""
|
|
|
|
-------------------------------------------------------------------
|
|
Tue Feb 5 07:45:53 UTC 2019 - Christian Goll <cgoll@suse.com>
|
|
|
|
- Initial checking of the Intel(R) Math Kernel Library for
|
|
Deep Neural Networks which can be used by:
|
|
* tensorflow
|
|
* Caffee
|
|
* PyTorch
|
|
and other machine learning tools
|