- Update to 3.3.1:
- This is a patch release containing the following changes to v3.3:
* Fixed int8 convolution accuracy issue on Intel GPUs (09c87c7)
* Switched internal stream to in-order mode for NVIDIA and AMD GPUs to avoid synchronization issues (db01d62)
* Fixed runtime error for avgpool_bwd operation in Graph API (d025ef6, 9e0602a, e0dc1b3)
* Fixed benchdnn error reporting for some Graph API cases (98dc9db)
* Fixed accuracy issue in experimental Graph Compiler for int8 MHA variant from StarCoder model (5476ef7)
* Fixed incorrect results for layer normalization with trivial dimensions on Intel GPUs (a2ec0a0)
* Removed redundant synchronization for out-of-order SYCL queues (a96e9b1)
* Fixed runtime error in experimental Graph Compiler for int8 MLP subgraph from LLAMA model (595543d)
* Fixed SEGFAULT in experimental Graph Compiler for fp32 MLP subgraph (4207105)
* Fixed incorrect results in experimental Graph Compiler for MLP subgraph (57e14b5)
* Fixed the issue with f16 inner product primitive with s8 output returning unimplemented on Intel GPUs (bf12207, 800b5e9, ec7054a)
* Fixed incorrect results for int8 deconvolution with zero-points on processors with Intel AMX instructions support (55d2cec)
OBS-URL: https://build.opensuse.org/request/show/1130129
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/onednn?expand=0&rev=24
- Update to 2.2.2, changes:
* Fixed performance regression in fp32 forward inner product for
shapes with number of output channels equal to 1 for processors
with Intel AVX-512 support (714b1fd)
* Fixed performance regression in forward convolutions with groups
for processors with Intel AVX-512 support(3555d4a)
* Removed -std=c++11 build flag for DPC++ headers (1fcb867)
* Fixed buffer access in initializing workspace in RNN
implementation on GPU (9b03091)
* Fixed fix a bug in convolution with 1x1 kernel and mixed
strides on processors with Intel AVX-512 support (d0b3e3f)
* Used getauxval for Linux to get CPU features on for AArch64
systems (25c4cea)
* Added -fp-model=precise build flag for DPC++ code (3e40e5e)
* Fixed out-of-bounds writes in elementwise primitive on
Intel Processor Graphics (bcf823c)
- Fix build with Arm Compute Library:
* onednn-1045.patch
OBS-URL: https://build.opensuse.org/request/show/895561
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/onednn?expand=0&rev=8