------------------------------------------------------------------- Fri Oct 18 06:59:35 UTC 2024 - Luigi Baldoni - Update to version 1.5.0 * WARNING: we removed some of the SSE2 optimizations, so if you care about systems without SSSE3, you should be careful when updating! * Optimize index offset calculations for decode_coefs * picture: copy HDR10+ and T35 metadata only to visible frames * SSSE3 new optimizations for 6-tap (8bit and hbd) * AArch64/SVE: Add HBD subpel filters using 128-bit SVE2 * AArch64: Add USMMLA implempentation for 6-tap H/HV * AArch64: Optimize Armv8.0 NEON for HBD horizontal filters and 6-tap filters * Power9: Optimized ITX till 16x4. * Loongarch: numerous optimizations * RISC-V optimizations for pal, cdef_filter, ipred, mc_blend, mc_bdir, itx * Allow playing videos in full-screen mode in dav1dplay ------------------------------------------------------------------- Wed Jun 12 15:13:55 UTC 2024 - Luigi Baldoni - Update to version 1.4.3 * AArch64: Fix potential out of bounds access in DotProd H/HV filters * cli: Prevent buffer over-read ------------------------------------------------------------------- Sat May 25 09:12:12 UTC 2024 - Luigi Baldoni - Update to version 1.4.2 * AVX2 optimizations for 8-tap and new variants for 6-tap * AVX-512 optimizations for 8-tap and new variants for 6-tap * Improve entropy decoding on ARM64 * New ARM64 optimizations for convolutions based on DotProd extension * New ARM64 optimizations for convolutions based on i8mm extension * New ARM64 optimizations for subpel and prep filters for i8mm * Misc improvements on existing ARM64 optimizations, notably for put/prep * New PowerPC9 optimizations for loopfilter * Support for macOS kperf API for benchmarking ------------------------------------------------------------------- Fri Mar 15 07:49:30 UTC 2024 - Luigi Baldoni - Update to version 1.4.1 * Optimizations for 6tap filters for NEON (ARM) * More RISC-V optimizations for itx (4x8, 8x4, 4x16, 16x4, 8x16, 16x8) * Reduction of binary size on ARM64, ARM32 and RISC-V * Fix out-of-bounds read in 8bpc SSE2/SSSE3 wiener_filter * Msac optimizations ------------------------------------------------------------------- Wed Feb 14 20:10:15 UTC 2024 - Luigi Baldoni - Update to version 1.4.0 * AVX-512 optimizations for z1, z2, z3 in 8bit and high-bitdepth * New architecture supported: loongarch * Loongarch optimizations for 8bit * New architecture supported: RISC-V * RISC-V optimizations for itx * Misc improvements in threading and in reducing binary size * Fix potential integer overflow with extremely large frame sizes (bsc#1220105, CVE-2024-1580) ------------------------------------------------------------------- Wed Oct 4 04:17:53 UTC 2023 - Luigi Baldoni - Update to version 1.3.0 * Reduce memory usage in numerous places * ABI break in Dav1dSequenceHeader, Dav1dFrameHeader, Dav1dContentLightLevel structures * new API function to check the API version: dav1d_version_api() * Rewrite of the SGR functions for ARM64 to be faster * NEON implemetation of save_tmvs for ARM32 and ARM64 * x86 palette DSP for pal_idx_finish function - Bump soversion to 7 ------------------------------------------------------------------- Thu Jun 1 14:49:36 UTC 2023 - Luigi Baldoni - Update to version 1.2.1 * Fix a threading race on task_thread.init_done * NEON z2 8bpc and high bit-depth optimizations * SSSE3 z2 high bit-depth optimziations * Fix a desynced luma/chroma planes issue with Film Grain * Reduce memory consumption * Improve dav1d_parse_sequence_header() speed * OBU: Improve header parsing and fix potential overflows * OBU: Improve ITU-T T.35 parsing speed * Misc buildsystems, CI and headers fixes ------------------------------------------------------------------- Wed May 31 21:02:21 UTC 2023 - Jan Engelhardt - Add to description some performance mentions that set it apart from other packages e.g. gav1. ------------------------------------------------------------------- Fri May 5 10:32:00 UTC 2023 - Bjørn Lie - Use ldconfig_scriptlets macro, minor spec clean-up. ------------------------------------------------------------------- Wed May 3 06:27:24 UTC 2023 - Luigi Baldoni - Update to version 1.2.0 * Improvements on attachments of props and T.35 entries on output pictures * NEON z1/z3 high bit-depth optimizations and improvements for 8bpc * SSSE3 z2/z3 8bpc and SSSE3 z1/z3 high bit-depth optimziations * refmvs.save_tmvs optimizations in SSSE3/AVX2/AVX-512 * AVX-512 optimizations for high bit-depth itx (16x64, 32x64, 64x16, 64x32, 64x64) * AVX2 optimizations for 12bpc for 16x32, 32x16, 32x32 itx * Includes fix for possible crash when decoding a frame (bsc#1211262 CVE-2023-32570). ------------------------------------------------------------------- Tue Mar 7 19:40:24 UTC 2023 - Michael Gorse - Revert last change. This is now handled in xxhash. ------------------------------------------------------------------- Wed Mar 1 16:22:14 UTC 2023 - Michael Gorse - Require gcc9 on SLE. Otherwise defaults to gcc7 and fails to build on ppc64le (boo#1208794). ------------------------------------------------------------------- Tue Feb 14 21:26:23 UTC 2023 - Luigi Baldoni - Update to version 1.1.0 * New function dav1d_get_frame_delay to query the decoder frame delay * Numerous fixes for strict conformity to the specs and samples * NEON and AVX-512 misc fixes and improvements * Partial AVX2 12bpc transform implementations * AVX-512 high bit-depth cdef_filter, loopfilter, itx * NEON z1/z3 optimization for 8bpc * SSSE3 z1 optimization for 8bpc ------------------------------------------------------------------- Thu Oct 13 06:52:06 UTC 2022 - Bjørn Lie - Drop _lto_cflags define, current version supports lto build. - Drop unneeded rpm BuildRequires. - Add pkgconfig(libxxhash) BuildRequires and stop passing xhash_muxer=disabled to meson, build hash_muxer support. - Add check section and meson_test macro, run tests during build. ------------------------------------------------------------------- Fri Mar 18 16:02:49 UTC 2022 - Luigi Baldoni - Update to version 1.0.0 * Automatic thread management. * Add support for AVX-512 acceleration. * x86 code speedup (from SSE2 to AVX2). * New grain API to ease acceleration on the GPU. * New API call to get information of which frame failed to decode, in error cases. * Numerous small bug fixes. - Bump soversion to 6 ------------------------------------------------------------------- Fri Sep 3 17:07:36 UTC 2021 - Luigi Baldoni - Update to version 0.9.2 * x86: SSE4 optimizations of inverse transforms for 10bit for all sizes * x86: mc.resize optimizations with AVX2/SSSE3 for 10/12b * x86: SSSE3 optimizations for cdef_filter in 10/12b and mc_w_mask_422/444 in 8b * ARM NEON optimizations for FilmGrain Gen_grain functions * Optimizations for splat_mv in SSE2/AVX2 and NEON * x86: SGR improvements for SSSE3 CPUs * x86: AVX2 optimizations for cfl_ac ------------------------------------------------------------------- Thu Jul 29 09:28:47 UTC 2021 - Luigi Baldoni - Update to version 0.9.1 * 10/12b SSSE3 optimizations for mc (avg, w_avg, mask, w_mask, emu_edge), prep/put_bilin, prep/put_8tap, ipred (dc/h/v, paeth, smooth, pal, filter), wiener, sgr (10b), warp8x8, deblock, film_grain, cfl_ac/pred for 32bit and 64bit x86 processors * Film grain NEON for fguv 10/12b, fgy/fguv 8b and fgy/fguv 10/12 arm32 * Fixes for filmgrain on ARM * itx 10bit optimizations for 4x4/x8/x16, 8x4/x8/x16 for SSE4 * Misc improvements on SSE2, SSE4 ------------------------------------------------------------------- Sun May 16 17:12:52 UTC 2021 - Luigi Baldoni - Update to version 0.9.0 * x86 (64bit) AVX2 implementation of most 10b/12b functions, which should provide a large boost for high-bitdepth decoding on modern x86 computers and servers. * ARM64 neon implementation of FilmGrain (4:2:0/4:2:2/4:4:4 8bit) * New API to signal events happening during the decoding process ------------------------------------------------------------------- Wed Mar 24 18:36:28 UTC 2021 - Luigi Baldoni - Disable LTO (fix boo#1183956) ------------------------------------------------------------------- Mon Feb 22 08:06:22 UTC 2021 - Luigi Baldoni - Update to version 0.8.2 * ARM32 optimizations for ipred and itx in 10/12bits, completing the 10b/12b work on ARM64 and ARM32 * Give the post-filters their own threads * ARM64: rewrite the wiener functions * Speed up coefficient decoding, 0.5%-3% global decoding gain * x86 optimizations for CDEF_filter and wiener in 10/12bit * x86: rewrite the SGR AVX2 asm * x86: improve msac speed on SSE2+ machines * ARM32: improve speed of ipred and warp * ARM64: improve speed of ipred, cdef_dir, cdef_filter, warp_motion and itx16 * ARM32/64: improve speed of looprestoration * Add seeking, pausing to the player * Update the player for rendering of 10b/12b * Misc speed improvements and fixes on all platforms * Add a xxh3 muxer in the dav1d application ------------------------------------------------------------------- Sat Jan 2 18:33:17 UTC 2021 - aloisio@gmx.com - Update to version 0.8.1 * Keep references to buffers valid after dav1d_close(). Fixes a regression caused by the picture buffer pool added in 0.8.0. * ARM32 optimizations for 10bit bitdepth for SGR * ARM32 optimizations for 16bit bitdepth for blend/w_masl/emu_edge * ARM64 optimizations for 10bit bitdepth for SGR * x86 optimizations for wiener in SSE2/SSSE3/AVX2 ------------------------------------------------------------------- Tue Nov 24 10:03:21 UTC 2020 - Luigi Baldoni - Update to version 0.8.0 * Improve the performance by using a picture buffer pool; * ARM32 optimizations for 8bit bitdepth for ipred paeth, smooth, cfl * ARM32 optimizations for 10/12/16bit bitdepth for mc_avg/mask/w_avg, put/prep 8tap/bilin, wiener and CDEF filters * ARM64 optimizations for cfl_ac 444 for all bitdepths * x86 optimizations for MC 8-tap, mc_scaled in AVX2 * x86 optimizations for CDEF in SSE and {put/prep}_{8tap/bilin} in SSSE3 - Bump soversion to 5 - Drop dav1d-nasm-2.15.patch (merged upstream) ------------------------------------------------------------------- Tue Sep 1 11:11:55 UTC 2020 - Dominique Leuenberger - Add dav1d-nasm-2.15.patch: Fix compilation with nasm 2.15. ------------------------------------------------------------------- Mon Jun 22 08:13:31 UTC 2020 - aloisio@gmx.com - Update to version 0.7.1 * ARM32 NEON optimizations for itxfm, which can give up to 28% speedup, and MSAC * SSE2 optimizations for prep_bilin and prep_8tap * AVX2 optimizations for MC scaled * Fix a clamping issue in motion vector projection * Fix an issue on some specific Haswell CPU on ipred_z AVX2 functions * Improvements on the dav1dplay utility player to support resizing ------------------------------------------------------------------- Wed May 20 16:50:41 UTC 2020 - Luigi Baldoni - Update to verison 0.7.0 * Faster refmv implementation gaining up to 12% speed while -25% of RAM (Single Thread) * 10b/12b ARM64 optimizations are mostly complete: + ipred (paeth, smooth, dc, pal, filter, cfl) + itxfm (only 10b) * AVX2/SSSE3 for non-4:2:0 film grain and for mc.resize * AVX2 for cfl4:4:4 * AVX-512 CDEF filter * ARM64 8b improvements for cfl_ac and itxfm * ARM64 implementation for emu_edge in 8b/10b/12b * ARM32 implementation for emu_edge in 8b * Improvements on the dav1dplay utility player to support 10 bit, non-4:2:0 pixel formats and film grain on the GPU ------------------------------------------------------------------- Fri Mar 6 07:20:04 UTC 2020 - Luigi Baldoni - Update to version 0.6.0 * New ARM64 optimizations for the 10/12bit depth: + mc_avg, mc_w_avg, mc_mask + mc_put/mc_prep 8tap/bilin + mc_warp_8x8 + mc_w_mask + mc_blend + wiener + SGR + loopfilter + cdef * New AVX-512 optimizations for prep_bilin, prep_8tap, cdef_filter, mc_avg/w_avg/mask * New SSSE3 optimizations for film grain * New AVX2 optimizations for msac_adapt16 * Fix rare mismatches against the reference decoder, notably because of clipping * Improvements on ARM64 on msac, cdef and looprestoration optimizations * Improvements on AVX2 optimizations for cdef_filter * Improvements in the C version for itxfm, cdef_filter - Bump sover to 4 ------------------------------------------------------------------- Wed Dec 4 19:03:37 UTC 2019 - Luigi Baldoni - Update to version 0.5.2 * ARM32 optimizations for loopfilter, ipred_dc|h|v * Add section-5 raw OBU demuxer * Improve the speed by reducing the L2 cache collisions * Fix minor issues ------------------------------------------------------------------- Sat Oct 26 05:39:14 UTC 2019 - Luigi Baldoni - Update to version 0.5.1 * SSE2 optimizations for CDEF, wiener and warp_affine * NEON optimizations for SGR on ARM32 * Fix mismatch issue in x86 asm in inverse identity transforms * Fix build issue in ARM64 assembly if debug info was enabled * Add a workaround for Xcode 11 -fstack-check bug - Dropped arm64_ipred_symbols_aligned.patch (merged upstream) ------------------------------------------------------------------- Fri Oct 11 09:43:36 UTC 2019 - Luigi Baldoni - Update to version 0.5.0 Medium release fixing regressions and minor issues, and improving speed significantly: * Export ITU T.35 metadata * Speed improvements on blend_ on ARM * Speed improvements on decode_coef and MSAC * NEON optimizations for blend*, w_mask_, ipred functions for ARM64 * NEON optimizations for CDEF and warp on ARM32 * SSE2 optimizations for MSAC hi_tok decoding * SSSE3 optimizations for deblocking loopfilters and warp_affine * AVX-2 optimizations for film grain and ipred_z2 * SSE4 optimizations for warp_affine * VSX optimizations for wiener * Fix inverse transform overflows in x86 and NEON asm * Fix integer overflows with large frames * Improve film grain generation to match reference code * Improve compatibility with older binutils for ARM * More advanced Player example in tools - Bump soversion to 3 - Added arm64_ipred_symbols_aligned.patch to fix aarch64 build ------------------------------------------------------------------- Mon Aug 5 14:55:40 UTC 2019 - Luigi Baldoni - Update to version 0.4.0 * Fix playback with unknown OBUs * Add an option to limit the maximum frame size * SSE2 and ARM64 optimizations for MSAC * Improve speed on 32bits systems * Optimization in obmc blend * Reduce RAM usage significantly * The initial PPC SIMD code, cdef_filter * NEON optimizations for blend functions on ARM * NEON optimizations for w_mask functions on ARM * NEON optimizations for inverse transforms on ARM64 * Improve handling of malloc failures * Simple Player example in tools - Dropped dav1d.armv6.patch (merged upstream) - Bumped SOVERSION to 2 ------------------------------------------------------------------- Mon May 13 19:48:51 UTC 2019 - olaf@aepfle.de - Added dav1d.armv6.patch (disables armv7 asm for armv6 builds) ------------------------------------------------------------------- Sat May 11 16:06:40 UTC 2019 - Luigi Baldoni - Update to version 0.3.1 * Fix a buffer overflow in frame-threading mode on SSSE3 CPUs * Reduce binary size, notably on Windows * SSSE3 optimizations for ipred_filter * ARM optimizations for MSAC ------------------------------------------------------------------- Mon Apr 29 18:07:47 UTC 2019 - Luigi Baldoni - Update to version 0.3.0 * Fixes an annoying crash on SSSE3 that happened in the itx functions ------------------------------------------------------------------- Fri Apr 19 07:48:06 UTC 2019 - Luigi Baldoni - Update to version 0.2.2 * Large improvement on MSAC decoding with SSE, bringing 4-6% speed increase The impact is important on SSSE3, SSE4 and AVX-2 cpus * SSSE3 optimizations for all blocks size in itx * SSSE3 optimizations for ipred_paeth and ipref_cfl (420, 422 and 444) * Speed improvements on CDEF for SSE4 CPUs * NEON optimizations for SGR and loop filter * Minor crashes, improvements and build changes ------------------------------------------------------------------- Tue Apr 2 06:43:21 UTC 2019 - Dominique Leuenberger - Add baselibs.conf: ffmpeg, which is the main consumer of Dav1d, produces -32bit packages that would be uninstallable otherwise. ------------------------------------------------------------------- Tue Mar 12 22:23:22 UTC 2019 - Luigi Baldoni - Update to version 0.2.1 * SSSE3 optimization for cdef_dir * AVX-2 improvements of the existing CDEF optimizations * NEON improvements of the existing CDEF and wiener optimizations * Clarification about the numbering/versionning scheme ------------------------------------------------------------------- Mon Mar 4 17:31:53 UTC 2019 - Luigi Baldoni - Update to version 0.2.0 * ARM64 and ARM optimizations using NEON instructions * SSSE3 optimizations for both 32 and 64bits * More AVX-2 assembly, reaching almost completion * Fix installation of includes * Rewrite inverse transforms to avoid overflows * Snap packaging for Linux * Updated API (ABI and API break) * Fixes for un-decodable samples ------------------------------------------------------------------- Thu Dec 13 13:21:36 UTC 2018 - Jan Engelhardt - Redo description and mention SIMD acceleration. ------------------------------------------------------------------- Thu Dec 13 11:53:50 UTC 2018 - Luigi Baldoni - Moved license file to library package ------------------------------------------------------------------- Tue Dec 11 18:25:04 UTC 2018 - Luigi Baldoni - Initial stable package (v0.1.0)