SHA256
1
0
forked from pool/openvino

Update openVINO 2025.4 in Leap 16.0 #1

Merged
cabelo merged 5 commits from pool/openvino:factory into leap-16.0 2025-12-18 00:20:35 +01:00
6 changed files with 110 additions and 15 deletions

View File

@@ -2,8 +2,8 @@
<service name="obs_scm" mode="manual">
<param name="url">https://github.com/openvinotoolkit/openvino.git</param>
<param name="scm">git</param>
<param name="revision">2025.2.0</param>
<param name="version">2025.2.0</param>
<param name="revision">2025.4.0</param>
<param name="version">2025.4.0</param>
<param name="submodules">enable</param>
<param name="filename">openvino</param>
<param name="exclude">.git</param>

View File

@@ -1,3 +0,0 @@
version https://git-lfs.github.com/spec/v1
oid sha256:6c75c27293662056f9098ecf9b0dfbeacf948983df5807a63610313678024adf
size 743258127

View File

@@ -0,0 +1,3 @@
version https://git-lfs.github.com/spec/v1
oid sha256:deda1db3ae8e8acb506d8937ff4709332bfa0380de14393c6f030b88dd2fc5c4
size 753350671

View File

@@ -1,3 +1,102 @@
-------------------------------------------------------------------
Tue Dec 2 22:43:52 UTC 2025 - Alessandro de Oliveira Faria <cabelo@opensuse.org>
- Update to 2025.4.0
- More GenAI coverage and framework integrations to minimize code
changes
* New models supported:
+ On CPUs & GPUs: Qwen3-Embedding-0.6B, Qwen3-Reranker-0.6B,
Mistral-Small-24B-Instruct-2501.
+ On NPUs: Gemma-3-4b-it and Qwen2.5-VL-3B-Instruct.
* Preview: Mixture of Experts (MoE) models optimized for CPUs
and GPUs, validated for Qwen3-30B-A3B.
* GenAI pipeline integrations: Qwen3-Embedding-0.6B and
Qwen3-Reranker-0.6B for enhanced retrieval/ranking, and
Qwen2.5VL-7B for video pipeline.
- Broader LLM model support and more model compression
techniques
* The Neural Network Compression Framework (NNCF) ONNX backend
now supports INT8 static post-training quantization (PTQ)
and INT8/INT4 weight-only compression to ensure accuracy
parity with OpenVINO IR format models. SmoothQuant algorithm
support added for INT8 quantization.
* Accelerated multi-token generation for GenAI, leveraging
optimized GPU kernels to deliver faster inference, smarter
KV-cache reuse, and scalable LLM performance.
* GPU plugin updates include improved performance with prefix
caching for chat history scenarios and enhanced LLM accuracy
with dynamic quantization support for INT8.
- More portability and performance to run AI at the edge, in the
cloud, or locally.
* Announcing support for Intel® Core Ultra Processor Series 3.
* Encrypted blob format support added for secure model
deployment with OpenVINO GenAI. Model weights and artifacts
are stored and transmitted in an encrypted format, reducing
risks of IP theft during deployment. Developers can deploy
with minimal code changes using OpenVINO GenAI pipelines.
* OpenVINO™ Model Server and OpenVINO™ GenAI now extend
support for Agentic AI scenarios with new features such as
output parsing and improved chat templates for reliable
multi-turn interactions, and preview functionality for the
Qwen3-30B-A3B model. OVMS also introduces a preview for
audio endpoints.
* NPU deployment is simplified with batch support, enabling
seamless model execution across Intel® Core Ultra
processors while eliminating driver dependencies. Models
are reshaped to batch_size=1 before compilation.
* The improved NVIDIA Triton Server* integration with
OpenVINO backend now enables developers to utilize Intel
GPUs or NPUs for deployment.
-------------------------------------------------------------------
Sun Sep 7 01:21:19 UTC 2025 - Alessandro de Oliveira Faria <cabelo@opensuse.org>
- Update to 2025.3.0
- More GenAI coverage and framework integrations to minimize code
changes
* New models supported: Phi-4-mini-reasoning, AFM-4.5B,
Gemma-3-1B-it, Gemma-3-4B-it, and Gemma-3-12B,
* NPU support added for: Qwen3-1.7B, Qwen3-4B, and Qwen3-8B.
* LLMs optimized for NPU now available on OpenVINO Hugging
Face collection.
- Broader LLM model support and more model compression techniques
* The NPU plug-in adds support for longer contexts of up to
8K tokens, dynamic prompts, and dynamic LoRA for improved
LLM performance.
* The NPU plug-in now supports dynamic batch sizes by reshaping
the model to a batch size of 1 and concurrently managing
multiple inference requests, enhancing performance and
optimizing memory utilization.
* Accuracy improvements for GenAI models on both built-in
and discrete graphics achieved through the implementation
of the key cache compression per channel technique, in
addition to the existing KV cache per-token compression
method.
* OpenVINO™ GenAI introduces TextRerankPipeline for improved
retrieval relevance and RAG pipeline accuracy, plus
Structured Output for enhanced response reliability and
function calling while ensuring adherence to predefined
formats.
- More portability and performance to run AI at the edge,
in the cloud, or locally.
* Announcing support for Intel® Arc™ Pro B-Series
(B50 and B60).
* Preview: Hugging Face models that are GGUF-enabled for
OpenVINO GenAI are now supported by the OpenVINO™ Model
Server for popular LLM model architectures such as
DeepSeek Distill, Qwen2, Qwen2.5, and Llama 3.
This functionality reduces memory footprint and
simplifies integration for GenAI workloads.
* With improved reliability and tool call accuracy,
the OpenVINO™ Model Server boosts support for
agentic AI use cases on AI PCs, while enhancing
performance on Intel CPUs, built-in GPUs, and NPUs.
* int4 data-aware weights compression, now supported in the
Neural Network Compression Framework (NNCF) for ONNX
models, reduces memory footprint while maintaining
accuracy and enables efficient deployment in
resource-constrained environments.
-------------------------------------------------------------------
Wed Jun 25 01:09:14 UTC 2025 - Alessandro de Oliveira Faria <cabelo@opensuse.org>
@@ -186,7 +285,7 @@ Mon Apr 14 06:52:03 UTC 2025 - Alessandro de Oliveira Faria <cabelo@opensuse.org
cloud, or locally.
* Additional LLM performance optimizations on Intel® Core™ Ultra
200H series processors for improved 2nd token latency on
Windows and Linux.
Linux.
* Enhanced performance and efficient resource utilization with
the implementation of Paged Attention and Continuous Batching
by default in the GPU plugin.
@@ -241,10 +340,6 @@ Sun Feb 9 03:36:41 UTC 2025 - Alessandro de Oliveira Faria <cabelo@opensuse.org
directly within Keras 3 workflows for faster AI inference on
CPUs, built-in GPUs, discrete GPUs, and NPUs. This feature is
available with the latest Keras 3.8 release.
* The OpenVINO Model Server now supports native Windows Server
deployments, allowing developers to leverage better
performance by eliminating container overhead and simplifying
GPU deployment.
- Support Change and Deprecation Notices
* Now deprecated:
+ Legacy prefixes l_, w_, and m_ have been removed from

View File

@@ -1,4 +1,4 @@
name: openvino
version: 2025.2.0
mtime: 1749227913
commit: c01cd93e24d1cd78bfbb401eed51c08fb93e0816
version: 2025.4.0
mtime: 1763052589
commit: 7a975177ff432c687e5619e8fb22e4bf265e48b7

View File

@@ -31,13 +31,13 @@
%define pythons python3
%endif
%define __builder ninja
%define so_ver 2520
%define so_ver 2540
%define shlib lib%{name}%{so_ver}
%define shlib_c lib%{name}_c%{so_ver}
%define prj_name OpenVINO
Name: openvino
Version: 2025.2.0
Version: 2025.4.0
Release: 0
Summary: A toolkit for optimizing and deploying AI inference
# Let's be safe and put all third party licenses here, no matter that we use specific thirdparty libs or not