* Preview: Mixture of Experts (MoE) models optimized for CPUs and GPUs, validated for GPT-OSS 20B model. How to convert model: optimum-cli export openvino -m "openai/gpt-oss-20b" out_dir --weight-format int4 * Fixed issue ID 174531: Accuracy regression of Mistral-7b-instruct-v0.2 and Mistral-7b-instruct-v0.3 on all devices when executed with OpenVINO GenAI. As a workaround, use the IR converted with OpenVINO 2025.3. * Fixed issue ID 176777: Using the callback parameter with the Python API call generate() in Text2ImagePipeline, Image2ImagePipeline, InpaintingPipeline may cause the process to hang. As a workaround, do not use the callback parameter. C++ implementations was not affected. * Resolved an issue in the NPU plugin where the Level Zero (L0) context was implemented as a static global object and only destroyed during DLL unload, even after unload_plugin() was called. This behavior prevented the driver from spawning threads required for certain optimizations and features. OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/openvino?expand=0&rev=47
891 lines
43 KiB
Plaintext
891 lines
43 KiB
Plaintext
-------------------------------------------------------------------
|
||
Sun Dec 28 02:04:43 UTC 2025 - Alessandro de Oliveira Faria <cabelo@opensuse.org>
|
||
|
||
- Update to 2025.4.0
|
||
* Preview: Mixture of Experts (MoE) models optimized for CPUs and
|
||
GPUs, validated for GPT-OSS 20B model. How to convert model:
|
||
optimum-cli export openvino -m "openai/gpt-oss-20b" out_dir
|
||
--weight-format int4
|
||
* Fixed issue ID 174531: Accuracy regression of
|
||
Mistral-7b-instruct-v0.2 and Mistral-7b-instruct-v0.3
|
||
on all devices when executed with OpenVINO GenAI.
|
||
As a workaround, use the IR converted with OpenVINO 2025.3.
|
||
* Fixed issue ID 176777: Using the callback parameter with the
|
||
Python API call generate() in Text2ImagePipeline,
|
||
Image2ImagePipeline, InpaintingPipeline may cause the process
|
||
to hang. As a workaround, do not use the callback parameter.
|
||
C++ implementations was not affected.
|
||
* Resolved an issue in the NPU plugin where the Level Zero (L0)
|
||
context was implemented as a static global object and only
|
||
destroyed during DLL unload, even after unload_plugin()
|
||
was called. This behavior prevented the driver from
|
||
spawning threads required for certain optimizations and
|
||
features.
|
||
|
||
-------------------------------------------------------------------
|
||
Tue Dec 2 22:43:52 UTC 2025 - Alessandro de Oliveira Faria <cabelo@opensuse.org>
|
||
|
||
- Update to 2025.4.0
|
||
- More GenAI coverage and framework integrations to minimize code
|
||
changes
|
||
* New models supported:
|
||
+ On CPUs & GPUs: Qwen3-Embedding-0.6B, Qwen3-Reranker-0.6B,
|
||
Mistral-Small-24B-Instruct-2501.
|
||
+ On NPUs: Gemma-3-4b-it and Qwen2.5-VL-3B-Instruct.
|
||
* Preview: Mixture of Experts (MoE) models optimized for CPUs
|
||
and GPUs, validated for Qwen3-30B-A3B.
|
||
* GenAI pipeline integrations: Qwen3-Embedding-0.6B and
|
||
Qwen3-Reranker-0.6B for enhanced retrieval/ranking, and
|
||
Qwen2.5VL-7B for video pipeline.
|
||
- Broader LLM model support and more model compression
|
||
techniques
|
||
* The Neural Network Compression Framework (NNCF) ONNX backend
|
||
now supports INT8 static post-training quantization (PTQ)
|
||
and INT8/INT4 weight-only compression to ensure accuracy
|
||
parity with OpenVINO IR format models. SmoothQuant algorithm
|
||
support added for INT8 quantization.
|
||
* Accelerated multi-token generation for GenAI, leveraging
|
||
optimized GPU kernels to deliver faster inference, smarter
|
||
KV-cache reuse, and scalable LLM performance.
|
||
* GPU plugin updates include improved performance with prefix
|
||
caching for chat history scenarios and enhanced LLM accuracy
|
||
with dynamic quantization support for INT8.
|
||
- More portability and performance to run AI at the edge, in the
|
||
cloud, or locally.
|
||
* Announcing support for Intel® Core Ultra Processor Series 3.
|
||
* Encrypted blob format support added for secure model
|
||
deployment with OpenVINO GenAI. Model weights and artifacts
|
||
are stored and transmitted in an encrypted format, reducing
|
||
risks of IP theft during deployment. Developers can deploy
|
||
with minimal code changes using OpenVINO GenAI pipelines.
|
||
* OpenVINO™ Model Server and OpenVINO™ GenAI now extend
|
||
support for Agentic AI scenarios with new features such as
|
||
output parsing and improved chat templates for reliable
|
||
multi-turn interactions, and preview functionality for the
|
||
Qwen3-30B-A3B model. OVMS also introduces a preview for
|
||
audio endpoints.
|
||
* NPU deployment is simplified with batch support, enabling
|
||
seamless model execution across Intel® Core Ultra
|
||
processors while eliminating driver dependencies. Models
|
||
are reshaped to batch_size=1 before compilation.
|
||
* The improved NVIDIA Triton Server* integration with
|
||
OpenVINO backend now enables developers to utilize Intel
|
||
GPUs or NPUs for deployment.
|
||
|
||
-------------------------------------------------------------------
|
||
Sun Sep 7 01:21:19 UTC 2025 - Alessandro de Oliveira Faria <cabelo@opensuse.org>
|
||
|
||
- Update to 2025.3.0
|
||
- More GenAI coverage and framework integrations to minimize code
|
||
changes
|
||
* New models supported: Phi-4-mini-reasoning, AFM-4.5B,
|
||
Gemma-3-1B-it, Gemma-3-4B-it, and Gemma-3-12B,
|
||
* NPU support added for: Qwen3-1.7B, Qwen3-4B, and Qwen3-8B.
|
||
* LLMs optimized for NPU now available on OpenVINO Hugging
|
||
Face collection.
|
||
- Broader LLM model support and more model compression techniques
|
||
* The NPU plug-in adds support for longer contexts of up to
|
||
8K tokens, dynamic prompts, and dynamic LoRA for improved
|
||
LLM performance.
|
||
* The NPU plug-in now supports dynamic batch sizes by reshaping
|
||
the model to a batch size of 1 and concurrently managing
|
||
multiple inference requests, enhancing performance and
|
||
optimizing memory utilization.
|
||
* Accuracy improvements for GenAI models on both built-in
|
||
and discrete graphics achieved through the implementation
|
||
of the key cache compression per channel technique, in
|
||
addition to the existing KV cache per-token compression
|
||
method.
|
||
* OpenVINO™ GenAI introduces TextRerankPipeline for improved
|
||
retrieval relevance and RAG pipeline accuracy, plus
|
||
Structured Output for enhanced response reliability and
|
||
function calling while ensuring adherence to predefined
|
||
formats.
|
||
- More portability and performance to run AI at the edge,
|
||
in the cloud, or locally.
|
||
* Announcing support for Intel® Arc™ Pro B-Series
|
||
(B50 and B60).
|
||
* Preview: Hugging Face models that are GGUF-enabled for
|
||
OpenVINO GenAI are now supported by the OpenVINO™ Model
|
||
Server for popular LLM model architectures such as
|
||
DeepSeek Distill, Qwen2, Qwen2.5, and Llama 3.
|
||
This functionality reduces memory footprint and
|
||
simplifies integration for GenAI workloads.
|
||
* With improved reliability and tool call accuracy,
|
||
the OpenVINO™ Model Server boosts support for
|
||
agentic AI use cases on AI PCs, while enhancing
|
||
performance on Intel CPUs, built-in GPUs, and NPUs.
|
||
* int4 data-aware weights compression, now supported in the
|
||
Neural Network Compression Framework (NNCF) for ONNX
|
||
models, reduces memory footprint while maintaining
|
||
accuracy and enables efficient deployment in
|
||
resource-constrained environments.
|
||
|
||
-------------------------------------------------------------------
|
||
Wed Jun 25 01:09:14 UTC 2025 - Alessandro de Oliveira Faria <cabelo@opensuse.org>
|
||
|
||
- openSUSE Leap 16.0 compatibility
|
||
|
||
-------------------------------------------------------------------
|
||
Tue Jun 24 05:10:06 UTC 2025 - Alessandro de Oliveira Faria <cabelo@opensuse.org>
|
||
|
||
- Remove openvino-gcc5-compatibility.patch file
|
||
|
||
-------------------------------------------------------------------
|
||
Tue Jun 24 02:54:10 UTC 2025 - Alessandro de Oliveira Faria <cabelo@opensuse.org>
|
||
|
||
Summary of major features and improvements
|
||
- More GenAI coverage and framework integrations to minimize code
|
||
changes
|
||
* New models supported on CPUs & GPUs: Phi-4,
|
||
Mistral-7B-Instruct-v0.3, SD-XL Inpainting 0.1, Stable
|
||
Diffusion 3.5 Large Turbo, Phi-4-reasoning, Qwen3, and
|
||
Qwen2.5-VL-3B-Instruct. Mistral 7B Instruct v0.3 is also
|
||
supported on NPUs.
|
||
* Preview: OpenVINO ™ GenAI introduces a text-to-speech
|
||
pipeline for the SpeechT5 TTS model, while the new RAG
|
||
backend offers developers a simplified API that delivers
|
||
reduced memory usage and improved performance.
|
||
* Preview: OpenVINO™ GenAI offers a GGUF Reader for seamless
|
||
integration of llama.cpp based LLMs, with Python and C++
|
||
pipelines that load GGUF models, build OpenVINO graphs,
|
||
and run GPU inference on-the-fly. Validated for popular models:
|
||
DeepSeek-R1-Distill-Qwen (1.5B, 7B), Qwen2.5 Instruct
|
||
(1.5B, 3B, 7B) & llama-3.2 Instruct (1B, 3B, 8B).
|
||
- Broader LLM model support and more model compression
|
||
techniques
|
||
* Further optimization of LoRA adapters in OpenVINO GenAI
|
||
for improved LLM, VLM, and text-to-image model performance
|
||
on built-in GPUs. Developers can use LoRA adapters to
|
||
quickly customize models for specialized tasks.
|
||
* KV cache compression for CPUs is enabled by default for
|
||
INT8, providing a reduced memory footprint while maintaining
|
||
accuracy compared to FP16. Additionally, it delivers
|
||
substantial memory savings for LLMs with INT4 support compared
|
||
to INT8.
|
||
* Optimizations for Intel® Core™ Ultra Processor Series 2
|
||
built-in GPUs and Intel® Arc™ B Series Graphics with the
|
||
Intel® XMX systolic platform to enhance the performance of
|
||
VLM models and hybrid quantized image generation models, as
|
||
well as improve first-token latency for LLMs through dynamic
|
||
quantization.
|
||
- More portability and performance to run AI at the edge, in the
|
||
cloud, or locally.
|
||
* Enhanced Linux* support with the latest GPU driver for
|
||
built-in GPUs on Intel® Core™ Ultra Processor Series 2
|
||
(formerly codenamed Arrow Lake H).
|
||
* Support for INT4 data-free weights compression for ONNX
|
||
models implemented in the Neural Network Compression
|
||
Framework (NNCF).
|
||
* NPU support for FP16-NF4 precision on Intel® Core™ 200V
|
||
Series processors for models with up to 8B parameters is
|
||
enabled through symmetrical and channel-wise quantization,
|
||
improving accuracy while maintaining performance efficiency.
|
||
Support Change and Deprecation Notices
|
||
- Discontinued in 2025:
|
||
* Runtime components:
|
||
+ The OpenVINO property of Affinity API is no longer
|
||
available. It has been replaced with CPU binding
|
||
configurations (ov::hint::enable_cpu_pinning).
|
||
+ The openvino-nightly PyPI module has been discontinued.
|
||
End-users should proceed with the Simple PyPI nightly repo
|
||
instead. More information in Release Policy. The
|
||
openvino-nightly PyPI module has been discontinued.
|
||
End-users should proceed with the Simple PyPI nightly repo
|
||
instead. More information in Release Policy.
|
||
* Tools:
|
||
+ The OpenVINO™ Development Tools package (pip install
|
||
openvino-dev) is no longer available for OpenVINO releases
|
||
in 2025.
|
||
+ Model Optimizer is no longer available. Consider using the
|
||
new conversion methods instead. For more details, see the
|
||
model conversion transition guide.
|
||
+ Intel® Streaming SIMD Extensions (Intel® SSE) are currently
|
||
not enabled in the binary package by default. They are
|
||
still supported in the source code form.
|
||
+ Legacy prefixes: l_, w_, and m_ have been removed from
|
||
OpenVINO archive names.
|
||
* OpenVINO GenAI:
|
||
+ StreamerBase::put(int64_t token)
|
||
+ The Bool value for Callback streamer is no longer accepted.
|
||
It must now return one of three values of StreamingStatus
|
||
enum.
|
||
+ ChunkStreamerBase is deprecated. Use StreamerBase instead.
|
||
* NNCF create_compressed_model() method is now deprecated.
|
||
nncf.quantize() method is recommended for
|
||
Quantization-Aware Training of PyTorch and TensorFlow models.
|
||
* OpenVINO Model Server (OVMS) benchmark client in C++
|
||
using TensorFlow Serving API.
|
||
- Deprecated and to be removed in the future:
|
||
* Python 3.9 is now deprecated and will be unavailable after
|
||
OpenVINO version 2025.4.
|
||
* openvino.Type.undefined is now deprecated and will be removed
|
||
with version 2026.0. openvino.Type.dynamic should be used
|
||
instead.
|
||
* APT & YUM Repositories Restructure: Starting with release
|
||
2025.1, users can switch to the new repository structure
|
||
for APT and YUM, which no longer uses year-based
|
||
subdirectories (like “2025”). The old (legacy) structure
|
||
will still be available until 2026, when the change will
|
||
be finalized. Detailed instructions are available on the
|
||
relevant documentation pages:
|
||
+ Installation guide - yum
|
||
+ Installation guide - apt
|
||
* OpenCV binaries will be removed from Docker images in 2026.
|
||
* Ubuntu 20.04 support will be deprecated in future OpenVINO
|
||
releases due to the end of standard support.
|
||
* “auto shape” and “auto batch size” (reshaping a model in
|
||
runtime) will be removed in the future. OpenVINO’s dynamic
|
||
shape models are recommended instead.
|
||
* MacOS x86 is no longer recommended for use due to the
|
||
discontinuation of validation. Full support will be removed
|
||
later in 2025.
|
||
* The openvino namespace of the OpenVINO Python API has been
|
||
redesigned, removing the nested openvino.runtime module.
|
||
The old namespace is now considered deprecated and will be
|
||
discontinued in 2026.0.
|
||
|
||
-------------------------------------------------------------------
|
||
Wed May 21 14:43:02 UTC 2025 - Andreas Schwab <schwab@suse.de>
|
||
|
||
- Fix file list for riscv64
|
||
|
||
-------------------------------------------------------------------
|
||
Mon May 5 07:47:30 UTC 2025 - Dominique Leuenberger <dimstar@opensuse.org>
|
||
|
||
- Do not force GCC15 on Tumblewed just yet: follow the distro
|
||
default compiler, like any other package.
|
||
|
||
-------------------------------------------------------------------
|
||
Sat May 3 19:19:07 UTC 2025 - Alessandro de Oliveira Faria <cabelo@opensuse.org>
|
||
|
||
- openvino-gcc5-compatibility.patch to resolve incompatibility
|
||
in gcc5
|
||
|
||
-------------------------------------------------------------------
|
||
Thu May 1 01:06:52 UTC 2025 - Alessandro de Oliveira Faria <cabelo@opensuse.org>
|
||
|
||
- Added gcc-14
|
||
|
||
-------------------------------------------------------------------
|
||
Mon Apr 14 06:52:03 UTC 2025 - Alessandro de Oliveira Faria <cabelo@opensuse.org>
|
||
|
||
- Update to 2025.1.0
|
||
- Downgrade from gcc13-c++ to 12 due to incompatibility in tbb
|
||
compilation. This was due to C++ libraries (using libstdc++)
|
||
resulting in the error: libtbb.so.12: undefined reference to
|
||
`__cxa_call_terminate@CXXABI_1.3.15'
|
||
- More GenAI coverage and framework integrations to minimize code
|
||
changes
|
||
* New models supported: Phi-4 Mini, Jina CLIP v1, and Bce
|
||
Embedding Base v1.
|
||
* OpenVINO™ Model Server now supports VLM models, including
|
||
Qwen2-VL, Phi-3.5-Vision, and InternVL2.
|
||
* OpenVINO GenAI now includes image-to-image and inpainting
|
||
features for transformer-based pipelines, such as Flux.1 and
|
||
Stable Diffusion 3 models, enhancing their ability to generate
|
||
more realistic content.
|
||
* Preview: AI Playground now utilizes the OpenVINO Gen AI backend
|
||
to enable highly optimized inferencing performance on AI PCs.
|
||
|
||
- Broader LLM model support and more model compression techniques
|
||
* Reduced binary size through optimization of the CPU plugin and
|
||
removal of the GEMM kernel.
|
||
* Optimization of new kernels for the GPU plugin significantly
|
||
boosts the performance of Long Short-Term Memory (LSTM) models,
|
||
used in many applications, including speech recognition,
|
||
language modeling, and time series forecasting.
|
||
* Preview: Token Eviction implemented in OpenVINO GenAI to reduce
|
||
the memory consumption of KV Cache by eliminating unimportant
|
||
tokens. This current Token Eviction implementation is
|
||
beneficial for tasks where a long sequence is generated, such
|
||
as chatbots and code generation.
|
||
* NPU acceleration for text generation is now enabled in
|
||
OpenVINO™ Runtime and OpenVINO™ Model Server to support the
|
||
power-efficient deployment of VLM models on NPUs for AI PC use
|
||
cases with low concurrency.
|
||
|
||
- More portability and performance to run AI at the edge, in the
|
||
cloud, or locally.
|
||
* Additional LLM performance optimizations on Intel® Core™ Ultra
|
||
200H series processors for improved 2nd token latency on
|
||
Linux.
|
||
* Enhanced performance and efficient resource utilization with
|
||
the implementation of Paged Attention and Continuous Batching
|
||
by default in the GPU plugin.
|
||
* Preview: The new OpenVINO backend for Executorch will enable
|
||
accelerated inference and improved performance on Intel
|
||
hardware, including CPUs, GPUs, and NPUs.
|
||
|
||
-------------------------------------------------------------------
|
||
Tue Mar 4 00:38:30 UTC 2025 - Alessandro de Oliveira Faria <cabelo@opensuse.org>
|
||
|
||
- Disabled JAX plugin beta.
|
||
|
||
-------------------------------------------------------------------
|
||
Sun Feb 9 03:36:41 UTC 2025 - Alessandro de Oliveira Faria <cabelo@opensuse.org>
|
||
|
||
- Update to 2025.0.0
|
||
- More GenAI coverage and framework integrations to minimize code
|
||
changes
|
||
* New models supported: Qwen 2.5, Deepseek-R1-Distill-Llama-8B,
|
||
DeepSeek-R1-Distill-Qwen-7B, and DeepSeek-R1-Distill-Qwen-1.5B,
|
||
FLUX.1 Schnell and FLUX.1 Dev
|
||
* Whisper Model: Improved performance on CPUs, built-in GPUs,
|
||
and discrete GPUs with GenAI API.
|
||
* Preview: Introducing NPU support for torch.compile, giving
|
||
developers the ability to use the OpenVINO backend to run the
|
||
PyTorch API on NPUs. 300+ deep learning models enabled from
|
||
the TorchVision, Timm, and TorchBench repositories..
|
||
- Broader Large Language Model (LLM) support and more model
|
||
compression techniques.
|
||
* Preview: Addition of Prompt Lookup to GenAI API improves 2nd
|
||
token latency for LLMs by effectively utilizing predefined
|
||
prompts that match the intended use case.
|
||
* Preview: The GenAI API now offers image-to-image inpainting
|
||
functionality. This feature enables models to generate
|
||
realistic content by inpainting specified modifications and
|
||
seamlessly integrating them with the original image.
|
||
* Asymmetric KV Cache compression is now enabled for INT8 on
|
||
CPUs, resulting in lower memory consumption and improved 2nd
|
||
token latency, especially when dealing with long prompts that
|
||
require significant memory. The option should be explicitly
|
||
specified by the user.
|
||
- More portability and performance to run AI at the edge, in the
|
||
cloud, or locally.
|
||
* Support for the latest Intel® Core™ Ultra 200H series
|
||
processors (formerly codenamed Arrow Lake-H)
|
||
* Integration of the OpenVINO ™ backend with the Triton
|
||
Inference Server allows developers to utilize the Triton
|
||
server for enhanced model serving performance when deploying
|
||
on Intel CPUs.
|
||
* Preview: A new OpenVINO ™ backend integration allows
|
||
developers to leverage OpenVINO performance optimizations
|
||
directly within Keras 3 workflows for faster AI inference on
|
||
CPUs, built-in GPUs, discrete GPUs, and NPUs. This feature is
|
||
available with the latest Keras 3.8 release.
|
||
- Support Change and Deprecation Notices
|
||
* Now deprecated:
|
||
+ Legacy prefixes l_, w_, and m_ have been removed from
|
||
OpenVINO archive names.
|
||
+ The runtime namespace for Python API has been marked as
|
||
deprecated and designated to be removed for 2026.0. The
|
||
new namespace structure has been delivered, and migration
|
||
is possible immediately. Details will be communicated
|
||
through warnings andvia documentation.
|
||
+ NNCF create_compressed_model() method is deprecated.
|
||
nncf.quantize() method is now recommended for
|
||
Quantization-Aware Training of PyTorch and
|
||
TensorFlow models.
|
||
|
||
-------------------------------------------------------------------
|
||
Sun Dec 29 03:41:47 UTC 2024 - Alessandro de Oliveira Faria <cabelo@opensuse.org>
|
||
|
||
- openvino-onnx-ml-defines.patch and
|
||
openvino-remove-npu-compile-tool.patchhas been removed
|
||
as it is no longer needed in this version.
|
||
- Update to 2024.4.0
|
||
- Summary of major features and improvements
|
||
* OpenVINO 2024.6 release includes updates for enhanced
|
||
stability and improved LLM performance.
|
||
* Introduced support for Intel® Arc™ B-Series Graphics
|
||
(formerly known as Battlemage).
|
||
* Implemented optimizations to improve the inference time and
|
||
LLM performance on NPUs.
|
||
* Improved LLM performance with GenAI API optimizations and
|
||
bug fixes.
|
||
- Support Change and Deprecation Notices
|
||
* Using deprecated features and components is not advised. They
|
||
are available to enable a smooth transition to new solutions
|
||
and will be discontinued in the future. To keep using
|
||
discontinued features, you will have to revert to the last
|
||
LTS OpenVINO version supporting them. For more details, refer
|
||
to the OpenVINO Legacy Features and Components page.
|
||
* Discontinued in 2024.0:
|
||
+ Runtime components:
|
||
- Intel® Gaussian & Neural Accelerator (Intel® GNA)..
|
||
Consider using the Neural Processing Unit (NPU) for
|
||
low-powered systems like Intel® Core™ Ultra or 14th
|
||
generation and beyond.
|
||
- OpenVINO C++/C/Python 1.0 APIs (see 2023.3 API transition
|
||
guide for reference).
|
||
- All ONNX Frontend legacy API (known as ONNX_IMPORTER_API)
|
||
- 'PerfomanceMode.UNDEFINED' property as part of the
|
||
OpenVINO Python API
|
||
+ Tools:
|
||
- Deployment Manager. See installation and deployment
|
||
guides for current distribution options.
|
||
- Accuracy Checker.
|
||
- Post-Training Optimization Tool (POT). Neural Network
|
||
Compression Framework (NNCF) should be used instead.
|
||
- A Git patch for NNCF integration with
|
||
huggingface/transformers. The recommended approach is
|
||
to use huggingface/optimum-intel for applying NNCF
|
||
optimization on top of models from Hugging Face.
|
||
- Support for Apache MXNet, Caffe, and Kaldi model formats.
|
||
Conversion to ONNX may be used as a solution.
|
||
* Deprecated and to be removed in the future:
|
||
+ The macOS x86_64 debug bins will no longer be provided
|
||
with the OpenVINO toolkit, starting with OpenVINO 2024.5.
|
||
+ Python 3.8 is no longer supported, starting with
|
||
OpenVINO 2024.5.
|
||
+ As MxNet doesn’t support Python version higher than 3.8,
|
||
according to the MxNet PyPI project, it is no longer
|
||
supported by OpenVINO, either.
|
||
+ Discrete Keem Bay support is no longer supported, starting
|
||
with OpenVINO 2024.5.
|
||
+ Support for discrete devices (formerly codenamed Raptor
|
||
Lake) is no longer available for NPU.
|
||
|
||
-------------------------------------------------------------------
|
||
Tue Dec 10 15:50:41 UTC 2024 - Giacomo Comes <gcomes.obs@gmail.com>
|
||
|
||
- fix build on tumbleweed
|
||
* currently openvino does not support protobuf v22 or newer
|
||
|
||
-------------------------------------------------------------------
|
||
Tue Oct 15 00:56:54 UTC 2024 - Alessandro de Oliveira Faria <cabelo@opensuse.org>
|
||
|
||
- Temporarily inserted gcc-13 in Tumbleweed/Factory/Slowroll:
|
||
Because there is an incompatibility of the source code of the
|
||
level-zero library and npu module with gcc-14. I am working
|
||
with Intel on tests to return to native gcc.
|
||
- Update to 2024.4.0
|
||
- Summary of major features and improvements
|
||
* More Gen AI coverage and framework integrations to minimize
|
||
code changes
|
||
+ Support for GLM-4-9B Chat, MiniCPM-1B, Llama 3 and 3.1,
|
||
Phi-3-Mini, Phi-3-Medium and YOLOX-s models.
|
||
+ Noteworthy notebooks added: Florence-2, NuExtract-tiny
|
||
Structure Extraction, Flux.1 Image Generation, PixArt-α:
|
||
Photorealistic Text-to-Image Synthesis, and Phi-3-Vision
|
||
Visual Language Assistant.
|
||
* Broader Large Language Model (LLM) support and more model
|
||
compression techniques.
|
||
+ OpenVINO™ runtime optimized for Intel® Xe Matrix Extensions
|
||
(Intel® XMX) systolic arrays on built-in GPUs for efficient
|
||
matrix multiplication resulting in significant LLM
|
||
performance boost with improved 1st and 2nd token
|
||
latency, as well as a smaller memory footprint on
|
||
Intel® Core™ Ultra Processors (Series 2).
|
||
+ Memory sharing enabled for NPUs on Intel® Core™ Ultra
|
||
Processors (Series 2) for efficient pipeline integration
|
||
without memory copy overhead.
|
||
+ Addition of the PagedAttention feature for discrete GPUs*
|
||
enables a significant boost in throughput for parallel
|
||
inferencing when serving LLMs on Intel® Arc™ Graphics
|
||
or Intel® Data Center GPU Flex Series.
|
||
* More portability and performance to run AI at the edge,
|
||
in the cloud, or locally.
|
||
+ OpenVINO™ Model Server now comes with production-quality
|
||
support for OpenAI-compatible API which enables i
|
||
significantly higher throughput for parallel inferencing
|
||
on Intel® Xeon® processors when serving LLMs to many
|
||
concurrent users.
|
||
+ Improved performance and memory consumption with prefix
|
||
caching, KV cache compression, and other optimizations
|
||
for serving LLMs using OpenVINO™ Model Server.
|
||
+ Support for Python 3.12.
|
||
- Support Change and Deprecation Notices
|
||
* Using deprecated features and components is not advised.
|
||
They are available to enable a smooth transition to new
|
||
solutions and will be discontinued in the future.
|
||
To keep using discontinued features, you will have to
|
||
revert to the last LTS OpenVINO version supporting them.
|
||
For more details, refer to the OpenVINO Legacy Features
|
||
and Components page.
|
||
* Discontinued in 2024.0:
|
||
+ Runtime components:
|
||
- Intel® Gaussian & Neural Accelerator (Intel® GNA).
|
||
Consider using the Neural Processing Unit (NPU) for
|
||
low-powered systems like Intel® Core™ Ultra or
|
||
14th generation and beyond.
|
||
- OpenVINO C++/C/Python 1.0 APIs (see 2023.3 API
|
||
transition guide for reference).
|
||
- All ONNX Frontend legacy API (known as
|
||
ONNX_IMPORTER_API)
|
||
-'PerfomanceMode.UNDEFINED' property as part of the
|
||
OpenVINO Python API
|
||
+ Tools:
|
||
- Deployment Manager. See installation and deployment
|
||
guides for current distribution options.
|
||
- Accuracy Checker.
|
||
- Post-Training Optimization Tool (POT). Neural Network
|
||
Compression Framework (NNCF) should be used instead.
|
||
- A Git patch for NNCF integration with huggingface/
|
||
transformers. The recommended approach is to use
|
||
huggingface/optimum-intel for applying NNCF
|
||
optimization on top of models from Hugging Face.
|
||
- Support for Apache MXNet, Caffe, and Kaldi model
|
||
formats. Conversion to ONNX may be used as a
|
||
solution.
|
||
* Deprecated and to be removed in the future:
|
||
+ The macOS x86_64 debug bins will no longer be
|
||
provided with the OpenVINO toolkit, starting with
|
||
OpenVINO 2024.5.
|
||
+ Python 3.8 is now considered deprecated, and it will not
|
||
be available beyond the 2024.4 OpenVINO version.
|
||
+ dKMB support is now considered deprecated and will be
|
||
fully removed with OpenVINO 2024.5
|
||
+ Intel® Streaming SIMD Extensions (Intel® SSE) will be
|
||
supported in source code form, but not enabled in the
|
||
binary package by default, starting with OpenVINO 2025.0
|
||
+ The openvino-nightly PyPI module will soon be discontinued.
|
||
End-users should proceed with the Simple PyPI nightly repo
|
||
instead. More information in Release Policy.
|
||
+ The OpenVINO™ Development Tools package (pip install
|
||
openvino-dev) will be removed from installation options and
|
||
distribution channels beginning with OpenVINO 2025.0.
|
||
+ Model Optimizer will be discontinued with OpenVINO 2025.0.
|
||
Consider using the new conversion methods instead. For more
|
||
details, see the model conversion transition guide.
|
||
+ OpenVINO property Affinity API will be discontinued with
|
||
OpenVINO 2025.0. It will be replaced with CPU binding
|
||
configurations (ov::hint::enable_cpu_pinning).
|
||
+ OpenVINO Model Server components:
|
||
- “auto shape” and “auto batch size” (reshaping a model in
|
||
runtime) will be removed in the future. OpenVINO’s dynamic
|
||
shape models are recommended instead.
|
||
+ A number of notebooks have been deprecated. For an
|
||
up-to-date listing of available notebooks, refer to the
|
||
OpenVINO™ Notebook index (openvinotoolkit.github.io).
|
||
|
||
-------------------------------------------------------------------
|
||
Wed Oct 2 20:56:59 UTC 2024 - Giacomo Comes <gcomes.obs@gmail.com>
|
||
|
||
- Add Leap15 build
|
||
- Remove comment lines in the spec file that cause the insertion
|
||
of extra lines during a commit
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Aug 10 01:41:06 UTC 2024 - Alessandro de Oliveira Faria <cabelo@opensuse.org>
|
||
|
||
- Remove NPU Compile Tool
|
||
* openvino-remove-npu-compile-tool.patch
|
||
- Update to 2024.3.0
|
||
- Summary of major features and improvements
|
||
* More Gen AI coverage and framework integrations to minimize
|
||
code changes
|
||
+ OpenVINO pre-optimized models are now available in Hugging
|
||
Face making it easier for developers to get started with
|
||
these models.
|
||
* Broader Large Language Model (LLM) support and more model
|
||
compression techniques.
|
||
+ Significant improvement in LLM performance on Intel
|
||
discrete GPUs with the addition of Multi-Head Attention
|
||
(MHA) and OneDNN enhancements.
|
||
* More portability and performance to run AI at the edge, in the
|
||
cloud, or locally.
|
||
+ Improved CPU performance when serving LLMs with the
|
||
inclusion of vLLM and continuous batching in the OpenVINO
|
||
Model Server (OVMS). vLLM is an easy-to-use open-source
|
||
library that supports efficient LLM inferencing and model
|
||
serving.
|
||
- Support Change and Deprecation Notices
|
||
* Using deprecated features and components is not advised.
|
||
They are available to enable a smooth transition to new
|
||
solutions and will be discontinued in the future. To keep
|
||
using discontinued features, you will have to revert to the
|
||
last LTS OpenVINO version supporting them. For more details,
|
||
refer to the OpenVINO Legacy Features and Components page.
|
||
* Discontinued in 2024.0:
|
||
+ Runtime components:
|
||
- Intel® Gaussian & Neural Accelerator (Intel® GNA)..Consider
|
||
using the Neural Processing Unit (NPU) for low-powered
|
||
systems like Intel® Core™ Ultra or 14th generation
|
||
and beyond.
|
||
- OpenVINO C++/C/Python 1.0 APIs (see 2023.3 API transition
|
||
guide for reference).
|
||
- All ONNX Frontend legacy API (known as ONNX_IMPORTER_API)
|
||
- 'PerfomanceMode.UNDEFINED' property as part of the OpenVINO
|
||
Python API
|
||
+ Tools:
|
||
- Deployment Manager. See installation and deployment guides
|
||
for current distribution options.
|
||
- Accuracy Checker.
|
||
- Post-Training Optimization Tool (POT). Neural Network
|
||
Compression Framework (NNCF) should be used instead.
|
||
- A Git patch for NNCF integration with huggingface/
|
||
transformers. The recommended approach is to use
|
||
huggingface/optimum-intel for applying NNCF optimization
|
||
on top of models from Hugging Face.
|
||
- Support for Apache MXNet, Caffe, and Kaldi model formats.
|
||
Conversion to ONNX may be used as a solution.
|
||
* Deprecated and to be removed in the future:
|
||
+ The OpenVINO™ Development Tools package (pip install
|
||
openvino-dev) will be removed from installation options
|
||
and distribution channels beginning with OpenVINO 2025.0.
|
||
+ Model Optimizer will be discontinued with OpenVINO 2025.0.
|
||
Consider using the new conversion methods instead. For
|
||
more details, see the model conversion transition guide.
|
||
+ OpenVINO property Affinity API will be discontinued with
|
||
OpenVINO 2025.0. It will be replaced with CPU binding
|
||
configurations (ov::hint::enable_cpu_pinning).
|
||
+ OpenVINO Model Server components:
|
||
- “auto shape” and “auto batch size” (reshaping a model
|
||
in runtime) will be removed in the future. OpenVINO’s
|
||
dynamic shape models are recommended instead.
|
||
+ A number of notebooks have been deprecated. For an
|
||
up-to-date listing of available notebooks, refer to
|
||
the OpenVINO™ Notebook index (openvinotoolkit.github.io).
|
||
|
||
-------------------------------------------------------------------
|
||
Sat Jun 22 12:01:23 UTC 2024 - Andreas Schwab <schwab@suse.de>
|
||
|
||
- Add riscv-cpu-plugin subpackage
|
||
|
||
-------------------------------------------------------------------
|
||
Wed Jun 19 21:36:01 UTC 2024 - Alessandro de Oliveira Faria <cabelo@opensuse.org>
|
||
|
||
- Update to 2024.2.0
|
||
- More Gen AI coverage and framework integrations to minimize code
|
||
changes
|
||
* Llama 3 optimizations for CPUs, built-in GPUs, and discrete
|
||
GPUs for improved performance and efficient memory usage.
|
||
* Support for Phi-3-mini, a family of AI models that leverages
|
||
the power of small language models for faster, more accurate
|
||
and cost-effective text processing.
|
||
* Python Custom Operation is now enabled in OpenVINO making it
|
||
easier for Python developers to code their custom operations
|
||
instead of using C++ custom operations (also supported).
|
||
Python Custom Operation empowers users to implement their own
|
||
specialized operations into any model.
|
||
* Notebooks expansion to ensure better coverage for new models.
|
||
Noteworthy notebooks added: DynamiCrafter, YOLOv10, Chatbot
|
||
notebook with Phi-3, and QWEN2.
|
||
- Broader Large Language Model (LLM) support and more model
|
||
compression techniques.
|
||
* GPTQ method for 4-bit weight compression added to NNCF for
|
||
more efficient inference and improved performance of
|
||
compressed LLMs.
|
||
* Significant LLM performance improvements and reduced latency
|
||
for both built-in GPUs and discrete GPUs.
|
||
* Significant improvement in 2nd token latency and memory
|
||
footprint of FP16 weight LLMs on AVX2 (13th Gen Intel® Core™
|
||
processors) and AVX512 (3rd Gen Intel® Xeon® Scalable
|
||
Processors) based CPU platforms, particularly for small
|
||
batch sizes.
|
||
- More portability and performance to run AI at the edge, in the
|
||
cloud, or locally.
|
||
* Model Serving Enhancements:
|
||
* Preview: OpenVINO Model Server (OVMS) now supports
|
||
OpenAI-compatible API along with Continuous Batching and
|
||
PagedAttention, enabling significantly higher throughput
|
||
for parallel inferencing, especially on Intel® Xeon®
|
||
processors, when serving LLMs to many concurrent users.
|
||
* OpenVINO backend for Triton Server now supports built-in
|
||
GPUs and discrete GPUs, in addition to dynamic
|
||
shapes support.
|
||
* Integration of TorchServe through torch.compile OpenVINO
|
||
backend for easy model deployment, provisioning to
|
||
multiple instances, model versioning, and maintenance.
|
||
* Preview: addition of the Generate API, a simplified API
|
||
for text generation using large language models with only
|
||
a few lines of code. The API is available through the newly
|
||
launched OpenVINO GenAI package.
|
||
* Support for Intel Atom® Processor X Series. For more details,
|
||
see System Requirements.
|
||
* Preview: Support for Intel® Xeon® 6 processor.
|
||
- Support Change and Deprecation Notices
|
||
* Using deprecated features and components is not advised.
|
||
They are available to enable a smooth transition to new
|
||
solutions and will be discontinued in the future.
|
||
To keep using discontinued features, you will have to revert
|
||
to the last LTS OpenVINO version supporting them. For more
|
||
details, refer to the OpenVINO Legacy Features and
|
||
Components page.
|
||
* Discontinued in 2024.0:
|
||
+ Runtime components:
|
||
- Intel® Gaussian & Neural Accelerator (Intel® GNA).
|
||
Consider using the Neural Processing Unit (NPU) for
|
||
low-powered systems like Intel® Core™ Ultra or 14th
|
||
generation and beyond.
|
||
- OpenVINO C++/C/Python 1.0 APIs (see 2023.3 API
|
||
transition guide for reference).
|
||
- All ONNX Frontend legacy API (known as ONNX_IMPORTER_API)
|
||
- 'PerfomanceMode.UNDEFINED' property as part of the
|
||
OpenVINO Python API
|
||
+ Tools:
|
||
- Deployment Manager. See installation and deployment
|
||
guides for current distribution options.
|
||
- Accuracy Checker.
|
||
- Post-Training Optimization Tool (POT). Neural Network
|
||
Compression Framework (NNCF) should be used instead.
|
||
- A Git patch for NNCF integration with
|
||
huggingface/transformers. The recommended approach
|
||
is to use huggingface/optimum-intel for applying NNCF
|
||
optimization on top of models from Hugging Face.
|
||
- Support for Apache MXNet, Caffe, and Kaldi model formats.
|
||
Conversion to ONNX may be used as a solution.
|
||
* Deprecated and to be removed in the future:
|
||
+ The OpenVINO™ Development Tools package (pip install
|
||
openvino-dev) will be removed from installation options
|
||
and distribution channels beginning with OpenVINO 2025.0.
|
||
+ Model Optimizer will be discontinued with OpenVINO 2025.0.
|
||
Consider using the new conversion methods instead. For
|
||
more details, see the model conversion transition guide.
|
||
+ OpenVINO property Affinity API will be discontinued with
|
||
OpenVINO 2025.0. It will be replaced with CPU binding
|
||
configurations (ov::hint::enable_cpu_pinning).
|
||
+ OpenVINO Model Server components:
|
||
+ “auto shape” and “auto batch size” (reshaping a model in
|
||
runtime) will be removed in the future. OpenVINO’s dynamic
|
||
shape models are recommended instead.
|
||
+ A number of notebooks have been deprecated. For an
|
||
up-to-date listing of available notebooks, refer to the
|
||
OpenVINO™ Notebook index (openvinotoolkit.github.io).
|
||
|
||
-------------------------------------------------------------------
|
||
Thu May 9 22:56:53 UTC 2024 - Alessandro de Oliveira Faria <cabelo@opensuse.org>
|
||
|
||
- Fix sample source path in build script:
|
||
* openvino-fix-build-sample-path.patch
|
||
- Update to 2024.1.0
|
||
- More Generative AI coverage and framework integrations to
|
||
minimize code changes.
|
||
* Mixtral and URLNet models optimized for performance
|
||
improvements on Intel® Xeon® processors.
|
||
* Stable Diffusion 1.5, ChatGLM3-6B, and Qwen-7B models
|
||
optimized for improved inference speed on Intel® Core™
|
||
Ultra processors with integrated GPU.
|
||
* Support for Falcon-7B-Instruct, a GenAI Large Language Model
|
||
(LLM) ready-to-use chat/instruct model with superior
|
||
performance metrics.
|
||
* New Jupyter Notebooks added: YOLO V9, YOLO V8
|
||
Oriented Bounding Boxes Detection (OOB), Stable Diffusion
|
||
in Keras, MobileCLIP, RMBG-v1.4 Background Removal, Magika,
|
||
TripoSR, AnimateAnyone, LLaVA-Next, and RAG system with
|
||
OpenVINO and LangChain.
|
||
- Broader Large Language Model (LLM) support and more model
|
||
compression techniques.
|
||
* LLM compilation time reduced through additional optimizations
|
||
with compressed embedding. Improved 1st token performance of
|
||
LLMs on 4th and 5th generations of Intel® Xeon® processors
|
||
with Intel® Advanced Matrix Extensions (Intel® AMX).
|
||
* Better LLM compression and improved performance with oneDNN,
|
||
INT4, and INT8 support for Intel® Arc™ GPUs.
|
||
* Significant memory reduction for select smaller GenAI
|
||
models on Intel® Core™ Ultra processors with integrated GPU.
|
||
- More portability and performance to run AI at the edge,
|
||
in the cloud, or locally.
|
||
* The preview NPU plugin for Intel® Core™ Ultra processors
|
||
is now available in the OpenVINO open-source GitHub
|
||
repository, in addition to the main OpenVINO package on PyPI.
|
||
* The JavaScript API is now more easily accessible through
|
||
the npm repository, enabling JavaScript developers’ seamless
|
||
access to the OpenVINO API.
|
||
* FP16 inference on ARM processors now enabled for the
|
||
Convolutional Neural Network (CNN) by default.
|
||
- Support Change and Deprecation Notices
|
||
* Using deprecated features and components is not advised. They
|
||
are available to enable a smooth transition to new solutions
|
||
and will be discontinued in the future. To keep using
|
||
Discontinued features, you will have to revert to the last
|
||
LTS OpenVINO version supporting them.
|
||
* For more details, refer to the OpenVINO Legacy Features
|
||
and Components page.
|
||
* Discontinued in 2024.0:
|
||
+ Runtime components:
|
||
- Intel® Gaussian & Neural Accelerator (Intel® GNA).
|
||
Consider using the Neural Processing Unit (NPU)
|
||
for low-powered systems like Intel® Core™ Ultra or
|
||
14th generation and beyond.
|
||
- OpenVINO C++/C/Python 1.0 APIs (see 2023.3 API
|
||
transition guide for reference).
|
||
- All ONNX Frontend legacy API (known as
|
||
ONNX_IMPORTER_API)
|
||
- 'PerfomanceMode.UNDEFINED' property as part of
|
||
the OpenVINO Python API
|
||
+ Tools:
|
||
- Deployment Manager. See installation and deployment
|
||
guides for current distribution options.
|
||
- Accuracy Checker.
|
||
- Post-Training Optimization Tool (POT). Neural Network
|
||
Compression Framework (NNCF) should be used instead.
|
||
- A Git patch for NNCF integration with
|
||
huggingface/transformers. The recommended approach
|
||
is to use huggingface/optimum-intel for applying
|
||
NNCF optimization on top of models from Hugging
|
||
Face.
|
||
- Support for Apache MXNet, Caffe, and Kaldi model
|
||
formats. Conversion to ONNX may be used as
|
||
a solution.
|
||
* Deprecated and to be removed in the future:
|
||
+ The OpenVINO™ Development Tools package (pip install
|
||
openvino-dev) will be removed from installation options
|
||
and distribution channels beginning with OpenVINO 2025.0.
|
||
+ Model Optimizer will be discontinued with OpenVINO 2025.0.
|
||
Consider using the new conversion methods instead. For
|
||
more details, see the model conversion transition guide.
|
||
+ OpenVINO property Affinity API will be discontinued with
|
||
OpenVINO 2025.0. It will be replaced with CPU binding
|
||
configurations (ov::hint::enable_cpu_pinning).
|
||
+ OpenVINO Model Server components:
|
||
- “auto shape” and “auto batch size” (reshaping a model
|
||
in runtime) will be removed in the future. OpenVINO’s
|
||
dynamic shape models are recommended instead.
|
||
|
||
-------------------------------------------------------------------
|
||
Tue Apr 23 18:57:17 UTC 2024 - Atri Bhattacharya <badshah400@gmail.com>
|
||
|
||
- License update: play safe and list all third party licenses as
|
||
part of the License tag.
|
||
|
||
-------------------------------------------------------------------
|
||
Tue Apr 23 12:42:32 UTC 2024 - Atri Bhattacharya <badshah400@gmail.com>
|
||
|
||
- Switch to _service file as tagged Source tarball does not
|
||
include `./thirdparty` submodules.
|
||
- Update openvino-fix-install-paths.patch to fix python module
|
||
install path.
|
||
- Enable python module and split it out into a python subpackage
|
||
(for now default python3 only).
|
||
- Explicitly build python metadata (dist-info) and install it
|
||
(needs simple sed hackery to support "officially" unsupported
|
||
platform ppc64le).
|
||
- Specify ENABLE_JS=OFF to turn off javascript bindings as
|
||
building these requires downloading npm stuff from the network.
|
||
- Build with system pybind11.
|
||
- Bump _constraints for updated disk space requirements.
|
||
- Drop empty %check section, rpmlint was misleading when it
|
||
recommended adding this.
|
||
|
||
-------------------------------------------------------------------
|
||
Fri Apr 19 08:08:02 UTC 2024 - Atri Bhattacharya <badshah400@gmail.com>
|
||
|
||
- Numerous specfile cleanups:
|
||
* Drop redundant `mv` commands and use `install` where
|
||
appropriate.
|
||
* Build with system protobuf.
|
||
* Fix Summary tags.
|
||
* Trim package descriptions.
|
||
* Drop forcing CMAKE_BUILD_TYPE=Release, let macro default
|
||
RelWithDebInfo be used instead.
|
||
* Correct naming of shared library packages.
|
||
* Separate out libopenvino_c.so.* into own shared lib package.
|
||
* Drop rpmlintrc rule used to hide shlib naming mistakes.
|
||
* Rename Source tarball to %{name}-%{version}.EXT pattern.
|
||
* Use ldconfig_scriptlet macro for post(un).
|
||
- Add openvino-onnx-ml-defines.patch -- Define ONNX_ML at compile
|
||
time when using system onnx to allow using 'onnx-ml.pb.h'
|
||
instead of 'onnx.pb.h', the latter not being shipped with
|
||
openSUSE's onnx-devel package (gh#onnx/onnx#3074).
|
||
- Add openvino-fix-install-paths.patch: Change hard-coded install
|
||
paths in upstream cmake macro to standard Linux dirs.
|
||
- Add openvino-ComputeLibrary-include-string.patch: Include header
|
||
for std::string.
|
||
- Add external devel packages as Requires for openvino-devel.
|
||
- Pass -Wl,-z,noexecstack to %build_ldflags to avoid an exec stack
|
||
issue with intel CPU plugin.
|
||
- Use ninja for build.
|
||
- Adapt _constraits file for correct disk space and memory
|
||
requirements.
|
||
- Add empty %check section.
|
||
|
||
-------------------------------------------------------------------
|
||
Mon Apr 15 03:18:33 UTC 2024 - Alessandro de Oliveira Faria <cabelo@opensuse.org>
|
||
|
||
- Initial package
|
||
- Version 2024.0.0
|
||
- Add openvino-rpmlintrc.
|