SHA256
1
0
forked from pool/openvino
Files
openvino/_service
Alessandro de Oliveira Faria 3b365b8798 - Update to 2025.4.0
- More GenAI coverage and framework integrations to minimize code
  changes
  * New models supported:
    + On CPUs & GPUs: Qwen3-Embedding-0.6B, Qwen3-Reranker-0.6B, 
      Mistral-Small-24B-Instruct-2501.
    + On NPUs: Gemma-3-4b-it and Qwen2.5-VL-3B-Instruct.
  * Preview: Mixture of Experts (MoE) models optimized for CPUs 
    and GPUs, validated for Qwen3-30B-A3B.
  * GenAI pipeline integrations: Qwen3-Embedding-0.6B and 
    Qwen3-Reranker-0.6B for enhanced retrieval/ranking, and 
    Qwen2.5VL-7B for video pipeline.
- Broader LLM model support and more model compression 
  techniques
  * Gold support for Windows ML* enables developers to deploy AI 
    models and applications effortlessly across CPUs, GPUs, and 
    NPUs on Intel® Core™ Ultra processor-powered AI PCs.
  * The Neural Network Compression Framework (NNCF) ONNX backend 
    now supports INT8 static post-training quantization (PTQ) 
    and INT8/INT4 weight-only compression to ensure accuracy 
    parity with OpenVINO IR format models. SmoothQuant algorithm 
    support added for INT8 quantization.
  * Accelerated multi-token generation for GenAI, leveraging 
    optimized GPU kernels to deliver faster inference, smarter 
    KV-cache reuse, and scalable LLM performance.
  * GPU plugin updates include improved performance with prefix 
    caching for chat history scenarios and enhanced LLM accuracy 
    with dynamic quantization support for INT8.
- More portability and performance to run AI at the edge, in the 
  cloud, or locally.
  * Announcing support for Intel® Core Ultra Processor Series 3.
  * Encrypted blob format support added for secure model 
    deployment with OpenVINO GenAI. Model weights and artifacts
    are stored and transmitted in an encrypted format, reducing
    risks of IP theft during deployment. Developers can deploy 
    with minimal code changes using OpenVINO GenAI pipelines.
  * OpenVINO™ Model Server and OpenVINO™ GenAI now extend 
    support for Agentic AI scenarios with new features such as
    output parsing and improved chat templates for reliable 
    multi-turn interactions, and preview functionality for the 
    Qwen3-30B-A3B model. OVMS also introduces a preview for 
    audio endpoints.
  * NPU deployment is simplified with batch support, enabling 
    seamless model execution across Intel® Core Ultra 
    processors while eliminating driver dependencies. Models 
    are reshaped to batch_size=1 before compilation.
  * The improved NVIDIA Triton Server* integration with 
    OpenVINO backend now enables developers to utilize Intel
    GPUs or NPUs for deployment.

OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/openvino?expand=0&rev=44
2025-12-03 02:10:49 +00:00

17 lines
585 B
Plaintext

<services>
<service name="obs_scm" mode="manual">
<param name="url">https://github.com/openvinotoolkit/openvino.git</param>
<param name="scm">git</param>
<param name="revision">2025.4.0</param>
<param name="version">2025.4.0</param>
<param name="submodules">enable</param>
<param name="filename">openvino</param>
<param name="exclude">.git</param>
</service>
<service name="tar" mode="buildtime" />
<service name="recompress" mode="buildtime">
<param name="file">*.tar</param>
<param name="compression">zstd</param>
</service>
</services>