Files
openvino/_service
Alessandro de Oliveira Faria 63929e8cbb - Update to 2025.3.0
- More GenAI coverage and framework integrations to minimize code
  changes
  * New models supported: Phi-4-mini-reasoning, AFM-4.5B, 
    Gemma-3-1B-it, Gemma-3-4B-it, and Gemma-3-12B,
  * NPU support added for: Qwen3-1.7B, Qwen3-4B, and Qwen3-8B.
  * LLMs optimized for NPU now available on OpenVINO Hugging
    Face collection.
- Broader LLM model support and more model compression techniques
  * The NPU plug-in adds support for longer contexts of up to 
    8K tokens, dynamic prompts, and dynamic LoRA for improved 
    LLM performance.
  * The NPU plug-in now supports dynamic batch sizes by reshaping
    the model to a batch size of 1 and concurrently managing
    multiple inference requests, enhancing performance and
    optimizing memory utilization.
  * Accuracy improvements for GenAI models on both built-in 
    and discrete graphics achieved through the implementation 
    of the key cache compression per channel technique, in 
    addition to the existing KV cache per-token compression
    method.
  * OpenVINO™ GenAI introduces TextRerankPipeline for improved
    retrieval relevance and RAG pipeline accuracy, plus 
    Structured Output for enhanced response reliability and 
    function calling while ensuring adherence to predefined
    formats.
- More portability and performance to run AI at the edge,
  in the cloud, or locally.
  * Announcing support for Intel® Arc™ Pro B-Series 
    (B50 and B60).
  * Preview: Hugging Face models that are GGUF-enabled for 
    OpenVINO GenAI are now supported by the OpenVINO™ Model
    Server for popular LLM model architectures such as 
    DeepSeek Distill, Qwen2, Qwen2.5, and Llama 3. 
    This functionality reduces memory footprint and 
    simplifies integration for GenAI workloads.
  * With improved reliability and tool call accuracy, 
    the OpenVINO™ Model Server boosts support for 
    agentic AI use cases on AI PCs, while enhancing 
    performance on Intel CPUs, built-in GPUs, and NPUs.
  * int4 data-aware weights compression, now supported in the
    Neural Network Compression Framework (NNCF) for ONNX 
    models, reduces memory footprint while maintaining 
    accuracy and enables efficient deployment in 
    resource-constrained environments.

OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/openvino?expand=0&rev=42
2025-09-07 04:23:58 +00:00

17 lines
585 B
Plaintext

<services>
<service name="obs_scm" mode="manual">
<param name="url">https://github.com/openvinotoolkit/openvino.git</param>
<param name="scm">git</param>
<param name="revision">2025.3.0</param>
<param name="version">2025.3.0</param>
<param name="submodules">enable</param>
<param name="filename">openvino</param>
<param name="exclude">.git</param>
</service>
<service name="tar" mode="buildtime" />
<service name="recompress" mode="buildtime">
<param name="file">*.tar</param>
<param name="compression">zstd</param>
</service>
</services>