1 Commits

Author SHA256 Message Date
1a25d5953e Accepting request 1244529 from home:cabelo:branches:science:machinelearning
- Update to 2025.0.0
- More GenAI coverage and framework integrations to minimize code
  changes
  * New models supported: Qwen 2.5, Deepseek-R1-Distill-Llama-8B,
    DeepSeek-R1-Distill-Qwen-7B, and DeepSeek-R1-Distill-Qwen-1.5B, 
    FLUX.1 Schnell and FLUX.1 Dev
  * Whisper Model: Improved performance on CPUs, built-in GPUs,
    and discrete GPUs with GenAI API.
  * Preview: Introducing NPU support for torch.compile, giving
    developers the ability to use the OpenVINO backend to run the
    PyTorch API on NPUs. 300+ deep learning models enabled from
    the TorchVision, Timm, and TorchBench repositories..
- Broader Large Language Model (LLM) support and more model 
  compression techniques.
  * Preview: Addition of Prompt Lookup to GenAI API improves 2nd
    token latency for LLMs by effectively utilizing predefined
    prompts that match the intended use case.
  * Preview: The GenAI API now offers image-to-image inpainting
    functionality. This feature enables models to generate
    realistic content by inpainting specified modifications and
    seamlessly integrating them with the original image.
  * Asymmetric KV Cache compression is now enabled for INT8 on
    CPUs, resulting in lower memory consumption and improved 2nd
    token latency, especially when dealing with long prompts that
    require significant memory. The option should be explicitly
    specified by the user.
- More portability and performance to run AI at the edge, in the
  cloud, or locally.
  * Support for the latest Intel® Core™ Ultra 200H series 
    processors (formerly codenamed Arrow Lake-H)
  * Integration of the OpenVINO ™ backend with the Triton
    Inference Server allows developers to utilize the Triton
    server for enhanced model serving performance when deploying
    on Intel CPUs.
  * Preview: A new OpenVINO ™ backend integration allows
    developers to leverage OpenVINO performance optimizations
    directly within Keras 3 workflows for faster AI inference on
    CPUs, built-in GPUs, discrete GPUs, and NPUs. This feature is
    available with the latest Keras 3.8 release.
  * The OpenVINO Model Server now supports native Windows Server
    deployments, allowing developers to leverage better
    performance by eliminating container overhead and simplifying
    GPU deployment.
- Support Change and Deprecation Notices
  * Now deprecated:
    + Legacy prefixes l_, w_, and m_ have been removed from
      OpenVINO archive names.
    + The runtime namespace for Python API has been marked as
      deprecated and designated to be removed for 2026.0. The 
      new namespace structure has been delivered, and migration
      is possible  immediately. Details will be communicated 
      through warnings andvia documentation.
    + NNCF create_compressed_model() method is deprecated. 
      nncf.quantize() method is now recommended for 
      Quantization-Aware Training of PyTorch and 
      TensorFlow models.

OBS-URL: https://build.opensuse.org/request/show/1244529
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/openvino?expand=0&rev=25
2025-02-10 07:49:40 +00:00