121 Commits

Author SHA256 Message Date
6eacdbbad9 Accepting request 1324424 from science:machinelearning
- Update to version 7540:
  * Major CUDA improvements including Blackwell native build fixes,
    experimental MXFP4 support, optimized CUMSUM paths, new ops
    (FILL, DIAG, TRI, CUMSUM), FA/MMA overflow fixes, better GPU
    utilization defaults, and multiple correctness and stability
    fixes.
  * Significant Vulkan backend work with new operators, faster
    FA/MMV/MMVQ paths, async tensor and event support, rope and MoE
    improvements, reduced data races, better logging, and numerous
    performance optimizations.
  * CPU and GGML backend enhancements covering ARM64, RVV, RISC-V,
    ZenDNN, and Hexagon, with new and optimized kernels, improved
    repack logic, allocator fixes, graph reuse, and better error
    handling.
  * Expanded support and fixes across Metal, HIP, SYCL, OpenCL,
    CANN, WebGPU, and Hexagon backends.
  * Added and improved support for many models and architectures
    including Qwen3-Next, Nemotron v2/v3, Llama 4 scaling, GLM4V,
    MiMo-V2-Flash, Granite Embeddings, KORMo, Rnj-1, LFM2 text/
    audio/MoE, Mistral and Mistral-Large variants, DeepSeek
    variants, ASR conformer models, and multimodal pipelines.
  * Fixed multiple model issues such as missing tensors,
    division-by-zero errors, rope scaling regressions, MoE edge
    cases, bidirectional architectures, and multimodal loading
    errors.
  * Server and router improvements including safer multithreading,
    race-condition fixes, multi-model routing, preset cascading,
    startup model loading, auto-sleep on idle, improved speculative
    decoding, better RPC validation, and friendlier error handling.
  * CLI and argument-parsing improvements with new flags, negated

OBS-URL: https://build.opensuse.org/request/show/1324424
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=24
2025-12-26 13:37:57 +00:00
846ae27b53 - Update to version 7540:
* Major CUDA improvements including Blackwell native build fixes,
    experimental MXFP4 support, optimized CUMSUM paths, new ops
    (FILL, DIAG, TRI, CUMSUM), FA/MMA overflow fixes, better GPU
    utilization defaults, and multiple correctness and stability
    fixes.
  * Significant Vulkan backend work with new operators, faster
    FA/MMV/MMVQ paths, async tensor and event support, rope and MoE
    improvements, reduced data races, better logging, and numerous
    performance optimizations.
  * CPU and GGML backend enhancements covering ARM64, RVV, RISC-V,
    ZenDNN, and Hexagon, with new and optimized kernels, improved
    repack logic, allocator fixes, graph reuse, and better error
    handling.
  * Expanded support and fixes across Metal, HIP, SYCL, OpenCL,
    CANN, WebGPU, and Hexagon backends.
  * Added and improved support for many models and architectures
    including Qwen3-Next, Nemotron v2/v3, Llama 4 scaling, GLM4V,
    MiMo-V2-Flash, Granite Embeddings, KORMo, Rnj-1, LFM2 text/
    audio/MoE, Mistral and Mistral-Large variants, DeepSeek
    variants, ASR conformer models, and multimodal pipelines.
  * Fixed multiple model issues such as missing tensors,
    division-by-zero errors, rope scaling regressions, MoE edge
    cases, bidirectional architectures, and multimodal loading
    errors.
  * Server and router improvements including safer multithreading,
    race-condition fixes, multi-model routing, preset cascading,
    startup model loading, auto-sleep on idle, improved speculative
    decoding, better RPC validation, and friendlier error handling.
  * CLI and argument-parsing improvements with new flags, negated

OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=120
2025-12-26 02:15:13 +00:00
763622b525 Accepting request 1321203 from science:machinelearning
- Switch to .so versioning, following upstream
- Update to version 7266:

OBS-URL: https://build.opensuse.org/request/show/1321203
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=23
2025-12-05 15:56:38 +00:00
0f17a147fa OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=118 2025-12-04 23:38:13 +00:00
f050e5debb - Switch to .so versioning, following upstream
- Update to version 7266:
  * Added support for several new and updated models including
    Ministral3, Qwen3 Next, RND1 Diffusion LM, AfmoeForCausalLM,
    openPangu-Embedded, and improved detection for
    GigaChat3-10-A1.8B.
  * Server improvements: multi-model API, Anthropic Messages API,
    task generator API, HTTP interface split, jinja enabled by
    default.
  * Chat and parsing improvements: generalized XML-style tool-call
    parsing, composable PEG parser combinators.
  * WebUI enhancements: restored HTML in Markdown tables, rehype
    plugin improvements, attachment-handling UX improvements,
    Harmony tool-call visualization, new keyboard shortcuts,
    clickability fixes, autoscroll toggle, and new “Continue”
    action.
  * CUDA backend improvements: FP16 restrictions, memory bandwidth
    improvements, stream-based concurrency, MMQ and fusion fixes,
    rope fusion corrections, improved handling of nb00/nb02, and
    various stability fixes.
  * Vulkan backend improvements: new operators, improved FA and
    MMVQ support, async graph_compute, conv2d spec constants, i32 copy
    support.
  * GGML and CPU backend updates: expanded RVV, ARM64, RISC-V
    feature detection; new CPU intrinsic implementations; improved
    GEMM/GEMV repack kernels; ops additions.
  * OpenCL, SYCL, HIP, MUSA, and Hexagon improvements: expanded
    operator support, new kernels, fallback logic for older SoCs,
    buffer handling fixes.
  * MTMD (multimodal) improvements: warmup toggles, CLI log-noise
    reduction, image embedding size fixes and audio model patch
    fixes.
  * General performance, stability, and correctness improvements
    across CPU, GPU, schedulers, memory management, kv-cache,
    async behavior, thread safety, and operator fusion.
  * Full commit log:
    https://github.com/ggml-org/llama.cpp/compare/b6937...b7266

OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=117
2025-12-04 14:34:53 +00:00
a541da59b8 Accepting request 1315691 from science:machinelearning
- Update to version 6937:
  * New model: Janus Pro
  * New model: Minimax M2
  * New model: Granite Hybrid nano types
  * New model: support for qwen3vl series
  * New model: support for CogVLM model
  * New model: LightOnOCR-1B model
  * New model: BailingMoeV2 support
  * New model: Granite Hybrid types
  * New model: Support home-cooked Mistral Small Omni
  * New model: Support LiquidAI LFM2-MoE hybrid model
  * New model: Granite docling + Idefics3 preprocessing (SmolVLM)
  * New model: EmbeddingGemma Adding Support for
    SentenceTransformers Dense Modules
  * Server improvements, OpenAI API compatibility, optimizations,
    and bug fixes
  * Vulkan backend improvements, optimizations, and bug fixes
  * OpenCL backend fixes
  * CPU backend optimizations
  * Multimodal (mtmd) improvements
  * WebUI enhancements
  * Architecture-specific improvements
  * llama core improvements
  * Memory management improvements
  * Conversion and quantization tools enhancements
  * Grammar and sampling improvements
  * Chat and prompts enhancements
  * General fixes and improvements
  * RPC improvements and bug fixes
  * Full commit log:

OBS-URL: https://build.opensuse.org/request/show/1315691
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=22
2025-11-06 17:12:47 +00:00
b89927e8a7 - Update to version 6937:
* New model: Janus Pro
  * New model: Minimax M2
  * New model: Granite Hybrid nano types
  * New model: support for qwen3vl series
  * New model: support for CogVLM model
  * New model: LightOnOCR-1B model
  * New model: BailingMoeV2 support
  * New model: Granite Hybrid types
  * New model: Support home-cooked Mistral Small Omni
  * New model: Support LiquidAI LFM2-MoE hybrid model
  * New model: Granite docling + Idefics3 preprocessing (SmolVLM)
  * New model: EmbeddingGemma Adding Support for
    SentenceTransformers Dense Modules
  * Server improvements, OpenAI API compatibility, optimizations,
    and bug fixes
  * Vulkan backend improvements, optimizations, and bug fixes
  * OpenCL backend fixes
  * CPU backend optimizations
  * Multimodal (mtmd) improvements
  * WebUI enhancements
  * Architecture-specific improvements
  * llama core improvements
  * Memory management improvements
  * Conversion and quantization tools enhancements
  * Grammar and sampling improvements
  * Chat and prompts enhancements
  * General fixes and improvements
  * RPC improvements and bug fixes
  * Full commit log:

OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=115
2025-11-03 18:57:48 +00:00
12f7fce1c0 Accepting request 1309029 from science:machinelearning
- Update to version 6690:
  * Full commit log:
    https://github.com/ggml-org/llama.cpp/compare/b6605...b6690

OBS-URL: https://build.opensuse.org/request/show/1309029
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=21
2025-10-05 15:51:16 +00:00
a8cb1efc19 - Update to version 6690:
* ggml: bump to v0.9.4; fix graph reallocation and multi-chunk
    dependencies
  * ggml webgpu: add soft_max op; optimize rms_norm; extend
    operator support
  * ggml-riscv: add Spacemit backend
  * vulkan: improve shader threading and incremental builds
  * vulkan: fix FA coopmat1 array indexing and quantized flash
    attention
  * vulkan: replace maxMemoryAllocationSize, improve header
    compatibility
  * vulkan: add bounds checks in flash attention; 64-bit im2col
    support
  * rpc: add multi-device support; validate src buffer copies
  * server: add context checkpointing for hybrid and recurrent
    models
  * chat: add Magistral thinking support; fix missing sibling
    messages
  * webui: fix payloads and routing; improve mobile and dialog
    behavior
  * model: implement Apertus; support GLM 4.6
  * llama: fix shapes for BERT/MPT q/k norm; improve PLaMo2
    loading
  * common: introduce http.h client; disable progress bar without
    tty
  * common: remove common_has_curl(); simplify etag tracking
  * opencl: support pad_ext and ne3 in get_rows
  * various minor fixes for scrolling, sampling, and chat block
    handling
  * Full commit log:

OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=113
2025-10-04 22:06:53 +00:00
c1a7fde844 Accepting request 1307499 from science:machinelearning
OBS-URL: https://build.opensuse.org/request/show/1307499
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=20
2025-09-29 14:32:13 +00:00
7eecb82de8 - Update to b6605:
* Added docker protocol support and resumable downloads for
    llama-server
  * New models: LLaDA-7b-MoE, Grok-2, GroveMoE, OLMo3, LiquidAI
    LFM2-2.6B
  * Added conversion support for GraniteHybrid (non-hybrid attn)
    and Llama4ForCausalLM
  * llama: support for qwen3 reranker, T5 unequal encoder-decoder
    layers, seq limit bumped 64 → 256
  * Bench improvements: list devices, multiple devices, n-cpu-moe
  * Vulkan: conv_transpose_2d, GET_ROWS, iGPU device selection,
    buffer optimizations, shader fixes, OOM handling
  * ggml: semantic versioning, backend/device extensions,
    optimizations, fixes for embedding, quantization, padding
  * ggml-cpu: SIMD support (MXFP4 for s390x), cpumask respect,
    ARM INT8 checks
  * Common: fixes for memory corruption, offline mode without curl,
    switch to cpp-httplib
  * Server: SSE/OpenAI error handling, usage stats opt-in, external
    test server, removed LLAMA_SERVER_SSL
  * WebUI: migrated to SvelteKit, hash-based routing, chunk
    handling fixes
  * Fixes across model-conversion, rpc, media, devops, embedding
    docs, typos
  * Full commit log:
    https://github.com/ggml-org/llama.cpp/compare/b6269...b6428

OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=111
2025-09-27 17:34:31 +00:00
636fe3b65d Accepting request 1305191 from science:machinelearning
Automatic submission by obs-autosubmit

OBS-URL: https://build.opensuse.org/request/show/1305191
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=19
2025-09-16 16:20:03 +00:00
0ec4f68c01 - Update to version 6428
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=109
2025-09-09 12:29:13 +00:00
29895c65c7 Accepting request 1302234 from science:machinelearning
Automatic submission by obs-autosubmit

OBS-URL: https://build.opensuse.org/request/show/1302234
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=18
2025-09-02 15:58:24 +00:00
f6cca5429e Accepting request 1301212 from science:machinelearning
Automatic submission by obs-autosubmit

OBS-URL: https://build.opensuse.org/request/show/1301212
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=17
2025-08-25 18:38:58 +00:00
c5d8653d73 - Update to version 6269:
* Model and conversion: support for Seed-OSS, GPT-OSS 
  	response_format, interns1-mini, Ernie 4.5, gpt-oss type 
  	strings, improved Mistral templates, new model conversion 
  	tool/example with torch-cpu.
  * Vulkan backend: multiple optimizations (rms_norm, mul_mat_id,
    synchronization, conv2d, subgroup ops), new ops (exp, 
    conv_2d_dw f16, ggml_mean).
  * GGML/CPU: added conv3d op, WebGPU quantization support, 
  	Q5_0/Q5_1 on s390x, mxfp4 intrinsics on ppc64le.
  * Server and chat: multimodal completion and embeddings 
  	JSON support, improved OpenAI API compatibility and usage 
  	statistics, disabled context shift by default, fixed ordering 
  	of tasks, webui issues, debug assertions, clarified 
  	reasoning_format.
  * KV cache: unified handling improvements, support for reuse, 
  	removal of deprecated APIs, simplifications.
  * Miscellaneous: fixed logging of non-ASCII characters, removed 
  	deprecated or unused code and build artifacts.
  * Full commit log:
    https://github.com/ggml-org/llama.cpp/compare/b6188...b6269

OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=106
2025-08-25 14:14:05 +00:00
3764c5b78a - Update to version 6188:
* Vulkan backend improvements: larger workgroups, optimized
    argsort, fused adds, bounds checking, out-of-bounds and compile
    warning fixes, performance logging.
  * OpenCL backend: initial FA and mxfp4 support.
  * Model support: vision LiquidAI LFM2-VL family, 18-layer Gemma
    3-270m model type.
  * Common: fixed double BOS, improved chat templates, added
    override-tensor and CPU MoE draft parameters.
  * GGML: initial IBM zDNN backend, rope_multi update, conv_1d_dw
    bug fix, block_iq4_nlx8 repack, improved Mistral integration.
  * Server: SWA checkpoints, -td/-tbd parameters, harmony thought
    message filtering.
  * Perplexity: improved error hints and constraint reporting.
  * GPT-OSS: harmony parsing implemented.
- Add LLAMA_BUILD_NUMBER and LLAMA_VERSION to the build

OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=105
2025-08-17 22:18:58 +00:00
8ee8f8134c Accepting request 1299150 from science:machinelearning
- Update to version 6139:
  * opencl: allow mixed f16/f32 `add` (#15140)
  * mtmd : Fix MinicpmV model converter and clip to avoid using
    hardcode. (#14750)
  * chat : hotfix gpt-oss jinja raising an exception (#15243)
  * server : allow specifying reasoning_format in HTTP request
    (#15238)
  * kv-cache : fix seq_rm with seq_id == -1 (#15226)
  * kv-cache : log (debug) all streams in find_slot (#15176)
  * convert : improve Mistral models integration (#14737)
  * kleidiai: fix unsigned overflow bug (#15150)

- Add LLAMA_BUILD_NUMBER and LLAMA_VERSION to the build 

- Update to version 6121:
  * Support intern-s1
  * opencl: add swiglu_oai and add_id
  * vulkan: support fattn sinks
  * vulkan: Add env var to disable host visible vidmem
  * ggml: Skip backend library linking code when GGML_BACKEND_DL=ON
  * ggml : fix fallback to CPU for ununsupported ops
  * Various bug fixes
  * Full changelog:
    https://github.com/ggml-org/llama.cpp/compare/b6100...b6121

OBS-URL: https://build.opensuse.org/request/show/1299150
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=16
2025-08-13 14:30:52 +00:00
755973372c - Update to version 6139:
* opencl: allow mixed f16/f32 `add` (#15140)
  * mtmd : Fix MinicpmV model converter and clip to avoid using
    hardcode. (#14750)
  * chat : hotfix gpt-oss jinja raising an exception (#15243)
  * server : allow specifying reasoning_format in HTTP request
    (#15238)
  * kv-cache : fix seq_rm with seq_id == -1 (#15226)
  * kv-cache : log (debug) all streams in find_slot (#15176)
  * convert : improve Mistral models integration (#14737)
  * kleidiai: fix unsigned overflow bug (#15150)

OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=103
2025-08-12 18:01:43 +00:00
0e11fa8fd1 Add LLAMA_BUILD_NUMBER and LLAMA_VERSION to the build
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=102
2025-08-12 17:38:59 +00:00
02bb0a433c - Update to version 6121:
* Support intern-s1
  * opencl: add swiglu_oai and add_id
  * vulkan: support fattn sinks
  * vulkan: Add env var to disable host visible vidmem
  * ggml: Skip backend library linking code when GGML_BACKEND_DL=ON
  * ggml : fix fallback to CPU for ununsupported ops
  * Various bug fixes
  * Full changelog:
    https://github.com/ggml-org/llama.cpp/compare/b6100...b6121

OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=101
2025-08-08 23:44:39 +00:00
6e32983ab7 Accepting request 1298007 from science:machinelearning
- Drop 0001-dl-load-path.patch: use GGML_BACKEND_DIR instead
- Enable loading backends dynamically
- Update to version 6100:
  * llama : add gpt-oss (#15091)
  * llama : add --n-cpu-moe option (#15077)
  * llama : enable LLAMA_SET_ROWS=1 by default (#14959)
  * server : add openai-style logit_bias support (#14946)
  * server : implement universal assisted decoding (#12635)
  * mtmd : support MiniCPM-V 4.0 (#14983)
  * opencl: add f16 for `add`, `sub`, `mul`, `div` (#14984)
  * model : add hunyuan dense (#14878)
  * model : add text-only support for Kimi-VL
  * model: support GLM 4.5 family of models (#14939)
  * model : support Qwen3-Embedding (#15023)
  * graph : Optimize graph operations
  * vulkan: various bug fixes and optimizations
  * Various bug fixes

- Update to version 6038:
  * chat : fix kimi-k2 chat template (#14852)
  * common : avoid logging partial messages (which can contain
    broken UTF-8 sequences) (#14937)
  * context : perform output reorder lazily upon access after sync
    (#14853)
  * context : restore preemptive sched reset when LLAMA_SET_ROWS=0
    (#14870)
  * convert : text-only support for GLM-4.1V-9B-Thinking (#14823)
  * embeddings: fix extraction of CLS pooling results (#14927)
  * ggml-cpu : deduplicate scalar implementations (#14897)
  * ggml-cpu : disable GGML_NNPA by default due to instability

OBS-URL: https://build.opensuse.org/request/show/1298007
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=15
2025-08-07 14:48:47 +00:00
031ccc2321 - Enable loading backends dynamically
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=99
2025-08-06 17:36:49 +00:00
65d2ea5770 - Drop 0001-dl-load-path.patch: use GGML_BACKEND_DIR instead
- Update to version 6100:
  * llama : add gpt-oss (#15091)
  * llama : add --n-cpu-moe option (#15077)
  * llama : enable LLAMA_SET_ROWS=1 by default (#14959)
  * server : add openai-style logit_bias support (#14946)
  * server : implement universal assisted decoding (#12635)
  * mtmd : support MiniCPM-V 4.0 (#14983)
  * opencl: add f16 for `add`, `sub`, `mul`, `div` (#14984)
  * model : add hunyuan dense (#14878)
  * model : add text-only support for Kimi-VL
  * model: support GLM 4.5 family of models (#14939)
  * model : support Qwen3-Embedding (#15023)
  * graph : Optimize graph operations
  * vulkan: various bug fixes and optimizations
  * Various bug fixes

OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=98
2025-08-06 17:28:56 +00:00
2fd82620cc Accepting request 1296595 from science:machinelearning
Automatic submission by obs-autosubmit

OBS-URL: https://build.opensuse.org/request/show/1296595
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=14
2025-07-31 15:46:28 +00:00
5fb3a7e0e0 - Update to version 6038:
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=96
2025-07-30 20:42:07 +00:00
17f3fd085f - Update to version 5970:
* Full changelog:
    https://github.com/ggml-org/llama.cpp/compare/b5889...b5970

OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=95
2025-07-23 14:31:52 +00:00
5fd045ca7d Accepting request 1292534 from science:machinelearning
- Add GGML_NATIVE=OFF build flag
- Update to version 5889:
  * Remove Kompute support
  * Prevent integer overflow in gguf tensor size calculation
    (bsc#1246377) (CVE-2025-53630) (GHSA-vgg9-87g3-85w8)
  * Improved build-time messaging for ggml_set_rows.
  * Enhanced test coverage for LFM2 and added LFM2 to
    documentation.
  * Synchronized ggml updates and improved Vulkan backend
    (bilinear interpolation, ggml_roll, SET_ROWS, optimizations).
  * Fixed pooled embedding output in server and improved prompt
    processing.
  * Added support for LiquidAI LFM2 hybrid family and Falcon-H1
    models.
  * Improved HIP, OpenCL, and SYCL backend compatibility
    and features.
  * Added new vocabularies and model support
    (midm-2.0, skt/A.X-4.0, SmolLM3, hunyuan moe, Granite Four).
  * Various bug fixes, optimizations, and documentation improvements
    across backends and models.
  * Full changelog:
    https://github.com/ggml-org/llama.cpp/compare/b5812...b5889

OBS-URL: https://build.opensuse.org/request/show/1292534
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=13
2025-07-15 14:43:18 +00:00
54299e5777 * Remove Kompute support
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=93
2025-07-13 15:14:43 +00:00
21e3ba7e90 - Add GGML_NATIVE=OFF build flag
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=92
2025-07-13 15:12:56 +00:00
287ac0c443 - Add -GGML_NATIVE=OFF build flag
- Update to version 5889:
  * Prevent integer overflow in gguf tensor size calculation
    (bsc#1246377) (CVE-2025-53630) (GHSA-vgg9-87g3-85w8)
  * Improved build-time messaging for ggml_set_rows.
  * Enhanced test coverage for LFM2 and added LFM2 to
    documentation.
  * Synchronized ggml updates and improved Vulkan backend
    (bilinear interpolation, ggml_roll, SET_ROWS, optimizations).
  * Fixed pooled embedding output in server and improved prompt
    processing.
  * Added support for LiquidAI LFM2 hybrid family and Falcon-H1
    models.
  * Improved HIP, OpenCL, and SYCL backend compatibility
    and features.
  * Added new vocabularies and model support
    (midm-2.0, skt/A.X-4.0, SmolLM3, hunyuan moe, Granite Four).
  * Various bug fixes, optimizations, and documentation improvements
    across backends and models.
  * Full changelog:
    https://github.com/ggml-org/llama.cpp/compare/b5812...b5889

OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=91
2025-07-13 15:11:35 +00:00
aee11711a1 Accepting request 1290235 from science:machinelearning
- Update to version 5812:
  * Mamba-2 Support: Initial integration of Mamba-2 architecture.
  * Added support for ERNIE 4.5 0.3B, NeoBERT, Arcee AI's AFM,
    Gemma3n text-only, and dots.llm1 architectures
  * Vulkan Improvements: Support for softmax/FlashAttention
    batch/broadcast, fused RMS_NORM+MUL, and better memory handling
  * GGML Backend: Added REGLU/GEGLU/SWIGLU ops, ggml_set_rows, and
    improved SYCL/OpenCL/Metal support
  * Server Improvements: Jinja template kwargs, draft model cache
    params, and Unix socket support
  * Quantization: User-defined layer pruning and KV override fixes
  * Optimizations: Batched Vulkan mul_mat_id splitting
    and ARM hsum reduction
  * Added GGML version function
  * Full changelog:
    https://github.com/ggml-org/llama.cpp/compare/b5699...b5812

OBS-URL: https://build.opensuse.org/request/show/1290235
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=12
2025-07-06 15:07:53 +00:00
7027db2e08 - Update to version 5812:
* Mamba-2 Support: Initial integration of Mamba-2 architecture.
  * Added support for ERNIE 4.5 0.3B, NeoBERT, Arcee AI's AFM,
    Gemma3n text-only, and dots.llm1 architectures
  * Vulkan Improvements: Support for softmax/FlashAttention
    batch/broadcast, fused RMS_NORM+MUL, and better memory handling
  * GGML Backend: Added REGLU/GEGLU/SWIGLU ops, ggml_set_rows, and
    improved SYCL/OpenCL/Metal support
  * Server Improvements: Jinja template kwargs, draft model cache
    params, and Unix socket support
  * Quantization: User-defined layer pruning and KV override fixes
  * Optimizations: Batched Vulkan mul_mat_id splitting
    and ARM hsum reduction
  * Added GGML version function
  * Full changelog:
    https://github.com/ggml-org/llama.cpp/compare/b5699...b5812

OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=89
2025-07-03 00:33:30 +00:00
e84b2edce8 Accepting request 1286807 from science:machinelearning
- Update to 5699:
  * vocab : prevent integer overflow during load
    (bsc#1244714) (CVE-2025-49847)
  ...
- Update to 5657:
  ...

OBS-URL: https://build.opensuse.org/request/show/1286807
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=11
2025-06-20 14:48:56 +00:00
05fa0fbdf4 - Update to 5699:
* vocab : prevent integer overflow during load
    (bsc#1244714) (CVE-2025-49847)
  * batch : add LLAMA_BATCH_DEBUG environment variable
  * batch : auto-gen positions + verify multi-sequence input
  * common : suggest --jinja when autodetection fails
  * ggml-cpu: fix uncaught underscore terminators
  * kv-cache : fix use-after-move of defrag info
  * llama : rework embeddings logic
  * llama-chat : do not throw when tool parsing fails
  * llama-chat : fix multiple system message for gemma, orion
  * model : Add support for Arcee AI's upcoming AFM model
  * model : add dots.llm1 architecture support
  * model : add NeoBERT
  * server : When listening on a unix domain socket don't print
    http:// and port
  * quantize : change int to unsigned int for KV overrides
  * Full changelog:
    https://github.com/ggml-org/llama.cpp/compare/b5657...b5699

OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=87
2025-06-19 00:59:30 +00:00
ecc1b17ccf - Update to 5657:
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=86
2025-06-14 13:16:56 +00:00
c55a8cac44 Accepting request 1283892 from science:machinelearning
Automatic submission by obs-autosubmit

OBS-URL: https://build.opensuse.org/request/show/1283892
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=10
2025-06-10 07:05:25 +00:00
098bcc02b6 - Update to 5556:
* mtmd : move helpers to dedicated library
  * server: fix remove 'image_url'/'input_audio' json-object
  * llama : add RobertaForSequenceClassification reranker support
  * ggml: aarch64: Implement SVE F32 kernels for Mamba Sequential
    Scan Algorithm
  * llama : add support for jina-reranker-v2
  * arm64: optimize q4_k_q8_k kernel with i8mm
  * llama : use llm_build_granite for minicpm
  * mtmd : drop _shared from libmtmd name, merge helpers into
    libmtmd
  * server: allow unclosed thinking tags
  * llama : use n_swa + n_ubatch cells for SWA cache
  * convert : fix rwkv bos/eos token
  * llama : add support for DistilBert

OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=84
2025-05-31 23:45:53 +00:00
b9212f41bb Accepting request 1280718 from science:machinelearning
- Update to 5516:
  * llama : remove llama_kv_cache_view API
  * model : disable SWA for Phi models
  * kv-cache : simplify the interface
  * server : Add the endpoints /api/tags and /api/chat
  * ggml : add ggml_gelu_erf()
  * hparams : support models for which all layers use SWA
  * opencl: fix couple crashes
  * opencl: Add support for multiple devices
  * mtmd : add ultravox audio input
  * server : support audio input
  * server: streaming of tool calls and thoughts when jinja is on
  * mtmd : support Qwen 2.5 Omni
  * ggml : riscv: add xtheadvector support
  * opencl : various optimizations
  * Full changelog:
    https://github.com/ggml-org/llama.cpp/compare/b5426...b5516

OBS-URL: https://build.opensuse.org/request/show/1280718
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=9
2025-05-30 12:32:23 +00:00
d0e896b3f4 - Update to 5516:
* llama : remove llama_kv_cache_view API
  * model : disable SWA for Phi models
  * kv-cache : simplify the interface
  * server : Add the endpoints /api/tags and /api/chat
  * ggml : add ggml_gelu_erf()
  * hparams : support models for which all layers use SWA
  * opencl: fix couple crashes
  * opencl: Add support for multiple devices
  * mtmd : add ultravox audio input
  * server : support audio input
  * server: streaming of tool calls and thoughts when jinja is on
  * mtmd : support Qwen 2.5 Omni
  * ggml : riscv: add xtheadvector support
  * opencl : various optimizations
  * Full changelog:
    https://github.com/ggml-org/llama.cpp/compare/b5426...b5516

OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=82
2025-05-27 22:55:37 +00:00
8dfa0f3a34 Accepting request 1278459 from science:machinelearning
- Update to 5426:
  * print hint when loading a model when no backends are loaded
  * vulkan: use scalar FA rather than coopmat2 when N==1
  * mtmd : add vision support for llama 4
  * Full changelog:
    https://github.com/ggml-org/llama.cpp/compare/b5402...b5426
- Update to 5402
  * removed llava subpackage (#13460)
  * Full changelog:
    https://github.com/ggml-org/llama.cpp/compare/b5158...b5321
- Update to version 5332:
  * server : vision support via libmtmd

OBS-URL: https://build.opensuse.org/request/show/1278459
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=8
2025-05-20 10:19:52 +00:00
50ec7b4608 - Update to 5426:
* print hint when loading a model when no backends are loaded
  * vulkan: use scalar FA rather than coopmat2 when N==1
  * mtmd : add vision support for llama 4
  * Full changelog:
    https://github.com/ggml-org/llama.cpp/compare/b5402...b5426

OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=80
2025-05-19 22:19:33 +00:00
3125cf9e4e Update to 5402
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=79
2025-05-17 13:15:02 +00:00
f19318fe3f - Update to version 5332:
* server : vision support via libmtmd

OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=78
2025-05-09 21:18:48 +00:00
1dd356788c Accepting request 1276203 from science:machinelearning
- Use source urls instead of obs_scm
- Add libllava and libmtmd libraries
- Update to version 5327:
  * A new binary llama-mtmd-cli is introduced to replace llava-cli,
    minicpmv-cli, gemma3-cli (#13012) and qwen2vl-cli (#13141),
    libllava will be deprecated
  * Full changes here:
    https://github.com/ggml-org/llama.cpp/compare/b5158...b5321
- Delete patch 0002-build-main-cli.patch: build system changed
  upstream

OBS-URL: https://build.opensuse.org/request/show/1276203
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=7
2025-05-09 16:52:00 +00:00
6553a34765 OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=76 2025-05-09 11:08:31 +00:00
aee1933fe1 - Delete patch 0002-build-main-cli.patch: build system changed
upstream

OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=75
2025-05-09 11:08:05 +00:00
6c3349496a OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=74 2025-05-09 11:04:22 +00:00
68410760e1 OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=73 2025-05-09 11:02:25 +00:00
fce1fbe866 - Use source urls instead of obs_scm
- Update to version 5327:
  * A new binary llama-mtmd-cli is introduced to replace llava-cli,
    minicpmv-cli, gemma3-cli (#13012) and qwen2vl-cli (#13141),
    libllava will be deprecated
  * Full changes here:
    https://github.com/ggml-org/llama.cpp/compare/b5158...b5321
- Disable patch 0001-dl-load-path.patch

OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=72
2025-05-09 11:00:51 +00:00