llamacpp

SHA256

Author	SHA256	Message	Date
Dominique Leuenberger	6eacdbbad9	Accepting request 1324424 from science:machinelearning - Update to version 7540: * Major CUDA improvements including Blackwell native build fixes, experimental MXFP4 support, optimized CUMSUM paths, new ops (FILL, DIAG, TRI, CUMSUM), FA/MMA overflow fixes, better GPU utilization defaults, and multiple correctness and stability fixes. * Significant Vulkan backend work with new operators, faster FA/MMV/MMVQ paths, async tensor and event support, rope and MoE improvements, reduced data races, better logging, and numerous performance optimizations. * CPU and GGML backend enhancements covering ARM64, RVV, RISC-V, ZenDNN, and Hexagon, with new and optimized kernels, improved repack logic, allocator fixes, graph reuse, and better error handling. * Expanded support and fixes across Metal, HIP, SYCL, OpenCL, CANN, WebGPU, and Hexagon backends. * Added and improved support for many models and architectures including Qwen3-Next, Nemotron v2/v3, Llama 4 scaling, GLM4V, MiMo-V2-Flash, Granite Embeddings, KORMo, Rnj-1, LFM2 text/ audio/MoE, Mistral and Mistral-Large variants, DeepSeek variants, ASR conformer models, and multimodal pipelines. * Fixed multiple model issues such as missing tensors, division-by-zero errors, rope scaling regressions, MoE edge cases, bidirectional architectures, and multimodal loading errors. * Server and router improvements including safer multithreading, race-condition fixes, multi-model routing, preset cascading, startup model loading, auto-sleep on idle, improved speculative decoding, better RPC validation, and friendlier error handling. * CLI and argument-parsing improvements with new flags, negated OBS-URL: https://build.opensuse.org/request/show/1324424 OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=24	2025-12-26 13:37:57 +00:00
Eyad Issa	846ae27b53	- Update to version 7540: * Major CUDA improvements including Blackwell native build fixes, experimental MXFP4 support, optimized CUMSUM paths, new ops (FILL, DIAG, TRI, CUMSUM), FA/MMA overflow fixes, better GPU utilization defaults, and multiple correctness and stability fixes. * Significant Vulkan backend work with new operators, faster FA/MMV/MMVQ paths, async tensor and event support, rope and MoE improvements, reduced data races, better logging, and numerous performance optimizations. * CPU and GGML backend enhancements covering ARM64, RVV, RISC-V, ZenDNN, and Hexagon, with new and optimized kernels, improved repack logic, allocator fixes, graph reuse, and better error handling. * Expanded support and fixes across Metal, HIP, SYCL, OpenCL, CANN, WebGPU, and Hexagon backends. * Added and improved support for many models and architectures including Qwen3-Next, Nemotron v2/v3, Llama 4 scaling, GLM4V, MiMo-V2-Flash, Granite Embeddings, KORMo, Rnj-1, LFM2 text/ audio/MoE, Mistral and Mistral-Large variants, DeepSeek variants, ASR conformer models, and multimodal pipelines. * Fixed multiple model issues such as missing tensors, division-by-zero errors, rope scaling regressions, MoE edge cases, bidirectional architectures, and multimodal loading errors. * Server and router improvements including safer multithreading, race-condition fixes, multi-model routing, preset cascading, startup model loading, auto-sleep on idle, improved speculative decoding, better RPC validation, and friendlier error handling. * CLI and argument-parsing improvements with new flags, negated OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=120	2025-12-26 02:15:13 +00:00
Ana Guerrero	763622b525	Accepting request 1321203 from science:machinelearning - Switch to .so versioning, following upstream - Update to version 7266: OBS-URL: https://build.opensuse.org/request/show/1321203 OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=23	2025-12-05 15:56:38 +00:00
Eyad Issa	0f17a147fa	OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=118	2025-12-04 23:38:13 +00:00
Eyad Issa	f050e5debb	- Switch to .so versioning, following upstream - Update to version 7266: * Added support for several new and updated models including Ministral3, Qwen3 Next, RND1 Diffusion LM, AfmoeForCausalLM, openPangu-Embedded, and improved detection for GigaChat3-10-A1.8B. * Server improvements: multi-model API, Anthropic Messages API, task generator API, HTTP interface split, jinja enabled by default. * Chat and parsing improvements: generalized XML-style tool-call parsing, composable PEG parser combinators. * WebUI enhancements: restored HTML in Markdown tables, rehype plugin improvements, attachment-handling UX improvements, Harmony tool-call visualization, new keyboard shortcuts, clickability fixes, autoscroll toggle, and new “Continue” action. * CUDA backend improvements: FP16 restrictions, memory bandwidth improvements, stream-based concurrency, MMQ and fusion fixes, rope fusion corrections, improved handling of nb00/nb02, and various stability fixes. * Vulkan backend improvements: new operators, improved FA and MMVQ support, async graph_compute, conv2d spec constants, i32 copy support. * GGML and CPU backend updates: expanded RVV, ARM64, RISC-V feature detection; new CPU intrinsic implementations; improved GEMM/GEMV repack kernels; ops additions. * OpenCL, SYCL, HIP, MUSA, and Hexagon improvements: expanded operator support, new kernels, fallback logic for older SoCs, buffer handling fixes. * MTMD (multimodal) improvements: warmup toggles, CLI log-noise reduction, image embedding size fixes and audio model patch fixes. * General performance, stability, and correctness improvements across CPU, GPU, schedulers, memory management, kv-cache, async behavior, thread safety, and operator fusion. * Full commit log: https://github.com/ggml-org/llama.cpp/compare/b6937...b7266 OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=117	2025-12-04 14:34:53 +00:00
Ana Guerrero	a541da59b8	Accepting request 1315691 from science:machinelearning - Update to version 6937: * New model: Janus Pro * New model: Minimax M2 * New model: Granite Hybrid nano types * New model: support for qwen3vl series * New model: support for CogVLM model * New model: LightOnOCR-1B model * New model: BailingMoeV2 support * New model: Granite Hybrid types * New model: Support home-cooked Mistral Small Omni * New model: Support LiquidAI LFM2-MoE hybrid model * New model: Granite docling + Idefics3 preprocessing (SmolVLM) * New model: EmbeddingGemma Adding Support for SentenceTransformers Dense Modules * Server improvements, OpenAI API compatibility, optimizations, and bug fixes * Vulkan backend improvements, optimizations, and bug fixes * OpenCL backend fixes * CPU backend optimizations * Multimodal (mtmd) improvements * WebUI enhancements * Architecture-specific improvements * llama core improvements * Memory management improvements * Conversion and quantization tools enhancements * Grammar and sampling improvements * Chat and prompts enhancements * General fixes and improvements * RPC improvements and bug fixes * Full commit log: OBS-URL: https://build.opensuse.org/request/show/1315691 OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=22	2025-11-06 17:12:47 +00:00
Eyad Issa	b89927e8a7	- Update to version 6937: * New model: Janus Pro * New model: Minimax M2 * New model: Granite Hybrid nano types * New model: support for qwen3vl series * New model: support for CogVLM model * New model: LightOnOCR-1B model * New model: BailingMoeV2 support * New model: Granite Hybrid types * New model: Support home-cooked Mistral Small Omni * New model: Support LiquidAI LFM2-MoE hybrid model * New model: Granite docling + Idefics3 preprocessing (SmolVLM) * New model: EmbeddingGemma Adding Support for SentenceTransformers Dense Modules * Server improvements, OpenAI API compatibility, optimizations, and bug fixes * Vulkan backend improvements, optimizations, and bug fixes * OpenCL backend fixes * CPU backend optimizations * Multimodal (mtmd) improvements * WebUI enhancements * Architecture-specific improvements * llama core improvements * Memory management improvements * Conversion and quantization tools enhancements * Grammar and sampling improvements * Chat and prompts enhancements * General fixes and improvements * RPC improvements and bug fixes * Full commit log: OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=115	2025-11-03 18:57:48 +00:00
Dominique Leuenberger	12f7fce1c0	Accepting request 1309029 from science:machinelearning - Update to version 6690: * Full commit log: https://github.com/ggml-org/llama.cpp/compare/b6605...b6690 OBS-URL: https://build.opensuse.org/request/show/1309029 OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=21	2025-10-05 15:51:16 +00:00
Eyad Issa	a8cb1efc19	- Update to version 6690: * ggml: bump to v0.9.4; fix graph reallocation and multi-chunk dependencies * ggml webgpu: add soft_max op; optimize rms_norm; extend operator support * ggml-riscv: add Spacemit backend * vulkan: improve shader threading and incremental builds * vulkan: fix FA coopmat1 array indexing and quantized flash attention * vulkan: replace maxMemoryAllocationSize, improve header compatibility * vulkan: add bounds checks in flash attention; 64-bit im2col support * rpc: add multi-device support; validate src buffer copies * server: add context checkpointing for hybrid and recurrent models * chat: add Magistral thinking support; fix missing sibling messages * webui: fix payloads and routing; improve mobile and dialog behavior * model: implement Apertus; support GLM 4.6 * llama: fix shapes for BERT/MPT q/k norm; improve PLaMo2 loading * common: introduce http.h client; disable progress bar without tty * common: remove common_has_curl(); simplify etag tracking * opencl: support pad_ext and ne3 in get_rows * various minor fixes for scrolling, sampling, and chat block handling * Full commit log: OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=113	2025-10-04 22:06:53 +00:00
Ana Guerrero	c1a7fde844	Accepting request 1307499 from science:machinelearning OBS-URL: https://build.opensuse.org/request/show/1307499 OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=20	2025-09-29 14:32:13 +00:00
Eyad Issa	7eecb82de8	- Update to b6605: * Added docker protocol support and resumable downloads for llama-server * New models: LLaDA-7b-MoE, Grok-2, GroveMoE, OLMo3, LiquidAI LFM2-2.6B * Added conversion support for GraniteHybrid (non-hybrid attn) and Llama4ForCausalLM * llama: support for qwen3 reranker, T5 unequal encoder-decoder layers, seq limit bumped 64 → 256 * Bench improvements: list devices, multiple devices, n-cpu-moe * Vulkan: conv_transpose_2d, GET_ROWS, iGPU device selection, buffer optimizations, shader fixes, OOM handling * ggml: semantic versioning, backend/device extensions, optimizations, fixes for embedding, quantization, padding * ggml-cpu: SIMD support (MXFP4 for s390x), cpumask respect, ARM INT8 checks * Common: fixes for memory corruption, offline mode without curl, switch to cpp-httplib * Server: SSE/OpenAI error handling, usage stats opt-in, external test server, removed LLAMA_SERVER_SSL * WebUI: migrated to SvelteKit, hash-based routing, chunk handling fixes * Fixes across model-conversion, rpc, media, devops, embedding docs, typos * Full commit log: https://github.com/ggml-org/llama.cpp/compare/b6269...b6428 OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=111	2025-09-27 17:34:31 +00:00
Ana Guerrero	636fe3b65d	Accepting request 1305191 from science:machinelearning Automatic submission by obs-autosubmit OBS-URL: https://build.opensuse.org/request/show/1305191 OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=19	2025-09-16 16:20:03 +00:00
Eyad Issa	0ec4f68c01	- Update to version 6428 OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=109	2025-09-09 12:29:13 +00:00
Ana Guerrero	29895c65c7	Accepting request 1302234 from science:machinelearning Automatic submission by obs-autosubmit OBS-URL: https://build.opensuse.org/request/show/1302234 OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=18	2025-09-02 15:58:24 +00:00
Ana Guerrero	f6cca5429e	Accepting request 1301212 from science:machinelearning Automatic submission by obs-autosubmit OBS-URL: https://build.opensuse.org/request/show/1301212 OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=17	2025-08-25 18:38:58 +00:00
Eyad Issa	c5d8653d73	- Update to version 6269: * Model and conversion: support for Seed-OSS, GPT-OSS response_format, interns1-mini, Ernie 4.5, gpt-oss type strings, improved Mistral templates, new model conversion tool/example with torch-cpu. * Vulkan backend: multiple optimizations (rms_norm, mul_mat_id, synchronization, conv2d, subgroup ops), new ops (exp, conv_2d_dw f16, ggml_mean). * GGML/CPU: added conv3d op, WebGPU quantization support, Q5_0/Q5_1 on s390x, mxfp4 intrinsics on ppc64le. * Server and chat: multimodal completion and embeddings JSON support, improved OpenAI API compatibility and usage statistics, disabled context shift by default, fixed ordering of tasks, webui issues, debug assertions, clarified reasoning_format. * KV cache: unified handling improvements, support for reuse, removal of deprecated APIs, simplifications. * Miscellaneous: fixed logging of non-ASCII characters, removed deprecated or unused code and build artifacts. * Full commit log: https://github.com/ggml-org/llama.cpp/compare/b6188...b6269 OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=106	2025-08-25 14:14:05 +00:00
Eyad Issa	3764c5b78a	- Update to version 6188: * Vulkan backend improvements: larger workgroups, optimized argsort, fused adds, bounds checking, out-of-bounds and compile warning fixes, performance logging. * OpenCL backend: initial FA and mxfp4 support. * Model support: vision LiquidAI LFM2-VL family, 18-layer Gemma 3-270m model type. * Common: fixed double BOS, improved chat templates, added override-tensor and CPU MoE draft parameters. * GGML: initial IBM zDNN backend, rope_multi update, conv_1d_dw bug fix, block_iq4_nlx8 repack, improved Mistral integration. * Server: SWA checkpoints, -td/-tbd parameters, harmony thought message filtering. * Perplexity: improved error hints and constraint reporting. * GPT-OSS: harmony parsing implemented. - Add LLAMA_BUILD_NUMBER and LLAMA_VERSION to the build OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=105	2025-08-17 22:18:58 +00:00
Dominique Leuenberger	8ee8f8134c	Accepting request 1299150 from science:machinelearning - Update to version 6139: * opencl: allow mixed f16/f32 `add` (#15140) * mtmd : Fix MinicpmV model converter and clip to avoid using hardcode. (#14750) * chat : hotfix gpt-oss jinja raising an exception (#15243) * server : allow specifying reasoning_format in HTTP request (#15238) * kv-cache : fix seq_rm with seq_id == -1 (#15226) * kv-cache : log (debug) all streams in find_slot (#15176) * convert : improve Mistral models integration (#14737) * kleidiai: fix unsigned overflow bug (#15150) - Add LLAMA_BUILD_NUMBER and LLAMA_VERSION to the build - Update to version 6121: * Support intern-s1 * opencl: add swiglu_oai and add_id * vulkan: support fattn sinks * vulkan: Add env var to disable host visible vidmem * ggml: Skip backend library linking code when GGML_BACKEND_DL=ON * ggml : fix fallback to CPU for ununsupported ops * Various bug fixes * Full changelog: https://github.com/ggml-org/llama.cpp/compare/b6100...b6121 OBS-URL: https://build.opensuse.org/request/show/1299150 OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=16	2025-08-13 14:30:52 +00:00
Eyad Issa	755973372c	- Update to version 6139: * opencl: allow mixed f16/f32 `add` (#15140) * mtmd : Fix MinicpmV model converter and clip to avoid using hardcode. (#14750) * chat : hotfix gpt-oss jinja raising an exception (#15243) * server : allow specifying reasoning_format in HTTP request (#15238) * kv-cache : fix seq_rm with seq_id == -1 (#15226) * kv-cache : log (debug) all streams in find_slot (#15176) * convert : improve Mistral models integration (#14737) * kleidiai: fix unsigned overflow bug (#15150) OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=103	2025-08-12 18:01:43 +00:00
Eyad Issa	0e11fa8fd1	Add LLAMA_BUILD_NUMBER and LLAMA_VERSION to the build OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=102	2025-08-12 17:38:59 +00:00
Eyad Issa	02bb0a433c	- Update to version 6121: * Support intern-s1 * opencl: add swiglu_oai and add_id * vulkan: support fattn sinks * vulkan: Add env var to disable host visible vidmem * ggml: Skip backend library linking code when GGML_BACKEND_DL=ON * ggml : fix fallback to CPU for ununsupported ops * Various bug fixes * Full changelog: https://github.com/ggml-org/llama.cpp/compare/b6100...b6121 OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=101	2025-08-08 23:44:39 +00:00
Dominique Leuenberger	6e32983ab7	Accepting request 1298007 from science:machinelearning - Drop 0001-dl-load-path.patch: use GGML_BACKEND_DIR instead - Enable loading backends dynamically - Update to version 6100: * llama : add gpt-oss (#15091) * llama : add --n-cpu-moe option (#15077) * llama : enable LLAMA_SET_ROWS=1 by default (#14959) * server : add openai-style logit_bias support (#14946) * server : implement universal assisted decoding (#12635) * mtmd : support MiniCPM-V 4.0 (#14983) * opencl: add f16 for `add`, `sub`, `mul`, `div` (#14984) * model : add hunyuan dense (#14878) * model : add text-only support for Kimi-VL * model: support GLM 4.5 family of models (#14939) * model : support Qwen3-Embedding (#15023) * graph : Optimize graph operations * vulkan: various bug fixes and optimizations * Various bug fixes - Update to version 6038: * chat : fix kimi-k2 chat template (#14852) * common : avoid logging partial messages (which can contain broken UTF-8 sequences) (#14937) * context : perform output reorder lazily upon access after sync (#14853) * context : restore preemptive sched reset when LLAMA_SET_ROWS=0 (#14870) * convert : text-only support for GLM-4.1V-9B-Thinking (#14823) * embeddings: fix extraction of CLS pooling results (#14927) * ggml-cpu : deduplicate scalar implementations (#14897) * ggml-cpu : disable GGML_NNPA by default due to instability OBS-URL: https://build.opensuse.org/request/show/1298007 OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=15	2025-08-07 14:48:47 +00:00
Eyad Issa	031ccc2321	- Enable loading backends dynamically OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=99	2025-08-06 17:36:49 +00:00
Eyad Issa	65d2ea5770	- Drop 0001-dl-load-path.patch: use GGML_BACKEND_DIR instead - Update to version 6100: * llama : add gpt-oss (#15091) * llama : add --n-cpu-moe option (#15077) * llama : enable LLAMA_SET_ROWS=1 by default (#14959) * server : add openai-style logit_bias support (#14946) * server : implement universal assisted decoding (#12635) * mtmd : support MiniCPM-V 4.0 (#14983) * opencl: add f16 for `add`, `sub`, `mul`, `div` (#14984) * model : add hunyuan dense (#14878) * model : add text-only support for Kimi-VL * model: support GLM 4.5 family of models (#14939) * model : support Qwen3-Embedding (#15023) * graph : Optimize graph operations * vulkan: various bug fixes and optimizations * Various bug fixes OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=98	2025-08-06 17:28:56 +00:00
Dominique Leuenberger	2fd82620cc	Accepting request 1296595 from science:machinelearning Automatic submission by obs-autosubmit OBS-URL: https://build.opensuse.org/request/show/1296595 OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=14	2025-07-31 15:46:28 +00:00
Eyad Issa	5fb3a7e0e0	- Update to version 6038: OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=96	2025-07-30 20:42:07 +00:00
Eyad Issa	17f3fd085f	- Update to version 5970: * Full changelog: https://github.com/ggml-org/llama.cpp/compare/b5889...b5970 OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=95	2025-07-23 14:31:52 +00:00
Ana Guerrero	5fd045ca7d	Accepting request 1292534 from science:machinelearning - Add GGML_NATIVE=OFF build flag - Update to version 5889: * Remove Kompute support * Prevent integer overflow in gguf tensor size calculation (bsc#1246377) (CVE-2025-53630) (GHSA-vgg9-87g3-85w8) * Improved build-time messaging for ggml_set_rows. * Enhanced test coverage for LFM2 and added LFM2 to documentation. * Synchronized ggml updates and improved Vulkan backend (bilinear interpolation, ggml_roll, SET_ROWS, optimizations). * Fixed pooled embedding output in server and improved prompt processing. * Added support for LiquidAI LFM2 hybrid family and Falcon-H1 models. * Improved HIP, OpenCL, and SYCL backend compatibility and features. * Added new vocabularies and model support (midm-2.0, skt/A.X-4.0, SmolLM3, hunyuan moe, Granite Four). * Various bug fixes, optimizations, and documentation improvements across backends and models. * Full changelog: https://github.com/ggml-org/llama.cpp/compare/b5812...b5889 OBS-URL: https://build.opensuse.org/request/show/1292534 OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=13	2025-07-15 14:43:18 +00:00
Eyad Issa	54299e5777	* Remove Kompute support OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=93	2025-07-13 15:14:43 +00:00
Eyad Issa	21e3ba7e90	- Add GGML_NATIVE=OFF build flag OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=92	2025-07-13 15:12:56 +00:00
Eyad Issa	287ac0c443	- Add -GGML_NATIVE=OFF build flag - Update to version 5889: * Prevent integer overflow in gguf tensor size calculation (bsc#1246377) (CVE-2025-53630) (GHSA-vgg9-87g3-85w8) * Improved build-time messaging for ggml_set_rows. * Enhanced test coverage for LFM2 and added LFM2 to documentation. * Synchronized ggml updates and improved Vulkan backend (bilinear interpolation, ggml_roll, SET_ROWS, optimizations). * Fixed pooled embedding output in server and improved prompt processing. * Added support for LiquidAI LFM2 hybrid family and Falcon-H1 models. * Improved HIP, OpenCL, and SYCL backend compatibility and features. * Added new vocabularies and model support (midm-2.0, skt/A.X-4.0, SmolLM3, hunyuan moe, Granite Four). * Various bug fixes, optimizations, and documentation improvements across backends and models. * Full changelog: https://github.com/ggml-org/llama.cpp/compare/b5812...b5889 OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=91	2025-07-13 15:11:35 +00:00
Ana Guerrero	aee11711a1	Accepting request 1290235 from science:machinelearning - Update to version 5812: * Mamba-2 Support: Initial integration of Mamba-2 architecture. * Added support for ERNIE 4.5 0.3B, NeoBERT, Arcee AI's AFM, Gemma3n text-only, and dots.llm1 architectures * Vulkan Improvements: Support for softmax/FlashAttention batch/broadcast, fused RMS_NORM+MUL, and better memory handling * GGML Backend: Added REGLU/GEGLU/SWIGLU ops, ggml_set_rows, and improved SYCL/OpenCL/Metal support * Server Improvements: Jinja template kwargs, draft model cache params, and Unix socket support * Quantization: User-defined layer pruning and KV override fixes * Optimizations: Batched Vulkan mul_mat_id splitting and ARM hsum reduction * Added GGML version function * Full changelog: https://github.com/ggml-org/llama.cpp/compare/b5699...b5812 OBS-URL: https://build.opensuse.org/request/show/1290235 OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=12	2025-07-06 15:07:53 +00:00
Eyad Issa	7027db2e08	- Update to version 5812: * Mamba-2 Support: Initial integration of Mamba-2 architecture. * Added support for ERNIE 4.5 0.3B, NeoBERT, Arcee AI's AFM, Gemma3n text-only, and dots.llm1 architectures * Vulkan Improvements: Support for softmax/FlashAttention batch/broadcast, fused RMS_NORM+MUL, and better memory handling * GGML Backend: Added REGLU/GEGLU/SWIGLU ops, ggml_set_rows, and improved SYCL/OpenCL/Metal support * Server Improvements: Jinja template kwargs, draft model cache params, and Unix socket support * Quantization: User-defined layer pruning and KV override fixes * Optimizations: Batched Vulkan mul_mat_id splitting and ARM hsum reduction * Added GGML version function * Full changelog: https://github.com/ggml-org/llama.cpp/compare/b5699...b5812 OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=89	2025-07-03 00:33:30 +00:00
Ana Guerrero	e84b2edce8	Accepting request 1286807 from science:machinelearning - Update to 5699: * vocab : prevent integer overflow during load (bsc#1244714) (CVE-2025-49847) ... - Update to 5657: ... OBS-URL: https://build.opensuse.org/request/show/1286807 OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=11	2025-06-20 14:48:56 +00:00
Eyad Issa	05fa0fbdf4	- Update to 5699: * vocab : prevent integer overflow during load (bsc#1244714) (CVE-2025-49847) * batch : add LLAMA_BATCH_DEBUG environment variable * batch : auto-gen positions + verify multi-sequence input * common : suggest --jinja when autodetection fails * ggml-cpu: fix uncaught underscore terminators * kv-cache : fix use-after-move of defrag info * llama : rework embeddings logic * llama-chat : do not throw when tool parsing fails * llama-chat : fix multiple system message for gemma, orion * model : Add support for Arcee AI's upcoming AFM model * model : add dots.llm1 architecture support * model : add NeoBERT * server : When listening on a unix domain socket don't print http:// and port * quantize : change int to unsigned int for KV overrides * Full changelog: https://github.com/ggml-org/llama.cpp/compare/b5657...b5699 OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=87	2025-06-19 00:59:30 +00:00
Eyad Issa	ecc1b17ccf	- Update to 5657: OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=86	2025-06-14 13:16:56 +00:00
Ana Guerrero	c55a8cac44	Accepting request 1283892 from science:machinelearning Automatic submission by obs-autosubmit OBS-URL: https://build.opensuse.org/request/show/1283892 OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=10	2025-06-10 07:05:25 +00:00
Eyad Issa	098bcc02b6	- Update to 5556: * mtmd : move helpers to dedicated library * server: fix remove 'image_url'/'input_audio' json-object * llama : add RobertaForSequenceClassification reranker support * ggml: aarch64: Implement SVE F32 kernels for Mamba Sequential Scan Algorithm * llama : add support for jina-reranker-v2 * arm64: optimize q4_k_q8_k kernel with i8mm * llama : use llm_build_granite for minicpm * mtmd : drop _shared from libmtmd name, merge helpers into libmtmd * server: allow unclosed thinking tags * llama : use n_swa + n_ubatch cells for SWA cache * convert : fix rwkv bos/eos token * llama : add support for DistilBert OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=84	2025-05-31 23:45:53 +00:00
Dominique Leuenberger	b9212f41bb	Accepting request 1280718 from science:machinelearning - Update to 5516: * llama : remove llama_kv_cache_view API * model : disable SWA for Phi models * kv-cache : simplify the interface * server : Add the endpoints /api/tags and /api/chat * ggml : add ggml_gelu_erf() * hparams : support models for which all layers use SWA * opencl: fix couple crashes * opencl: Add support for multiple devices * mtmd : add ultravox audio input * server : support audio input * server: streaming of tool calls and thoughts when jinja is on * mtmd : support Qwen 2.5 Omni * ggml : riscv: add xtheadvector support * opencl : various optimizations * Full changelog: https://github.com/ggml-org/llama.cpp/compare/b5426...b5516 OBS-URL: https://build.opensuse.org/request/show/1280718 OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=9	2025-05-30 12:32:23 +00:00
Eyad Issa	d0e896b3f4	- Update to 5516: * llama : remove llama_kv_cache_view API * model : disable SWA for Phi models * kv-cache : simplify the interface * server : Add the endpoints /api/tags and /api/chat * ggml : add ggml_gelu_erf() * hparams : support models for which all layers use SWA * opencl: fix couple crashes * opencl: Add support for multiple devices * mtmd : add ultravox audio input * server : support audio input * server: streaming of tool calls and thoughts when jinja is on * mtmd : support Qwen 2.5 Omni * ggml : riscv: add xtheadvector support * opencl : various optimizations * Full changelog: https://github.com/ggml-org/llama.cpp/compare/b5426...b5516 OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=82	2025-05-27 22:55:37 +00:00
Ana Guerrero	8dfa0f3a34	Accepting request 1278459 from science:machinelearning - Update to 5426: * print hint when loading a model when no backends are loaded * vulkan: use scalar FA rather than coopmat2 when N==1 * mtmd : add vision support for llama 4 * Full changelog: https://github.com/ggml-org/llama.cpp/compare/b5402...b5426 - Update to 5402 * removed llava subpackage (#13460) * Full changelog: https://github.com/ggml-org/llama.cpp/compare/b5158...b5321 - Update to version 5332: * server : vision support via libmtmd OBS-URL: https://build.opensuse.org/request/show/1278459 OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=8	2025-05-20 10:19:52 +00:00
Eyad Issa	50ec7b4608	- Update to 5426: * print hint when loading a model when no backends are loaded * vulkan: use scalar FA rather than coopmat2 when N==1 * mtmd : add vision support for llama 4 * Full changelog: https://github.com/ggml-org/llama.cpp/compare/b5402...b5426 OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=80	2025-05-19 22:19:33 +00:00
Eyad Issa	3125cf9e4e	Update to 5402 OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=79	2025-05-17 13:15:02 +00:00
Eyad Issa	f19318fe3f	- Update to version 5332: * server : vision support via libmtmd OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=78	2025-05-09 21:18:48 +00:00
Ana Guerrero	1dd356788c	Accepting request 1276203 from science:machinelearning - Use source urls instead of obs_scm - Add libllava and libmtmd libraries - Update to version 5327: * A new binary llama-mtmd-cli is introduced to replace llava-cli, minicpmv-cli, gemma3-cli (#13012) and qwen2vl-cli (#13141), libllava will be deprecated * Full changes here: https://github.com/ggml-org/llama.cpp/compare/b5158...b5321 - Delete patch 0002-build-main-cli.patch: build system changed upstream OBS-URL: https://build.opensuse.org/request/show/1276203 OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/llamacpp?expand=0&rev=7	2025-05-09 16:52:00 +00:00
Eyad Issa	6553a34765	OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=76	2025-05-09 11:08:31 +00:00
Eyad Issa	aee1933fe1	- Delete patch 0002-build-main-cli.patch: build system changed upstream OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=75	2025-05-09 11:08:05 +00:00
Eyad Issa	6c3349496a	OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=74	2025-05-09 11:04:22 +00:00
Eyad Issa	68410760e1	OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=73	2025-05-09 11:02:25 +00:00
Eyad Issa	fce1fbe866	- Use source urls instead of obs_scm - Update to version 5327: * A new binary llama-mtmd-cli is introduced to replace llava-cli, minicpmv-cli, gemma3-cli (#13012) and qwen2vl-cli (#13141), libllava will be deprecated * Full changes here: https://github.com/ggml-org/llama.cpp/compare/b5158...b5321 - Disable patch 0001-dl-load-path.patch OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/llamacpp?expand=0&rev=72	2025-05-09 11:00:51 +00:00

1 2 3

121 Commits