- Updated to version 0.15.2:
* New ollama launch clawdbot command for launching Clawdbot
using Ollama models
- Updated to version 0.15.1:
* GLM-4.7-Flash performance and correctness improvements, fixing
repetitive answers and tool calling quality
* Fixed performance issues on arm64
* Fixed issue where ollama launch would not detect claude and would
incorrectly update opencode configurations
- Updated to version 0.15.1:
* New command: ollama launch
A new ollama launch command to use Ollama's models with Claude
Code, Codex, OpenCode, and Droid without separate configuration.
New ollama launch command for Claude Code, Codex, OpenCode, and Droid
* Fixed issue where creating multi-line strings with """ would not
work when using ollama run
* Ctrl+J and Shift+Enter now work for inserting newlines in ollama run
* Reduced memory usage for GLM-4.7-Flash models
- Updated to version 0.14.3:
Image generation:
* Z-Image Turbo: 6 billion parameter text-to-image model from
Alibaba’s Tongyi Lab. It generates high-quality photorealistic
images.
* Flux.2 Klein: Black Forest Labs’ fastest image-generation models
to date.
New models:
* GLM-4.7-Flash: As the strongest model in the 30B
class, GLM-4.7-Flash offers a new option for lightweight
deployment that balances performance and efficiency.
* LFM2.5-1.2B-Thinking: LFM2.5 is a new family of hybrid models
OBS-URL: https://build.opensuse.org/request/show/1329648
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/ollama?expand=0&rev=140
- Update to version 0.14.2:
* New models: TranslateGemma
* Shift + Enter will now enter a newline in Ollama's CLI
* Improve /v1/responses API to better confirm to OpenResponses
specification
- Update to version 0.14.1:
* Experimental image generation models are available Linux (CUDA)
`ollama run x/z-image-turbo`
- Update to version 0.14.0:
* ollama run --experimental CLI will now open a new Ollama CLI
that includes an agent loop and the bash tool
* Anthropic API compatibility: support for the /v1/messages API
* A new REQUIRES command for the Modelfile allows declaring which
version of Ollama is required for the model
* For older models, Ollama will avoid an integer underflow on low
VRAM systems during memory estimation
* More accurate VRAM measurements for AMD iGPUs
* An error will now return when embeddings return NaN or -Inf
OBS-URL: https://build.opensuse.org/request/show/1328568
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/ollama?expand=0&rev=54
* New models: TranslateGemma
* Shift + Enter will now enter a newline in Ollama's CLI
* Improve /v1/responses API to better confirm to OpenResponses
specification
- Update to version 0.14.1:
* Experimental image generation models are available Linux (CUDA)
`ollama run x/z-image-turbo`
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/ollama?expand=0&rev=138
* ollama run --experimental CLI will now open a new Ollama CLI
that includes an agent loop and the bash tool
* Anthropic API compatibility: support for the /v1/messages API
* A new REQUIRES command for the Modelfile allows declaring which
version of Ollama is required for the model
* For older models, Ollama will avoid an integer underflow on low
VRAM systems during memory estimation
* More accurate VRAM measurements for AMD iGPUs
* An error will now return when embeddings return NaN or -Inf
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/ollama?expand=0&rev=137
- Update vendored golang.org/x/net/html to v0.48.0
- Update to version 0.13.4:
* New models: Nemotron 3 Nano, Olmo 3, Olmo 3.1
* Enable Flash Attention automatically for models by default
* Fixed handling of long contexts with Gemma 3 models
* Fixed issue that would occur with Gemma 3 QAT models or
other models imported with the Gemma 3 architecture
- Update to version 0.13.3:
* New models: Devstral-Small-2, rnj-1, nomic-embed-text-v2
* Improved truncation logic when using /api/embed and
/v1/embeddings
* Extend Gemma 3 architecture to support rnj-1 model
* Fix error that would occur when running qwen2.5vl with image
input
- Update to version 0.13.2:
* New models: Qwen3-Next
* Flash attention is now enabled by default for vision models
such as mistral-3, gemma3, qwen3-vl and more. This improves
memory utilization and performance when providing images as
input.
* Fixed GPU detection on multi-GPU CUDA machines
* Fixed issue where deepseek-v3.1 would always think even with
thinking is disabled in Ollama's app
OBS-URL: https://build.opensuse.org/request/show/1323403
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/ollama?expand=0&rev=52
- Update to version 0.12.10
* Fixed errors when running qwen3-vl:235b and
qwen3-vl:235b-instruct
* Enable flash attention for Vulkan (currently needs to be built
from source)
* Add Vulkan memory detection for Intel GPU using DXGI+PDH
* Ollama will now return tool call IDs from the /api/chat API
* Fixed hanging due to CPU discovery
* Ollama will now show login instructions when switching to a
cloud model in interactive mode
* Fix reading stale VRAM data
* 'ollama run' now works with embedding models
OBS-URL: https://build.opensuse.org/request/show/1316467
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/ollama?expand=0&rev=124
- Fixed issue with duplicated libraries (/usr/lib, /usr/lib64)
- Update to version 0.12.9
* Fix performance regression on CPU-only systems
- Update to version 0.12.8
* qwen3-vl performance improvements, including flash attention
support by default
* qwen3-vl will now output less leading whitespace in the
response when thinking
* Fixed issue where deepseek-v3.1 thinking could not be disabled
in Ollama's new app
* Fixed issue where qwen3-vl would fail to interpret images with
transparent backgrounds
* Ollama will now stop running a model before removing it via
ollama rm
* Fixed issue where prompt processing would be slower on
Ollama's engine
- Update to version 0.12.7
* New model: Qwen3-VL: Qwen3-VL is now available in all parameter
sizes ranging from 2B to 235B
* New model: MiniMax-M2: a 230 Billion parameter model built for
coding & agentic workflows available on Ollama's cloud
* Model load failures now include more information on Windows
* Fixed embedding results being incorrect when running
embeddinggemma
* Fixed gemma3n on Vulkan backend
* Increased time allocated for ROCm to discover devices
* Fixed truncation error when generating embeddings
* Fixed request status code when running cloud models
* The OpenAI-compatible /v1/embeddings endpoint now supports
encoding_format parameter
* Ollama will now parse tool calls that don't conform to
{"name": name, "arguments": args} (thanks @rick-github!)
* Fixed prompt processing reporting in the llama runner
* Increase speed when scheduling models
* Fixed issue where FROM <model> would not inherit RENDERER or
PARSER commands
OBS-URL: https://build.opensuse.org/request/show/1315028
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/ollama?expand=0&rev=122
- Update vendored golang.org/x/net/html to v0.46.0
- Update to version 0.12.6
* Experimental Vulkan support
* Ollama's app now supports searching when running DeepSeek-V3.1,
Qwen3 and other models that support tool calling.
* Flash attention is now enabled by default for Gemma 3,
improving performance and memory utilization
* Fixed issue where Ollama would hang while generating responses
* Fixed issue where qwen3-coder would act in raw mode when using
/api/generate or ollama run qwen3-coder <prompt>
* Fixed qwen3-embedding providing invalid results
* Ollama will now evict models correctly when num_gpu is set
* Fixed issue where tool_index with a value of 0 would not be
sent to the model
- Add ollama user to render group
OBS-URL: https://build.opensuse.org/request/show/1312121
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/ollama?expand=0&rev=118
[boo#1251413] [CVE-2025-47911] [boo#1241757] [CVE-2025-22872]
- Update to version 0.12.5:
* Fixed issue where "think": false would show an error instead of
being silently ignored
* Fixed deepseek-r1 output issues
- Update to version 0.12.4:
* Flash attention is now enabled by default for Qwen 3 and Qwen 3
Coder
* Fixed an issue where keep_alive in the API would accept
different values for the /api/chat and /api/generate endpoints
* Fixed tool calling rendering with qwen3-coder
* More reliable and accurate VRAM detection
* OLLAMA_FLASH_ATTENTION can now be overridden to 0 for models
that have flash attention enabled by default
* Fixed crash where templates were not correctly defined
* openai: always provide reasoning
* Bug fixes
* No notable changes.
* Fixed issue when quantizing models with the Gemma 3n
* Ollama will now limit context length to what the model was
* Fixed issue where tool calls without parameters would not be
returned correctly
* Fixed issue where some special tokens would not be tokenized
- Allow to build for Package Hub for SLE-15-SP7
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/ollama?expand=0&rev=117
* New models: DeepSeek-V3.1-Terminus, Kimi K2-Instruct-0905
* Fixed issue where tool calls provided as stringified JSON
would not be parsed correctly
* ollama push will now provide a URL to follow to sign in
* Fixed issues where qwen3-coder would output unicode characters
incorrectly
* Fix issue where loading a model with /load would crash
- Update to version 0.12.2:
* A new web search API is now available in Ollama
* Models with Qwen3's architecture including MoE now run in
Ollama's new engine
* Fixed issue where built-in tools for gpt-oss were not being
rendered correctly
* Support multi-regex pretokenizers in Ollama's new engine
* Ollama's new engine can now load tensors by matching a prefix
or suffix
- Update to version 0.12.1:
* New model: Qwen3 Embedding: state of the art open embedding
model by the Qwen team
* Qwen3-Coder now supports tool calling
* Fixed issue where Gemma3 QAT models would not output correct
tokens
* Fix issue where & characters in Qwen3-Coder would not be parsed
correctly when function calling
* Fixed issues where ollama signin would not work properly
- Update to version 0.12.0:
* Cloud models are now available in preview
* Models with the Bert architecture now run on Ollama's engine
* Models with the Qwen 3 architecture now run on Ollama's engine
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/ollama?expand=0&rev=115
- Update to version 0.10.1:
* No notable changes.
- Update to version 0.10.0:
* ollama ps will now show the context length of loaded models
* Improved performance in gemma3n models by 2-3x
* Parallel request processing now defaults to 1
* Fixed issue where tool calling would not work correctly with
granite3.3 and mistral-nemo models
* Fixed issue where Ollama's tool calling would not work
correctly if a tool's name was part of of another one, such as
add and get_address
* Improved performance when using multiple GPUs by 10-30%
* Ollama's OpenAI-compatible API will now support WebP images
* Fixed issue where ollama show would report an error
* ollama run will more gracefully display errors
OBS-URL: https://build.opensuse.org/request/show/1297591
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/ollama?expand=0&rev=41
* No notable changes.
- Update to version 0.10.0:
* ollama ps will now show the context length of loaded models
* Improved performance in gemma3n models by 2-3x
* Parallel request processing now defaults to 1
* Fixed issue where tool calling would not work correctly with
granite3.3 and mistral-nemo models
* Fixed issue where Ollama's tool calling would not work
correctly if a tool's name was part of of another one, such as
add and get_address
* Improved performance when using multiple GPUs by 10-30%
* Ollama's OpenAI-compatible API will now support WebP images
* Fixed issue where ollama show would report an error
* ollama run will more gracefully display errors
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/ollama?expand=0&rev=109
- Update to version 0.9.5:
* No notable changes.
- Update to version 0.9.4:
* The directory in which models are stored can now be modified.
* Tool calling with empty parameters will now work correctly
* Fixed issue when quantizing models with the Gemma 3n
architecture
- Update to version 0.9.3:
* Ollama now supports Gemma 3n
* Ollama will now limit context length to what the model was
trained against to avoid strange overflow behavior
- Update to version 0.9.2:
* Fixed issue where tool calls without parameters would not be
returned correctly
* Fixed does not support generate errors
* Fixed issue where some special tokens would not be tokenized
properly for some model architectures
OBS-URL: https://build.opensuse.org/request/show/1290234
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/ollama?expand=0&rev=40
* No notable changes.
- Update to version 0.9.4:
* The directory in which models are stored can now be modified.
* Tool calling with empty parameters will now work correctly
* Fixed issue when quantizing models with the Gemma 3n
architecture
- Update to version 0.9.3:
* Ollama now supports Gemma 3n
* Ollama will now limit context length to what the model was
trained against to avoid strange overflow behavior
- Update to version 0.9.2:
* Fixed issue where tool calls without parameters would not be
returned correctly
* Fixed does not support generate errors
* Fixed issue where some special tokens would not be tokenized
properly for some model architectures
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/ollama?expand=0&rev=107
* Tool calling reliability and performance has been improved for
the following models: Magistral Llama 4 Mistral
DeepSeek-R1-2508
* Magistral now supports disabling thinking mode
* Error messages that previously showed POST predict will now be
more informative
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/ollama?expand=0&rev=105
* Ollama now has the ability to enable or disable thinking.
This gives users the flexibility to choose the model’s thinking
behavior for different applications and use cases.
- Update to version 0.8.0:
* Ollama will now stream responses with tool calls
* Logs will now include better memory estimate debug information
when running models in Ollama's engine.
- Update to version 0.7.1:
* Improved model memory management to allocate sufficient memory
to prevent crashes when running multimodal models in certain
situations
* Enhanced memory estimation for models to prevent unintended
memory offloading
* ollama show will now show ... when data is truncated
* Fixed crash that would occur with qwen2.5vl
* Fixed crash on Nvidia's CUDA for llama3.2-vision
* Support for Alibaba's Qwen 3 and Qwen 2 architectures in
Ollama's new multimodal engine
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/ollama?expand=0&rev=103
- Update to version 0.7.0:
* Ollama now supports multimodal models via Ollama’s new engine,
starting with new vision multimodal models:
~ Meta Llama 4
~ Google Gemma 3
~ Qwen 2.5 VL
~ Qwen 2.5 VL
* Ollama now supports providing WebP images as input to
multimodal models
* Improved performance of importing safetensors models via
ollama create
* Various bug fixes and performance enhancements
OBS-URL: https://build.opensuse.org/request/show/1278142
OBS-URL: https://build.opensuse.org/package/show/openSUSE:Factory/ollama?expand=0&rev=36
* Ollama now supports multimodal models via Ollama’s new engine,
starting with new vision multimodal models:
~ Meta Llama 4
~ Google Gemma 3
~ Qwen 2.5 VL
~ Qwen 2.5 VL
* Ollama now supports providing WebP images as input to
multimodal models
* Improved performance of importing safetensors models via
ollama create
* Various bug fixes and performance enhancements
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/ollama?expand=0&rev=97