8d7853a519- Update to version 0.11.4: * openai: allow for content and tool calls in the same message * openai: when converting role=tool messages, propagate the tool name * openai: always provide reasoning * Bug fixes
devel
Eyad Issa2025-08-07 23:21:08 +00:00
2084254cdb- Update to version 0.10.1: * No notable changes. - Update to version 0.10.0: * ollama ps will now show the context length of loaded models * Improved performance in gemma3n models by 2-3x * Parallel request processing now defaults to 1 * Fixed issue where tool calling would not work correctly with granite3.3 and mistral-nemo models * Fixed issue where Ollama's tool calling would not work correctly if a tool's name was part of of another one, such as add and get_address * Improved performance when using multiple GPUs by 10-30% * Ollama's OpenAI-compatible API will now support WebP images * Fixed issue where ollama show would report an error * ollama run will more gracefully display errorsEyad Issa2025-08-05 00:09:03 +00:00
1bd364710eAccepting request 1290234 from science:machinelearningAna Guerrero2025-07-06 15:07:50 +00:00
96bd58f7ae- Update to version 0.9.5: * No notable changes. - Update to version 0.9.4: * The directory in which models are stored can now be modified. * Tool calling with empty parameters will now work correctly * Fixed issue when quantizing models with the Gemma 3n architecture - Update to version 0.9.3: * Ollama now supports Gemma 3n * Ollama will now limit context length to what the model was trained against to avoid strange overflow behavior - Update to version 0.9.2: * Fixed issue where tool calls without parameters would not be returned correctly * Fixed does not support generate errors * Fixed issue where some special tokens would not be tokenized properly for some model architecturesEyad Issa2025-07-03 00:15:58 +00:00
6a5a5625baAccepting request 1288227 from science:machinelearningAna Guerrero2025-06-24 18:50:15 +00:00
98c8741f6d- Update to version 0.9.1: * Tool calling reliability and performance has been improved for the following models: Magistral Llama 4 Mistral DeepSeek-R1-2508 * Magistral now supports disabling thinking mode * Error messages that previously showed POST predict will now be more informativeEyad Issa2025-06-17 10:54:45 +00:00
a1fb8916f0Accepting request 1283893 from science:machinelearningAna Guerrero2025-06-10 07:05:27 +00:00
f6e82760be- Update to version 0.9.0: * Ollama now has the ability to enable or disable thinking. This gives users the flexibility to choose the model’s thinking behavior for different applications and use cases. - Update to version 0.8.0: * Ollama will now stream responses with tool calls * Logs will now include better memory estimate debug information when running models in Ollama's engine. - Update to version 0.7.1: * Improved model memory management to allocate sufficient memory to prevent crashes when running multimodal models in certain situations * Enhanced memory estimation for models to prevent unintended memory offloading * ollama show will now show ... when data is truncated * Fixed crash that would occur with qwen2.5vl * Fixed crash on Nvidia's CUDA for llama3.2-vision * Support for Alibaba's Qwen 3 and Qwen 2 architectures in Ollama's new multimodal engineEyad Issa2025-06-01 00:00:21 +00:00
5613668f67Accepting request 1279778 from science:machinelearningAna Guerrero2025-05-26 16:32:37 +00:00
80920a6d90- Cleanup part in spec file where build for SLE-15-SP6 and above is defined to make if condition more robustEyad Issa2025-05-23 12:19:15 +00:00
12e7825191- Allow to build for Package Hub for SLE-15-SP7 (openSUSE:Backports:SLE-15-SP7) with g++-12/gcc-12 by checking for sle_version >= 150600 in spec file (bsc#1243438)Eyad Issa2025-05-21 21:23:32 +00:00
9319788c42Accepting request 1278142 from science:machinelearningAna Guerrero2025-05-20 07:36:41 +00:00
a44b3e88be- Update to version 0.7.0: * Ollama now supports multimodal models via Ollama’s new engine, starting with new vision multimodal models: ~ Meta Llama 4 ~ Google Gemma 3 ~ Qwen 2.5 VL ~ Qwen 2.5 VL * Ollama now supports providing WebP images as input to multimodal models * Improved performance of importing safetensors models via ollama create * Various bug fixes and performance enhancementsEyad Issa2025-05-17 14:49:57 +00:00
2995d78773Accepting request 1277233 from science:machinelearningAna Guerrero2025-05-14 15:01:10 +00:00
df3ef52b06- Update to version 0.6.8: * Performance improvements for Qwen 3 MoE models on NVIDIA and AMD GPUs * Fixed a memory leak that occurred when providing images as input * ollama show will now correctly label older vision models such as llava * Reduced out of memory errors by improving worst-case memory estimations * Fix issue that resulted in a context canceled error - Update to version 0.6.7: * New model: Qwen 3 * New model: Phi 4 reasoning and Phi 4 mini reasoning * New model: llama 4 * Increased default context window to 4096 tokens * Fixed issue where image paths would not be recognized with ~ when being provided to ollama run * Improved output quality when using JSON mode in certain scenarios * Fixed issue where model would be stuck in the Stopping... state - Use source url (https://en.opensuse.org/SourceUrls)Eyad Issa2025-05-06 10:13:41 +00:00
0b29c803c5Accepting request 1272498 from science:machinelearningAna Guerrero2025-04-25 20:19:03 +00:00
9e7eb61a05- Update to version 0.6.6: * New model: IBM Granite 3.3 * New model: DeepCoder * New, faster model downloading: OLLAMA_EXPERIMENT=client2 ollama serve will run Ollama using a new downloader with improved performance and reliability when running ollama pull * Fixed memory leak issues when running Gemma 3, Mistral Small 3.1 and other models on Ollama * Improved performance of ollama create when importing models from Safetensors * Ollama will now allow tool function parameters with either a single type or an array of types * Fixed certain out-of-memory issues caused by not reserving enough memory at startup * Fixed nondeterministic model unload order * Included the items and $defs fields to properly handle array types in the API * OpenAI-Beta headers are now included in the CORS safelist * Fixed issue where model tensor data would be corrupted when importing models from SafetensorsEyad Issa2025-04-24 16:37:52 +00:00
8bfdb8d3f2- Add ollama to the video group - Update to version 0.6.5: * Add support for mistral-small * Fix issues with spm tokenizer for Gemma 3 models * Add checks for values falling out of sliding window cache * Improve file descriptor management for tensors and Pull operations * Add gfx1200 & gfx1201 GPU support on Linux * Optimize sliding window attention and KV cache implementations * Implement loading tensors in 32KiB chunks for better performance * Add autotemplate for gemma3 models * Add benchmarking for ollama server performance * Fix file handling in /proc/cpuinfo discovery * Support heterogeneous KV cache layer sizes in memory estimation * Fix debug logging for memory estimates * Improve error handling for empty logits and tensor data reading * Return model capabilities from the show endpoint - Update BuildRequires to go1.24Eyad Issa2025-04-19 22:15:43 +00:00
2b9b5e1f83Accepting request 1256309 from science:machinelearningAna Guerrero2025-03-27 21:31:53 +00:00
c1ae73a3faRe-wrap .changes file to 67 charsEyad Issa2025-03-26 20:06:49 +00:00
bf6db9898e- Update to version 0.6.2: * Multiple images are now supported in Gemma 3 * Fixed issue where running Gemma 3 would consume a large amount of system memory * ollama create --quantize now works when converting Gemma 3 from safetensors * Fixed issue where /save would not work if running a model with / in the name * Add support for AMD Strix Halo GPUsEyad Issa2025-03-26 20:03:43 +00:00
ccaa3d1dfaAccepting request 1254230 from science:machinelearningAna Guerrero2025-03-19 21:33:26 +00:00
98fb524bfeOnly require git-core because we don't need git-web and the other stuff hereEyad Issa2025-03-18 20:06:03 +00:00
278fdfd811Accepting request 1252927 from science:machinelearningAna Guerrero2025-03-14 22:52:08 +00:00
39cfaaa125- Update BuildRequires to go1.24Eyad Issa2025-03-14 01:22:52 +00:00
82d203a2c9- Update to version 0.6.0: * New model: Gemma 3 * Fixed error that would occur when running snowflake-arctic-embed and snowflake-arctic-embed2 models * Various performance improvements and bug fixesEyad Issa2025-03-14 01:19:14 +00:00
ecc5ce485f- Update to version 0.5.12: * New model: Perplexity R1 1776 * The OpenAI-compatible API will now return tool_calls if the model called a tool * Performance on certain Intel Xeon processors should now be restored * Fixed permission denied issues after installing Ollama on Linux * Fixed issue where additional CPU libraries were included in the arm64 Linux install * The progress bar will no longer flicker when running ollama pull * Fixed issue where running a model would fail on Linux if Ollama was installed in a path with UTF-8 characters * X-Stainless-Timeout will now be accepted as a header in the OpenAI API endpointsEyad Issa2025-02-27 14:02:42 +00:00
a9b6c46525- Use Ninja instead of Make and update the build script to match the new versionEyad Issa2025-02-15 02:49:21 +00:00
6334ea69a7- Update to version 0.5.11: * No notable changes for Linux - Update to version 0.5.10: * Fixed issue on multi-GPU Windows and Linux machines where memory estimations would be incorrect - Update to version 0.5.9: * New model: DeepScaleR * New model: OpenThinker - Update to version 0.5.8: * Ollama will now use AVX-512 instructions where available for additional CPU acceleration * Fixed indexing error that would occur when downloading a model with ollama run or ollama pull * Fixes cases where download progress would reverseEyad Issa2025-02-15 01:36:40 +00:00
0c42aadc96Accepting request 1240594 from science:machinelearningAna Guerrero2025-01-29 15:10:09 +00:00
ef0c7cecdd- Make ollama configurable by the admin via /etc/sysconfig/ollama (boo#1236008) - cleanup reproducible.patchEyad Issa2025-01-27 16:15:11 +00:00
16f46deeb3- Removed 01-build-verbose.patch: embedded GOFLAG into .spec file - Disabled reproducible.patch: should be not needed, as .gz is not produced anymore - Update to version 0.5.7:Eyad Issa2025-01-17 00:02:27 +00:00
c414c5711cAccepting request 1230609 from science:machinelearningAna Guerrero2024-12-12 20:18:15 +00:00
80cfae2b5dAdd reproducible.patch for deterministic .gz creation (boo#1047218)Eyad Issa2024-12-12 14:52:49 +00:00
785c029f70- Update to version 0.5.1: - Update to version 0.5.0: - Update to version 0.4.7:Eyad Issa2024-12-07 18:30:08 +00:00
46179bee73- Update to version 0.4.6: - Update to version 0.4.5: - Update to version 0.4.4: - Update to version 0.4.3:Eyad Issa2024-11-30 20:05:29 +00:00
6de51226b7Accepting request 1225993 from science:machinelearningAna Guerrero2024-11-24 10:04:51 +00:00
1e48ea9d8a- Add patch 01-build-verbose.patch to add the -v option to go build - Update to version 0.4.1: * runner.go: Check for zero length images * docs: update langchainpy.md with proper model name (#7527) * Set macos min version for all architectures (#7579) * win: remove preview title from installer (#7529) * Workaround buggy P2P ROCm copy on windows (#7466) * Debug logging for nvcuda init (#7532) * Align rocm compiler flags (#7467) * Be explicit for gpu library link dir (#7560) * docs: OLLAMA_NEW_RUNNERS no longer exists * runner.go: Remove unused arguments * sched: Lift parallel restriction for multimodal models except mllamaEyad Issa2024-11-11 14:46:13 +00:00
65708a6764- Update to version 0.4.0: * Update README.md (#7516) * One corrupt manifest should not wedge model operations (#7515) * prompt: Use a single token when estimating mllama context size * readme: add Hexabot to the list of community integrations * Quiet down debug log of image payload (#7454)Guillaume GARDET2024-11-07 15:09:03 +00:00
da3e66a886- Update to version 0.4.0-rc6: * Refine default thread selection for NUMA systems (#7322) * runner.go: Better abstract vision model integration * Soften windows clang requirement (#7428) * Remove submodule and shift to Go server - 0.4.0 (#7157) * Move windows app out of preview (#7347) * windows: Support alt install paths, fit and finish (#6967) * add more tests for getting the optimal tiled canvas (#7411) * Switch windows to clang (#7407) * tests: Add test for Unicode processing * runner.go: Better handle return NULL values from llama.cpp * add mllama image processing to the generate handler (#7384) * Bump to latest Go 1.22 patch (#7379) * Fix deepseek deseret regex (#7369) * Better support for AMD multi-GPU on linux (#7212) * Fix unicode output on windows with redirect to file (#7358) * Fix incremental build file deps (#7361) * Improve dependency gathering logic (#7345) * fix#7247 - invalid image input (#7249) * integration: harden embedding test (#7306) * default to "FROM ." if a Modelfile isn't present (#7250) * Fix rocm windows build and clean up dependency gathering (#7305) * runner.go: Merge partial unicode characters before sending * readme: add Ollama for Swift to the community integrations (#7295) * server: allow vscode-webview origin (#7273) * image processing for llama3.2 (#6963) * llama: Decouple patching script from submodule (#7139) * llama: add compiler tags for cpu features (#7137)Eyad Issa2024-11-01 02:20:51 +00:00
5a882751e3- Update to version 0.3.14: * New Models + Granite 3 MoE: The IBM Granite 1B and 3B models are the first mixture of experts (MoE) Granite models from IBM designed for low latency usage. + Granite 3 Dense: The IBM Granite 2B and 8B models are designed to support tool-based use cases and support for retrieval augmented generation (RAG), streamlining code generation, translation and bug fixing.Eyad Issa2024-10-31 01:55:17 +00:00
332862e2b8- Update to version 0.3.13: * New safety models: ~ Llama Guard 3: a series of models by Meta, fine-tuned for content safety classification of LLM inputs and responses. ~ ShieldGemma: ShieldGemma is set of instruction tuned models from Google DeepMind for evaluating the safety of text prompt input and text output responses against a set of defined safety policies. * Fixed issue where ollama pull would leave connections when encountering an error * ollama rm will now stop a model if it is running prior to deleting itGuillaume GARDET2024-10-14 07:28:18 +00:00
bd7fc28fe4Accepting request 1204591 from science:machinelearningAna Guerrero2024-09-30 13:40:27 +00:00
2808304cf4- Update to version 0.3.12: * Llama 3.2: Meta's Llama 3.2 goes small with 1B and 3B models. * Qwen 2.5 Coder: The latest series of Code-Specific Qwen models, with significant improvements in code generation, code reasoning, and code fixing. * Ollama now supports ARM Windows machines * Fixed rare issue where Ollama would report a missing .dll file on Windows * Fixed performance issue for Windows without GPUsEyad Issa2024-09-29 21:30:54 +00:00
f7aaf9b2afAccepting request 1202264 from science:machinelearningAna Guerrero2024-09-22 09:06:09 +00:00
5bb20bbdee- Update to version 0.3.11: * llm: add solar pro (preview) (#6846) * server: add tool parsing support for nemotron-mini (#6849) * make patches git am-able * CI: dist directories no longer present (#6834) * CI: clean up naming, fix tagging latest (#6832) * CI: set platform build build_linux script to keep buildx happy (#6829) * readme: add Agents-Flex to community integrations (#6788) * fix typo in import docs (#6828) * readme: add vim-intelligence-bridge to Terminal section (#6818) * readme: add Obsidian Quiz Generator plugin to community integrations (#6789) * Fix incremental builds on linux (#6780) * Use GOARCH for build dirs (#6779) * Optimize container images for startup (#6547) * examples: updated requirements.txt for privategpt example * examples: polish loganalyzer example (#6744) * readme: add ollama_moe to community integrations (#6752) * runner: Flush pending responses before returning * add "stop" command (#6739) * refactor show ouput * readme: add QodeAssist to community integrations (#6754) * Verify permissions for AMD GPU (#6736) * add *_proxy for debugging * docs: update examples to use llama3.1 (#6718) * Quiet down dockers new lint warnings (#6716) * catch when model vocab size is set correctly (#6714) * readme: add crewAI to community integrations (#6699) * readme: add crewAI with mesop to community integrationsEyad Issa2024-09-20 20:29:36 +00:00
e5b1fec77cAccepting request 1201962 from science:machinelearningAna Guerrero2024-09-19 19:17:44 +00:00
c97461a42d- Update to version 0.3.10: * openai: align chat temperature and frequency_penalty options with completion (#6688) * docs: improve linux install documentation (#6683) * openai: don't scale temperature or frequency_penalty (#6514) * readme: add Archyve to community integrations (#6680) * readme: add Plasmoid Ollama Control to community integrations (#6681) * Improve logging on GPU too small (#6666) * openai: fix "presence_penalty" typo and add test (#6665) * Fix gemma2 2b conversion (#6645) * Document uninstall on windows (#6663) * Revert "Detect running in a container (#6495)" (#6662) * llm: make load time stall duration configurable via OLLAMA_LOAD_TIMEOUT * Introduce GPU Overhead env var (#5922) * Detect running in a container (#6495) * readme: add AiLama to the list of community integrations (#4957) * Update gpu.md: Add RTX 3050 Ti and RTX 3050 Ti (#5888) * server: fix blob download when receiving a 200 response (#6656) * readme: add Gentoo package manager entry to community integrations (#5714) * Update install.sh:Replace "command -v" with encapsulated functionality (#6035) * readme: include Enchanted for Apple Vision Pro (#4949) * readme: add lsp-ai to community integrations (#5063) * readme: add ollama-php library to community integrations (#6361) * readme: add vnc-lm discord bot community integration (#6644) * llm: use json.hpp from common (#6642) * readme: add confichat to community integrations (#6378) * docs: add group to manual Linux isntructions and verify service is running (#6430) * readme: add gollm to the list of community libraries (#6099) * readme: add Cherry Studio to community integrations (#6633) * readme: add Go fun package (#6421) * docs: fix spelling error (#6391)Eyad Issa2024-09-19 08:48:38 +00:00
5a2110e469- Update to version 0.3.6: * Fixed issue where /api/embed would return an error instead of loading the model when the input field was not provided. * ollama create can now import Phi-3 models from Safetensors * Added progress information to ollama create when importing GGUF files * Ollama will now import GGUF files faster by minimizing file copies - Update to version 0.3.6: * Fixed issue where temporary files would not be cleaned up * Fix rare error when Ollama would start up due to invalid model data
1742298089105198786/tmp_refs/heads/devel
1742298089105198786/devel
Eyad Issa2024-08-15 19:06:50 +00:00
aa82c484e7- Update to version 0.3.4: * New embedding models - BGE-M3: a large embedding model from BAAI distinguished for its versatility in Multi-Functionality, Multi-Linguality, and Multi-Granularity. - BGE-Large: a large embedding model trained in english. - Paraphrase-Multilingual: A multilingual embedding model trained on parallel data for 50+ languages. * New embedding API with batch support - Ollama now supports a new API endpoint /api/embed for embedding generation: * This API endpoint supports new features: - Batches: generate embeddings for several documents in one request - Normalized embeddings: embeddings are now normalized, improving similarity results - Truncation: a new truncate parameter that will error if set to false - Metrics: responses include load_duration, total_duration and prompt_eval_count metricsEyad Issa2024-08-15 18:56:53 +00:00
8b38454cf5- Update to version 0.3.0: * Ollama now supports tool calling with popular models such as Llama 3.1. This enables a model to answer a given prompt using tool(s) it knows about, making it possible for models to perform more complex tasks or interact with the outside world. * New models: ~ Llama 3.1 ~ Mistral Large 2 ~ Firefunction v2 ~ Llama-3-Groq-Tool-Use * Fixed duplicate error message when running ollama createEyad Issa2024-07-28 11:46:59 +00:00
808a0b582d- Update to version 0.2.8: * api embed docs (#5282) * convert: capture head_dim for mistral (#5818) * Update llama.cpp submodule commit to d94c6e0c (#5805) * server: collect nested tool call objects when parsing (#5824) * Remove no longer supported max vram var * Refine error reporting for subprocess crash * Remove out of space test temporarily (#5825) * llm: consider head_dim in llama arch (#5817) * Adjust windows ROCm discovery * add patch for tekken (#5807) * preserve last assistant message (#5802) * Fix generate test flakyness (#5804) * server: validate template (#5734) * OpenAI: Function Based Testing (#5752) * adjust openai chat msg processing (#5729) * fix parsing tool calls * server: check for empty tools array too (#5779) * always provide content even if empty (#5778) * server: only parse tool calls if tools are provided (#5771) * Fix context exhaustion integration test for small gpus * Refine scheduler unit tests for reliabilityEyad Issa2024-07-25 11:03:50 +00:00
e1464e1fa0Accepting request 1188404 from science:machinelearningAna Guerrero2024-07-19 13:27:51 +00:00
44981711f9- Fixed issue with shared librariesEyad Issa2024-07-18 13:09:42 +00:00
3e72c81bf1- Added %check section - Use -v when buildingEyad Issa2024-07-18 12:28:24 +00:00
8d6b930083- Update to version 0.2.6: * New models: MathΣtral is a 7B model designed for math reasoning and scientific discovery by Mistral AI. * Fixed issue where uppercase roles such as USER would no longer work in the chat endpoints * Fixed issue where empty system message would be included in the promptEyad Issa2024-07-18 12:13:25 +00:00
b2ca9b9e96Accepting request 1187407 from science:machinelearningAna Guerrero2024-07-15 17:49:07 +00:00
3ddd383b3c- Update to version 0.2.5: - Update to version 0.2.4: - Update to version 0.2.3: - Update to version 0.2.2: - Update to version 0.2.1: - Update to version 0.2.0:Eyad Issa2024-07-14 18:09:05 +00:00
1202fb05d0Accepting request 1186033 from science:machinelearningAna Guerrero2024-07-08 17:08:25 +00:00
3eccb0320d- Update to version 0.1.48: * Fixed issue where Gemma 2 would continuously output when reaching context limits * Fixed out of memory and core dump errors when running Gemma 2 * /show info will now show additional model information in ollama run * Fixed issue where ollama show would result in an error on certain vision models - Update to version 0.1.48: * Added support for Google Gemma 2 models (9B and 27B) * Fixed issues with ollama create when importing from SafetensorsEyad Issa2024-07-07 19:20:28 +00:00