------------------------------------------------------------------- Sun May 12 01:39:26 UTC 2024 - Eyad Issa - Update to version 0.1.36: * Fixed exit status 0xc0000005 error with AMD graphics cards on Windows * Fixed rare out of memory errors when loading a model to run with CPU - Update to version 0.1.35: * New models: Llama 3 ChatQA: A model from NVIDIA based on Llama 3 that excels at conversational question answering (QA) and retrieval-augmented generation (RAG). * Quantization: ollama create can now quantize models when importing them using the --quantize or -q flag * Fixed issue where inference subprocesses wouldn't be cleaned up on shutdown. * Fixed a series out of memory errors when loading models on multi-GPU systems * Ctrl+J characters will now properly add newlines in ollama run * Fixed issues when running ollama show for vision models * OPTIONS requests to the Ollama API will no longer result in errors * Fixed issue where partially downloaded files wouldn't be cleaned up * Added a new done_reason field in responses describing why generation stopped responding * Ollama will now more accurately estimate how much memory is available on multi-GPU systems especially when running different models one after another - Update to version 0.1.34: * New model: Llava Llama 3 * New model: Llava Phi 3 * New model: StarCoder2 15B Instruct * New model: CodeGemma 1.1 * New model: StableLM2 12B * New model: Moondream 2 * Fixed issues with LLaVa models where they would respond incorrectly after the first request * Fixed out of memory errors when running large models such as Llama 3 70B * Fixed various issues with Nvidia GPU discovery on Linux and Windows * Fixed a series of Modelfile errors when running ollama create * Fixed no slots available error that occurred when cancelling a request and then sending follow up requests * Improved AMD GPU detection on Fedora * Improved reliability when using the experimental OLLAMA_NUM_PARALLEL and OLLAMA_MAX_LOADED flags * ollama serve will now shut down quickly, even if a model is loading - Update to version 0.1.33: * New model: Llama 3 * New model: Phi 3 Mini * New model: Moondream * New model: Llama 3 Gradient 1048K * New model: Dolphin Llama 3 * New model: Qwen 110B * Fixed issues where the model would not terminate, causing the API to hang. * Fixed a series of out of memory errors on Apple Silicon Macs * Fixed out of memory errors when running Mixtral architecture models * Aded experimental concurrency features: ~ OLLAMA_NUM_PARALLEL: Handle multiple requests simultaneously for a single model ~ OLLAMA_MAX_LOADED_MODELS: Load multiple models simultaneously ------------------------------------------------------------------- Tue Apr 23 02:26:34 UTC 2024 - rrahl0@disroot.org - Update to version 0.1.32: * scale graph based on gpu count * Support unicode characters in model path (#3681) * darwin: no partial offloading if required memory greater than system * update llama.cpp submodule to `7593639` (#3665) * fix padding in decode * Revert "cmd: provide feedback if OLLAMA_MODELS is set on non-serve command (#3470)" (#3662) * Added Solar example at README.md (#3610) * Update langchainjs.md (#2030) * Added MindsDB information (#3595) * examples: add more Go examples using the API (#3599) * Update modelfile.md * Add llama2 / torch models for `ollama create` (#3607) * Terminate subprocess if receiving `SIGINT` or `SIGTERM` signals while model is loading (#3653) * app: gracefully shut down `ollama serve` on windows (#3641) * types/model: add path helpers (#3619) * update llama.cpp submodule to `4bd0f93` (#3627) * types/model: make ParseName variants less confusing (#3617) * types/model: remove (*Digest).Scan and Digest.Value (#3605) * Fix rocm deps with new subprocess paths * mixtral mem * Revert "types/model: remove (*Digest).Scan and Digest.Value (#3589)" * types/model: remove (*Digest).Scan and Digest.Value (#3589) * types/model: remove DisplayLong (#3587) * types/model: remove MarshalText/UnmarshalText from Digest (#3586) * types/model: init with Name and Digest types (#3541) * server: provide helpful workaround hint when stalling on pull (#3584) * partial offloading * refactor tensor query * api: start adding documentation to package api (#2878) * examples: start adding Go examples using api/ (#2879) * Handle very slow model loads * fix: rope * Revert "build.go: introduce a friendlier way to build Ollama (#3548)" (#3564) * build.go: introduce a friendlier way to build Ollama (#3548) * update llama.cpp submodule to `1b67731` (#3561) * ci: use go-version-file * Correct directory reference in macapp/README (#3555) * cgo quantize * no blob create if already exists * update generate scripts with new `LLAMA_CUDA` variable, set `HIP_PLATFORM` to avoid compiler errors (#3528) * Docs: Remove wrong parameter for Chat Completion (#3515) * no rope parameters * add command-r graph estimate * Fail fast if mingw missing on windows * use an older version of the mac os sdk in release (#3484) * Add test case for context exhaustion * CI missing archive * fix dll compress in windows building * CI subprocess path fix * Fix CI release glitches * update graph size estimate * Fix macOS builds on older SDKs (#3467) * cmd: provide feedback if OLLAMA_MODELS is set on non-serve command (#3470) * feat: add OLLAMA_DEBUG in ollama server help message (#3461) * Revert options as a ref in the server * default head_kv to 1 * fix metal gpu * Bump to b2581 * Refined min memory from testing * Release gpu discovery library after use * Safeguard for noexec * Detect too-old cuda driver * Integration test improvements * Apply 01-cache.diff * Switch back to subprocessing for llama.cpp * Simplify model conversion (#3422) * fix generate output * update memory calcualtions * refactor model parsing * Add chromem-go to community integrations (#3437) * Update README.md (#3436) * Community Integration: CRAG Ollama Chat (#3423) * Update README.md (#3378) * Community Integration: ChatOllama (#3400) * Update 90_bug_report.yml * Add gemma safetensors conversion (#3250) * CI automation for tagging latest images * Bump ROCm to 6.0.2 patch release * CI windows gpu builds * Update troubleshooting link * fix: trim quotes on OLLAMA_ORIGINS - add set_version to automatically switch over to the newer version ------------------------------------------------------------------- Tue Apr 16 10:52:25 UTC 2024 - bwiedemann@suse.com - Update to version 0.1.31: * Backport MacOS SDK fix from main * Apply 01-cache.diff * fix: workflows * stub stub * mangle arch * only generate on changes to llm subdirectory * only generate cuda/rocm when changes to llm detected * Detect arrow keys on windows (#3363) * add license in file header for vendored llama.cpp code (#3351) * remove need for `$VSINSTALLDIR` since build will fail if `ninja` cannot be found (#3350) * change `github.com/jmorganca/ollama` to `github.com/ollama/ollama` (#3347) * malformed markdown link (#3358) * Switch runner for final release job * Use Rocky Linux Vault to get GCC 10.2 installed * Revert "Switch arm cuda base image to centos 7" * Switch arm cuda base image to centos 7 * Bump llama.cpp to b2527 * Fix ROCm link in `development.md` * adds ooo to community integrations (#1623) * Add cliobot to ollama supported list (#1873) * Add Dify.AI to community integrations (#1944) * enh: add ollero.nvim to community applications (#1905) * Add typechat-cli to Terminal apps (#2428) * add new Web & Desktop link in readme for alpaca webui (#2881) * Add LibreChat to Web & Desktop Apps (#2918) * Add Community Integration: OllamaGUI (#2927) * Add Community Integration: OpenAOE (#2946) * Add Saddle (#3178) * tlm added to README.md terminal section. (#3274) * Update README.md (#3288) * Update README.md (#3338) * Integration tests conditionally pull * add support for libcudart.so for CUDA devices (adds Jetson support) * llm: prevent race appending to slice (#3320) * Bump llama.cpp to b2510 * Add Testcontainers into Libraries section (#3291) * Revamp go based integration tests * rename `.gitattributes` * Bump llama.cpp to b2474 * Add docs for GPU selection and nvidia uvm workaround * doc: faq gpu compatibility (#3142) * Update faq.md * Better tmpdir cleanup * Update faq.md * update `faq.md` * dyn global * llama: remove server static assets (#3174) * add `llm/ext_server` directory to `linguist-vendored` (#3173) * Add Radeon gfx940-942 GPU support * Wire up more complete CI for releases * llm,readline: use errors.Is instead of simple == check (#3161) * server: replace blob prefix separator from ':' to '-' (#3146) * Add ROCm support to linux install script (#2966) * .github: fix model and feature request yml (#3155) * .github: add issue templates (#3143) * fix: clip memory leak * Update README.md * add `OLLAMA_KEEP_ALIVE` to environment variable docs for `ollama serve` (#3127) * Default Keep Alive environment variable (#3094) * Use stdin for term discovery on windows * Update ollama.iss * restore locale patch (#3091) * token repeat limit for prediction requests (#3080) * Fix iGPU detection for linux * add more docs on for the modelfile message command (#3087) * warn when json format is expected but not mentioned in prompt (#3081) * Adapt our build for imported server.cpp * Import server.cpp as of b2356 * refactor readseeker * Add docs explaining GPU selection env vars * chore: fix typo (#3073) * fix gpu_info_cuda.c compile warning (#3077) * use `-trimpath` when building releases (#3069) * relay load model errors to the client (#3065) * Update troubleshooting.md * update llama.cpp submodule to `ceca1ae` (#3064) * convert: fix shape * Avoid rocm runner and dependency clash * fix `03-locale.diff` * Harden for deps file being empty (or short) * Add ollama executable peer dir for rocm * patch: use default locale in wpm tokenizer (#3034) * only copy deps for `amd64` in `build_linux.sh` * Rename ROCm deps file to avoid confusion (#3025) * add `macapp` to `.dockerignore` * add `bundle_metal` and `cleanup_metal` funtions to `gen_darwin.sh` * tidy cleanup logs * update llama.cpp submodule to `77d1ac7` (#3030) * disable gpu for certain model architectures and fix divide-by-zero on memory estimation * Doc how to set up ROCm builds on windows * Finish unwinding idempotent payload logic * update llama.cpp submodule to `c2101a2` (#3020) * separate out `isLocalIP` * simplify host checks * add additional allowed hosts * Update docs `README.md` and table of contents * add allowed host middleware and remove `workDir` middleware (#3018) * decode ggla * convert: fix default shape * fix: allow importing a model from name reference (#3005) * update llama.cpp submodule to `6cdabe6` (#2999) * Update api.md * Revert "adjust download and upload concurrency based on available bandwidth" (#2995) * cmd: tighten up env var usage sections (#2962) * default terminal width, height * Refined ROCm troubleshooting docs * Revamp ROCm support * update go to 1.22 in other places (#2975) * docs: Add LLM-X to Web Integration section (#2759) * fix some typos (#2973) * Convert Safetensors to an Ollama model (#2824) * Allow setting max vram for workarounds * cmd: document environment variables for serve command * Add Odin Runes, a Feature-Rich Java UI for Ollama, to README (#2440) * Update api.md * Add NotesOllama to Community Integrations (#2909) * Added community link for Ollama Copilot (#2582) * use LimitGroup for uploads * adjust group limit based on download speed * add new LimitGroup for dynamic concurrency * refactor download run ------------------------------------------------------------------- Wed Mar 06 23:51:28 UTC 2024 - computersemiexpert@outlook.com - Update to version 0.1.28: * Fix embeddings load model behavior (#2848) * Add Community Integration: NextChat (#2780) * prepend image tags (#2789) * fix: print usedMemory size right (#2827) * bump submodule to `87c91c07663b707e831c59ec373b5e665ff9d64a` (#2828) * Add ollama user to video group * Add env var so podman will map cuda GPUs ------------------------------------------------------------------- Tue Feb 27 08:33:15 UTC 2024 - Jan Engelhardt - Edit description, answer _what_ the package is and use nominal phrase. (https://en.opensuse.org/openSUSE:Package_description_guidelines) ------------------------------------------------------------------- Fri Feb 23 21:13:53 UTC 2024 - Loren Burkholder - Added the Ollama package - Included a systemd service