* api embed docs (#5282)
* convert: capture `head_dim` for mistral (#5818)
* Update llama.cpp submodule commit to `d94c6e0c` (#5805)
* server: collect nested tool call objects when parsing (#5824)
* Remove no longer supported max vram var
* Refine error reporting for subprocess crash
* Remove out of space test temporarily (#5825)
* llm: consider `head_dim` in llama arch (#5817)
* Adjust windows ROCm discovery
* add patch for tekken (#5807)
* preserve last assistant message (#5802)
* Fix generate test flakyness (#5804)
* server: validate template (#5734)
* OpenAI: Function Based Testing (#5752)
* adjust openai chat msg processing (#5729)
* fix parsing tool calls
* server: check for empty tools array too (#5779)
* always provide content even if empty (#5778)
* server: only parse tool calls if tools are provided (#5771)
* Fix context exhaustion integration test for small gpus
* Refine scheduler unit tests for reliability
OBS-URL: https://build.opensuse.org/package/show/science:machinelearning/ollama?expand=0&rev=37