Ollama models gain working thinking display

hoyyeva

·Apr 8, 2026·#62712fix: enable thinking support for the ollama api

Local Ollama models now properly show thinking content in the TUI and save it to session files when users enable `/think` — a capability that was completely broken before this fix.

Ollama users have been locked out of thinking displays — setting /think low did nothing, the TUI showed no reasoning block, and session files stayed empty even when models fully supported the feature. The native Ollama API stream was never wired to send think: true to the backend, and the streaming loop simply ignored the thinking fields in responses.

This fix completes the thinking pipeline for Ollama. When users enable any non-off thinking level, the API now sends think: true to Ollama, the streaming loop captures both thinking and reasoning deltas from responses, and the assistant message builder assembles structured thinking blocks alongside text content. The TUI already knew how to display { type: "thinking" } blocks — it just never received them from the Ollama path.

In the Ollama API extension, the thinking wrapper function now accepts a boolean parameter to control whether thinking is enabled or disabled. The streaming loop tracks thinking state separately from text state, emitting thinking_start, thinking_delta, and thinking_end events before transitioning to text events. A helper function manages the thinking-to-text transition cleanly when the model switches from reasoning to responding. Tests cover the new thinking injection, block assembly, and stream event behavior.

View Original GitHub Description

Summary

Problem: The native Ollama API stream had no thinking/reasoning support. Setting /think low (or any non-off level) had no effect since Ollama never received think: true, thinking fields in the stream response were ignored, and thinking content was excluded from the final assistant message.
Why it matters: Users running ollama models with the default api: "ollama" could not use the thinking/reasoning feature at all. The TUI showed no [thinking] block, Telegram showed no Reasoning: prefix, and session files contained no structured thinking content even when the model fully supports it.
What changed:
- Added createOllamaThinkingWrapper that sends think: true to Ollama when thinkingLevel is any non-"off" value
- The streaming loop in createOllamaStreamFn now reads chunk.message.thinking / chunk.message.reasoning and emits thinking_start → thinking_delta → thinking_end events
- buildAssistantMessage now includes a { type: "thinking" } content block when thinking/reasoning is present in the Ollama response
- Added tests for think: true payload injection, buildAssistantMessage thinking blocks, stream thinking event emission, and edge cases
What did NOT change (scope boundary):
- The OpenAI-compat path is untouched
- The existing think: false wrapper for thinkingLevel === "off" is unchanged
- No changes to the TUI, Telegram, or any other display/channel layer — they already handled { type: "thinking" } blocks correctly
- No changes to the ThinkLevel / ReasoningLevel types or session patch logic

Change Type (select all)

Scope (select all touched areas)

Root Cause (if applicable)

Root cause: The native Ollama stream implementation (api: "ollama") was never wired to send think: true to the Ollama API, and the streaming loop never read the thinking/reasoning response fields. The createConfiguredOllamaCompatStreamWrapper only handled the thinkingLevel === "off" case.

Regression Test Plan (if applicable)

Coverage level that should have caught this:
- Unit test
- Seam / integration test
- End-to-end test
- Existing coverage already sufficient
Target test or file: extensions/ollama/src/stream.test.ts, extensions/ollama/index.test.ts
Scenario the test should lock in: (1) thinkingLevel: "low" produces think: true in payload (2) Ollama stream chunks with thinking field emit thinking_start/thinking_delta/thinking_end events, (3) buildAssistantMessage includes { type: "thinking" } blocks
Existing test that already covers this (if any): only think: false case had coverage prior to this PR
If no new test is added, why not: N/A

User-visible / Behavior Changes

Ollama models using the default api: "ollama" now produce structured thinking content when /think is set to any non-off level
TUI shows [thinking] block (via Ctrl+T) for Ollama models
Telegram/Discord show Reasoning: output for Ollama models when /reasoning on or /reasoning stream is set
No change to defaults — thinking remains off unless the user explicitly enables it

Diagram (if applicable)

Before: /think low -> createConfiguredOllamaCompatStreamWrapper -> no wrapper applied -> Ollama request: { model, messages, stream: true } (no think param) -> Ollama response: { message: { content, thinking } } -> stream loop reads content only, ignores thinking -> final message: [{ type: "text" }]

After: /think low -> createConfiguredOllamaCompatStreamWrapper -> createOllamaThinkingWrapper(_, true) -> Ollama request: { model, messages, stream: true, think: true } -> Ollama response: { message: { content, thinking } } -> stream loop reads both, emits thinking_start/delta/end + text events -> final message: [{ type: "thinking" }, { type: "text" }]

Security Impact (required)

New permissions/capabilities? (No)
Secrets/tokens handling changed? (No)
New/changed network calls? (No) - it is same Ollama /api/chat endpoint, only the think boolean field is added to the request body
Command/tool execution surface changed? (No)
Data access scope changed? (No)

Repro + Verification

Environment

OS: macOS Darwin 25.3.0
Runtime/container: Node 25.6.1
Model/provider: Ollama / qwen3.5
Integration/channel (if any): TUI, Telegram
Relevant config (redacted): models.providers.ollama.api: "ollama"

Steps

Start gateway: pnpm openclaw gateway run --bind loopback --port 19001 --force
Connect TUI: pnpm openclaw tui --url ws://127.0.0.1:19001
Set /think low, press Ctrl+T to enable thinking display, send a message

Expected

[thinking] block appears in TUI with the model's reasoning content
Session file contains { type: "thinking", thinking: "..." } block

Actual

Without fix: no [thinking] block, no thinking content in session
With fix: [thinking] block appears, thinking content stored as structured block

Human Verification (required)

What you personally verified (not just CI), and how:

Verified scenarios: TUI with Ctrl+T shows thinking, Telegram with /reasoning stream captures thinking, session file contains structured thinking blocks, global install (no fix) produces no thinking vs local dev build (with fix) produces thinking
Edge cases checked: empty thinking field, undefined thinkingLevel, reasoning field fallback, transition from thinking to text in stream
What you did not verify: /reasoning on separate message delivery in Telegram (suppressed by existing shouldSuppressReasoningPayload — pre-existing behavior), other Ollama models besides qwen3.5 and kimi-k2.5:cloud

Compatibility / Migration

Backward compatible? (Yes)
Config/env changes? (No)
Migration needed? (No)
If yes, exact upgrade steps: