Merged
Size
M
Change Breakdown
Bug Fix95%
Maintenance5%
#62493fix(sessions): provider-qualified context limits (#62472)

Context token limits now provider-aware

When multiple providers expose the same model ID with different context windows, OpenClaw could persist the wrong limit into the session store, causing incorrect /status values, premature compaction, and off-target memory flush thresholds.

When multiple providers expose the same model ID but with different configured context windows, OpenClaw was using a bare model identifier to look up session context limits. This meant the persisted context tokens could belong to a different provider than the one actually handling the conversation.

The fix threads the active provider through all hot paths that resolve context window limits: session usage persistence in the auto-reply agent, inline directive handling, and memory-flush or preflight compaction sizing. Each now calls the provider-qualified resolution function that scans the configuration per provider, not just by model ID alone.

Users with multi-provider configurations will now see accurate /status output, correct compaction thresholds, and properly sized memory flush decisions. The session store reflects the actual active provider's context window rather than whichever provider was encountered first or cached last.

This issue lived in the @openclaw/auto-reply package, specifically in the reply state and memory management logic that writes to the session store.

View Original GitHub Description

Summary

  • Problem: Session usage persistence, inline directive handling, and memory-flush / preflight compaction sizing used bare model ids (lookupContextTokens / lookupCachedContextTokens) for context limits. When several providers expose the same model id with different configured context windows, the wrong limit could be written to sessionEntry.contextTokens and affect /status, compaction, and flush behavior (#62472).
  • Why it matters: Persisted contextTokens drives usage display, compaction thresholds, and related safeguards; a stale or collided value can make the runtime overly conservative or inconsistent with the active provider.
  • What changed: Use resolveContextTokensForModel({ cfg, provider, model, allowAsyncLoad: false }) in the auto-reply agent and follow-up runners, in persistInlineDirectives return value, and in resolveMemoryFlushContextWindowTokens (threading cfg + provider from the follow-up run). Follow-up runs now resolve providerUsed from agent meta when present, matching the main reply path.
  • What did NOT change: CLI updateSessionStoreAfterAgentRun already used provider-qualified resolution; gateway session row building and status logic were left as-is. The global context cache structure is unchanged; resolution goes through the existing resolveContextTokensForModel contract.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #62472
  • Related #
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: Mixed strategies: resolveContextTokensForModel correctly scans cfg.models.providers per provider, but several hot paths still used bare-model cache lookups or cached tokens keyed only by model id, so duplicate ids across providers could leak the wrong window into persisted session metadata.
  • Missing detection / guardrail: Unit coverage did not assert provider-qualified resolution for resolveMemoryFlushContextWindowTokens when two providers list the same model id with different limits.
  • Contributing context (if known): N/A

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/auto-reply/reply/reply-state.test.ts
  • Scenario the test should lock in: Two providers in config share one model id with different contextWindow values; resolveMemoryFlushContextWindowTokens returns the limit for the requested provider.
  • Why this is the smallest reliable guardrail: Exercises the same resolveContextTokensForModel path used by reply usage persistence and memory sizing without standing up a full channel run.
  • Existing test that already covers this (if any): Partial — resolveContextTokensForModel tests in src/agents/context.lookup.test.ts; this adds coverage at the memory-flush entry point.
  • If no new test is added, why not: N/A — test added.

User-visible / Behavior Changes

  • Session contextTokens and related usage / flush thresholds should match the active provider’s configured window when multiple providers reuse the same model id.

Diagram (if applicable)

N/A

Security Impact (required)

  • New permissions/capabilities? No
  • New network endpoints or trust boundaries? No

Testing

  • pnpm exec oxlint --type-aware on touched src/auto-reply/reply/*.ts files
  • pnpm test src/auto-reply/reply/reply-state.test.ts -t resolveMemoryFlushContextWindowTokens
© 2026 · via Gitpulse