Context token limits now provider-aware
When multiple providers expose the same model ID with different context windows, OpenClaw could persist the wrong limit into the session store, causing incorrect /status values, premature compaction, and off-target memory flush thresholds.
When multiple providers expose the same model ID but with different configured context windows, OpenClaw was using a bare model identifier to look up session context limits. This meant the persisted context tokens could belong to a different provider than the one actually handling the conversation.
The fix threads the active provider through all hot paths that resolve context window limits: session usage persistence in the auto-reply agent, inline directive handling, and memory-flush or preflight compaction sizing. Each now calls the provider-qualified resolution function that scans the configuration per provider, not just by model ID alone.
Users with multi-provider configurations will now see accurate /status output, correct compaction thresholds, and properly sized memory flush decisions. The session store reflects the actual active provider's context window rather than whichever provider was encountered first or cached last.
This issue lived in the @openclaw/auto-reply package, specifically in the reply state and memory management logic that writes to the session store.
View Original GitHub Description
Summary
- Problem: Session usage persistence, inline directive handling, and memory-flush / preflight compaction sizing used bare model ids (
lookupContextTokens/lookupCachedContextTokens) for context limits. When several providers expose the same model id with different configured context windows, the wrong limit could be written tosessionEntry.contextTokensand affect/status, compaction, and flush behavior (#62472). - Why it matters: Persisted
contextTokensdrives usage display, compaction thresholds, and related safeguards; a stale or collided value can make the runtime overly conservative or inconsistent with the active provider. - What changed: Use
resolveContextTokensForModel({ cfg, provider, model, allowAsyncLoad: false })in the auto-reply agent and follow-up runners, inpersistInlineDirectivesreturn value, and inresolveMemoryFlushContextWindowTokens(threadingcfg+providerfrom the follow-up run). Follow-up runs now resolveproviderUsedfrom agent meta when present, matching the main reply path. - What did NOT change: CLI
updateSessionStoreAfterAgentRunalready used provider-qualified resolution; gateway session row building and status logic were left as-is. The global context cache structure is unchanged; resolution goes through the existingresolveContextTokensForModelcontract.
Change Type (select all)
- Bug fix
- Feature
- Refactor required for the fix
- Docs
- Security hardening
- Chore/infra
Scope (select all touched areas)
- Gateway / orchestration
- Skills / tool execution
- Auth / tokens
- Memory / storage
- Integrations
- API / contracts
- UI / DX
- CI/CD / infra
Linked Issue/PR
- Closes #62472
- Related #
- This PR fixes a bug or regression
Root Cause (if applicable)
- Root cause: Mixed strategies:
resolveContextTokensForModelcorrectly scanscfg.models.providersper provider, but several hot paths still used bare-model cache lookups or cached tokens keyed only by model id, so duplicate ids across providers could leak the wrong window into persisted session metadata. - Missing detection / guardrail: Unit coverage did not assert provider-qualified resolution for
resolveMemoryFlushContextWindowTokenswhen two providers list the same model id with different limits. - Contributing context (if known): N/A
Regression Test Plan (if applicable)
- Coverage level that should have caught this:
- Unit test
- Seam / integration test
- End-to-end test
- Existing coverage already sufficient
- Target test or file:
src/auto-reply/reply/reply-state.test.ts - Scenario the test should lock in: Two providers in config share one model id with different
contextWindowvalues;resolveMemoryFlushContextWindowTokensreturns the limit for the requestedprovider. - Why this is the smallest reliable guardrail: Exercises the same
resolveContextTokensForModelpath used by reply usage persistence and memory sizing without standing up a full channel run. - Existing test that already covers this (if any): Partial —
resolveContextTokensForModeltests insrc/agents/context.lookup.test.ts; this adds coverage at the memory-flush entry point. - If no new test is added, why not: N/A — test added.
User-visible / Behavior Changes
- Session
contextTokensand related usage / flush thresholds should match the active provider’s configured window when multiple providers reuse the same model id.
Diagram (if applicable)
N/A
Security Impact (required)
- New permissions/capabilities? No
- New network endpoints or trust boundaries? No
Testing
pnpm exec oxlint --type-awareon touchedsrc/auto-reply/reply/*.tsfilespnpm test src/auto-reply/reply/reply-state.test.ts -t resolveMemoryFlushContextWindowTokens