Merged
Size
S
Change Breakdown
Bug Fix85%
Docs15%
#66446fix(media): remap AAC uploads to M4A

AAC uploads remapped to M4A for transcription compatibility

VI
vincentkoc
·Apr 14, 2026

Audio uploads in AAC format are now automatically normalized to M4A before hitting OpenAI-compatible transcription endpoints, fixing failures at MIME-sensitive providers.

When users uploaded AAC audio files for transcription, the files kept their original .aac extension instead of being normalized to .m4a. Downstream OpenAI-compatible audio paths often key off filename and container hints, and raw AAC files were being rejected or handled inconsistently at those endpoints.

A new normalization step now catches AAC uploads and remaps them to the widely-accepted M4A container format before the transcription request is built. The fix handles both explicit .aac extensions and cases where the file has no extension but carries an audio/aac MIME type — both now surface as M4A to downstream handlers.

This change lives in the media-understanding package's OpenAI-compatible audio path. No broader audio transcoding or codec conversion logic was modified.

View Original GitHub Description

Summary

Describe the problem and fix in 2–5 bullets:

  • Problem: AAC audio uploads could keep an incompatible extension instead of being remapped to a widely accepted M4A container name.
  • Why it matters: downstream OpenAI-compatible audio paths reject or mishandle raw AAC uploads more often than M4A-labelled equivalents.
  • What changed: remap AAC uploads to M4A in the media-understanding audio normalization path and carry the missing changelog entry.
  • What did NOT change (scope boundary): no broader audio transcoding policy or codec conversion logic changed.
  • Supersedes #61094.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #
  • Related #61094
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: the audio normalization path did not translate AAC inputs into the container/extension shape expected by downstream OpenAI-compatible audio handling.
  • Missing detection / guardrail: coverage did not lock in AAC-to-M4A remapping for uploads entering the OpenAI-compatible audio path.
  • Contributing context (if known): providers often key off filename/container hints even when the underlying audio bytes are otherwise usable.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/media-understanding/openai-compatible-audio.test.ts
  • Scenario the test should lock in: AAC inputs are surfaced as M4A for OpenAI-compatible audio handling.
  • Why this is the smallest reliable guardrail: the behavior is a local media normalization decision.
  • Existing test that already covers this (if any): none.
  • If no new test is added, why not: N/A.

User-visible / Behavior Changes

AAC uploads now normalize to M4A where the downstream audio path expects that container label.

Diagram (if applicable)

N/A

Security Impact (required)

  • New permissions/capabilities? (Yes/No) No
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) No
  • Data access scope changed? (Yes/No) No
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: Node 22 / pnpm test:serial
  • Model/provider: OpenAI-compatible audio
  • Integration/channel (if any): Media understanding
  • Relevant config (redacted): targeted fixture/test doubles only

Steps

  1. Feed an AAC upload into the OpenAI-compatible audio normalization path.
  2. Run pnpm test:serial src/media-understanding/openai-compatible-audio.test.ts.
  3. Confirm the surfaced media metadata uses an M4A filename/container shape.

Expected

  • AAC uploads normalize to an M4A-labelled file shape before the downstream request is built.

Actual

  • Before the fix, AAC could stay in a shape that downstream handlers rejected or handled inconsistently.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: pnpm test:serial src/media-understanding/openai-compatible-audio.test.ts.
  • Edge cases checked: AAC inputs entering the OpenAI-compatible audio path.
  • What you did not verify: live provider upload behavior outside the targeted unit coverage.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: narrow fix may miss adjacent provider-specific edge cases not covered by the focused regression.
    • Mitigation: keep the patch scoped and ship targeted regression coverage for the implicated path only.
© 2026 · via Gitpulse