“Gemma 4 arrives with vision and audio. The tokenizer finally remembers accented letters exist.”

v0.20.0 · April 2, 2026

Ollama v0.20.0 introduces support for Google's Gemma 4 model family, bringing local audio transcription and vision understanding to the open-source runtime. Users can now run these multimodal models entirely offline, processing images and audio without sending data to external servers.

The tokenizer received a quiet but significant fix. Models using SentencePiece BPE encoding—no longer silently discard accented characters and diacritics. French, Spanish, and other language output should render with proper accents intact, rather than arriving as plain ASCII casualties.

Four Gemma 4 variants are available: the efficient 2B and 4B sizes, a 26B mixture-of-experts model with 4B active parameters, and the dense 31B variant for those with more demanding hardware.

← All Releases

PRs Merged

Contributors

+7.7k

Additions

-100

Deletions

Highlights

25%

Coverage

“Gemma 4 arrives with vision and audio. The tokenizer finally remembers accented letters exist.”

Gemma 4 support arrives with multimodal audio and vision capabilities

Gemma4 tokenization now preserves Unicode accents and diacritics