models / vision-language · coming soon

gemma-4-26B-A4B-it-AWQ-4bit

cyankiwi/gemma-4-26B-A4B-it-AWQ-4bit

A popular open vision-language model, with 5.1M downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.

est. price

~$1.341

/ 1k images · estimated, set at launch

API providers

downloads / mo

5.1M

license

apache-2.0

about this model

Overview

The cyankiwi/gemma-4-26B-A4B-it-AWQ-4bit model is a Mixture-of-Experts (MoE) multimodal vision-language model built by Google DeepMind and optimized for text+image understanding. It processes text and images with variable aspect ratios and resolutions, generates text output, and supports a 256K token context window. The model uses 25.2B total parameters with 3.8B active parameters per token (8 active experts out of 128, plus one shared expert), enabling fast inference relative to its total size. The vocabulary size is 262K tokens, and the vision encoder contains approximately 550M parameters.

The model is instruction-tuned and includes native system prompt support, function calling for agentic workflows, and a configurable thinking mode for step-by-step reasoning. It supports interleaved multimodal input (text and images freely mixed) and video analysis via frame sequences. Multilingual coverage includes 35+ languages out of the box, with pretraining on 140+ languages.

Key Benchmarks

The table below shows results for the 26B A4B instruction-tuned variant compared to other Gemma 4 sizes and the previous Gemma 3 27B (no thinking). Figures are from the model card.

Benchmark	26B A4B	31B	E4B	E2B	Gemma 3 27B
MMLU Pro	82.6%	85.2%	69.4%	60.0%	67.6%
AIME 2026 (no tools)	88.3%	89.2%

not yet live

We're benchmarking and onboarding gemma-4-26B-A4B-it-AWQ-4bit as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.