gemma-4-26B-A4B-it-AWQ-4bit
cyankiwi/gemma-4-26B-A4B-it-AWQ-4bit
A popular open vision-language model, with 5.1M downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.
about this model
Overview
The cyankiwi/gemma-4-26B-A4B-it-AWQ-4bit model is a Mixture-of-Experts (MoE) multimodal vision-language model built by Google DeepMind and optimized for text+image understanding. It processes text and images with variable aspect ratios and resolutions, generates text output, and supports a 256K token context window. The model uses 25.2B total parameters with 3.8B active parameters per token (8 active experts out of 128, plus one shared expert), enabling fast inference relative to its total size. The vocabulary size is 262K tokens, and the vision encoder contains approximately 550M parameters.
The model is instruction-tuned and includes native system prompt support, function calling for agentic workflows, and a configurable thinking mode for step-by-step reasoning. It supports interleaved multimodal input (text and images freely mixed) and video analysis via frame sequences. Multilingual coverage includes 35+ languages out of the box, with pretraining on 140+ languages.
Key Benchmarks
The table below shows results for the 26B A4B instruction-tuned variant compared to other Gemma 4 sizes and the previous Gemma 3 27B (no thinking). Figures are from the model card.
| Benchmark | 26B A4B | 31B | E4B | E2B | Gemma 3 27B |
|---|---|---|---|---|---|
| MMLU Pro | 82.6% | 85.2% | 69.4% | 60.0% | 67.6% |
| AIME 2026 (no tools) | 88.3% | 89.2% |
We're benchmarking and onboarding gemma-4-26B-A4B-it-AWQ-4bit as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.