Qwen3-0.6B
Qwen/Qwen3-0.6B
A popular open text-gen model. gigarouter benchmarks and hosts it as an OpenAI-compatible API.
about this model
Overview
Qwen3-0.6B is a 0.6-billion-parameter causal language model optimized for chat. As part of the Qwen3 series, it delivers reasoning, instruction-following, agent capabilities, and multilingual support within a single model. It uniquely supports seamless switching between a thinking mode (for complex reasoning, math, and coding) and a non-thinking mode (for efficient general-purpose dialogue), enabling optimal performance across varied use cases.
Key Capabilities
- Thinking / non-thinking switching: A hard switch (
enable_thinking) and a soft switch via/thinkor/no_thinkin prompts allow per-turn control. Recommended sampling parameters: temperature 0.6, top-p 0.95, top-k 20, min-p 0 for thinking mode; temperature 0.7, top-p 0.8, top-k 20, min-p 0 for non-thinking mode. Greedy decoding is strongly discouraged in thinking mode. - Enhanced reasoning: Surpasses previous QwQ (thinking mode) and Qwen2.5 instruct models (non-thinking mode) on mathematics, code generation, and commonsense logical reasoning.
- Human preference alignment: Excels in creative writing, role-playing, multi-turn dialogues, and instruction following.
- Agent capabilities: Supports tool calling in both thinking and non-thinking modes, with leading performance among open-source models on complex agent-based tasks.
- Multilingual support: Covers 100+ languages and dialects with strong instruction following and translation abilities.
Model Specifications
| Specification | Value |
|---|---|
| Type | Causal Language Model |
| Training Stage | Pretraining & Post-training |
| Parameters (Total) | 0.6B |
| Parameters (Non-Embedding) | 0.44B |
| Layers | 28 |
| Attention Heads (GQA) | 16 (Q) / 8 (KV) |
| Context Length | 32,768 tokens |
Usage Notes
For best results, avoid greedy decoding in thinking mode. If endless repetitions occur, set presence_penalty to 1.5. Refer to the blog, GitHub, and documentation for further details.
We're benchmarking and onboarding Qwen3-0.6B as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.