skip to content
gigarouter gigarouter
models / text-gen · coming soon

Qwen3-0.6B

Qwen/Qwen3-0.6B

A popular open text-gen model. gigarouter benchmarks and hosts it as an OpenAI-compatible API.

status
coming soon

about this model

Overview

Qwen3-0.6B is a 0.6-billion-parameter causal language model optimized for chat. As part of the Qwen3 series, it delivers reasoning, instruction-following, agent capabilities, and multilingual support within a single model. It uniquely supports seamless switching between a thinking mode (for complex reasoning, math, and coding) and a non-thinking mode (for efficient general-purpose dialogue), enabling optimal performance across varied use cases.

Key Capabilities

  • Thinking / non-thinking switching: A hard switch (enable_thinking) and a soft switch via /think or /no_think in prompts allow per-turn control. Recommended sampling parameters: temperature 0.6, top-p 0.95, top-k 20, min-p 0 for thinking mode; temperature 0.7, top-p 0.8, top-k 20, min-p 0 for non-thinking mode. Greedy decoding is strongly discouraged in thinking mode.
  • Enhanced reasoning: Surpasses previous QwQ (thinking mode) and Qwen2.5 instruct models (non-thinking mode) on mathematics, code generation, and commonsense logical reasoning.
  • Human preference alignment: Excels in creative writing, role-playing, multi-turn dialogues, and instruction following.
  • Agent capabilities: Supports tool calling in both thinking and non-thinking modes, with leading performance among open-source models on complex agent-based tasks.
  • Multilingual support: Covers 100+ languages and dialects with strong instruction following and translation abilities.

Model Specifications

SpecificationValue
TypeCausal Language Model
Training StagePretraining & Post-training
Parameters (Total)0.6B
Parameters (Non-Embedding)0.44B
Layers28
Attention Heads (GQA)16 (Q) / 8 (KV)
Context Length32,768 tokens

Usage Notes

For best results, avoid greedy decoding in thinking mode. If endless repetitions occur, set presence_penalty to 1.5. Refer to the blog, GitHub, and documentation for further details.

not yet live

We're benchmarking and onboarding Qwen3-0.6B as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.