models / text-gen · coming soon

Qwen2.5-0.5B-Instruct

Qwen/Qwen2.5-0.5B-Instruct

A popular open text-gen model. gigarouter benchmarks and hosts it as an OpenAI-compatible API.

est. price

~$0.12

/ 1M tokens · estimated, set at launch

about this model

Model Overview

Qwen2.5-0.5B-Instruct is a causal language model from the Qwen2.5 series, optimized for chat and instruction-following tasks. It is available as a hosted, OpenAI-compatible API on gigarouter.

Key Capabilities

Improved coding and mathematics performance relative to Qwen2.
Strong instruction following, with resilience to diverse system prompts for role-play and condition-setting.
Proficient in generating long texts (up to 8,192 tokens) and understanding structured data such as tables.
Supports structured output generation, particularly JSON.
Full context length of 32,768 tokens, with generation context up to 8,192 tokens.
Multilingual support for over 29 languages including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, and Arabic.

Architecture

Parameters: 0.49B total (0.36B non-embedding).
Layers: 24.
Attention heads: 14 for Q, 2 for KV (Grouped Query Attention).
Activation: SwiGLU; normalization: RMSNorm; includes Attention QKV bias and tied word embeddings.
Training: pretraining followed by post-training (instruction tuning).

Evaluation

Detailed evaluation results and speed benchmarks are available in the official blog and documentation.

Best For

This model is suited for applications requiring efficient, lightweight instruction-tuned chat, structured data handling, and multilingual support within a constrained parameter budget. It is particularly effective for tasks that benefit from long context or structured output generation.

not yet live

We're benchmarking and onboarding Qwen2.5-0.5B-Instruct as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.