Qwen3-Embedding-0.6B
Qwen/Qwen3-Embedding-0.6B
A hosted embeddings model - call it over an OpenAI-compatible API, no GPU to run.
about this model
Qwen3-Embedding-0.6B is a dense text embedding model hosted by gigarouter as a managed, OpenAI-compatible API. It is part of the Qwen3 Embedding series, built on the Qwen3 foundational models and designed for text embedding and ranking tasks.
Capabilities
- Multilingual: Supports over 100 languages, including programming languages, enabling robust multilingual, cross-lingual, and code retrieval.
- Long context: Handles sequences up to 32K tokens.
- Flexible embedding dimensions: User-defined output dimensions from 32 to 1024 (MRL support).
- Instruction-aware: Supports task-specific instructions on both embedding and reranking models, typically improving performance by 1% to 5%.
Performance
The larger 8B variant of the series ranks No.1 on the MTEB multilingual leaderboard (score 70.58 as of June 5, 2025). The 0.6B model delivers efficient, production-ready embeddings suitable for retrieval, classification, clustering, and bitext mining across diverse languages and domains.
Model Specifications
| Model | Size | Layers | Context Length | Embedding Dimension | MRL Support | Instruction Aware |
|---|---|---|---|---|---|---|
| Qwen3-Embedding-0.6B | 0.6B | 28 | 32K | 1024 | Yes | Yes |
Larger embedding variants (4B and 8B) and corresponding reranker models are also available in the Qwen3 Embedding series.
For detailed benchmarks, refer to the Qwen3 Embedding blog and GitHub repository.
# OpenAI client - just change base_url from openai import OpenAI client = OpenAI(base_url="https://gigarouter.ai/v1", api_key=KEY) v = client.embeddings.create(model="Qwen/Qwen3-Embedding-0.6B", input=["hello world"]) print(v.data[0].embedding[:4])