models / embeddings

Qwen3-Embedding-0.6B

Qwen/Qwen3-Embedding-0.6B

A hosted embeddings model - call it over an OpenAI-compatible API, no GPU to run.

price

$0.008

/ 1M tokens

throughput

581 embeds/s

about this model

Qwen3-Embedding-0.6B is a dense text embedding model hosted by gigarouter as a managed, OpenAI-compatible API. It is part of the Qwen3 Embedding series, built on the Qwen3 foundational models and designed for text embedding and ranking tasks.

Capabilities

Multilingual: Supports over 100 languages, including programming languages, enabling robust multilingual, cross-lingual, and code retrieval.
Long context: Handles sequences up to 32K tokens.
Flexible embedding dimensions: User-defined output dimensions from 32 to 1024 (MRL support).
Instruction-aware: Supports task-specific instructions on both embedding and reranking models, typically improving performance by 1% to 5%.

Performance

The larger 8B variant of the series ranks No.1 on the MTEB multilingual leaderboard (score 70.58 as of June 5, 2025). The 0.6B model delivers efficient, production-ready embeddings suitable for retrieval, classification, clustering, and bitext mining across diverse languages and domains.

Model Specifications

Model	Size	Layers	Context Length	Embedding Dimension	MRL Support	Instruction Aware
Qwen3-Embedding-0.6B	0.6B	28	32K	1024	Yes	Yes

Larger embedding variants (4B and 8B) and corresponding reranker models are also available in the Qwen3 Embedding series.

For detailed benchmarks, refer to the Qwen3 Embedding blog and GitHub repository.

call it

# OpenAI client - just change base_url
from openai import OpenAI
client = OpenAI(base_url="https://gigarouter.ai/v1", api_key=KEY)
v = client.embeddings.create(model="Qwen/Qwen3-Embedding-0.6B", input=["hello world"])
print(v.data[0].embedding[:4])

get a key + $25 free →model card ↗all models