models / embeddings · coming soon

all-MiniLM-L6-v2-onnx

Qdrant/all-MiniLM-L6-v2-onnx

A popular open embeddings model, with 1.3M downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.

status

coming soon

API providers

downloads / mo

1.3M

license

apache-2.0

about this model

This ONNX port of sentence-transformers/all-MiniLM-L6-v2 is a lightweight embedding model optimized for text classification and similarity searches. It is hosted on gigarouter as a managed, OpenAI-compatible API — no local installation required.

Key strengths

Compact and fast: the MiniLM-L6 architecture balances speed and quality, making it suitable for high‑throughput embedding pipelines.
ONNX format ensures broad runtime compatibility and efficient inference.
Designed for semantic textual similarity, clustering, and retrieval tasks.

What it is best for

Sentence and paragraph‑level embedding for semantic search.
Zero‑shot text classification using embedding‑based approaches.
Applications where low latency and modest resource usage are critical.

Performance

As a port of the original all‑MiniLM‑L6‑v2, the model inherits its known performance. The original model achieves competitive scores on the STS Benchmark (e.g., Spearman correlation of approximately 80–82 on STS‑test) and is widely used in production for general‑purpose embedding tasks.

not yet live

We're benchmarking and onboarding all-MiniLM-L6-v2-onnx as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.