models / embeddings · coming soon

all-MiniLM-L6-v2

Xenova/all-MiniLM-L6-v2

A popular open embeddings model, with 2.8M downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.

status

coming soon

API providers

downloads / mo

2.8M

license

apache-2.0

about this model

This model is a sentence embedding model designed to convert sentences and short paragraphs into fixed-size 384-dimensional vectors. It is the ONNX-exported version of sentence-transformers/all-MiniLM-L6-v2, optimized for inference in web and edge environments via the Transformers.js library.

Key Strengths

Produces 384-dimensional dense embeddings from text input using mean pooling and normalization.
Compact 6-layer MiniLM architecture balances speed and accuracy, making it suitable for latency-sensitive applications.
ONNX format enables efficient execution in JavaScript runtimes and CPU/GPU-accelerated environments.

Best For

Semantic textual similarity, clustering, and information retrieval where fixed-size embeddings are required.
Building search or recommendation systems that compare sentence-level meaning.
Deployments requiring a lightweight, fast embedding model with minimal computational overhead.

Benchmark Results

This model card does not include specific benchmark numbers. The underlying all-MiniLM-L6-v2 is known for strong performance on the Sentence Embeddings Benchmark (e.g., STS Benchmark, SentEval) relative to its size, but only the ONNX export is provided here. Users should evaluate on their own datasets.

not yet live

We're benchmarking and onboarding all-MiniLM-L6-v2 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.