ms-marco-MiniLM-L12-v2
cross-encoder/ms-marco-MiniLM-L12-v2
A popular open reranker model, with 2.3M downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.
about this model
This cross-encoder model is trained on the MS Marco Passage Ranking dataset and is designed specifically for passage reranking in information retrieval pipelines. Given a query and a set of candidate passages (e.g., retrieved by a first-stage retrieval system), the model outputs relevance scores that can be used to reorder the passages. It is hosted as a managed, OpenAI-compatible API on gigarouter, requiring no local installation or transformer library setup.
Key Strengths
- Optimized for the MS Marco passage reranking task, a standard benchmark for search and retrieval.
- Balances high relevance accuracy with practical inference speed on GPU hardware.
- Directly usable as a reranking stage after an initial retrieval step (e.g., ElasticSearch).
Benchmark Performance
The following table shows the model's performance on the TREC Deep Learning 2019 and MS Marco Passage Reranking datasets, compared with other cross-encoders. Runtime was measured on a V100 GPU.
| Model-Name | NDCG@10 (TREC DL 19) | MRR@10 (MS Marco Dev) | Docs / Sec |
|---|---|---|---|
| cross-encoder/ms-marco-MiniLM-L12-v2 | 74.31 | 39.02 | 960 |
When to Use
This model is best suited for search and question-answering pipelines where a fast first-pass retrieval (e.g., dense or sparse) is followed by a more accurate reranking step. It is not intended for standalone retrieval but for ranking a limited set of pre-selected candidates.
We're benchmarking and onboarding ms-marco-MiniLM-L12-v2 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.