ms-marco-TinyBERT-L2-v2
cross-encoder/ms-marco-TinyBERT-L2-v2
A popular open reranker model, with 283.3K downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.
about this model
This cross-encoder model is trained on the MS Marco Passage Ranking task for information retrieval reranking. Given a query and a set of candidate passages (e.g., retrieved by ElasticSearch or a bi-encoder), it produces a relevance score for each query-passage pair, enabling sorting by relevance. It is the smallest and fastest model in the cross-encoder/ms-marco version 2 family, designed for low-latency applications.
Key Strengths
- Inference speed: Processes up to 9,000 documents per second on a V100 GPU, making it suitable for high-throughput reranking pipelines.
- Effective accuracy: Despite its compact size (2 layers), it achieves competitive results on standard benchmarks.
Benchmark Performance
Evaluated on TREC Deep Learning 2019 (NDCG@10) and MS Marco Passage Reranking dev set (MRR@10). Runtime measured on a V100 GPU.
| Model-Name | NDCG@10 (TREC DL 19) | MRR@10 (MS Marco Dev) | Docs / Sec |
|---|---|---|---|
| cross-encoder/ms-marco-TinyBERT-L2-v2 | 69.84 | 32.56 | 9000 |
| cross-encoder/ms-marco-MiniLM-L2-v2 | 71.01 | 34.85 | 4100 |
| cross-encoder/ms-marco-MiniLM-L4-v2 | 73.04 | 37.70 | 2500 |
| cross-encoder/ms-marco-MiniLM-L6-v2 | 74.30 | 39.01 | 1800 |
| cross-encoder/ms-marco-MiniLM-L12-v2 | 74.31 | 39.02 | 960 |
| Version 1 models | |||
| cross-encoder/ms-marco-TinyBERT-L2 | 67.43 | 30.15 | 9000 |
| cross-encoder/ms-marco-TinyBERT-L4 | 68.09 | 34.50 | 2900 |
| cross-encoder/ms-marco-TinyBERT-L6 | 69.57 | 36.13 | 680 |
| cross-encoder/ms-marco-electra-base | 71.99 | 36.41 | 340 |
Best Use Case
Ideal for reranking in a two-stage retrieval pipeline where low latency and high throughput are priorities. It is particularly effective when paired with a fast first-stage ret
We're benchmarking and onboarding ms-marco-TinyBERT-L2-v2 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.