rate card
Models & pricing
Published prices in each model's native unit. Realtime is the on-demand rate; batch is a discounted flexible tier (send X-Tier: batch). More models are added as we validate and price them.
| model | task | tier | realtime | batch |
|---|---|---|---|---|
| cross-encoder/ms-marco-MiniLM-L6-v2 | reranker | A | $0.008/1k docs | $0.0025/1k docs |
| jinaai/jina-reranker-v2-base-multilingual | reranker | A | $0.008/1k docs | $0.0025/1k docs |
| Qwen/Qwen3-Reranker-0.6B | reranker | A | $0.008/1k docs | $0.0025/1k docs |
| Qwen/Qwen3-Embedding-0.6B | embeddings | A | $0.008/1M tok | $0.0025/1M tok |
| BAAI/bge-small-en-v1.5 | embeddings | A | $0.008/1M tok | $0.0025/1M tok |
| BAAI/bge-reranker-base | reranker | A | $0.008/1k docs | $0.0025/1k docs |