MinerU2.5 2509 1.2B
opendatalab/MinerU2.5-2509-1.2B
published Sep 2025 · updated Apr 2026
A popular open vision-language model, with 21.2K downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.
about this model
MinerU2.5-2509-1.2B is a 1.2B-parameter vision-language model for document parsing, hosted by gigarouter as a managed API. It achieves state-of-the-art accuracy with high computational efficiency through a coarse-to-fine, two-stage parsing strategy. The model first performs efficient global layout analysis on downsampled images to identify structural elements, then conducts fine-grained content recognition on native-resolution crops for text, formulas, and tables. A large-scale, diverse data engine supports both pretraining and fine-tuning, enabling robust performance across diverse document types.
Key Improvements
- Comprehensive and Granular Layout Analysis: Preserves non-body elements (headers, footers, page numbers) and uses a refined labeling schema for clearer representation of lists, references, and code blocks.
- Breakthroughs in Formula Parsing: Delivers high-quality parsing of complex, lengthy mathematical formulae and accurately recognizes mixed-language (Chinese‑English) equations.
- Enhanced Robustness in Table Parsing: Handles challenging cases such as rotated tables, borderless tables, and tables with partial borders.
Benchmark Performance
On the olmOCR-bench benchmark, MinerU2.5-2509-1.2B achieves the following scores:
| Benchmark | Score |
|---|---|
| Overall | 75.2 |
| Arxiv Math | 76.6 |
| Old Scans Math | 54.6 |
| Table Tests | 84.9 |
It ranks 14th overall, 15th for Arxiv Math, and 15th for Old Scans Math in the olmOCR-bench leaderboard.

We're benchmarking and onboarding MinerU2.5 2509 1.2B as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.