skip to content
gigarouter gigarouter
models / vision-language · coming soon

Surya OCR 2

datalab-to/surya-ocr-2

published May 2026 · updated May 2026

Surya OCR 2 is a VLM model that performs OCR, layout analysis, and table recognition on documents.

est. price
~$0.235
/ 1k images · estimated, set at launch
API providers
0
downloads / mo
407K
license
openrail

specs

TaskOCR, Layout Analysis, Table Recognition
ArchitectureVLM (Vision Language Model)
Parameters650M
LicenseCode: Apache 2.0, Weights: Modified AI Pubs Open Rail-M (free for research, personal use, and startups under $5M)

about this model

Surya is a 650M parameter vision-language model (VLM) for document OCR, text detection, layout analysis, and table recognition, hosted by Gigarouter as a managed API.

Key benchmarks and capabilities

Surya scores 83.3% on olmOCR-bench, ranking 4th overall and first among models under 3B parameters. It achieves 87.2% on an internal 91-language multilingual benchmark.

  • Speed: 5 pages/second on an RTX 5090.
  • Layout analysis: classifies blocks (text, table, image, header, etc.) and provides reading order.
  • Table recognition: extracts rows and columns from tables.
  • Multilingual support: 91 languages covered in internal evaluation.

Output formats

Per-block OCR returns HTML (tables as <table>, math in <math>), bounding polygons, confidence scores, and raw labels. Layout output includes canonical labels and 0-indexed reading order.

Visual examples

Datalab logo

The model handles a wide range of documents, from newspapers to handwritten notes to corporate reports, as shown below.

TaskExample output
DetectionText detection bounding boxes
OCRRecognized text overlays
LayoutLayout analysis with labeled regions
Table recognitionTable row and column extraction

Additional images from the model card illustrate detection, OCR, layout, reading order, and table recognition on newspaper, textbook, tax form, handwritten notes, and corporate documents. The model is optimized for accurate, structured document understanding with fast throughput on modern GPUs.

best for

FAQ

What is Surya OCR 2 best for?

Surya OCR 2 is best for document OCR with layout analysis, table recognition, and reading order extraction, especially for multilingual documents.

How does Surya OCR 2 compare in accuracy to other OCR models?

Surya OCR 2 scores 83.3% on olmOCR-bench, ranking 4th overall and top under 3B parameters.

What are the license terms for Surya OCR 2?

The code is Apache 2.0. The model weights use a modified AI Pubs Open Rail-M license, free for research, personal use, and startups under $5M funding/revenue. Broader commercial use requires a license from Datalab.

How do I call Surya OCR 2 via the API on gigarouter?

Use the gigarouter OpenAI-compatible endpoint with an API key. Send an image or PDF as input and receive JSON output with text, layout, and table data.

What input formats does Surya OCR 2 support?

It supports images (JPEG, PNG, etc.) and PDF files. Output is a structured JSON with per-block text, bounding boxes, layout labels, and confidence scores.

not yet live

We're benchmarking and onboarding Surya OCR 2 as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.

related vision-language models

compare all →