Question 1

What is Unlimited-OCR best for?

Accepted Answer

It is best for one-shot document parsing, converting images to Markdown, extracting text with bounding boxes, and locating specific text in documents.

Question 2

What quantizations are available?

Accepted Answer

K-quants (BF16, Q8_0, Q6_K, Q5_K_M, Q5_K_S, Q4_K_M, Q4_K_S, Q3_K_M) and i-quants (IQ4_XS, IQ4_NL, IQ3_M, IQ3_XXS, IQ2_M) are available. The recommended default is Q4_K_M.

Question 3

How do I call Unlimited-OCR via the gigarouter API?

Accepted Answer

Use the gigarouter OpenAI-compatible endpoint with your API key. Send a chat completion request with an image URL (base64 data URL) and the appropriate prompt (e.g., "Convert the document to markdown.").

Question 4

What license does Unlimited-OCR use?

Accepted Answer

It is released under the MIT license (inherited from the base model).

Question 5

What input and output formats are supported?

Accepted Answer

Input: an image (supported formats like PNG/JPEG) plus a text instruction. Output: Markdown text with optional bounding boxes in tokens like ... when using the  prefix.

Question 6

What is Unlimited-OCR best for?

Accepted Answer

It is best for one-shot document parsing, converting images to Markdown, extracting text with bounding boxes, and locating specific text in documents.

Question 7

What quantizations are available?

Accepted Answer

K-quants (BF16, Q8_0, Q6_K, Q5_K_M, Q5_K_S, Q4_K_M, Q4_K_S, Q3_K_M) and i-quants (IQ4_XS, IQ4_NL, IQ3_M, IQ3_XXS, IQ2_M) are available. The recommended default is Q4_K_M.

Question 8

How do I call Unlimited-OCR via the gigarouter API?

Accepted Answer

Use the gigarouter OpenAI-compatible endpoint with your API key. Send a chat completion request with an image URL (base64 data URL) and the appropriate prompt (e.g., "Convert the document to markdown.").

Question 9

What license does Unlimited-OCR use?

Accepted Answer

It is released under the MIT license (inherited from the base model).

Question 10

What input and output formats are supported?

Accepted Answer

Input: an image (supported formats like PNG/JPEG) plus a text instruction. Output: Markdown text with optional bounding boxes in tokens like ... when using the  prefix.

Task	OCR / document parsing
Architecture	DeepSeek-OCR (DeepEncoder vision + DeepSeek-V2 MoE text decoder)
Parameters	3B
License	MIT
Task	OCR / document parsing
Architecture	DeepSeek-OCR (DeepEncoder vision + DeepSeek-V2 MoE text decoder)
Parameters	3B
License	MIT

Metric	Score	Rank
Mean	46.17	13
Text Content	86.81	9
Layout	71.52	6
Table	70.21	12

Unlimited-OCR

specs

about this model

Key Capabilities and Strengths

Benchmark Performance

Technical Details

best for

FAQ

related vision-language models