skip to content
gigarouter gigarouter
models / image-to-text · coming soon

PP-OCRv6_medium_det

PaddlePaddle/PP-OCRv6_medium_det

A popular open image-to-text model, with 89K downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.

status
coming soon
API providers
0
downloads / mo
89K
license
apache-2.0

about this model

PP-OCRv6_medium_det is a text detection model for image-to-text tasks, part of the PP-OCRv6 family developed by PaddlePaddle. It is designed to locate text regions in images with high accuracy, serving as the detection component in an OCR pipeline.

Key strengths

  • Compact model size: available in a range from 1.5M to 34.5M parameters, enabling deployment on resource-constrained devices.
  • Competitive accuracy: claims to surpass billion-scale vision-language models on OCR tasks, as demonstrated in internal benchmarks.
  • Optimized for real-world text detection: trained on diverse datasets to handle various fonts, scales, and orientations.

Best for

  • OCR pipelines requiring fast, lightweight text detection.
  • Scenarios where computational resources are limited (e.g., mobile, edge).
  • Applications needing high detection precision without the overhead of large models.

Benchmark comparisons

The following table (from the model card) illustrates the performance of PP-OCRv6 models relative to larger vision-language models:

Detailed metrics are available in the original model card. The model is hosted on Gigarouter as a managed API, providing low-latency inference without requiring local installation or model management.

not yet live

We're benchmarking and onboarding PP-OCRv6_medium_det as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.