PP-OCRv6_medium_det
PaddlePaddle/PP-OCRv6_medium_det
A popular open image-to-text model, with 89K downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.
about this model
PP-OCRv6_medium_det is a text detection model for image-to-text tasks, part of the PP-OCRv6 family developed by PaddlePaddle. It is designed to locate text regions in images with high accuracy, serving as the detection component in an OCR pipeline.
Key strengths
- Compact model size: available in a range from 1.5M to 34.5M parameters, enabling deployment on resource-constrained devices.
- Competitive accuracy: claims to surpass billion-scale vision-language models on OCR tasks, as demonstrated in internal benchmarks.
- Optimized for real-world text detection: trained on diverse datasets to handle various fonts, scales, and orientations.
Best for
- OCR pipelines requiring fast, lightweight text detection.
- Scenarios where computational resources are limited (e.g., mobile, edge).
- Applications needing high detection precision without the overhead of large models.
Benchmark comparisons
The following table (from the model card) illustrates the performance of PP-OCRv6 models relative to larger vision-language models:
Detailed metrics are available in the original model card. The model is hosted on Gigarouter as a managed API, providing low-latency inference without requiring local installation or model management.
We're benchmarking and onboarding PP-OCRv6_medium_det as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.