manga-ocr-base
kha-white/manga-ocr-base
A popular open image-to-text model, with 389.4K downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.
about this model
kha-white/manga-ocr-base is an image-to-text model that performs optical character recognition (OCR) for printed Japanese text, with a primary focus on Japanese manga.
Key Strengths
- Robust recognition of both vertical and horizontal text layouts.
- Handles text with furigana (phonetic annotations) accurately.
- Works reliably on text overlaid on images, such as speech bubbles.
- Supports a wide variety of fonts, styles, and low-quality image inputs.
Best for
Developers requiring high-quality Japanese text extraction from manga pages, scanned comics, or any printed Japanese material where conventional OCR may struggle with layout complexity or image degradation.
Architecture
The model uses a Vision Encoder Decoder framework, built on Hugging Face Transformers. Underlying code and training details are available in the official repository.
Hosted by Gigarouter
This model is offered as a managed, OpenAI-compatible API. No installation or local setup is required; simply call the endpoint with an image and receive text output.
We're benchmarking and onboarding manga-ocr-base as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.