skip to content
gigarouter gigarouter
models / image-to-text · coming soon

manga-ocr-base

kha-white/manga-ocr-base

A popular open image-to-text model, with 389.4K downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.

status
coming soon
API providers
0
downloads / mo
389.4K
license
apache-2.0

about this model

kha-white/manga-ocr-base is an image-to-text model that performs optical character recognition (OCR) for printed Japanese text, with a primary focus on Japanese manga.

Key Strengths

  • Robust recognition of both vertical and horizontal text layouts.
  • Handles text with furigana (phonetic annotations) accurately.
  • Works reliably on text overlaid on images, such as speech bubbles.
  • Supports a wide variety of fonts, styles, and low-quality image inputs.

Best for

Developers requiring high-quality Japanese text extraction from manga pages, scanned comics, or any printed Japanese material where conventional OCR may struggle with layout complexity or image degradation.

Architecture

The model uses a Vision Encoder Decoder framework, built on Hugging Face Transformers. Underlying code and training details are available in the official repository.

Hosted by Gigarouter

This model is offered as a managed, OpenAI-compatible API. No installation or local setup is required; simply call the endpoint with an image and receive text output.

not yet live

We're benchmarking and onboarding manga-ocr-base as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.