skip to content
gigarouter gigarouter
rankings / caption-images

The best models for captioning images

37 models & services · 0 callable here now

No public community benchmark covers captioning quality yet - ranked by adoption. Prices are our live per-call rates; ~ marks an estimate until the model is onboarded.

#modelscorepriceparamsstatus
1Salesforce/blip-image-captioning-base---coming soon
2Salesforce/blip-image-captioning-large-~$0.094 / 1k images469.7Mcoming soon
3PaddlePaddle/PP-OCRv5_server_det---coming soon
4numind/NuExtract3-~$1.341 / 1k images4539.3Mcoming soon
5PaddlePaddle/UVDoc---coming soon
6microsoft/trocr-small-handwritten---coming soon
7PaddlePaddle/PP-LCNet_x1_0_doc_ori---coming soon
8kha-white/manga-ocr-base---coming soon
9ibm-granite/granite-vision-3.3-2b-~$0.626 / 1k images2975.4Mcoming soon
10PaddlePaddle/PP-LCNet_x1_0_textline_ori---coming soon
11microsoft/trocr-base-printed-~$0.094 / 1k images333.3Mcoming soon
12lightonai/LightOnOCR-1B-1025-~$0.235 / 1k images1161.2Mcoming soon
13PaddlePaddle/PP-OCRv5_server_rec---coming soon
14microsoft/trocr-large-handwritten---coming soon
15microsoft/kosmos-2-patch14-224-~$0.626 / 1k images1664.5Mcoming soon
16naver-clova-ix/donut-base---coming soon
17microsoft/trocr-base-stage1-~$0.094 / 1k images384.3Mcoming soon
18facebook/nougat-base-~$0.094 / 1k images348.7Mcoming soon
19microsoft/trocr-large-printed-~$0.235 / 1k images608.1Mcoming soon
20PaddlePaddle/PP-OCRv5_mobile_det---coming soon