skip to content
gigarouter gigarouter
tasks / vision-language

Hosted vision-language models

32 models · 0 live as APIs · benchmarked & compared

Vision-language models process both images and text, enabling tasks such as extracting structured data from scanned documents, answering questions about photographs, and generating captions for accessibility. For example, deepseek-ai/DeepSeek-OCR-2 is specialised for optical character recognition, while series like Qwen/Qwen2.5-VL-7B-Instruct and Qwen/Qwen2-VL-2B-Instruct support visual question answering and image-to-text generation.

  • Document digitisation and invoice parsing
  • Automated content moderation on visual platforms
  • Visual search and retrieval-augmented generation (RAG) pipelines

In production, these models are often integrated into RAG workflows or multimodal chatbots. Choosing among the 32 models listed here involves balancing latency, accuracy, and cost: larger architectures such as Qwen/Qwen3.6-35B-A3B-FP8 yield higher quality on complex reasoning but require more compute, while quantised or smaller models like cyankiwi/gemma-4-26B-A4B-it-AWQ-4bit or Qwen/Qwen3-VL-4B-Instruct serve well at lower throughputs. For most call volumes, calling a hosted API eliminates infrastructure overhead and enables elastic scaling — benefits gigarouter provides through its benchmarked, OpenAI-compatible endpoints. (Currently 0 models are live; the remainder are being onboarded.)

compare

modelparamsdownloads/mopricestatus
Qwen/Qwen2.5-VL-7B-Instruct8292.2M9.8M~$1.341 / 1k imagescoming soon
Qwen/Qwen3.6-35B-A3B-FP835953.9M6.2M~$1.341 / 1k imagescoming soon
Qwen/Qwen2.5-VL-3B-Instruct3754.6M5.3M~$0.626 / 1k imagescoming soon
cyankiwi/gemma-4-26B-A4B-it-AWQ-4bit26554.3M5.1M~$1.341 / 1k imagescoming soon
Qwen/Qwen3.6-27B-FP827782.9M4.9M~$1.341 / 1k imagescoming soon
Qwen/Qwen3-VL-4B-Instruct4437.8M3.7M~$1.341 / 1k imagescoming soon
Qwen/Qwen2-VL-2B-Instruct2209M3.6M~$0.626 / 1k imagescoming soon
deepseek-ai/DeepSeek-OCR-23389.1M3.3M~$0.626 / 1k imagescoming soon
llava-hf/llava-1.5-7b-hf7063.4M3.2M~$1.341 / 1k imagescoming soon
RedHatAI/gemma-4-31B-it-FP8-block31274.9M3.2M~$1.341 / 1k imagescoming soon
HauhauCS/Qwen3.6-35B-A3B-Uncensored-HauhauCS-Aggressive-3Mat launchcoming soon
microsoft/Florence-2-base231.6M2.6M~$0.094 / 1k imagescoming soon
Qwen/Qwen3.5-0.8B873.4M2.5M~$0.235 / 1k imagescoming soon
Qwen/Qwen3-VL-2B-Instruct2127.5M2.1M~$0.626 / 1k imagescoming soon
RedHatAI/gemma-4-26B-A4B-it-FP8-Dynamic26560.9M2M~$1.341 / 1k imagescoming soon
cyankiwi/Qwen3.6-35B-A3B-AWQ-4bit35951.8M1.8M~$1.341 / 1k imagescoming soon
Qwen/Qwen2-VL-7B-Instruct8291.4M1.8M~$1.341 / 1k imagescoming soon
Qwen/Qwen2-VL-7B-Instruct-AWQ8291.4M1.8M~$1.341 / 1k imagescoming soon
unsloth/Qwen3.6-27B-MTP-GGUF-1.8Mat launchcoming soon
Qwen/Qwen2.5-VL-7B-Instruct-AWQ8292.2M1.7M~$1.341 / 1k imagescoming soon
vikhyatk/moondream21927.2M1.6M~$0.626 / 1k imagescoming soon
unsloth/gemma-4-26B-A4B-it-GGUF-1.5Mat launchcoming soon
OpenGVLab/InternVL2-2B2205.8M1.5M~$0.626 / 1k imagescoming soon
empero-ai/Qwythos-9B-Claude-Mythos-5-1M-GGUF-1.4Mat launchcoming soon
baidu/Unlimited-OCR3336.1M885K~$0.626 / 1k imagescoming soon
unsloth/Qwen3.6-35B-A3B-GGUF-874.6Kat launchcoming soon
unsloth/Qwen3.6-35B-A3B-MTP-GGUF-734.7Kat launchcoming soon
DavidAU/Qwen3.6-40B-Claude-4.6-Opus-Deckard-Heretic-Uncensored-Thinking-NEO-CODE-Di-IMatrix-MAX-GGUF-519.4Kat launchcoming soon
HauhauCS/Gemma4-12B-QAT-Uncensored-HauhauCS-Balanced-71.7Kat launchcoming soon
Jackrong/Qwopus3.6-35B-A3B-Coder-MTP-GGUF-44.8Kat launchcoming soon
HauhauCS/Gemma4-26B-A4B-QAT-Uncensored-HauhauCS-Balanced-MTP-44.5Kat launchcoming soon
sahilchachra/Unlimited-OCR-GGUF-43.7Kat launchcoming soon