skip to content
gigarouter gigarouter
tasks / text generation

Hosted text generation models

29 models · 0 live as APIs · benchmarked & compared

Text generation models produce coherent, contextually relevant sequences of text from a given prompt. They are used to automate drafting, summarization, translation, and dialogue systems—for example, generating customer support replies, powering code assistants, or creating product descriptions at scale. In production, these models are typically integrated via REST APIs that accept a prompt and return generated tokens, often with controls for temperature, top‑p, and max length to tune output characteristics.

Choosing among text generation models involves the classic size/quality/speed trade‑off. Smaller models (e.g., facebook/opt-125m, google/gemma-3-270m) offer low latency and modest hardware requirements but produce less coherent or creative output on complex tasks. Larger models (e.g., nvidia/Qwen3.6-35B-A3B-NVFP4, dphn/dolphin-2.9.1-yi-1.5-34b) generate more nuanced and accurate responses but require more compute and incur higher per‑token latency. For most workflows, the appropriate model balances acceptable generation quality against the throughput and cost constraints of the application.

Calling a hosted API avoids the operational burden of managing inference infrastructure, enabling teams to focus on integration and product logic rather than GPU provisioning and model serving.

compare

modelparamsdownloads/mopricestatus
facebook/opt-125m-13.7Mat launchcoming soon
openai-community/gpt2137M13.3Mat launchcoming soon
trl-internal-testing/tiny-Qwen2ForCausalLM-2.52.4M9.2Mat launchcoming soon
antirez/deepseek-v4-gguf-6.4Mat launchcoming soon
nvidia/Qwen3.6-35B-A3B-NVFP418683.9M6.2Mat launchcoming soon
google/gemma-3-270m268.1M5.1Mat launchcoming soon
dphn/dolphin-2.9.1-yi-1.5-34b34388.9M4.6Mat launchcoming soon
yuxinlu1/gemma-4-12B-coder-fable5-composer2.5-v1-GGUF-628.2Kat launchcoming soon
yuxinlu1/gemma-4-12B-agentic-fable5-composer2.5-v2-3.5x-tau2-GGUF-329.4Kat launchcoming soon
deepreinforce-ai/Ornith-1.0-35B-GGUF-322.8Kat launchcoming soon
deepreinforce-ai/Ornith-1.0-9B-GGUF-287.9Kat launchcoming soon
unsloth/GLM-5.2-GGUF-264.6Kat launchcoming soon
unsloth/Qwen-AgentWorld-35B-A3B-GGUF-259.8Kat launchcoming soon
nvidia/GLM-5.2-NVFP4380989.1M190Kat launchcoming soon
nvidia/Qwen3.6-27B-NVFP418164.6M94.5Kat launchcoming soon
deepreinforce-ai/Ornith-1.0-397B-FP8396991.8M65Kat launchcoming soon
deepreinforce-ai/Ornith-1.0-9B1.5M64.1Kat launchcoming soon
huihui-ai/Huihui-Qwythos-9B-Claude-Mythos-5-1M-abliterated-GGUF-49.8Kat launchcoming soon
Qwen/Qwen-AgentWorld-35B-A3B34660.6M45.5Kat launchcoming soon
deepseek-ai/DeepSeek-V4-Flash-DSpark165265.5M32.7Kat launchcoming soon
LiquidAI/LFM2.5-230M229.7M29.6Kat launchcoming soon
SC117/Ornith-1.0-35B-MTP-APEX-GGUF-19.3Kat launchcoming soon
protoLabsAI/Ornith-1.0-9B-MTP-GGUF-16.8Kat launchcoming soon
nvidia/Nemotron-Labs-TwoTower-30B-A3B-Base-BF1663185.9M10.3Kat launchcoming soon
deepseek-ai/DeepSeek-V4-Pro-DSpark889484.9M9.4Kat launchcoming soon
AEON-7/Ornith-1.0-35B-AEON-Ultimate-Uncensored-NVFP420959.3M8.7Kat launchcoming soon
deepreinforce-ai/Ornith-1.0-397B396802.4M8.1Kat launchcoming soon
huihui-ai/Huihui-GLM-5.2-abliterated-GGUF-3.7Kat launchcoming soon
InternScience/Agents-A135107.2M3.5Kat launchcoming soon