tasks / text generation

Hosted text generation models

29 models · 0 live as APIs · benchmarked & compared

Text generation models produce coherent, contextually relevant sequences of text from a given prompt. They are used to automate drafting, summarization, translation, and dialogue systems—for example, generating customer support replies, powering code assistants, or creating product descriptions at scale. In production, these models are typically integrated via REST APIs that accept a prompt and return generated tokens, often with controls for temperature, top‑p, and max length to tune output characteristics.

Choosing among text generation models involves the classic size/quality/speed trade‑off. Smaller models (e.g., facebook/opt-125m, google/gemma-3-270m) offer low latency and modest hardware requirements but produce less coherent or creative output on complex tasks. Larger models (e.g., nvidia/Qwen3.6-35B-A3B-NVFP4, dphn/dolphin-2.9.1-yi-1.5-34b) generate more nuanced and accurate responses but require more compute and incur higher per‑token latency. For most workflows, the appropriate model balances acceptable generation quality against the throughput and cost constraints of the application.

Calling a hosted API avoids the operational burden of managing inference infrastructure, enabling teams to focus on integration and product logic rather than GPU provisioning and model serving.

compare

model	params	downloads/mo	price	status
facebook/opt-125m	-	13.7M	at launch	coming soon
openai-community/gpt2	137M	13.3M	at launch	coming soon
trl-internal-testing/tiny-Qwen2ForCausalLM-2.5	2.4M	9.2M	at launch	coming soon
antirez/deepseek-v4-gguf	-	6.4M	at launch	coming soon
nvidia/Qwen3.6-35B-A3B-NVFP4	18683.9M	6.2M	at launch	coming soon
google/gemma-3-270m	268.1M	5.1M	at launch	coming soon
dphn/dolphin-2.9.1-yi-1.5-34b	34388.9M	4.6M	at launch	coming soon
yuxinlu1/gemma-4-12B-coder-fable5-composer2.5-v1-GGUF	-	628.2K	at launch	coming soon
yuxinlu1/gemma-4-12B-agentic-fable5-composer2.5-v2-3.5x-tau2-GGUF	-	329.4K	at launch	coming soon
deepreinforce-ai/Ornith-1.0-35B-GGUF	-	322.8K	at launch	coming soon
deepreinforce-ai/Ornith-1.0-9B-GGUF	-	287.9K	at launch	coming soon
unsloth/GLM-5.2-GGUF	-	264.6K	at launch	coming soon
unsloth/Qwen-AgentWorld-35B-A3B-GGUF	-	259.8K	at launch	coming soon
nvidia/GLM-5.2-NVFP4	380989.1M	190K	at launch	coming soon
nvidia/Qwen3.6-27B-NVFP4	18164.6M	94.5K	at launch	coming soon
deepreinforce-ai/Ornith-1.0-397B-FP8	396991.8M	65K	at launch	coming soon
deepreinforce-ai/Ornith-1.0-9B	1.5M	64.1K	at launch	coming soon
huihui-ai/Huihui-Qwythos-9B-Claude-Mythos-5-1M-abliterated-GGUF	-	49.8K	at launch	coming soon
Qwen/Qwen-AgentWorld-35B-A3B	34660.6M	45.5K	at launch	coming soon
deepseek-ai/DeepSeek-V4-Flash-DSpark	165265.5M	32.7K	at launch	coming soon
LiquidAI/LFM2.5-230M	229.7M	29.6K	at launch	coming soon
SC117/Ornith-1.0-35B-MTP-APEX-GGUF	-	19.3K	at launch	coming soon
protoLabsAI/Ornith-1.0-9B-MTP-GGUF	-	16.8K	at launch	coming soon
nvidia/Nemotron-Labs-TwoTower-30B-A3B-Base-BF16	63185.9M	10.3K	at launch	coming soon
deepseek-ai/DeepSeek-V4-Pro-DSpark	889484.9M	9.4K	at launch	coming soon
AEON-7/Ornith-1.0-35B-AEON-Ultimate-Uncensored-NVFP4	20959.3M	8.7K	at launch	coming soon
deepreinforce-ai/Ornith-1.0-397B	396802.4M	8.1K	at launch	coming soon
huihui-ai/Huihui-GLM-5.2-abliterated-GGUF	-	3.7K	at launch	coming soon
InternScience/Agents-A1	35107.2M	3.5K	at launch	coming soon

get a key + $25 free →docs