models / object detection · coming soon
yoloe
jameslahm/yoloe
A popular open object detection model, with 4.3K downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.
status
coming soon
API providers
0
downloads / mo
4.3K
license
agpl-3.0
about this model
YOLOE is a unified object detection and segmentation model that supports text prompts, visual inputs, and prompt-free inference within a single architecture, optimized for real-time performance.
Key capabilities
- Text prompts – Re-parameterizable Region-Text Alignment (RepRTA) refines textual embeddings with zero inference overhead.
- Visual prompts – Semantic-Activated Visual Prompt Encoder (SAVPE) improves visual embedding and accuracy with minimal complexity.
- Prompt-free mode – Lazy Region-Prompt Contrast (LRPC) uses a built-in vocabulary to identify all objects without a language model.
Benchmark highlights
- On LVIS
minival, YOLOE-v8-S surpasses YOLO-Worldv2-S by 3.5 AP with 3× less training cost and 1.4× inference speedup. - Transferring to COCO, YOLOE-v8-L achieves 0.6 AP and 0.4 AP gains over closed-set YOLOv8-L with nearly 4× less training time.
- After re-parameterization, YOLOE becomes a standard YOLO with zero inference or transferring overhead.
Zero-shot detection on LVIS (text / visual prompts)
| Model | Size | Params | AP | AP | AP | AP |
|---|---|---|---|---|---|---|
| YOLOE-v8-S | 640 | 12M / 13M | 27.9 / 26.2 | 22.3 / 21.3 | 27.8 / 27.7 | 29.0 / 25.7 |
| YOLOE-v8-M | 640 | 27M / 30M | 32.6 / 31.0 | 26.9 / 27.0 | 31.9 / 31.7 | 34.4 / 31.1 |
| YOLOE-v8-L | 640 | 45M / 50M | 35.9 / 34.2 | 33.2 / 33.2 | 34.8 / 34.6 | 37.3 / 34.1 |
YOLOE is best suited for applications requiring real-time detection and segmentation across diverse, open-ended object categories without retraining. It is hosted on Gigarouter as a managed, OpenAI-compatible API.
not yet live
We're benchmarking and onboarding yoloe as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.