Hosted object detection models
36 models · 0 live as APIs · benchmarked & compared
Object detection models locate and classify objects within images or documents. They solve problems such as extracting tables from scanned PDFs (e.g., microsoft/table-transformer-structure-recognition, TahaDouaji/detr-doc-table-detection), detecting table regions in layouts (microsoft/table-transformer-detection, microsoft/table-transformer-structure-recognition-v1.1-all), and identifying small objects in general scenes (hustvl/yolos-small). Other models target document layout parsing (PaddlePaddle/PP-DocLayoutV3_safetensors) or high-accuracy detection across diverse categories (PekingU/rtdetr_v2_r50vd, PekingU/rtdetr_r50vd_coco_o365).
In production, these models are often deployed as part of a document processing pipeline, a real-time video analysis system, or a batch annotation service. They are typically called via an API that accepts an image or a document page and returns bounding boxes with class labels and confidence scores. Integration involves preprocessing inputs, handling model inference, and post-processing outputs for downstream tasks such as OCR, data extraction, or automation.
Choosing among object detection models involves a trade-off between model size, inference speed, and detection quality. Larger backbones (e.g., r50vd-based RT-DETR models) tend to achieve higher accuracy but require more compute and latency. Smaller models such as yolos-small trade some accuracy for faster inference and lower memory footprint. Domain-specific models (like the table transformers) are purpose-built for particular use cases and generally outperform general-purpose models on their target task. The right choice depends on the acceptable throughput, hardware budget, and precision requirements of your application.
For most call volumes, using a hosted API eliminates the overhead of provisioning GPUs, managing inference frameworks, and scaling for variable demand — making it a simpler and more cost-effective option than self-hosting.
compare
| model | params | downloads/mo | price | status |
|---|---|---|---|---|
| microsoft/table-transformer-structure-recognition | 28.8M | 1.8M | ~$0.047 / 1k images | coming soon |
| microsoft/table-transformer-detection | 28.8M | 1.5M | ~$0.047 / 1k images | coming soon |
| hustvl/yolos-small | 30.7M | 713.6K | ~$0.047 / 1k images | coming soon |
| PaddlePaddle/PP-DocLayoutV3_safetensors | 33.3M | 341.1K | ~$0.047 / 1k images | coming soon |
| PekingU/rtdetr_v2_r50vd | 43M | 309.8K | ~$0.047 / 1k images | coming soon |
| PekingU/rtdetr_r50vd_coco_o365 | 43M | 254.5K | ~$0.047 / 1k images | coming soon |
| microsoft/table-transformer-structure-recognition-v1.1-all | 28.8M | 239.5K | ~$0.047 / 1k images | coming soon |
| TahaDouaji/detr-doc-table-detection | 41.6M | 208.1K | ~$0.047 / 1k images | coming soon |
| keremberke/yolov8m-table-extraction | - | 176.4K | at launch | coming soon |
| hustvl/yolos-tiny | 6.5M | 100.9K | ~$0.047 / 1k images | coming soon |
| PekingU/rtdetr_r101vd_coco_o365 | 76.8M | 99.4K | ~$0.047 / 1k images | coming soon |
| PekingU/rtdetr_v2_r18vd | 20.2M | 97.1K | ~$0.047 / 1k images | coming soon |
| Anzhc/Anzhcs_YOLOs | - | 75.5K | at launch | coming soon |
| PekingU/rtdetr_r50vd | 43M | 63.7K | ~$0.047 / 1k images | coming soon |
| foduucom/stockmarket-pattern-detection-yolov8 | - | 42.1K | at launch | coming soon |
| morsetechlab/yolov11-license-plate-detection | - | 26.5K | at launch | coming soon |
| keremberke/yolov5m-license-plate | - | 23.8K | at launch | coming soon |
| valentinafevu/yolos-fashionpedia | - | 21.4K | at launch | coming soon |
| microsoft/conditional-detr-resnet-50 | 43.5M | 18.5K | ~$0.047 / 1k images | coming soon |
| PekingU/rtdetr_r18vd_coco_o365 | 20.2M | 17.3K | ~$0.047 / 1k images | coming soon |
| Ultralytics/YOLOv8 | - | 10.4K | at launch | coming soon |
| iitolstykh/YOLO-Face-Person-Detector | - | 10.2K | at launch | coming soon |
| Ultralytics/YOLO11 | - | 9.8K | at launch | coming soon |
| PekingU/rtdetr_r18vd | 20.2M | 9K | ~$0.047 / 1k images | coming soon |
| Ultralytics/YOLO26 | - | 8.5K | at launch | coming soon |
| SenseTime/deformable-detr | 40.2M | 8.1K | ~$0.047 / 1k images | coming soon |
| Fuyucchi/yolov8_animeface | - | 7.8K | at launch | coming soon |
| facebook/detr-resnet-101-dc5 | 60.7M | 7.1K | ~$0.047 / 1k images | coming soon |
| PekingU/rtdetr_v2_r101vd | 76.8M | 6.9K | ~$0.047 / 1k images | coming soon |
| Xenova/detr-resnet-50 | - | 6.5K | at launch | coming soon |
| tech4humans/conditional-detr-50-signature-detector | 43.5M | 6.2K | ~$0.047 / 1k images | coming soon |
| mosesb/best-comic-panel-detection | - | 4.6K | at launch | coming soon |
| mudler/locate-anything.cpp-gguf | - | 4.6K | at launch | coming soon |
| ustc-community/dfine-small-coco | 10.4M | 4.5K | ~$0.047 / 1k images | coming soon |
| jameslahm/yoloe | - | 4.3K | at launch | coming soon |
| Armaggheddon/yolo11-document-layout | - | 4.2K | at launch | coming soon |