tasks / object detection

Hosted object detection models

36 models · 0 live as APIs · benchmarked & compared

Object detection models locate and classify objects within images or documents. They solve problems such as extracting tables from scanned PDFs (e.g., microsoft/table-transformer-structure-recognition, TahaDouaji/detr-doc-table-detection), detecting table regions in layouts (microsoft/table-transformer-detection, microsoft/table-transformer-structure-recognition-v1.1-all), and identifying small objects in general scenes (hustvl/yolos-small). Other models target document layout parsing (PaddlePaddle/PP-DocLayoutV3_safetensors) or high-accuracy detection across diverse categories (PekingU/rtdetr_v2_r50vd, PekingU/rtdetr_r50vd_coco_o365).

In production, these models are often deployed as part of a document processing pipeline, a real-time video analysis system, or a batch annotation service. They are typically called via an API that accepts an image or a document page and returns bounding boxes with class labels and confidence scores. Integration involves preprocessing inputs, handling model inference, and post-processing outputs for downstream tasks such as OCR, data extraction, or automation.

Choosing among object detection models involves a trade-off between model size, inference speed, and detection quality. Larger backbones (e.g., r50vd-based RT-DETR models) tend to achieve higher accuracy but require more compute and latency. Smaller models such as yolos-small trade some accuracy for faster inference and lower memory footprint. Domain-specific models (like the table transformers) are purpose-built for particular use cases and generally outperform general-purpose models on their target task. The right choice depends on the acceptable throughput, hardware budget, and precision requirements of your application.

For most call volumes, using a hosted API eliminates the overhead of provisioning GPUs, managing inference frameworks, and scaling for variable demand — making it a simpler and more cost-effective option than self-hosting.

compare

model	params	downloads/mo	price	status
microsoft/table-transformer-structure-recognition	28.8M	1.8M	~$0.047 / 1k images	coming soon
microsoft/table-transformer-detection	28.8M	1.5M	~$0.047 / 1k images	coming soon
hustvl/yolos-small	30.7M	713.6K	~$0.047 / 1k images	coming soon
PaddlePaddle/PP-DocLayoutV3_safetensors	33.3M	341.1K	~$0.047 / 1k images	coming soon
PekingU/rtdetr_v2_r50vd	43M	309.8K	~$0.047 / 1k images	coming soon
PekingU/rtdetr_r50vd_coco_o365	43M	254.5K	~$0.047 / 1k images	coming soon
microsoft/table-transformer-structure-recognition-v1.1-all	28.8M	239.5K	~$0.047 / 1k images	coming soon
TahaDouaji/detr-doc-table-detection	41.6M	208.1K	~$0.047 / 1k images	coming soon
keremberke/yolov8m-table-extraction	-	176.4K	at launch	coming soon
hustvl/yolos-tiny	6.5M	100.9K	~$0.047 / 1k images	coming soon
PekingU/rtdetr_r101vd_coco_o365	76.8M	99.4K	~$0.047 / 1k images	coming soon
PekingU/rtdetr_v2_r18vd	20.2M	97.1K	~$0.047 / 1k images	coming soon
Anzhc/Anzhcs_YOLOs	-	75.5K	at launch	coming soon
PekingU/rtdetr_r50vd	43M	63.7K	~$0.047 / 1k images	coming soon
foduucom/stockmarket-pattern-detection-yolov8	-	42.1K	at launch	coming soon
morsetechlab/yolov11-license-plate-detection	-	26.5K	at launch	coming soon
keremberke/yolov5m-license-plate	-	23.8K	at launch	coming soon
valentinafevu/yolos-fashionpedia	-	21.4K	at launch	coming soon
microsoft/conditional-detr-resnet-50	43.5M	18.5K	~$0.047 / 1k images	coming soon
PekingU/rtdetr_r18vd_coco_o365	20.2M	17.3K	~$0.047 / 1k images	coming soon
Ultralytics/YOLOv8	-	10.4K	at launch	coming soon
iitolstykh/YOLO-Face-Person-Detector	-	10.2K	at launch	coming soon
Ultralytics/YOLO11	-	9.8K	at launch	coming soon
PekingU/rtdetr_r18vd	20.2M	9K	~$0.047 / 1k images	coming soon
Ultralytics/YOLO26	-	8.5K	at launch	coming soon
SenseTime/deformable-detr	40.2M	8.1K	~$0.047 / 1k images	coming soon
Fuyucchi/yolov8_animeface	-	7.8K	at launch	coming soon
facebook/detr-resnet-101-dc5	60.7M	7.1K	~$0.047 / 1k images	coming soon
PekingU/rtdetr_v2_r101vd	76.8M	6.9K	~$0.047 / 1k images	coming soon
Xenova/detr-resnet-50	-	6.5K	at launch	coming soon
tech4humans/conditional-detr-50-signature-detector	43.5M	6.2K	~$0.047 / 1k images	coming soon
mosesb/best-comic-panel-detection	-	4.6K	at launch	coming soon
mudler/locate-anything.cpp-gguf	-	4.6K	at launch	coming soon
ustc-community/dfine-small-coco	10.4M	4.5K	~$0.047 / 1k images	coming soon
jameslahm/yoloe	-	4.3K	at launch	coming soon
Armaggheddon/yolo11-document-layout	-	4.2K	at launch	coming soon

get a key + $25 free →docs