models / zero-shot image · coming soon

marqo-fashionSigLIP

Marqo/marqo-fashionSigLIP

A popular open zero-shot image model, with 642.8K downloads a month. gigarouter benchmarks and hosts it as an OpenAI-compatible API.

est. price

~$0.094

/ 1k images · estimated, set at launch

API providers

downloads / mo

642.8K

license

apache-2.0

about this model

Marqo-FashionSigLIP is a multimodal embedding model for fashion search and retrieval. It provides up to a 57% improvement in mean reciprocal rank (MRR) and recall over FashionCLIP. The model leverages Generalized Contrastive Learning (GCL), allowing it to be trained on text descriptions, categories, style, colors, materials, keywords, and fine details to deliver highly relevant results for fashion products. It is fine-tuned from ViT-B-16-SigLIP (webli).

This model is best suited for zero-shot image retrieval and classification tasks in the fashion domain, including text-to-image, category-to-product, and sub-category-to-product matching. A newer version, marqo-fashion-SigLip-2, is available with a further 78% improvement in MRR and recall.

Benchmark Results

Average evaluation results across six public multimodal fashion datasets (Atlas, DeepFashion In-shop, DeepFashion Multimodal, Fashion200k, KAGL, Polyvore) are shown below.

Text-To-Image (Averaged across 6 datasets)

Model	AvgRecall	Recall@1	Recall@10	MRR
Marqo-FashionSigLIP	0.231	0.121	0.340	0.239
FashionCLIP2.0	0.163	0.077	0.249	0.165
OpenFashionCLIP	0.132	0.060	0.204	0.135
ViT-B-16-laion2b_s34b_b88k	0.174	0.088	0.261	0.180
ViT-B-16-SigLIP-webli	0.212	0.111	0.314	0.214

Category-To-Product (Averaged across 5 datasets)

Model	AvgP	P@1	P@10	MRR
Marqo-FashionSigLIP	0.737	0.758	0.716	0.812
FashionCLIP2.0	0.684	0.681	0.686	0.741
OpenFashionCLIP	0.646	0.653	0.639	0.720
ViT-B-16-laion2b_s34b_b88k	0.662	0.673	0.652	0.743
ViT-B-16-SigLIP-webli	0.688	0.690	0.685	0.751

Sub-Category-To-Product (Averaged across 4 datasets)

Model	AvgP	P@1	P@10	MRR
Marqo-FashionSigLIP	0.725	0.767	0.683	0.811
FashionCLIP2.0	0.657	0.676	0.638	0.733
OpenFashionCLIP	0.598	0.619	0.578	0.689
Vi

not yet live

We're benchmarking and onboarding marqo-fashionSigLIP as a hosted, OpenAI-compatible API. Sign in for free credit and be ready when it lands, or tell us you want it and we'll prioritize it.