documentation
Docs
Base URL https://gigarouter.ai/v1. Authenticate with Authorization: Bearer <key>. Get a key (free credit included).
Rerank (curl)
# scores documents by relevance to the query; billed per document curl https://gigarouter.ai/v1/rerank \ -H "Authorization: Bearer $GR_KEY" \ -H "Content-Type: application/json" \ -d '{"model":"cross-encoder/ms-marco-MiniLM-L6-v2", "query":"how do I reset my password", "documents":["Password reset steps...","Billing FAQ..."]}'
Rerank (Python)
# plain requests - no SDK needed import requests r = requests.post("https://gigarouter.ai/v1/rerank", headers={"Authorization": f"Bearer {KEY}"}, json={"model": "cross-encoder/ms-marco-MiniLM-L6-v2", "query": query, "documents": docs}) for hit in r.json()["results"]: print(hit["index"], hit["relevance_score"])
Embeddings (OpenAI client)
# the OpenAI SDK works - just change base_url from openai import OpenAI client = OpenAI(base_url="https://gigarouter.ai/v1", api_key=KEY) v = client.embeddings.create( model="Qwen/Qwen3-Embedding-0.6B", input=["hello world"]) print(v.data[0].embedding[:4])
notes
- Pricing — rerank per 1k documents, embeddings per 1M tokens. Send X-Tier: batch for the discounted flexible tier.
- Cold start — a rarely-used model may be asleep; the first call returns 503 + Retry-After while it warms (about 90s). Retry, or send Prefer: wait=30 to hold the request. Popular models stay warm.
- Balance — prepaid. Check it at GET /account?key=.... Out of credit? Email us to top up (self-serve billing is coming).