AI Gateway
Key Features
Everything you need to integrate AI Gateway into your production systems.
Universal Provider Access
Route to OpenAI, Anthropic, Google Gemini, Mistral, Groq, Together AI, DeepSeek, and Ollama through a single OpenAI-compatible API. Switch models by changing one parameter — no code changes required.
Automatic Fallback & Retry
Define fallback chains per model. If the primary provider fails or times out, AI Gateway automatically retries with exponential backoff and falls back to the next provider in the chain. Zero downtime for your application.
Real-Time Cost Tracking
Every API call is tracked with precise token counts and cost calculations per provider. Set monthly budget limits per user or team. View spend breakdowns by provider, model, and time period.
Semantic Response Caching
Exact-match and semantic caching powered by pgvector. Similar prompts return cached responses instantly, cutting costs and latency. TTL-based expiry keeps cache fresh. Cache hit/miss reported in response headers.
API Endpoints
Production-ready REST API endpoints. All requests require a valid API key in the Authorization header.
/api/v1/gateway/chat/completionsOpenAI-compatible chat completions. Send messages array with model name and get back a response in standard OpenAI format. Supports streaming via SSE. Works with any OpenAI SDK or library.
/api/v1/gateway/embeddingsGenerate embeddings using any supported provider. Returns vectors in OpenAI-compatible format. Supports text-embedding-3-small, text-embedding-3-large, and provider-specific embedding models.
/api/v1/gateway/modelsList all available models across all configured providers with pricing information, context window sizes, and current availability status.
/api/v1/gateway/usageGet your LLM spend breakdown by provider, model, and time period. Includes total cost, token counts, cache hit rates, and average latency per provider.
/api/v1/gateway/healthCheck the health and availability of all configured LLM providers. Returns status, latency, and error rates for each provider.
Example Request
curl -X POST \
https://bolor-intelligence.com/api/api/v1/gateway/chat/completions \
-H "Authorization: Bearer sk-your-api-key" \
-H "Content-Type: application/json" \
-d '{
"query": "Your input here",
"options": {
"max_latency_ms": 5000,
"min_confidence": 0.8
}
}'Use Cases
See how teams are using AI Gateway in production today.
Multi-Provider Resilience
Production applications use AI Gateway to eliminate single-provider dependency. When OpenAI has an outage, traffic automatically falls back to Anthropic or Google — users never notice. Teams report 99.99% effective uptime across providers.
Cost Optimization
Engineering teams route different workloads to the most cost-effective provider. Simple queries go to Groq for speed, complex reasoning to Claude, and embeddings to OpenAI. Teams typically save 40-60% on LLM costs compared to single-provider usage.
OpenAI SDK Drop-In Replacement
Teams using the OpenAI Python or JS SDK point their base URL to AI Gateway and instantly gain access to every provider. No code changes, no new SDKs to learn. Existing tools like LangChain and LlamaIndex work out of the box.
Start Building with AI Gateway
Get your API key and make your first call in under 5 minutes. Free tier includes 100 requests per hour.