Routing policies

Automatically route AI requests across providers to optimize cost, performance, and reliability.

Routing policies control which AI provider and model handles each request. Instead of hardcoding a single provider, policies automatically select the best option based on your optimization goals — whether that’s minimizing cost, maximizing uptime, or balancing both with ML-powered routing.

Strategy comparison

StrategyAPI ConfigDescriptionDetail Page
Singletype: "fallback" (1 provider)Always route to one providerSingle Provider
Prioritytype: "fallback"Try providers in priority order with automatic failoverPriority
Least LatencyDashboard onlyRoute to the fastest providerPerformance
Lowest CostDashboard onlyRoute to the cheapest providerPerformance
Cost Optimizedtype: "intelligent", axis: "cost"ML-based routing — ~70% traffic to cheaper modelsIntelligent
Balancedtype: "intelligent", axis: "performance"ML-based routing — even cost/quality splitIntelligent
Quality Firsttype: "intelligent", axis: "intelligence"ML-based routing — ~70% traffic to capable modelsIntelligent

Choosing the right strategy

Your PriorityRecommended Strategy
Simplicity / dev environmentsingle
High availability / failoverpriority
Fastest response timeleast_latency
Lowest cost (same model)lowest_cost
Lowest cost (mixed models)cost_optimized
General production optimizationbalanced
Maximum output qualityquality_first

Tag-based routing

You can attach tags to requests — like user tier, region, or environment — and use them to route to different policies. Rules are evaluated in priority order: the first matching rule applies, and unmatched requests fall through to the default strategy.

Conditions support AND/OR logic and operators like eq, gt, in, contains, starts_with, and exists. Configure tag-based routing through the dashboard or the API.

FAQs

The complexity scoring step adds ~1–4ms. Negligible compared to LLM inference time.

Clean separation below 0.4 (simple) and above 0.6 (complex). Edge cases around 0.5 route conservatively to more capable models.

Gateway falls back to the most capable model in your policy. Quality is never compromised by a scorer failure.

Yes. New models work immediately — capabilities are inferred from pricing data.

The router only selects from models in your policy, never outside of it.

Gateway returns an error after all failover attempts are exhausted. Provider health is tracked automatically so requests skip providers that are currently down.

No. One org-level default. You can have one additional policy per project for project-scoped routing.