Skip to content

TokenRouter Overview - Intelligent AI Routing Platform

TokenRouter is an intelligent LLM routing platform that sits between your application and AI providers. It automatically selects the best provider and model for each request based on your optimization preferences, routing rules, and firewall policies.

  • Cost Savings: Automatically route to the most cost-effective provider for each request
  • Improved Reliability: Failover to backup providers when your primary choice is unavailable
  • Better Performance: Route based on latency, quality, or balanced optimization
  • Simplified Integration: One API for multiple providers (OpenAI, Anthropic, Google, Mistral, DeepSeek, Meta)
  • Enhanced Security: Built-in firewall rules and content filtering
  • Full Observability: Track usage, costs, and performance across all providers

TokenRouter wraps the OpenAI Responses API with an intelligent routing layer:

  1. Send a Request: Your application sends a standard request to https://api.tokenrouter.io/v1/responses
  2. Intelligent Routing: TokenRouter analyzes your request, routing rules, and provider availability
  3. Provider Selection: The optimal provider is selected based on your routing mode (cost, quality, latency, or balance)
  4. Firewall Enforcement: Your firewall rules are applied to filter or modify the request
  5. Response Delivery: A fully OpenAI-compatible response is returned to your application

The router monitors live provider pricing, historical latency, and your account limits to ensure you always get the best trade-off for each request.

You can adjust the routing strategy per-request, define deterministic rules in the console, or force a specific provider when compliance requires it.

Use automatic routing modes for intelligent provider selection:

const response = await client.responses.create({
model: 'auto:balance', // Balanced optimization
input: 'Explain quantum computing'
});

Available modes:

  • auto:balance - Balanced trade-off
  • auto:cost - Minimize costs
  • auto:quality - Maximize quality
  • auto:latency - Minimize latency

TokenRouter supports the following AI providers:

ProviderModelsSpecial Features
OpenAIGPT-4o, GPT-4 Turbo, GPT-3.5 TurboFunction calling, vision, structured outputs
AnthropicClaude 3.7 Sonnet, Claude 3 Opus, Claude 3 HaikuExtended context, vision, tool use
GoogleGemini 1.5 Pro, Gemini 1.5 FlashMultimodal, long context
MistralMistral Large, Mistral MediumEuropean hosting, function calling
DeepSeekDeepSeek V3Cost-effective, code generation
MetaLlama 4 (special access required)Open weights, on-premises options

TokenRouter is designed as a drop-in replacement for the OpenAI API. If you’re currently using OpenAI, migrating to TokenRouter requires minimal changes:

import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.OPENAI_API_KEY
});
const response = await client.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: 'Hello!' }]
});

The key differences:

  • Use TokenRouter API key instead of OpenAI key
  • Use responses.create() instead of chat.completions.create()
  • Use input parameter instead of messages array (simplified interface)
  • Access to multiple providers through routing modes
  1. Create an account at TokenRouter
  2. Generate an API key in the console
  3. Add your provider keys (OpenAI, Anthropic, etc.)
  4. Install the SDK and make your first request