Skip to content

OpenAI Compatibility

TokenRouter is designed as a drop-in replacement for the OpenAI API. This means you can use TokenRouter with existing OpenAI client libraries and tools with minimal code changes.

Switch from OpenAI to TokenRouter with a simple base URL change:

// Before: OpenAI
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.OPENAI_API_KEY
});
// After: TokenRouter
const client = new OpenAI({
apiKey: process.env.TOKENROUTER_API_KEY,
baseURL: 'https://api.tokenrouter.io/v1'
});

TokenRouter implements the following OpenAI-compatible endpoints:

The primary endpoint for generating conversational responses.

Endpoint: POST /v1/chat/completions

OpenAI Equivalent: https://api.openai.com/v1/chat/completions

const response = await client.chat.completions.create({
model: 'gpt-4o',
messages: [
{ role: 'system', content: 'You are a helpful assistant' },
{ role: 'user', content: 'Hello!' }
],
temperature: 0.7,
max_tokens: 1000
});
console.log(response.choices[0].message.content);

Stream responses in real-time using Server-Sent Events (SSE).

const stream = await client.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: 'Tell me a story' }],
stream: true
});
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content || '';
process.stdout.write(content);
}

TokenRouter supports the following OpenAI chat completion parameters:

ParameterTypeSupportNotes
modelstring✅ FullSupports OpenAI models + auto-routing (auto:balance, etc.)
messagesarray✅ FullStandard message format with roles
temperaturenumber✅ Full0.0 to 2.0
max_tokensinteger✅ FullMaximum tokens in response
top_pnumber✅ FullNucleus sampling (0.0 to 1.0)
streamboolean✅ FullServer-Sent Events streaming
stopstring/array✅ FullStop sequences
presence_penaltynumber✅ Full-2.0 to 2.0
frequency_penaltynumber✅ Full-2.0 to 2.0
logit_biasobject⚠️ Provider-specificOnly OpenAI provider supports
userstring✅ FullUser identifier for tracking
seedinteger⚠️ Provider-specificOnly OpenAI provider supports
toolsarray✅ FullFunction calling / tool use
tool_choicestring/object✅ FullControl tool selection
response_formatobject⚠️ Provider-specificJSON mode (OpenAI, DeepSeek only)
ninteger❌ Not supportedMultiple completions not yet supported
logprobsboolean❌ Not supportedLog probabilities not yet supported

TokenRouter extends the OpenAI API with additional capabilities:

Use intelligent routing instead of hardcoding models:

// Standard OpenAI
const response = await client.chat.completions.create({
model: 'gpt-4o', // Fixed model
messages: [...]
});
// TokenRouter auto-routing
const response = await client.chat.completions.create({
model: 'auto:balance', // Intelligently routes to best provider
messages: [...]
});

Auto-routing modes:

  • auto:fast - Prioritize speed
  • auto:balance - Balance speed, cost, and quality
  • auto:cost - Minimize cost
  • auto:quality - Maximize quality

Explicitly route to specific providers:

const response = await client.chat.completions.create({
model: 'anthropic:claude-3-5-sonnet-20241022', // Force Anthropic
messages: [...]
});

Provider prefixes:

  • openai: - OpenAI models
  • anthropic: - Anthropic Claude models
  • gemini: - Google Gemini models
  • mistral: - Mistral AI models
  • deepseek: - DeepSeek models

Add custom metadata to requests for analytics:

const response = await client.chat.completions.create({
model: 'gpt-4o',
messages: [...],
// TokenRouter extension
user: JSON.stringify({
user_id: '12345',
task: 'code_review',
environment: 'production'
})
});

TokenRouter returns responses in standard OpenAI format:

{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1699000000,
"model": "gpt-4o",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 15,
"total_tokens": 25
}
}

TokenRouter adds extra fields for observability:

{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1699000000,
"model": "gpt-4o",
"choices": [...],
"usage": {...},
// TokenRouter extensions
"provider": "openai", // Which provider handled the request
"latency_ms": 850, // Request latency in milliseconds
"routing_mode": "balance", // Routing mode used
"x_request_id": "req_abc123" // TokenRouter request ID
}
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: process.env.OPENAI_API_KEY
});
const response = await client.chat.completions.create({
model: 'gpt-4o',
messages: [
{ role: 'user', content: 'Explain quantum computing' }
]
});
console.log(response.choices[0].message.content);
from openai import OpenAI
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
stream = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Tell me a story"}],
stream=True
)
for chunk in stream:
content = chunk.choices[0].delta.content or ""
print(content, end="", flush=True)
const tools = [{
type: 'function',
function: {
name: 'get_weather',
description: 'Get current weather',
parameters: {
type: 'object',
properties: {
location: { type: 'string' }
},
required: ['location']
}
}
}];
const response = await client.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: 'What is the weather in NYC?' }],
tools
});

While TokenRouter maintains OpenAI compatibility, different providers have unique characteristics:

  • ✅ Full OpenAI API compatibility
  • ✅ All parameters supported
  • ✅ JSON schema mode
  • ✅ Function calling
  • ✅ Streaming
  • ✅ Message format (converted automatically)
  • ✅ Function calling (via tool use)
  • ✅ Streaming
  • ❌ JSON mode not supported
  • ❌ logit_bias not supported
  • ⚠️ Different token counting
  • ✅ Message format (converted automatically)
  • ✅ Function calling
  • ✅ Streaming
  • ❌ JSON mode not supported
  • ❌ logit_bias not supported
  • ⚠️ Different system message handling
  • ✅ OpenAI-compatible format
  • ✅ Function calling
  • ✅ Streaming
  • ❌ JSON mode not supported
  • ⚠️ Limited model selection
  • ✅ OpenAI-compatible format
  • ✅ Basic JSON mode
  • ✅ Streaming
  • ❌ JSON schema not supported
  • ⚠️ Reasoning models have unique behavior

Problem: Some parameters only work with specific providers.

Solution: Use auto-routing or check provider capabilities:

// Option 1: Use auto-routing (handles capabilities automatically)
const response = await client.chat.completions.create({
model: 'auto:balance',
messages: [...],
response_format: { type: 'json_object' } // Only works with OpenAI/DeepSeek
});
// Option 2: Explicitly specify compatible provider
const response = await client.chat.completions.create({
model: 'openai:gpt-4o',
messages: [...],
response_format: { type: 'json_object' }
});

Problem: Different providers have slightly different message requirements.

Solution: TokenRouter automatically transforms messages:

// Works across all providers
const response = await client.chat.completions.create({
model: 'auto:balance',
messages: [
{ role: 'system', content: 'You are helpful' }, // Auto-transformed for each provider
{ role: 'user', content: 'Hello' }
]
});

Problem: Different providers count tokens differently.

Solution: Use TokenRouter’s unified usage tracking:

const response = await client.chat.completions.create({
model: 'auto:balance',
messages: [...]
});
// Consistent across providers
console.log(`Tokens used: ${response.usage.total_tokens}`);
console.log(`Provider: ${response.provider}`);

Use TokenRouter with LangChain:

import { ChatOpenAI } from '@langchain/openai';
const model = new ChatOpenAI({
openAIApiKey: process.env.TOKENROUTER_API_KEY,
configuration: {
baseURL: 'https://api.tokenrouter.io/v1'
},
modelName: 'auto:balance'
});
const response = await model.invoke([
{ role: 'user', content: 'Hello!' }
]);

Use TokenRouter with LlamaIndex:

from llama_index.llms.openai import OpenAI
llm = OpenAI(
api_key=os.getenv("TOKENROUTER_API_KEY"),
api_base="https://api.tokenrouter.io/v1",
model="auto:balance"
)
response = llm.complete("What is the capital of France?")

Use TokenRouter with Vercel AI SDK:

import { openai } from '@ai-sdk/openai';
import { generateText } from 'ai';
const client = openai({
apiKey: process.env.TOKENROUTER_API_KEY,
baseURL: 'https://api.tokenrouter.io/v1'
});
const { text } = await generateText({
model: client('auto:balance'),
prompt: 'Explain machine learning'
});
import OpenAI from 'openai';
async function testCompatibility() {
const client = new OpenAI({
apiKey: process.env.TOKENROUTER_API_KEY,
baseURL: 'https://api.tokenrouter.io/v1'
});
try {
const response = await client.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: 'Hello' }]
});
console.log('✅ Basic compatibility confirmed');
console.log(`Provider: ${response.provider}`);
console.log(`Response: ${response.choices[0].message.content}`);
} catch (error) {
console.error('❌ Compatibility issue:', error);
}
}
  1. Use Auto-Routing - Let TokenRouter pick the best provider automatically
  2. Handle Provider Differences - Test with multiple providers if using provider-specific features
  3. Check Response Metadata - Use TokenRouter’s additional fields for debugging
  4. Gradual Migration - Start with one endpoint, then expand
  5. Monitor Usage - Use TokenRouter dashboard to track provider usage
  6. Set Up Fallbacks - Use routing rules for automatic fallback between providers
  7. Test Streaming - Verify streaming works with your framework
  8. Validate Tool Calls - Test function calling across different providers

Problem: Response structure slightly different from OpenAI.

Solution: TokenRouter follows OpenAI format exactly. Check for additional fields like provider which don’t break compatibility:

// This works identically
const content = response.choices[0].message.content;
// These are TokenRouter enhancements (optional)
const provider = response.provider; // New field
const latency = response.latency_ms; // New field

Problem: Function calls don’t work with certain models.

Solution: Use auto-routing or OpenAI/Anthropic models:

// Works across OpenAI and Anthropic
const response = await client.chat.completions.create({
model: 'auto:balance',
messages: [...],
tools: [...]
});

Problem: JSON mode returns error with some providers.

Solution: Use OpenAI or DeepSeek explicitly:

const response = await client.chat.completions.create({
model: 'openai:gpt-4o', // Explicitly use OpenAI
messages: [...],
response_format: { type: 'json_object' }
});