OpenAI Compatibility

TokenRouter is designed as a drop-in replacement for the OpenAI API. This means you can use TokenRouter with existing OpenAI client libraries and tools with minimal code changes.

Quick Migration

Switch from OpenAI to TokenRouter with a simple base URL change:

// Before: OpenAI
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY
});

// After: TokenRouter
const client = new OpenAI({
  apiKey: process.env.TOKENROUTER_API_KEY,
  baseURL: 'https://api.tokenrouter.io/v1'
});

# Before: OpenAI
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("OPENAI_API_KEY")
)

# After: TokenRouter
client = OpenAI(
    api_key=os.getenv("TOKENROUTER_API_KEY"),
    base_url="https://api.tokenrouter.io/v1"
)

# Before: OpenAI
curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{"model": "gpt-4o", "messages": [...]}'

# After: TokenRouter
curl https://api.tokenrouter.io/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKENROUTER_API_KEY" \
  -d '{"model": "gpt-4o", "messages": [...]}'

Supported Endpoints

TokenRouter implements the following OpenAI-compatible endpoints:

Chat Completions

The primary endpoint for generating conversational responses.

Endpoint: POST /v1/chat/completions

OpenAI Equivalent: https://api.openai.com/v1/chat/completions

TypeScript
Python

const response = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [
    { role: 'system', content: 'You are a helpful assistant' },
    { role: 'user', content: 'Hello!' }
  ],
  temperature: 0.7,
  max_tokens: 1000
});

console.log(response.choices[0].message.content);

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant"},
        {"role": "user", "content": "Hello!"}
    ],
    temperature=0.7,
    max_tokens=1000
)

print(response.choices[0].message.content)

Streaming

Stream responses in real-time using Server-Sent Events (SSE).

TypeScript
Python

const stream = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Tell me a story' }],
  stream: true
});

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content || '';
  process.stdout.write(content);
}

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)

for chunk in stream:
    content = chunk.choices[0].delta.content or ""
    print(content, end="", flush=True)

Supported Parameters

TokenRouter supports the following OpenAI chat completion parameters:

Parameter	Type	Support	Notes
`model`	string	✅ Full	Supports OpenAI models + auto-routing (`auto:balance`, etc.)
`messages`	array	✅ Full	Standard message format with roles
`temperature`	number	✅ Full	0.0 to 2.0
`max_tokens`	integer	✅ Full	Maximum tokens in response
`top_p`	number	✅ Full	Nucleus sampling (0.0 to 1.0)
`stream`	boolean	✅ Full	Server-Sent Events streaming
`stop`	string/array	✅ Full	Stop sequences
`presence_penalty`	number	✅ Full	-2.0 to 2.0
`frequency_penalty`	number	✅ Full	-2.0 to 2.0
`logit_bias`	object	⚠️ Provider-specific	Only OpenAI provider supports
`user`	string	✅ Full	User identifier for tracking
`seed`	integer	⚠️ Provider-specific	Only OpenAI provider supports
`tools`	array	✅ Full	Function calling / tool use
`tool_choice`	string/object	✅ Full	Control tool selection
`response_format`	object	⚠️ Provider-specific	JSON mode (OpenAI, DeepSeek only)
`n`	integer	❌ Not supported	Multiple completions not yet supported
`logprobs`	boolean	❌ Not supported	Log probabilities not yet supported

Enhanced Features

TokenRouter extends the OpenAI API with additional capabilities:

Auto-Routing Modes

Use intelligent routing instead of hardcoding models:

TypeScript
Python

// Standard OpenAI
const response = await client.chat.completions.create({
  model: 'gpt-4o',  // Fixed model
  messages: [...]
});

// TokenRouter auto-routing
const response = await client.chat.completions.create({
  model: 'auto:balance',  // Intelligently routes to best provider
  messages: [...]
});

# Standard OpenAI
response = client.chat.completions.create(
    model="gpt-4o",  # Fixed model
    messages=[...]
)

# TokenRouter auto-routing
response = client.chat.completions.create(
    model="auto:balance",  # Intelligently routes to best provider
    messages=[...]
)

Auto-routing modes:

auto:fast - Prioritize speed
auto:balance - Balance speed, cost, and quality
auto:cost - Minimize cost
auto:quality - Maximize quality

Provider Selection

Explicitly route to specific providers:

const response = await client.chat.completions.create({
  model: 'anthropic:claude-3-5-sonnet-20241022',  // Force Anthropic
  messages: [...]
});

Provider prefixes:

openai: - OpenAI models
anthropic: - Anthropic Claude models
gemini: - Google Gemini models
mistral: - Mistral AI models
deepseek: - DeepSeek models

Metadata Tracking

Add custom metadata to requests for analytics:

TypeScript
Python

const response = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [...],
  // TokenRouter extension
  user: JSON.stringify({
    user_id: '12345',
    task: 'code_review',
    environment: 'production'
  })
});

import json

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[...],
    # TokenRouter extension
    user=json.dumps({
        "user_id": "12345",
        "task": "code_review",
        "environment": "production"
    })
)

Response Format

TokenRouter returns responses in standard OpenAI format:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1699000000,
  "model": "gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 15,
    "total_tokens": 25
  }
}

Additional Response Fields

TokenRouter adds extra fields for observability:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1699000000,
  "model": "gpt-4o",
  "choices": [...],
  "usage": {...},

  // TokenRouter extensions
  "provider": "openai",           // Which provider handled the request
  "latency_ms": 850,              // Request latency in milliseconds
  "routing_mode": "balance",      // Routing mode used
  "x_request_id": "req_abc123"    // TokenRouter request ID
}

Migration Examples

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY
});

const response = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [
    { role: 'user', content: 'Explain quantum computing' }
  ]
});

console.log(response.choices[0].message.content);

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.TOKENROUTER_API_KEY,
  baseURL: 'https://api.tokenrouter.io/v1'
});

const response = await client.chat.completions.create({
  model: 'auto:balance',  // Use intelligent routing
  messages: [
    { role: 'user', content: 'Explain quantum computing' }
  ]
});

console.log(response.choices[0].message.content);
console.log(`Provider: ${response.provider}`);  // New: See which provider was used

Example 2: Streaming Chat

Before (OpenAI)
After (TokenRouter)

from openai import OpenAI

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)

for chunk in stream:
    content = chunk.choices[0].delta.content or ""
    print(content, end="", flush=True)

from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("TOKENROUTER_API_KEY"),
    base_url="https://api.tokenrouter.io/v1"
)

stream = client.chat.completions.create(
    model="auto:fast",  # Use fastest available provider
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)

for chunk in stream:
    content = chunk.choices[0].delta.content or ""
    print(content, end="", flush=True)

Example 3: Function Calling

Before (OpenAI)
After (TokenRouter)

const tools = [{
  type: 'function',
  function: {
    name: 'get_weather',
    description: 'Get current weather',
    parameters: {
      type: 'object',
      properties: {
        location: { type: 'string' }
      },
      required: ['location']
    }
  }
}];

const response = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'What is the weather in NYC?' }],
  tools
});

const tools = [{
  type: 'function',
  function: {
    name: 'get_weather',
    description: 'Get current weather',
    parameters: {
      type: 'object',
      properties: {
        location: { type: 'string' }
      },
      required: ['location']
    }
  }
}];

const response = await client.chat.completions.create({
  model: 'auto:balance',  // Automatically picks provider with tool support
  messages: [{ role: 'user', content: 'What is the weather in NYC?' }],
  tools
});

Provider Differences

While TokenRouter maintains OpenAI compatibility, different providers have unique characteristics:

OpenAI

✅ Full OpenAI API compatibility
✅ All parameters supported
✅ JSON schema mode
✅ Function calling
✅ Streaming

Anthropic (Claude)

✅ Message format (converted automatically)
✅ Function calling (via tool use)
✅ Streaming
❌ JSON mode not supported
❌ logit_bias not supported
⚠️ Different token counting

Google (Gemini)

✅ Message format (converted automatically)
✅ Function calling
✅ Streaming
❌ JSON mode not supported
❌ logit_bias not supported
⚠️ Different system message handling

Mistral

✅ OpenAI-compatible format
✅ Function calling
✅ Streaming
❌ JSON mode not supported
⚠️ Limited model selection

DeepSeek

✅ OpenAI-compatible format
✅ Basic JSON mode
✅ Streaming
❌ JSON schema not supported
⚠️ Reasoning models have unique behavior

Common Compatibility Issues

Issue 1: Provider-Specific Parameters

Problem: Some parameters only work with specific providers.

Solution: Use auto-routing or check provider capabilities:

// Option 1: Use auto-routing (handles capabilities automatically)
const response = await client.chat.completions.create({
  model: 'auto:balance',
  messages: [...],
  response_format: { type: 'json_object' }  // Only works with OpenAI/DeepSeek
});

// Option 2: Explicitly specify compatible provider
const response = await client.chat.completions.create({
  model: 'openai:gpt-4o',
  messages: [...],
  response_format: { type: 'json_object' }
});

Issue 2: Message Format Differences

Problem: Different providers have slightly different message requirements.

Solution: TokenRouter automatically transforms messages:

// Works across all providers
const response = await client.chat.completions.create({
  model: 'auto:balance',
  messages: [
    { role: 'system', content: 'You are helpful' },  // Auto-transformed for each provider
    { role: 'user', content: 'Hello' }
  ]
});

Issue 3: Token Counting Variations

Problem: Different providers count tokens differently.

Solution: Use TokenRouter’s unified usage tracking:

const response = await client.chat.completions.create({
  model: 'auto:balance',
  messages: [...]
});

// Consistent across providers
console.log(`Tokens used: ${response.usage.total_tokens}`);
console.log(`Provider: ${response.provider}`);

Framework Integration

LangChain

Use TokenRouter with LangChain:

TypeScript
Python

import { ChatOpenAI } from '@langchain/openai';

const model = new ChatOpenAI({
  openAIApiKey: process.env.TOKENROUTER_API_KEY,
  configuration: {
    baseURL: 'https://api.tokenrouter.io/v1'
  },
  modelName: 'auto:balance'
});

const response = await model.invoke([
  { role: 'user', content: 'Hello!' }
]);

from langchain_openai import ChatOpenAI

model = ChatOpenAI(
    openai_api_key=os.getenv("TOKENROUTER_API_KEY"),
    openai_api_base="https://api.tokenrouter.io/v1",
    model_name="auto:balance"
)

response = model.invoke([
    {"role": "user", "content": "Hello!"}
])

LlamaIndex

Use TokenRouter with LlamaIndex:

from llama_index.llms.openai import OpenAI

llm = OpenAI(
    api_key=os.getenv("TOKENROUTER_API_KEY"),
    api_base="https://api.tokenrouter.io/v1",
    model="auto:balance"
)

response = llm.complete("What is the capital of France?")

Vercel AI SDK

Use TokenRouter with Vercel AI SDK:

import { openai } from '@ai-sdk/openai';
import { generateText } from 'ai';

const client = openai({
  apiKey: process.env.TOKENROUTER_API_KEY,
  baseURL: 'https://api.tokenrouter.io/v1'
});

const { text } = await generateText({
  model: client('auto:balance'),
  prompt: 'Explain machine learning'
});

Testing Compatibility

Verify Basic Compatibility

TypeScript
Python

import OpenAI from 'openai';

async function testCompatibility() {
  const client = new OpenAI({
    apiKey: process.env.TOKENROUTER_API_KEY,
    baseURL: 'https://api.tokenrouter.io/v1'
  });

  try {
    const response = await client.chat.completions.create({
      model: 'gpt-4o',
      messages: [{ role: 'user', content: 'Hello' }]
    });

    console.log('✅ Basic compatibility confirmed');
    console.log(`Provider: ${response.provider}`);
    console.log(`Response: ${response.choices[0].message.content}`);
  } catch (error) {
    console.error('❌ Compatibility issue:', error);
  }
}

from openai import OpenAI

def test_compatibility():
    client = OpenAI(
        api_key=os.getenv("TOKENROUTER_API_KEY"),
        base_url="https://api.tokenrouter.io/v1"
    )

    try:
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[{"role": "user", "content": "Hello"}]
        )

        print("✅ Basic compatibility confirmed")
        print(f"Provider: {response.provider}")
        print(f"Response: {response.choices[0].message.content}")
    except Exception as e:
        print(f"❌ Compatibility issue: {e}")

Best Practices

Use Auto-Routing - Let TokenRouter pick the best provider automatically
Handle Provider Differences - Test with multiple providers if using provider-specific features
Check Response Metadata - Use TokenRouter’s additional fields for debugging
Gradual Migration - Start with one endpoint, then expand
Monitor Usage - Use TokenRouter dashboard to track provider usage
Set Up Fallbacks - Use routing rules for automatic fallback between providers
Test Streaming - Verify streaming works with your framework
Validate Tool Calls - Test function calling across different providers

Troubleshooting

Issue: Responses Look Different

Problem: Response structure slightly different from OpenAI.

Solution: TokenRouter follows OpenAI format exactly. Check for additional fields like provider which don’t break compatibility:

// This works identically
const content = response.choices[0].message.content;

// These are TokenRouter enhancements (optional)
const provider = response.provider;  // New field
const latency = response.latency_ms;  // New field

Issue: Tool Calling Not Working

Problem: Function calls don’t work with certain models.

Solution: Use auto-routing or OpenAI/Anthropic models:

// Works across OpenAI and Anthropic
const response = await client.chat.completions.create({
  model: 'auto:balance',
  messages: [...],
  tools: [...]
});

Issue: JSON Mode Unavailable

Problem: JSON mode returns error with some providers.

Solution: Use OpenAI or DeepSeek explicitly:

const response = await client.chat.completions.create({
  model: 'openai:gpt-4o',  // Explicitly use OpenAI
  messages: [...],
  response_format: { type: 'json_object' }
});

Next Steps

Create Routing Rules - Customize request routing
Set Up Firewall - Add content filtering
View Dashboard - Monitor usage and performance
API Reference - Explore all endpoints