Error Handling

TokenRouter uses standard HTTP status codes and a consistent error response format across all endpoints. This guide covers all error types, their meanings, and how to handle them properly.

Error Response Format

All errors follow a consistent JSON structure:

{
  "error": {
    "type": "string",         // Error type identifier
    "message": "string",      // Human-readable error message
    "code": "string|null",    // Provider-specific error code (if applicable)
    "provider": "string",     // Provider name (e.g., "openai", "anthropic")
    "http_status": integer,   // HTTP status code
    "raw": "object|null"      // Raw provider response (debug mode only)
  }
}

HTTP Status Codes

TokenRouter uses the following HTTP status codes:

Status Code	Meaning	Common Causes
200	Success	Request completed successfully
201	Created	Resource created (API keys, webhooks)
400	Bad Request	Malformed request, invalid parameters
401	Unauthorized	Missing or invalid API key, expired token
403	Forbidden	Insufficient permissions, free plan limitation
404	Not Found	Resource doesn’t exist or not accessible
422	Unprocessable Entity	Validation failed, firewall blocked, routing error
429	Too Many Requests	Rate limit exceeded
500	Internal Server Error	Unexpected server error
503	Service Unavailable	Provider unavailable, maintenance mode

Error Types

Authentication Errors (401)

Returned when API key is missing, invalid, or expired.

{
  "error": {
    "type": "unauthorized_error",
    "message": "Invalid API key",
    "http_status": 401
  }
}

import { TokenRouter } from 'tokenrouter';

try {
  const client = new TokenRouter({ apiKey: 'invalid_key' });
  const response = await client.responses.create({
    model: 'gpt-4o',
    input: 'Hello'
  });
} catch (error: any) {
  if (error.status === 401) {
    console.error('Authentication failed:', error.message);
    // Re-authenticate or refresh token
  }
}

from tokenrouter import TokenRouter
from tokenrouter.exceptions import TokenRouterError

try:
    client = TokenRouter(api_key="invalid_key")
    response = client.responses.create(
        model="gpt-4o",
        input="Hello"
    )
except TokenRouterError as e:
    if e.status_code == 401:
        print(f"Authentication failed: {e.message}")
        # Re-authenticate or refresh token

Common Causes:

Missing Authorization header
Invalid API key format
Expired API key
Revoked API key

Solutions:

Verify API key is correct
Check API key hasn’t expired
Generate a new API key from dashboard
Ensure proper Bearer token format

Authorization Errors (403)

Returned when attempting to access features not available on your plan.

{
  "error": {
    "type": "authorization_error",
    "message": "Routing rules are not available on the free plan",
    "http_status": 403
  }
}

try {
  await client.routingRules.create({
    name: 'My Rule',
    priority: 100,
    is_enabled: true,
    match_json: { task: 'coding' },
    action_json: { set_provider: 'openai' }
  });
} catch (error: any) {
  if (error.status === 403) {
    console.error('Feature not available:', error.message);
    console.log('Upgrade your plan to access this feature');
  }
}

try:
    client.routing_rules.create(
        name="My Rule",
        priority=100,
        is_enabled=True,
        match_json={"task": "coding"},
        action_json={"set_provider": "openai"}
    )
except TokenRouterError as e:
    if e.status_code == 403:
        print(f"Feature not available: {e.message}")
        print("Upgrade your plan to access this feature")

Common Causes:

Free plan accessing paid features (routing rules, firewall rules)
Insufficient user permissions
Resource belongs to different user

Solutions:

Upgrade to a paid plan
Verify plan includes requested feature
Check resource ownership

Validation Errors (422)

Returned when request data fails validation.

{
  "error": {
    "type": "validation_error",
    "message": "Validation failed",
    "http_status": 422,
    "errors": {
      "priority": ["The priority must be between -1000 and 1000"],
      "match_json": ["The match_json field is required"]
    }
  }
}

try {
  await client.routingRules.create({
    name: 'Invalid Rule',
    priority: 2000,  // Invalid: exceeds maximum
    is_enabled: true,
    match_json: {},
    action_json: {}
  });
} catch (error: any) {
  if (error.status === 422 && error.errors) {
    console.error('Validation errors:');
    for (const [field, messages] of Object.entries(error.errors)) {
      console.error(`  ${field}:`, messages);
    }
  }
}

try:
    client.routing_rules.create(
        name="Invalid Rule",
        priority=2000,  # Invalid: exceeds maximum
        is_enabled=True,
        match_json={},
        action_json={}
    )
except TokenRouterError as e:
    if e.status_code == 422 and hasattr(e, 'errors'):
        print("Validation errors:")
        for field, messages in e.errors.items():
            print(f"  {field}: {messages}")

Common Causes:

Missing required fields
Invalid field values
Out-of-range values
Invalid data types

Solutions:

Review field requirements
Validate data before sending
Check value constraints
Ensure proper data types

Firewall Block Errors (422)

Returned when a request is blocked by a firewall rule.

{
  "error": {
    "type": "firewall_rule",
    "message": "Request blocked by firewall rule \"Block PII\"",
    "http_status": 422,
    "meta": {
      "rule_id": 123,
      "rule_name": "Block PII",
      "action": "block",
      "reason": "Request blocked by firewall rule"
    }
  }
}

try {
  const response = await client.responses.create({
    model: 'gpt-4o',
    input: 'My SSN is 123-45-6789'  // Blocked by firewall
  });
} catch (error: any) {
  if (error.type === 'firewall_rule') {
    console.error('Firewall blocked request:', error.message);
    console.log('Rule ID:', error.meta?.rule_id);
    console.log('Rule name:', error.meta?.rule_name);
    // Sanitize input and retry
  }
}

try:
    response = client.responses.create(
        model="gpt-4o",
        input="My SSN is 123-45-6789"  # Blocked by firewall
    )
except TokenRouterError as e:
    if e.type == "firewall_rule":
        print(f"Firewall blocked request: {e.message}")
        if e.meta:
            print(f"Rule ID: {e.meta.get('rule_id')}")
            print(f"Rule name: {e.meta.get('rule_name')}")
        # Sanitize input and retry

Common Causes:

Input contains sensitive data (SSN, credit cards, etc.)
Input matches firewall regex pattern
Output matches firewall pattern (response blocking)

Solutions:

Remove sensitive data from input
Use mask action instead of block
Adjust firewall rule patterns
Disable rule if not needed

Rate Limit Errors (429)

Returned when you exceed rate limits.

{
  "error": {
    "type": "rate_limit_error",
    "message": "Rate limit exceeded. Please try again in 30 seconds",
    "http_status": 429,
    "retry_after": 30
  }
}

async function makeRequestWithRetry(maxRetries = 3) {
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      return await client.responses.create({
        model: 'gpt-4o',
        input: 'Your prompt'
      });
    } catch (error: any) {
      if (error.status === 429) {
        const retryAfter = error.retry_after || 60;
        console.log(`Rate limited. Retrying in ${retryAfter}s...`);

        if (attempt < maxRetries) {
          await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
          continue;
        }
      }
      throw error;
    }
  }
}

import time

def make_request_with_retry(max_retries=3):
    for attempt in range(1, max_retries + 1):
        try:
            return client.responses.create(
                model="gpt-4o",
                input="Your prompt"
            )
        except TokenRouterError as e:
            if e.status_code == 429:
                retry_after = e.retry_after or 60
                print(f"Rate limited. Retrying in {retry_after}s...")

                if attempt < max_retries:
                    time.sleep(retry_after)
                    continue
            raise

Rate Limit Types:

Requests per minute (RPM) - Number of requests
Tokens per minute (TPM) - Token usage
Daily quota - Total tokens per day
Monthly quota - Total tokens per month

Solutions:

Implement exponential backoff
Use retry_after header value
Reduce request frequency
Upgrade plan for higher limits
Monitor usage via dashboard

Provider Errors (500+)

Returned when the underlying LLM provider encounters an error.

{
  "error": {
    "type": "provider_error",
    "message": "The model is currently overloaded. Please try again",
    "code": "model_overloaded",
    "provider": "openai",
    "http_status": 503
  }
}

async function makeRequestWithFallback() {
  try {
    return await client.responses.create({
      model: 'gpt-4o',
      input: 'Your prompt'
    });
  } catch (error: any) {
    if (error.type === 'provider_error' && error.http_status >= 500) {
      console.error(`Provider error: ${error.provider} - ${error.message}`);

      // Try fallback with different model
      return await client.responses.create({
        model: 'auto:balance',  // Let TokenRouter pick best available
        input: 'Your prompt'
      });
    }
    throw error;
  }
}

def make_request_with_fallback():
    try:
        return client.responses.create(
            model="gpt-4o",
            input="Your prompt"
        )
    except TokenRouterError as e:
        if e.type == "provider_error" and e.status_code >= 500:
            print(f"Provider error: {e.provider} - {e.message}")

            # Try fallback with different model
            return client.responses.create(
                model="auto:balance",  # Let TokenRouter pick best available
                input="Your prompt"
            )
        raise

Common Provider Error Codes:

model_overloaded - Provider infrastructure overloaded
server_error - Internal provider error
timeout - Request timed out
invalid_api_key - Provider API key invalid or expired

Solutions:

Implement retry logic with exponential backoff
Use auto-routing for automatic fallback
Switch to different provider manually
Check provider status page

Routing Errors (422)

Returned when routing logic encounters an issue.

{
  "error": {
    "type": "routing_error",
    "message": "No suitable provider available for requested model",
    "http_status": 422,
    "meta": {
      "requested_model": "gpt-4o",
      "available_providers": []
    }
  }
}

try {
  const response = await client.responses.create({
    model: 'gpt-4o',  // May not be available
    input: 'Your prompt'
  });
} catch (error: any) {
  if (error.type === 'routing_error') {
    console.error('Routing failed:', error.message);

    // Try with auto-routing
    const fallback = await client.responses.create({
      model: 'auto:balance',
      input: 'Your prompt'
    });
    return fallback;
  }
}

try:
    response = client.responses.create(
        model="gpt-4o",  # May not be available
        input="Your prompt"
    )
except TokenRouterError as e:
    if e.type == "routing_error":
        print(f"Routing failed: {e.message}")

        # Try with auto-routing
        fallback = client.responses.create(
            model="auto:balance",
            input="Your prompt"
        )
        return fallback

Common Causes:

No provider keys configured
All providers unavailable
Model not supported by any provider
All routing rules failed to match

Solutions:

Add provider keys in dashboard
Use auto-routing for flexibility
Check model availability
Review routing rule configuration

Streaming Error Handling

Errors during streaming are delivered as SSE events:

event: error
data: {"type":"provider_error","message":"Rate limit exceeded","http_status":429}

event: done
data: null

try {
  const stream = await client.responses.create({
    model: 'gpt-4o',
    input: 'Your prompt',
    stream: true
  });

  for await (const chunk of stream) {
    switch (chunk.event) {
      case 'error':
        console.error('Stream error:', chunk.data);
        // Handle error - decide whether to retry
        break;

      case 'content.delta':
        process.stdout.write(chunk.delta.text);
        break;

      case 'done':
        console.log('\\nStream completed');
        break;
    }
  }
} catch (error) {
  console.error('Stream failed to start:', error);
}

try:
    stream = client.responses.create(
        model="gpt-4o",
        input="Your prompt",
        stream=True
    )

    for chunk in stream:
        if chunk.event == "error":
            print(f"Stream error: {chunk.data}")
            # Handle error - decide whether to retry
            break

        elif chunk.event == "content.delta":
            print(chunk.delta.text, end="", flush=True)

        elif chunk.event == "done":
            print("\\nStream completed")

except TokenRouterError as e:
    print(f"Stream failed to start: {e}")

Error Recovery Patterns

Exponential Backoff

Implement exponential backoff for transient errors:

TypeScript
Python

async function makeRequestWithBackoff(
  maxRetries = 3,
  baseDelay = 1000
) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      return await client.responses.create({
        model: 'gpt-4o',
        input: 'Your prompt'
      });
    } catch (error: any) {
      const isRetryable = [429, 500, 502, 503, 504].includes(error.status);
      const isLastAttempt = attempt === maxRetries - 1;

      if (!isRetryable || isLastAttempt) {
        throw error;
      }

      const delay = baseDelay * Math.pow(2, attempt);
      console.log(`Attempt ${attempt + 1} failed. Retrying in ${delay}ms...`);
      await new Promise(resolve => setTimeout(resolve, delay));
    }
  }
}

import time

def make_request_with_backoff(max_retries=3, base_delay=1.0):
    for attempt in range(max_retries):
        try:
            return client.responses.create(
                model="gpt-4o",
                input="Your prompt"
            )
        except TokenRouterError as e:
            is_retryable = e.status_code in [429, 500, 502, 503, 504]
            is_last_attempt = attempt == max_retries - 1

            if not is_retryable or is_last_attempt:
                raise

            delay = base_delay * (2 ** attempt)
            print(f"Attempt {attempt + 1} failed. Retrying in {delay}s...")
            time.sleep(delay)

Graceful Degradation

Fallback to simpler models or cached responses:

TypeScript
Python

async function makeRequestWithDegradation(input: string) {
  const modelFallbacks = [
    'gpt-4o',           // Try premium model first
    'gpt-4o-mini',      // Fall back to mini
    'auto:cost'         // Use cheapest available
  ];

  for (const model of modelFallbacks) {
    try {
      return await client.responses.create({ model, input });
    } catch (error: any) {
      console.warn(`${model} failed:`, error.message);
      // Try next model
    }
  }

  // All models failed - return cached or default response
  throw new Error('All models unavailable');
}

def make_request_with_degradation(input_text: str):
    model_fallbacks = [
        "gpt-4o",           # Try premium model first
        "gpt-4o-mini",      # Fall back to mini
        "auto:cost"         # Use cheapest available
    ]

    for model in model_fallbacks:
        try:
            return client.responses.create(model=model, input=input_text)
        except TokenRouterError as e:
            print(f"{model} failed: {e.message}")
            # Try next model

    # All models failed - return cached or default response
    raise Exception("All models unavailable")

Circuit Breaker

Prevent cascading failures with a circuit breaker pattern:

TypeScript
Python

class CircuitBreaker {
  private failures = 0;
  private lastFailure = 0;
  private readonly threshold = 5;
  private readonly timeout = 60000; // 60 seconds

  async call<T>(fn: () => Promise<T>): Promise<T> {
    if (this.isOpen()) {
      throw new Error('Circuit breaker is open');
    }

    try {
      const result = await fn();
      this.onSuccess();
      return result;
    } catch (error) {
      this.onFailure();
      throw error;
    }
  }

  private isOpen(): boolean {
    if (this.failures >= this.threshold) {
      const elapsed = Date.now() - this.lastFailure;
      return elapsed < this.timeout;
    }
    return false;
  }

  private onSuccess() {
    this.failures = 0;
  }

  private onFailure() {
    this.failures++;
    this.lastFailure = Date.now();
  }
}

const breaker = new CircuitBreaker();

async function makeProtectedRequest() {
  return breaker.call(() =>
    client.responses.create({
      model: 'gpt-4o',
      input: 'Your prompt'
    })
  );
}

import time
from typing import Callable, TypeVar

T = TypeVar('T')

class CircuitBreaker:
    def __init__(self, threshold=5, timeout=60):
        self.failures = 0
        self.last_failure = 0
        self.threshold = threshold
        self.timeout = timeout

    def call(self, fn: Callable[[], T]) -> T:
        if self.is_open():
            raise Exception("Circuit breaker is open")

        try:
            result = fn()
            self.on_success()
            return result
        except Exception as e:
            self.on_failure()
            raise

    def is_open(self) -> bool:
        if self.failures >= self.threshold:
            elapsed = time.time() - self.last_failure
            return elapsed < self.timeout
        return False

    def on_success(self):
        self.failures = 0

    def on_failure(self):
        self.failures += 1
        self.last_failure = time.time()

breaker = CircuitBreaker()

def make_protected_request():
    return breaker.call(lambda: client.responses.create(
        model="gpt-4o",
        input="Your prompt"
    ))

Error Monitoring

Logging Errors

Implement comprehensive error logging:

TypeScript
Python

async function makeRequestWithLogging(input: string) {
  try {
    const response = await client.responses.create({
      model: 'gpt-4o',
      input
    });

    // Log success
    console.log({
      timestamp: new Date().toISOString(),
      level: 'info',
      message: 'Request succeeded',
      model: 'gpt-4o',
      tokens: response.usage?.total_tokens
    });

    return response;
  } catch (error: any) {
    // Log error with full context
    console.error({
      timestamp: new Date().toISOString(),
      level: 'error',
      message: 'Request failed',
      error_type: error.type,
      error_message: error.message,
      status_code: error.status,
      provider: error.provider,
      model: 'gpt-4o',
      input_length: input.length
    });

    throw error;
  }
}

import logging
from datetime import datetime

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

def make_request_with_logging(input_text: str):
    try:
        response = client.responses.create(
            model="gpt-4o",
            input=input_text
        )

        # Log success
        logger.info({
            "timestamp": datetime.utcnow().isoformat(),
            "level": "info",
            "message": "Request succeeded",
            "model": "gpt-4o",
            "tokens": response.usage.total_tokens if response.usage else None
        })

        return response
    except TokenRouterError as e:
        # Log error with full context
        logger.error({
            "timestamp": datetime.utcnow().isoformat(),
            "level": "error",
            "message": "Request failed",
            "error_type": e.type,
            "error_message": e.message,
            "status_code": e.status_code,
            "provider": e.provider,
            "model": "gpt-4o",
            "input_length": len(input_text)
        })

        raise

Error Metrics

Track error rates and patterns:

class ErrorMetrics {
  private errors: Map<string, number> = new Map();
  private total = 0;

  record(errorType: string) {
    this.total++;
    this.errors.set(errorType, (this.errors.get(errorType) || 0) + 1);
  }

  getErrorRate(errorType: string): number {
    return (this.errors.get(errorType) || 0) / this.total;
  }

  getMostCommon(): [string, number][] {
    return Array.from(this.errors.entries())
      .sort((a, b) => b[1] - a[1])
      .slice(0, 5);
  }
}

const metrics = new ErrorMetrics();

// Use in your error handlers
catch (error: any) {
  metrics.record(error.type);
  // ... handle error
}

Best Practices

Always Handle Errors - Never leave errors unhandled
Use Specific Error Types - Check error.type not just status codes
Implement Retries - Use exponential backoff for transient errors
Log with Context - Include request details in error logs
Monitor Error Rates - Track error patterns over time
Fail Gracefully - Provide fallback behavior when possible
Respect Rate Limits - Use retry_after header values
Circuit Breakers - Prevent cascading failures
User-Friendly Messages - Don’t expose technical errors to end users
Alert on Patterns - Set up alerts for unusual error rates

Common Error Scenarios

Scenario 1: Provider Outage

// Use auto-routing to automatically fallback
const response = await client.responses.create({
  model: 'auto:balance',  // Automatically picks healthy provider
  input: 'Your prompt'
});

Scenario 2: Quota Exceeded

try {
  const response = await client.responses.create({
    model: 'gpt-4o',
    input: 'Your prompt'
  });
} catch (error: any) {
  if (error.status === 429 && error.message.includes('quota')) {
    // Notify user to upgrade plan
    console.log('Daily quota exceeded. Upgrade plan for higher limits.');
  }
}

Scenario 3: Invalid Input

try {
  const response = await client.responses.create({
    model: 'gpt-4o',
    input: ''  // Empty input
  });
} catch (error: any) {
  if (error.status === 422) {
    console.log('Validation error:', error.errors);
    // Fix input and retry
  }
}

Error Handling

Error Handling

Error Response Format

HTTP Status Codes

Error Types

Authentication Errors (401)

Authorization Errors (403)

Validation Errors (422)

Firewall Block Errors (422)

Rate Limit Errors (429)

Provider Errors (500+)

Routing Errors (422)

Streaming Error Handling

Error Recovery Patterns

Exponential Backoff

Graceful Degradation

Circuit Breaker

Error Monitoring

Logging Errors

Error Metrics

Best Practices

Common Error Scenarios

Scenario 1: Provider Outage

Scenario 2: Quota Exceeded

Scenario 3: Invalid Input

Next Steps