Routing API

POST

The core endpoint for intelligent LLM routing. Automatically selects the optimal model based on your request characteristics.

Endpoint

POSThttps://api.tokenrouter.com/route

Send your prompt and preferences to get an intelligent response from the optimal AI model.

Request Headers

Header	Value	Required	Description
Authorization	Bearer tr_xxx...	Required	Your TokenRouter API key
Content-Type	application/json	Required	Request content type

Request Parameters

Parameter	Type	Required	Description
prompt	string	Required	The input text to send to the AI model
model_preferences	array	Optional	Preferred models in order of preference (e.g., ["gpt-4", "claude-3-opus"])
temperature	number	Optional	Sampling temperature between 0 and 2. Default: 0.7
max_tokens	integer	Optional	Maximum number of tokens to generate. Default: 1000
tools	array	Optional	Function calling tools available to the model
tool_choice	string	Optional	How the model should use tools: "auto", "none", or specific tool
response_format	object	Optional	Response format specification (e.g., type": "json_object)
user_id	string	Optional	User identifier for tracking and analytics
cost_priority	string	Optional	Routing priority: "cost", "quality", "speed", "balanced". Default: "balanced"

Request Examples

Basic Request

Simple prompt with default settings

{
  "prompt": "Explain quantum computing in simple terms",
  "temperature": 0.7,
  "max_tokens": 1000
}

Response Format

Responses follow the OpenAI API format with additional TokenRouter metadata.

Success Response

HTTP 200 - Successful completion

{
  "id": "tr_req_abc123def456",
  "object": "chat.completion",
  "created": 1699564800,
  "model": "gpt-4",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Quantum computing is a revolutionary approach to computation..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 150,
    "total_tokens": 162
  },
  "tokenrouter": {
    "selected_provider": "openai",
    "selected_model": "gpt-4",
    "routing_reason": "optimal_quality_cost_ratio",
    "cost_savings": "23%",
    "response_time_ms": 1250,
    "estimated_cost": 0.0032,
    "routing_score": 0.87
  }
}

Response Fields

Standard OpenAI Fields

Field	Type	Description
id	string	Unique request identifier
object	string	Always "chat.completion"
created	integer	Unix timestamp of creation
model	string	The model that generated the response
choices	array	Array of completion choices
usage	object	Token usage statistics

TokenRouter Metadata

Field	Type	Description
selected_provider	string	Provider used (openai, anthropic, mistral, together)
selected_model	string	Specific model used for generation
routing_reason	string	Why this model was selected
cost_savings	string	Percentage saved vs. most expensive option
response_time_ms	integer	Total response time in milliseconds
estimated_cost	number	Estimated cost in USD for this request
routing_score	number	Confidence score for routing decision (0-1)

Error Responses

400 Bad Request

Invalid request parameters

{
  "error": {
    "type": "invalid_request_error",
    "message": "Missing required parameter: prompt",
    "code": "missing_parameter",
    "param": "prompt"
  }
}

401 Unauthorized

Invalid API key

{
  "error": {
    "type": "authentication_error",
    "message": "Invalid API key provided",
    "code": "invalid_api_key"
  }
}

429 Rate Limited

Too many requests

{
  "error": {
    "type": "rate_limit_error",
    "message": "Rate limit exceeded. Try again in 60 seconds",
    "code": "rate_limit_exceeded",
    "retry_after": 60
  }
}

503 Service Unavailable

All providers unavailable

{
  "error": {
    "type": "service_unavailable_error",
    "message": "All configured providers are currently unavailable",
    "code": "no_providers_available",
    "retry_after": 30
  }
}