Routing API

POST

The core endpoint for intelligent LLM routing. Automatically selects the optimal model based on your request characteristics.

Endpoint

POSThttps://api.tokenrouter.com/route

Send your prompt and preferences to get an intelligent response from the optimal AI model.

Request Headers

HeaderValueRequiredDescription
AuthorizationBearer tr_xxx...RequiredYour TokenRouter API key
Content-Typeapplication/jsonRequiredRequest content type

Request Parameters

ParameterTypeRequiredDescription
promptstringRequiredThe input text to send to the AI model
model_preferencesarrayOptionalPreferred models in order of preference (e.g., ["gpt-4", "claude-3-opus"])
temperaturenumberOptionalSampling temperature between 0 and 2. Default: 0.7
max_tokensintegerOptionalMaximum number of tokens to generate. Default: 1000
toolsarrayOptionalFunction calling tools available to the model
tool_choicestringOptionalHow the model should use tools: "auto", "none", or specific tool
response_formatobjectOptionalResponse format specification (e.g., type": "json_object)
user_idstringOptionalUser identifier for tracking and analytics
cost_prioritystringOptionalRouting priority: "cost", "quality", "speed", "balanced". Default: "balanced"

Request Examples

Basic Request
Simple prompt with default settings
{
  "prompt": "Explain quantum computing in simple terms",
  "temperature": 0.7,
  "max_tokens": 1000
}

Response Format

Responses follow the OpenAI API format with additional TokenRouter metadata.

Success Response
HTTP 200 - Successful completion
{
  "id": "tr_req_abc123def456",
  "object": "chat.completion",
  "created": 1699564800,
  "model": "gpt-4",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Quantum computing is a revolutionary approach to computation..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 150,
    "total_tokens": 162
  },
  "tokenrouter": {
    "selected_provider": "openai",
    "selected_model": "gpt-4",
    "routing_reason": "optimal_quality_cost_ratio",
    "cost_savings": "23%",
    "response_time_ms": 1250,
    "estimated_cost": 0.0032,
    "routing_score": 0.87
  }
}

Response Fields

Standard OpenAI Fields

FieldTypeDescription
idstringUnique request identifier
objectstringAlways "chat.completion"
createdintegerUnix timestamp of creation
modelstringThe model that generated the response
choicesarrayArray of completion choices
usageobjectToken usage statistics

TokenRouter Metadata

FieldTypeDescription
selected_providerstringProvider used (openai, anthropic, mistral, together)
selected_modelstringSpecific model used for generation
routing_reasonstringWhy this model was selected
cost_savingsstringPercentage saved vs. most expensive option
response_time_msintegerTotal response time in milliseconds
estimated_costnumberEstimated cost in USD for this request
routing_scorenumberConfidence score for routing decision (0-1)

Error Responses

400 Bad Request
Invalid request parameters
{
  "error": {
    "type": "invalid_request_error",
    "message": "Missing required parameter: prompt",
    "code": "missing_parameter",
    "param": "prompt"
  }
}
401 Unauthorized
Invalid API key
{
  "error": {
    "type": "authentication_error",
    "message": "Invalid API key provided",
    "code": "invalid_api_key"
  }
}
429 Rate Limited
Too many requests
{
  "error": {
    "type": "rate_limit_error",
    "message": "Rate limit exceeded. Try again in 60 seconds",
    "code": "rate_limit_exceeded",
    "retry_after": 60
  }
}
503 Service Unavailable
All providers unavailable
{
  "error": {
    "type": "service_unavailable_error",
    "message": "All configured providers are currently unavailable",
    "code": "no_providers_available",
    "retry_after": 30
  }
}