| name | openrouter-pricing-basics |
| description | Understand OpenRouter pricing and cost estimation. Use when budgeting or optimizing costs. Trigger with phrases like 'openrouter pricing', 'openrouter costs', 'openrouter budget', 'openrouter token pricing'. |
| allowed-tools | Read, Write, Edit, Grep |
| version | 1.0.0 |
| license | MIT |
| author | Jeremy Longshore <jeremy@intentsolutions.io> |
OpenRouter Pricing Basics
Overview
This skill explains the OpenRouter pricing model, how to estimate costs, and strategies for cost-effective model selection.
Prerequisites
- OpenRouter account
- Basic understanding of token-based pricing
Instructions
Follow these steps to implement this skill:
- Verify Prerequisites: Ensure all prerequisites listed above are met
- Review the Implementation: Study the code examples and patterns below
- Adapt to Your Environment: Modify configuration values for your setup
- Test the Integration: Run the verification steps to confirm functionality
- Monitor in Production: Set up appropriate logging and monitoring
Overview
This skill explains the OpenRouter pricing model, how to estimate costs, and strategies for cost-effective model selection.
Prerequisites
- OpenRouter account
- Basic understanding of token-based pricing
Pricing Model
How Pricing Works
OpenRouter uses pay-per-token pricing:
- You pay for tokens used (prompt + completion)
- Different models have different prices
- No subscriptions or minimum commitments
- Pre-paid credits system
Token Basics
1 token ≈ 4 characters (English)
1 token ≈ 0.75 words
Example:
"Hello, how are you?" = ~6 tokens
1,000 words ≈ 1,333 tokens
Model Pricing Tiers
Budget Models (< $0.50/1M tokens)
meta-llama/llama-3.1-8b-instruct
Prompt: $0.06/1M tokens
Completion: $0.06/1M tokens
mistralai/mistral-7b-instruct
Prompt: $0.07/1M tokens
Completion: $0.07/1M tokens
google/gemma-7b-it
Prompt: $0.07/1M tokens
Completion: $0.07/1M tokens
Standard Models ($0.50-3/1M tokens)
anthropic/claude-3-haiku
Prompt: $0.25/1M tokens
Completion: $1.25/1M tokens
openai/gpt-3.5-turbo
Prompt: $0.50/1M tokens
Completion: $1.50/1M tokens
meta-llama/llama-3.1-70b-instruct
Prompt: $0.52/1M tokens
Completion: $0.75/1M tokens
Premium Models ($3-15/1M tokens)
anthropic/claude-3.5-sonnet
Prompt: $3.00/1M tokens
Completion: $15.00/1M tokens
openai/gpt-4-turbo
Prompt: $10.00/1M tokens
Completion: $30.00/1M tokens
anthropic/claude-3-sonnet
Prompt: $3.00/1M tokens
Completion: $15.00/1M tokens
Enterprise Models ($15+/1M tokens)
anthropic/claude-3-opus
Prompt: $15.00/1M tokens
Completion: $75.00/1M tokens
openai/gpt-4
Prompt: $30.00/1M tokens
Completion: $60.00/1M tokens
Cost Calculation
Formula
Total Cost = (Prompt Tokens × Prompt Price) + (Completion Tokens × Completion Price)
Example Calculations
def calculate_cost(model_pricing, prompt_tokens, completion_tokens):
prompt_cost = prompt_tokens * model_pricing["prompt"] / 1_000_000
completion_cost = completion_tokens * model_pricing["completion"] / 1_000_000
return prompt_cost + completion_cost
# GPT-4 Turbo example
gpt4_pricing = {"prompt": 10.00, "completion": 30.00}
cost = calculate_cost(gpt4_pricing, 1000, 500)
# = (1000 × $10/1M) + (500 × $30/1M)
# = $0.01 + $0.015
# = $0.025 per request
# Claude 3 Haiku example
haiku_pricing = {"prompt": 0.25, "completion": 1.25}
cost = calculate_cost(haiku_pricing, 1000, 500)
# = (1000 × $0.25/1M) + (500 × $1.25/1M)
# = $0.00025 + $0.000625
# = $0.000875 per request (100x cheaper!)
Credit System
Adding Credits
1. Go to openrouter.ai/credits
2. Choose amount:
- $5 (starter)
- $20 (recommended)
- $100 (heavy usage)
- Custom amount
3. Pay with card
4. Credits available instantly
Credit Limits per Key
# Set per-key limits to control spending
# In dashboard: openrouter.ai/keys
Key: "development"
Limit: $10.00
Key: "production"
Limit: $100.00
Key: "testing"
Limit: $1.00
Checking Balance
curl https://openrouter.ai/api/v1/auth/key \
-H "Authorization: Bearer $OPENROUTER_API_KEY"
{
"data": {
"label": "my-key",
"limit": 100.0,
"usage": 23.45,
"limit_remaining": 76.55
}
}
Cost Optimization
Model Selection Strategy
Task Type → Recommended Model → Cost
Simple Q&A:
claude-3-haiku $0.001/request
Code completion:
llama-3.1-70b $0.001/request
Complex analysis:
gpt-4-turbo $0.03/request
Final review:
claude-3-opus $0.10/request
Token Optimization
# Reduce prompt tokens
- Remove unnecessary context
- Use concise system prompts
- Summarize long conversations
# Reduce completion tokens
- Set appropriate max_tokens
- Use stop sequences
- Request concise responses
Caching Strategy
import hashlib
from functools import lru_cache
@lru_cache(maxsize=1000)
def cached_completion(prompt_hash: str, model: str):
# Cache repeated identical requests
return client.chat.completions.create(...)
def chat_with_cache(prompt: str, model: str):
prompt_hash = hashlib.md5(prompt.encode()).hexdigest()
return cached_completion(prompt_hash, model)
Monitoring Costs
Per-Request Cost
def get_request_cost(response, model_pricing):
usage = response.usage
return calculate_cost(
model_pricing,
usage.prompt_tokens,
usage.completion_tokens
)
response = client.chat.completions.create(...)
cost = get_request_cost(response, {"prompt": 10, "completion": 30})
print(f"Request cost: ${cost:.6f}")
Dashboard Monitoring
openrouter.ai/activity shows:
- Per-request costs
- Daily/weekly/monthly totals
- Model breakdown
- Usage trends
Budget Alerts
Set up monitoring:
1. Per-key credit limits
2. Daily spend tracking
3. Alerts when approaching limits
4. Automatic key disable at limit
Cost Comparison Tool
def compare_model_costs(prompt: str, models: list):
"""Compare costs across models for same prompt."""
results = []
for model_id in models:
response = client.chat.completions.create(
model=model_id,
messages=[{"role": "user", "content": prompt}],
max_tokens=500
)
# Get pricing from models API
model_info = get_model_info(model_id)
cost = calculate_cost(
model_info["pricing"],
response.usage.prompt_tokens,
response.usage.completion_tokens
)
results.append({
"model": model_id,
"cost": cost,
"tokens": response.usage.total_tokens
})
return sorted(results, key=lambda x: x["cost"])
# Usage
models = [
"anthropic/claude-3-haiku",
"openai/gpt-3.5-turbo",
"openai/gpt-4-turbo"
]
comparison = compare_model_costs("Explain recursion", models)
Billing FAQ
When are credits charged?
- Immediately after each successful request
- Failed requests are not charged
- Streaming charges on completion
What if I run out of credits?
- Requests return 402 Payment Required
- No service interruption warning
- Add credits to resume
Refunds?
- Unused credits don't expire
- No refunds for used credits
- Contact support for issues
Output
Successful execution produces:
- Working OpenRouter integration
- Verified API connectivity
- Example responses demonstrating functionality
Error Handling
Common errors and solutions:
- 401 Unauthorized: Check API key format (must start with
sk-or-) - 429 Rate Limited: Implement exponential backoff
- 500 Server Error: Retry with backoff, check OpenRouter status page
- Model Not Found: Verify model ID includes provider prefix
Examples
See code examples in sections above for complete, runnable implementations.
Resources
Output
Successful execution produces:
- Working OpenRouter integration
- Verified API connectivity
- Example responses demonstrating functionality
Error Handling
Common errors and solutions:
- 401 Unauthorized: Check API key format (must start with
sk-or-) - 429 Rate Limited: Implement exponential backoff
- 500 Server Error: Retry with backoff, check OpenRouter status page
- Model Not Found: Verify model ID includes provider prefix
Examples
See code examples in sections above for complete, runnable implementations.