Claude API Development
Source: https://platform.claude.com/docs
This skill provides quick reference for building with the Claude API, including the Messages API, tool use, Agent SDK, and prompt engineering best practices.
Best Practices
API Basics
- Always use the official SDKs (
anthropic for Python, @anthropic-ai/sdk for TypeScript)
- Set reasonable timeouts - Claude can take time on complex tasks, especially with extended thinking
- Handle rate limits gracefully - Implement exponential backoff with jitter
- Use streaming for long responses - Better UX and allows early termination
Model Selection
| Model |
Use Case |
Notes |
claude-sonnet-4-20250514 |
Default for most tasks |
Best balance of speed/quality |
claude-haiku-3-5-20241022 |
High-volume, simple tasks |
Fastest, lowest cost |
claude-opus-4-20250514 |
Complex reasoning, coding |
Highest quality, slower |
Prompt Engineering
- Use XML tags to structure inputs:
<document>, <instructions>, <examples>
- Put instructions at the end when working with long documents
- Use system prompts for persistent context and persona
- Prefill assistant response to guide output format
- Multi-shot examples improve consistency for structured tasks
Extended Thinking
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=16000,
thinking={
"type": "enabled",
"budget_tokens": 10000 # Tokens for reasoning
},
messages=[{"role": "user", "content": prompt}]
)
- Enable for complex reasoning, math, coding tasks
- Set
budget_tokens based on task complexity
- Access thinking via
response.content blocks with type="thinking"
Tool Use
tools = [{
"name": "get_weather",
"description": "Get current weather for a location",
"input_schema": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "City name"}
},
"required": ["location"]
}
}]
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=tools,
messages=[{"role": "user", "content": "What's the weather in Boston?"}]
)
- Write clear tool descriptions - Claude uses these to decide when to call
- Use JSON Schema for input validation
- Handle tool_use blocks and return tool_result
- Consider parallel tool calls for independent operations
Structured Outputs
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Analyze this text..."}],
response_format={
"type": "json_schema",
"json_schema": {
"name": "analysis",
"schema": {
"type": "object",
"properties": {
"summary": {"type": "string"},
"sentiment": {"enum": ["positive", "negative", "neutral"]}
},
"required": ["summary", "sentiment"]
}
}
}
)
Prompt Caching
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
system=[{
"type": "text",
"text": large_system_prompt,
"cache_control": {"type": "ephemeral"} # Cache this block
}],
messages=[{"role": "user", "content": query}]
)
- Cache large, reusable context (system prompts, documents)
- Minimum 1024 tokens for caching to activate
- 90% cost reduction on cache hits
- 5-minute TTL (or 1-hour with extended caching)
Cost Optimization
- Use prompt caching for repeated context
- Use batch API for async workloads (50% discount)
- Choose appropriate model - don't use Opus for simple tasks
- Set reasonable max_tokens - don't over-allocate
- Stream and cancel early when you have enough output
Quick Reference
Python SDK
from anthropic import Anthropic
client = Anthropic() # Uses ANTHROPIC_API_KEY env var
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello, Claude"}
]
)
print(response.content[0].text)
TypeScript SDK
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
const response = await client.messages.create({
model: "claude-sonnet-4-20250514",
max_tokens: 1024,
messages: [
{ role: "user", content: "Hello, Claude" }
]
});
console.log(response.content[0].text);
Streaming
with client.messages.stream(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": prompt}]
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
Documentation Index
Detailed official documentation is synced to resources/. Consult these for specifics beyond this quick reference.
Getting Started
Models
Core Features
Multimodal
Cost Optimization
Prompt Engineering
Tool Use
Agent Skills
MCP Integration
Agent SDK
Testing & Guardrails
API Reference
Cloud Providers
Syncing Documentation
Resources are synced from official Claude API documentation. To update:
cd skills/claude-api
bun run scripts/sync-docs.ts