| name | testing-claude-plugins-with-python-sdk |
| description | Use when working with Claude Agents SDK (Python) - covers plugin loading, session management, streaming messages, monitoring long-running operations, and real-time progress tracking. Essential for testing Claude Code plugins, building autonomous agents, or programmatic Claude Code interaction. |
Testing Claude Plugins With Python SDK
Overview
Complete reference for Claude Agents SDK (Python) covering plugin loading, session management, message streaming, and monitoring long-running operations.
Use this skill when:
- Loading Claude Code plugins programmatically
- Testing plugin functionality via SDK
- Building autonomous agents with SDK
- Monitoring long-running Claude operations
- Streaming real-time progress from Claude
- Managing sessions and resumption
Quick Start (5 Minutes)
1. Install SDK
pip install claude-agent-sdk
export ANTHROPIC_API_KEY="sk-ant-..."
2. Basic Query
from claude_agent_sdk import query
async for message in query(prompt="Hello Claude"):
if message.type == 'assistant':
print(message.content)
3. Load Plugin
from claude_agent_sdk import query, ClaudeAgentOptions
async for message in query(
prompt="/my-plugin:command",
options=ClaudeAgentOptions(
plugins=[{"type": "local", "path": "./my-plugin"}]
)
):
print(message)
Complete SDK Reference
1. Plugin Loading
Purpose: Load Claude Code plugins programmatically to test or extend functionality
Configuration:
from claude_agent_sdk import ClaudeAgentOptions
options = ClaudeAgentOptions(
plugins=[
{"type": "local", "path": "./my-plugin"}, # Relative path
{"type": "local", "path": "/absolute/path/plugin"} # Absolute path
]
)
Requirements:
- Path must point to plugin root directory (containing
.claude-plugin/plugin.json) - Plugin must have valid
plugin.jsonmanifest - Can load multiple plugins simultaneously
Verification:
async for message in query(prompt="test", options=options):
if message.type == 'system' and message.subtype == 'init':
print("Loaded plugins:", message.plugins)
print("Available commands:", message.slash_commands)
Example: Loading Shannon Framework plugin
options = ClaudeAgentOptions(
plugins=[{"type": "local", "path": "./shannon-plugin"}]
)
async for msg in query(prompt="/sh_spec 'Build React app'", options=options):
# Shannon plugin's /sh_spec command executes
pass
2. Session Management
Purpose: Resume conversations, maintain context across multiple queries
Starting Session:
from claude_agent_sdk import ClaudeSDKClient, ClaudeAgentOptions
session_id = None
async with ClaudeSDKClient(options=options) as client:
await client.query("First question")
async for msg in client.receive_messages():
if msg.type == 'system' and msg.subtype == 'init':
session_id = msg.session_id # Capture session ID
print(f"Session: {session_id}")
Resuming Session:
# Continue same conversation
async with ClaudeSDKClient(options=options) as client:
await client.query(
"Follow-up question",
resume=session_id # Resume previous session
)
async for msg in client.receive_messages():
# Claude has full context from previous session
pass
Forking Session:
# Explore alternative without modifying original
await client.query(
"Try different approach",
resume=session_id,
fork_session=True # Create branch
)
Use Cases:
- Resume: Continue long-running work across restarts
- Fork: Try alternatives without losing original path
- Testing: Create session, test behavior, resume to verify state
3. Message Streaming and Types
Purpose: Monitor real-time progress, parse different message types
Message Types:
from claude_agent_sdk import (
SystemMessage, # Init, completion events
AssistantMessage, # Claude's responses
ToolCallMessage, # Tool execution requests
ToolResultMessage,# Tool execution results
ErrorMessage # Errors and failures
)
Streaming Pattern:
async for message in query(prompt="Long task", options=options):
if message.type == 'system':
# System events (init, completion)
if message.subtype == 'init':
print(f"Session: {message.session_id}")
print(f"Model: {message.model}")
elif message.subtype == 'completion':
print("Task complete")
elif message.type == 'assistant':
# Claude's text responses
print(f"Claude: {message.content}")
elif message.type == 'tool_call':
# Tool about to execute
print(f"Calling tool: {message.tool_name}")
print(f"Input: {message.input}")
elif message.type == 'tool_result':
# Tool completed
print(f"Tool {message.tool_name} completed")
print(f"Result: {message.result}")
elif message.type == 'error':
# Errors
print(f"Error: {message.error.message}")
Real-Time Monitoring:
import time
start_time = time.time()
tool_calls = []
async for message in query(prompt="Complex task", options=options):
elapsed = time.time() - start_time
if message.type == 'tool_call':
tool_calls.append({
'tool': message.tool_name,
'timestamp': elapsed,
'input': message.input
})
print(f"[{elapsed:.1f}s] Tool: {message.tool_name}")
elif message.type == 'assistant':
print(f"[{elapsed:.1f}s] Progress: {message.content[:80]}...")
print(f"Total time: {elapsed:.1f}s")
print(f"Tool calls: {len(tool_calls)}")
4. Long-Running Operations
Purpose: Monitor operations that take minutes/hours (builds, analyses, multi-step workflows)
Progress Tracking Pattern:
import asyncio
from datetime import datetime
async def monitor_long_operation(prompt: str, options: ClaudeAgentOptions):
"""
Monitor long-running operation with progress updates
"""
start = datetime.now()
messages_received = 0
tools_executed = 0
current_activity = "Starting..."
print(f"Started: {start.strftime('%H:%M:%S')}")
print("-" * 80)
async for message in query(prompt=prompt, options=options):
messages_received += 1
if message.type == 'tool_call':
tools_executed += 1
current_activity = f"Tool: {message.tool_name}"
elapsed = (datetime.now() - start).total_seconds()
print(f"[{elapsed:6.1f}s] {current_activity}")
elif message.type == 'assistant':
# Update current activity from message
preview = message.content[:60]
if preview:
current_activity = preview
elapsed = (datetime.now() - start).total_seconds()
print(f"[{elapsed:6.1f}s] {current_activity}...")
total_time = (datetime.now() - start).total_seconds()
print("-" * 80)
print(f"Completed: {datetime.now().strftime('%H:%M:%S')}")
print(f"Duration: {total_time/60:.1f} minutes")
print(f"Messages: {messages_received}")
print(f"Tools: {tools_executed}")
# Usage
await monitor_long_operation(
"/plugin:complex-command with long execution",
options
)
Timeout Handling:
async def with_timeout(prompt: str, timeout_seconds: int = 600):
"""Execute with timeout (default 10 minutes)"""
try:
async with asyncio.timeout(timeout_seconds):
async for msg in query(prompt=prompt, options=options):
# Process messages
pass
except asyncio.TimeoutError:
print(f"Operation timed out after {timeout_seconds}s")
# Handle timeout
5. Permissions and Safety
Purpose: Control what plugins can do, prevent dangerous operations
Permission Modes:
options = ClaudeAgentOptions(
permission_mode="default" # Prompt for confirmations
# OR
permission_mode="acceptEdits" # Auto-approve file edits
# OR
permission_mode="bypassPermissions" # Skip all checks (dangerous)
)
Custom Permission Logic:
async def permission_callback(tool_name: str, tool_input: dict) -> dict:
"""
Custom permission logic for fine-grained control
Returns:
{"behavior": "allow"} - Allow operation
{"behavior": "deny", "message": "reason"} - Block operation
{"behavior": "ask", "message": "confirm?"} - Prompt user
"""
# Block dangerous bash commands
if tool_name == 'Bash':
command = tool_input.get('command', '')
if 'rm -rf' in command or 'dd if=' in command:
return {
"behavior": "deny",
"message": "Dangerous command blocked"
}
# Require confirmation for file writes to important files
if tool_name in ['Write', 'Edit']:
file_path = tool_input.get('file_path', '')
if 'config' in file_path.lower() or '.env' in file_path:
return {
"behavior": "ask",
"message": f"Modify {file_path}?"
}
# Allow everything else
return {"behavior": "allow"}
# Use custom permissions
options = ClaudeAgentOptions(
can_use_tool=permission_callback,
allowed_tools=["Read", "Write", "Bash"]
)
Restricting Tools:
# Whitelist approach (only these tools)
options = ClaudeAgentOptions(
allowed_tools=["Read", "Grep", "Glob"] # Read-only
)
# Blacklist approach (everything except these)
options = ClaudeAgentOptions(
disallowed_tools=["Bash"] # Everything except Bash
)
6. Parsing and Validating Output
Purpose: Extract structured data from Claude's text responses
Pattern Matching:
import re
def extract_value(text: str, pattern: str) -> str:
"""Extract value using regex pattern"""
match = re.search(pattern, text, re.IGNORECASE)
if match:
return match.group(1)
return None
# Example: Extract complexity score
output = "Complexity: 0.68 (COMPLEX)"
complexity = extract_value(output, r'Complexity:\s*([0-9.]+)')
# Returns: "0.68"
# Example: Extract all file paths
files = re.findall(r'Created:\s*(.+\.(?:py|js|md))', output)
# Returns: list of file paths
Structured Extraction:
def parse_plugin_output(messages: list) -> dict:
"""
Parse plugin command output into structured data
Args:
messages: List of assistant messages from SDK
Returns:
Extracted data as dict
"""
output = ''.join(msg.content for msg in messages if msg.type == 'assistant')
result = {}
# Extract key-value pairs
# Pattern: "Key: Value"
for line in output.split('\n'):
if ':' in line:
key, value = line.split(':', 1)
result[key.strip()] = value.strip()
# Extract lists
# Pattern: "- Item"
list_items = re.findall(r'^\s*-\s*(.+)$', output, re.MULTILINE)
if list_items:
result['items'] = list_items
return result
Validation Without Pytest:
def validate_output(output: str, expected: dict) -> bool:
"""
Validate output contains expected elements
Returns: True if all validations pass
"""
checks_passed = 0
checks_total = 0
# Check 1: Contains required text
for required in expected.get('contains', []):
checks_total += 1
if required in output:
print(f" ✅ Contains: '{required}'")
checks_passed += 1
else:
print(f" ❌ Missing: '{required}'")
# Check 2: Numeric ranges
for key, (min_val, max_val) in expected.get('ranges', {}).items():
checks_total += 1
value = extract_value(output, rf'{key}:\s*([0-9.]+)')
if value:
value = float(value)
if min_val <= value <= max_val:
print(f" ✅ {key}: {value} in [{min_val}, {max_val}]")
checks_passed += 1
else:
print(f" ❌ {key}: {value} outside [{min_val}, {max_val}]")
else:
print(f" ❌ {key}: not found in output")
print(f"\nValidation: {checks_passed}/{checks_total} checks passed")
return checks_passed == checks_total
# Usage
expected = {
'contains': ['Success', 'Complete'],
'ranges': {'Score': (0.5, 0.8), 'Count': (1, 10)}
}
valid = validate_output(output, expected)
7. Configuration Options
Complete ClaudeAgentOptions:
from claude_agent_sdk import ClaudeAgentOptions
from pathlib import Path
options = ClaudeAgentOptions(
# Model selection
model="claude-sonnet-4-5", # or "haiku", "opus", "inherit"
# Tools
allowed_tools=["Read", "Write", "Edit", "Bash"],
disallowed_tools=[], # Empty means allow all (unless allowed_tools set)
# Permissions
permission_mode="acceptEdits", # "default", "acceptEdits", "bypassPermissions"
can_use_tool=custom_permission_callback, # Optional callback
# Plugins
plugins=[
{"type": "local", "path": "./plugin1"},
{"type": "local", "path": "/abs/path/plugin2"}
],
# System prompt
system_prompt="You are a helpful assistant.",
# Or preset:
# system_prompt={"type": "preset", "preset": "claude_code", "append": "Extra instructions"},
# Working directory
cwd=str(Path("/path/to/project")),
# Budget limits
max_budget_usd=5.0, # Stop after spending $5
max_thinking_tokens=10000, # Limit extended thinking
max_turns=10, # Maximum conversation turns
# MCP servers
mcp_servers={
"custom": {
"command": "python",
"args": ["-m", "my_mcp_server"]
}
},
# Settings
setting_sources=["user", "project", "local"], # Load from filesystem
# Session
fork_session="previous_session_id", # Resume from session
# Streaming
include_partial_messages=True, # Get partial updates during streaming
)
8. Error Handling
Purpose: Handle SDK errors gracefully
Error Types:
from claude_agent_sdk import (
ClaudeSDKError, # Base exception
CLINotFoundError, # Claude Code CLI not installed
CLIConnectionError, # Connection failed
ProcessError, # Process execution failed
CLIJSONDecodeError, # JSON parsing failed
MessageParseError # Message format invalid
)
Error Handling Pattern:
import asyncio
async def safe_query(prompt: str, options: ClaudeAgentOptions):
"""
Execute query with comprehensive error handling
"""
try:
messages = []
async for msg in query(prompt=prompt, options=options):
messages.append(msg)
return messages
except CLINotFoundError:
print("❌ Claude Code CLI not installed")
print(" Install: npm install -g @anthropic-ai/claude-code")
return None
except CLIConnectionError as e:
print(f"❌ Connection failed: {e}")
print(" Check Claude Code is running")
return None
except ProcessError as e:
print(f"❌ Process failed (exit code: {e.exit_code})")
if e.stderr:
print(f" Error: {e.stderr}")
return None
except asyncio.TimeoutError:
print("❌ Operation timed out")
return None
except ClaudeSDKError as e:
print(f"❌ SDK error: {e}")
return None
# Usage with timeout
try:
async with asyncio.timeout(600): # 10 minute timeout
result = await safe_query(prompt, options)
except asyncio.TimeoutError:
print("Operation exceeded 10 minutes")
9. Monitoring and Progress Tracking
Purpose: Track progress of long-running operations in real-time
Progress Tracking Pattern:
class ProgressMonitor:
"""Monitor and report progress from SDK messages"""
def __init__(self):
self.start_time = time.time()
self.tool_count = 0
self.message_count = 0
self.current_activity = "Initializing..."
def update(self, message):
"""Update from SDK message"""
self.message_count += 1
elapsed = time.time() - self.start_time
if message.type == 'tool_call':
self.tool_count += 1
self.current_activity = f"Tool: {message.tool_name}"
self.report_progress(elapsed)
elif message.type == 'assistant' and message.content:
# Update from assistant message
preview = message.content[:60]
if preview and preview != self.current_activity:
self.current_activity = preview
self.report_progress(elapsed)
def report_progress(self, elapsed: float):
"""Print progress update"""
print(f"[{elapsed:6.1f}s] {self.current_activity}")
print(f" Tools: {self.tool_count} | Messages: {self.message_count}")
# Usage
monitor = ProgressMonitor()
async for message in query(prompt="Long analysis task", options=options):
monitor.update(message)
Checkpoint Pattern (for very long operations):
async def with_checkpoints(prompt: str, checkpoint_interval: int = 300):
"""
Monitor operation with periodic checkpoints
Args:
checkpoint_interval: Seconds between checkpoints (default 5 min)
"""
last_checkpoint = time.time()
checkpoint_data = []
async for message in query(prompt=prompt, options=options):
# Collect messages
checkpoint_data.append(message)
# Checkpoint every N seconds
if time.time() - last_checkpoint > checkpoint_interval:
# Save checkpoint
with open(f'checkpoint_{int(time.time())}.json', 'w') as f:
json.dump([m.dict() for m in checkpoint_data], f)
print(f"💾 Checkpoint saved ({len(checkpoint_data)} messages)")
last_checkpoint = time.time()
10. Testing Plugin Commands
Purpose: Validate plugin behavior programmatically
Basic Test Pattern:
async def test_plugin_command(
plugin_path: str,
command: str,
expected_patterns: list
) -> bool:
"""
Test plugin command execution
Args:
plugin_path: Path to plugin directory
command: Command to execute (e.g., "/plugin:cmd arg")
expected_patterns: List of patterns that should appear in output
Returns:
True if all patterns found
"""
print(f"Testing: {command}")
options = ClaudeAgentOptions(
plugins=[{"type": "local", "path": plugin_path}]
)
messages = []
async for msg in query(prompt=command, options=options):
if msg.type == 'assistant':
messages.append(msg.content)
output = ''.join(messages)
# Validate
checks_passed = 0
for pattern in expected_patterns:
if pattern in output:
print(f" ✅ Found: '{pattern}'")
checks_passed += 1
else:
print(f" ❌ Missing: '{pattern}'")
success = checks_passed == len(expected_patterns)
print(f" Result: {checks_passed}/{len(expected_patterns)} checks passed")
return success
# Usage
passed = await test_plugin_command(
plugin_path="./my-plugin",
command="/my-plugin:analyze",
expected_patterns=["Analysis complete", "Results:", "Success"]
)
Executable Test Script (no pytest):
#!/usr/bin/env python3
"""
Test plugin functionality
Run: python test_my_plugin.py
"""
import sys
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions
async def main():
print("=" * 80)
print("PLUGIN TEST")
print("=" * 80)
# Test 1
print("\nTest 1: Basic command execution")
options = ClaudeAgentOptions(
plugins=[{"type": "local", "path": "./my-plugin"}]
)
messages = []
async for msg in query(prompt="/my-plugin:test", options=options):
if msg.type == 'assistant':
messages.append(msg.content)
output = ''.join(messages)
# Validate
if "expected text" in output:
print(" ✅ Test 1 passed")
test1_passed = True
else:
print(" ❌ Test 1 failed")
test1_passed = False
# Summary
print("\n" + "=" * 80)
if test1_passed:
print("✅ ALL TESTS PASSED")
return 0
else:
print("❌ TESTS FAILED")
return 1
if __name__ == '__main__':
exit_code = asyncio.run(main())
sys.exit(exit_code)
11. Capturing Tool Usage
Purpose: Monitor which tools plugins invoke, validate tool calls
Tool Tracking:
class ToolTracker:
"""Track all tool calls during operation"""
def __init__(self):
self.tools = []
def record(self, message):
"""Record tool call"""
if message.type == 'tool_call':
self.tools.append({
'tool': message.tool_name,
'input': message.input,
'timestamp': time.time()
})
elif message.type == 'tool_result':
# Match to previous tool_call and add result
if self.tools and self.tools[-1]['tool'] == message.tool_name:
self.tools[-1]['result'] = message.result
self.tools[-1]['success'] = True
def summary(self):
"""Print tool usage summary"""
print(f"Total tools called: {len(self.tools)}")
# Group by tool name
from collections import Counter
tool_counts = Counter(t['tool'] for t in self.tools)
print("Tool usage:")
for tool, count in tool_counts.most_common():
print(f" {tool}: {count}x")
return self.tools
# Usage
tracker = ToolTracker()
async for message in query(prompt="Complex task", options=options):
tracker.record(message)
tools_used = tracker.summary()
Validation: Verify expected tools were called
def validate_tools_used(tools: list, expected_tools: set) -> bool:
"""Validate expected tools were called"""
actual_tools = {t['tool'] for t in tools}
missing = expected_tools - actual_tools
unexpected = actual_tools - expected_tools
if not missing and not unexpected:
print(f"✅ Tools match expected: {expected_tools}")
return True
else:
if missing:
print(f"❌ Missing tools: {missing}")
if unexpected:
print(f"⚠️ Unexpected tools: {unexpected}")
return False
12. Working Directory and File Operations
Purpose: Control where plugin operates, validate file changes
Set Working Directory:
options = ClaudeAgentOptions(
cwd="/path/to/project", # All file operations relative to this
allowed_tools=["Read", "Write", "Edit"]
)
Validate File Changes:
import os
from pathlib import Path
def capture_file_state(directory: str) -> dict:
"""Capture file tree state"""
files = {}
for root, dirs, filenames in os.walk(directory):
for filename in filenames:
filepath = Path(root) / filename
files[str(filepath)] = filepath.stat().st_mtime
return files
# Before operation
before = capture_file_state("./project")
# Run plugin
async for msg in query(prompt="/plugin:create-files", options=options):
pass
# After operation
after = capture_file_state("./project")
# Compare
new_files = set(after.keys()) - set(before.keys())
modified = {f for f in before if f in after and after[f] != before[f]}
print(f"New files: {len(new_files)}")
for f in new_files:
print(f" + {f}")
print(f"Modified: {len(modified)}")
for f in modified:
print(f" ~ {f}")
13. Cost Tracking
Purpose: Monitor API costs during testing
Token Usage Tracking:
from claude_agent_sdk import ClaudeSDKClient, ClaudeAgentOptions
async with ClaudeSDKClient(options=options) as client:
await client.query("Analyze this project")
total_tokens = 0
async for msg in client.receive_messages():
# Token count in system completion message
if msg.type == 'system' and msg.subtype == 'completion':
if hasattr(msg, 'usage'):
total_tokens = msg.usage.get('total_tokens', 0)
print(f"Total tokens: {total_tokens:,}")
# Estimate cost (rough)
cost_per_million = 3.0 # $3 per million tokens for Sonnet
estimated_cost = (total_tokens / 1_000_000) * cost_per_million
print(f"Estimated cost: ${estimated_cost:.4f}")
Budget Control:
class BudgetController:
"""Track and limit spending during tests"""
def __init__(self, budget_usd: float = 10.0):
self.budget = budget_usd
self.spent = 0.0
def can_run(self, estimated_cost: float) -> bool:
"""Check if test can run within budget"""
if self.spent + estimated_cost > self.budget:
print(f"⊘ Budget exceeded: ${self.spent:.2f} / ${self.budget:.2f}")
return False
return True
def record(self, cost: float):
"""Record cost after test"""
self.spent += cost
print(f"💰 Cost: ${cost:.2f} | Total: ${self.spent:.2f} / ${self.budget:.2f}")
# Usage
budget = BudgetController(budget_usd=10.0)
if budget.can_run(estimated_cost=0.50):
# Run test
result = await run_test()
budget.record(0.45) # Actual cost
14. Subagents in SDK
Purpose: Define specialized agents for delegation
Basic Subagent Definition:
from claude_agent_sdk import AgentDefinition
options = ClaudeAgentOptions(
agents={
"analyzer": AgentDefinition(
description="Code analysis specialist",
prompt="You analyze code for issues. Be thorough.",
tools=["Read", "Grep"],
model="sonnet"
),
"builder": AgentDefinition(
description="Code implementation specialist",
prompt="You build features. Write clean code.",
tools=["Read", "Write", "Edit"],
model="sonnet"
)
}
)
# Main agent can delegate to subagents
async for msg in query(
prompt="Analyze then build feature X",
options=options
):
# Main agent may use Task tool to delegate to analyzer or builder
pass
Monitoring Subagent Activity:
async for message in query(prompt="Complex task", options=options):
if message.type == 'tool_call' and message.tool_name == 'Task':
# Subagent delegation happening
subagent = message.input.get('subagent_type')
print(f"🔄 Delegating to: {subagent}")
elif message.type == 'tool_result' and message.tool_name == 'Task':
# Subagent completed
print(f"✅ Subagent completed")
15. Common Patterns
Pattern 1: Test and Capture
async def test_and_capture(prompt: str, plugin_path: str) -> dict:
"""Execute plugin command and capture complete state"""
options = ClaudeAgentOptions(
plugins=[{"type": "local", "path": plugin_path}]
)
result = {
'messages': [],
'tools_called': [],
'duration_seconds': 0,
'success': False
}
start = time.time()
try:
async for msg in query(prompt=prompt, options=options):
if msg.type == 'assistant':
result['messages'].append(msg.content)
elif msg.type == 'tool_call':
result['tools_called'].append(msg.tool_name)
result['success'] = True
except Exception as e:
result['error'] = str(e)
result['duration_seconds'] = time.time() - start
return result
Pattern 2: Batch Testing
async def batch_test(test_cases: list) -> dict:
"""Run multiple tests and aggregate results"""
results = {
'total': len(test_cases),
'passed': 0,
'failed': 0,
'tests': []
}
for test in test_cases:
print(f"\nRunning: {test['name']}")
passed = await test_plugin_command(
plugin_path=test['plugin'],
command=test['command'],
expected_patterns=test['expected']
)
results['tests'].append({
'name': test['name'],
'passed': passed
})
if passed:
results['passed'] += 1
else:
results['failed'] += 1
print(f"\n{'='*80}")
print(f"Summary: {results['passed']}/{results['total']} passed")
print(f"{'='*80}")
return results
Pattern 3: Iterative Testing
def test_fix_retest_loop(test_fn, max_iterations: int = 5):
"""
Run test, identify failures, allow fixes, retest
Use for plugin development/debugging
"""
for iteration in range(1, max_iterations + 1):
print(f"\n{'='*80}")
print(f"Iteration {iteration}")
print(f"{'='*80}")
passed = asyncio.run(test_fn())
if passed:
print(f"\n✅ Tests passed on iteration {iteration}")
return True
else:
print(f"\n❌ Tests failed on iteration {iteration}")
if iteration < max_iterations:
input("\nFix the plugin, then press Enter to retest...")
else:
print(f"\nMax iterations ({max_iterations}) reached")
return False
16. Real-World Examples
Example 1: Test Plugin Loads Correctly
#!/usr/bin/env python3
import sys
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions
async def test_plugin_loads():
"""Verify plugin loads without errors"""
print("Test: Plugin loads correctly")
options = ClaudeAgentOptions(
plugins=[{"type": "local", "path": "./my-plugin"}]
)
# Simple query to trigger plugin loading
async for msg in query(prompt="hello", options=options):
if msg.type == 'system' and msg.subtype == 'init':
if msg.plugins and len(msg.plugins) > 0:
print(f"✅ Plugin loaded: {msg.plugins}")
return 0
print("❌ Plugin did not load")
return 1
if __name__ == '__main__':
sys.exit(asyncio.run(test_plugin_loads()))
Example 2: Test Command Produces Expected Output
async def test_command_output():
"""Test plugin command produces expected results"""
options = ClaudeAgentOptions(
plugins=[{"type": "local", "path": "./my-plugin"}]
)
# Execute command
messages = []
async for msg in query(prompt="/my-plugin:analyze test.txt", options=options):
if msg.type == 'assistant':
messages.append(msg.content)
output = ''.join(messages)
# Validate output
required_elements = [
"Analysis complete",
"Lines:",
"Characters:"
]
checks_passed = sum(1 for elem in required_elements if elem in output)
if checks_passed == len(required_elements):
print(f"✅ All {len(required_elements)} checks passed")
return 0
else:
print(f"❌ Only {checks_passed}/{len(required_elements)} checks passed")
print(f"Output: {output[:200]}...")
return 1
Example 3: Monitor Long-Running Plugin Operation
async def test_long_operation():
"""Monitor plugin through long operation"""
import time
start = time.time()
tool_count = 0
print("Starting long operation...")
async for msg in query(
prompt="/plugin:build-project spec.md",
options=ClaudeAgentOptions(
plugins=[{"type": "local", "path": "./plugin"}]
)
):
elapsed = time.time() - start
if msg.type == 'tool_call':
tool_count += 1
print(f"[{elapsed:6.1f}s] Tool {tool_count}: {msg.tool_name}")
elif msg.type == 'assistant':
preview = msg.content[:60]
if preview:
print(f"[{elapsed:6.1f}s] {preview}...")
total_time = time.time() - start
print(f"\nCompleted in {total_time/60:.1f} minutes")
print(f"Tools used: {tool_count}")
Common Issues
Issue 1: Plugin Not Found
Error: Plugin not loaded
Solution: Verify path points to directory containing .claude-plugin/plugin.json
Issue 2: Command Not Recognized
Error: Unknown command /plugin:cmd
Solution: Use correct namespace format: plugin-name:command-name
Issue 3: Permission Denied
Error: Permission denied for tool
Solution: Check permission_mode or add tool to allowed_tools
Issue 4: Timeout on Long Operations
Error: asyncio.TimeoutError
Solution: Wrap in asyncio.timeout() with appropriate duration
Issue 5: Can't Extract Data from Output
Error: Pattern not found
Solution: Print full output, adjust regex patterns, handle format variations
SDK vs CLI Differences
SDK Advantages:
- ✅ Programmatic control (loops, conditions, automation)
- ✅ Capture and parse all messages
- ✅ Monitor real-time progress
- ✅ Automated testing possible
- ✅ Integration into pipelines
SDK Limitations:
- ❌ Some hooks may not fire (hooks are Claude Code-specific)
- ❌ Different session model than interactive Claude Code
- ❌ Text parsing required (no structured JSON API)
Quick Reference
| Task | Code Pattern |
|---|---|
| Load plugin | options=ClaudeAgentOptions(plugins=[{"type": "local", "path": "./plugin"}]) |
| Execute command | async for msg in query(prompt="/plugin:cmd", options=options) |
| Monitor progress | Track tool_call and assistant messages with timestamps |
| Validate output | Regex patterns, string contains checks |
| Handle errors | Try/except with SDK-specific exceptions |
| Track tools | Monitor tool_call and tool_result messages |
| Timeout | async with asyncio.timeout(seconds) |
| Resume session | await client.query("...", resume=session_id) |
Dependencies
Required:
claude-agent-sdk(Python package)- Claude Code CLI (required by SDK)
ANTHROPIC_API_KEYenvironment variable
Optional:
asyncio(standard library, for timeout/gather)re(standard library, for pattern matching)
References
- Official SDK Docs: https://docs.claude.com/en/docs/agent-sdk/python
- Plugin Loading: https://docs.claude.com/en/docs/agent-sdk/plugins
- Session Management: https://docs.claude.com/en/docs/agent-sdk/sessions
- Python API Reference: https://docs.claude.com/en/docs/agent-sdk/python
This skill provides general SDK knowledge. For specific testing scenarios, combine with domain knowledge and validation requirements.