| name | agent-native-architecture |
| description | Build AI agents on Cloudflare's edge using prompt-native architecture where features are defined in prompts, not code. Use when creating autonomous agents with Durable Objects, designing Workers services, implementing self-modifying systems, or adopting the "trust the agent's intelligence" philosophy on the edge. |
Agent native engineering inverts traditional software architecture. Instead of writing code that the agent executes, you define outcomes in prompts and let the agent figure out HOW to achieve them using Cloudflare's edge primitives.
The Foundational Principle
Whatever the user can do, the agent can do. Many things the developer can do, the agent can do.
Don't artificially limit the agent. If a user could read state, write to Durable Objects, queue messages, call Workers—the agent should be able to do those things too. The agent figures out HOW to achieve an outcome; it doesn't just call your pre-written functions.
Features Are Prompts
Each feature is a prompt that defines an outcome and gives the agent the tools it needs. The agent then figures out how to accomplish it.
Traditional: Feature = Worker function that agent calls Prompt-native: Feature = prompt defining desired outcome + primitive Cloudflare tools
The agent doesn't execute your code. It uses Cloudflare primitives to achieve outcomes you describe.
Tools Provide Capability, Not Behavior
Tools should be primitives that enable capability. The prompt defines what to do with that capability.
Wrong: process_workflow(data, steps, handlers) — agent executes your workflow
Right: durable_object_rpc, queue_send, kv_put, service_binding_call — agent figures out the flow
Pure primitives are better, but domain primitives (like store_agent_state) are OK if they don't encode logic—just storage/retrieval.
The Development Lifecycle
- Start in the prompt - New features begin as natural language defining outcomes
- Iterate rapidly - Change behavior by editing prose, not refactoring Workers code
- Graduate when stable - Harden to Workers code when requirements stabilize AND speed/reliability matter
- Many features stay as prompts - Not everything needs to become a Worker
Cloudflare Edge Architecture for Agents
The edge is ideal for agent-native systems:
Durable Objects = Agent State
- Each agent instance gets a Durable Object
- State persists between events
- Built-in coordination and consistency
Workers = Event Handlers
- Lightweight, event-driven execution
- Agent responds to HTTP, Queue, Cron events
- Fast cold starts enable on-demand agent activation
Queues = Agent Messaging
- Agents communicate via message passing
- Async workflows without blocking
- Guaranteed delivery and retry
Service Bindings = Agent Collaboration
- Zero-latency RPC between agents
- Type-safe inter-agent communication
- Compose multi-agent systems
Self-Modification (Advanced)
The advanced tier: agents that can evolve their own code, prompts, and behavior. On Cloudflare, this means:
- Approval gates for Worker code changes
- Git commit before modifications (rollback capability)
- Health checks via Logpush/Analytics
- Deploy verification before cutover
- Version-controlled Durable Object migrations
Context Management Strategies
Long-running agents face a fundamental challenge: context grows unbounded. Two key patterns from production systems:
Agentic Search Over Semantic Search
When retrieving information, prioritize accuracy over speed. Let the agent search iteratively rather than relying on semantic similarity alone.
- Semantic search: Fast but can miss relevant context due to embedding limitations
- Agentic search: Agent formulates queries, evaluates results, refines search—slower but more accurate
- On the edge: Use Workers KV for semantic embeddings, but let agent validate and search DO storage directly
The agent decides WHAT is relevant, not your embedding model.
Context Compaction Strategies
As agent context grows, implement tiered memory:
- Hot context (in prompt): Current task, immediate state
- Warm context (Durable Object storage): Recent history, working memory
- Cold context (KV/R2): Long-term memory, archived decisions
The agent decides when to compact: "Summarize your last 100 interactions and store the summary. Keep only the summary and last 10 interactions in working memory."
Verification Pattern Hierarchy
When agents make changes—deploying Workers, modifying state, updating configurations—verification is critical. Use this hierarchy (from most to least preferred):
1. Rules-Based Verification (PREFERRED)
Deterministic checks with clear pass/fail criteria. Fast, cheap, reliable.
- TypeScript type checking
- ESLint/Biome linting
- Wrangler validation (
wrangler deploy --dry-run) - JSON schema validation
- API contract testing
2. Visual Verification
For UI changes or layout-sensitive modifications. The agent takes screenshots and compares.
- Screenshot comparison for Tanstack Start components
- Responsive design verification across breakpoints
- Visual regression testing
- Cloudflare Browser Rendering for headless verification
3. LLM-as-Judge (LAST RESORT)
Only when criteria are subjective or fuzzy. Expensive, high latency, non-deterministic.
- Code readability assessment
- Documentation quality
- Natural language output correctness
- Tone/style consistency
The hierarchy is deliberate: prefer determinism. Let rules catch what they can, visual verification for what must be seen, LLM-as-judge only when nothing else works.
On Cloudflare: Rules-based verification runs in Workers (milliseconds), visual verification uses Browser Rendering (seconds), LLM-as-judge uses external API calls (seconds + cost).
When to Use Subagents
Subagents reduce memory overhead and enable parallel execution. Use them when:
Parallel Execution Needed
Multiple independent tasks can run simultaneously.
// Orchestrator agent spawns specialized subagents
const [analysis, generation, review] = await Promise.all([
env.ANALYZER.analyze(data), // Subagent
env.GENERATOR.generate(prompt), // Subagent
env.REVIEWER.review(content), // Subagent
]);
Each subagent maintains only the context it needs—no shared memory bloat.
Memory Optimization
Main agent context is growing too large. Offload specialized tasks:
- Main agent: High-level orchestration, 10KB context
- Data subagent: Process large datasets, isolated context
- Writer subagent: Generate content, no access to orchestration history
On Cloudflare: Each subagent gets its own Durable Object. Service Bindings enable zero-latency RPC between them.
Isolation Boundaries
When tasks should NOT share context (security, separation of concerns):
- User-facing agent shouldn't see internal admin operations
- Different tenants get isolated subagents
- Sensitive operations run in dedicated subagents with restricted tools
Anti-pattern: Don't create subagents for sequential steps. That's just overhead. Use subagents when parallelism or isolation provides real value.
When NOT to Use This Approach
- High-frequency operations - millions of requests per second per agent
- Deterministic requirements - exact same output every time
- Cost-sensitive scenarios - when API costs would be prohibitive vs CPU
- Cold start sensitive - though Cloudflare Workers are very fast
Event-Driven Agent with Workers
// Worker receives events, agent processes them
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const event = await request.json();
// Get agent's Durable Object
const agentId = env.AGENT_STATE.idFromName(event.agentId);
const agent = env.AGENT_STATE.get(agentId);
// Agent processes event
return agent.fetch(request);
},
async queue(batch: MessageBatch, env: Env): Promise<void> {
for (const message of batch.messages) {
// Agent responds to queued events
await processAgentEvent(message.body, env);
}
}
};
Agent State with Durable Objects
// Agent state persists in Durable Object
export class AgentState extends DurableObject {
async fetch(request: Request): Promise<Response> {
const event = await request.json();
// Load state
const state = await this.ctx.storage.get('agentState') || {};
// Agent decides what to do via prompt
const result = await runAgent({
systemPrompt: `You are a stateful agent. Your current state: ${JSON.stringify(state)}
Process the event and decide what to store.`,
event,
tools: [
tool("update_state", { key: z.string(), value: z.any() }),
tool("get_state", { key: z.string() }),
]
});
// State changes committed atomically
await this.ctx.storage.put('agentState', result.newState);
return Response.json(result);
}
}
Multi-Agent Collaboration via Service Bindings
// Agents call each other via RPC
export class OrchestratorAgent extends DurableObject {
async processTask(task: Task): Promise<Result> {
// Agent decides which specialized agent to call
const agentDecision = await runAgent({
systemPrompt: `You coordinate specialized agents via service bindings.
Available: ANALYZER_AGENT, WRITER_AGENT, REVIEWER_AGENT.
Decide which to call and what to send them.`,
task,
});
// Call specialized agent via binding
const result = await this.env.ANALYZER_AGENT.analyze(
agentDecision.data
);
return result;
}
}
Queue-Based Agent Workflows
// Agents communicate asynchronously via Queues
async function agentSendMessage(env: Env, recipientId: string, message: any) {
await env.AGENT_QUEUE.send({
recipientId,
message,
timestamp: Date.now(),
});
}
// Queue consumer activates receiving agent
export default {
async queue(batch: MessageBatch<AgentMessage>, env: Env) {
for (const msg of batch.messages) {
const agentId = env.AGENT_STATE.idFromName(msg.body.recipientId);
const agent = env.AGENT_STATE.get(agentId);
await agent.handleMessage(msg.body.message);
}
}
};
- Design edge architecture - Plan a Workers + Durable Objects agent system
- Create primitive tools - Build Cloudflare-native tools following the philosophy
- Write system prompts - Define agent behavior for edge execution
- Multi-agent systems - Design agent collaboration via Queues/Bindings
- Self-modification - Enable agents to safely evolve Workers code
- Review/refactor - Make existing Cloudflare code more prompt-native
Wait for response before proceeding.
Step 1: Define Cloudflare primitive tools
const tools = [
tool("durable_object_get", "Read from Durable Object storage",
{ key: z.string() }),
tool("durable_object_put", "Write to Durable Object storage",
{ key: z.string(), value: z.any() }),
tool("queue_send", "Send message to Queue",
{ queue: z.string(), message: z.any() }),
tool("kv_get", "Read from KV",
{ key: z.string() }),
tool("service_call", "Call another Worker via binding",
{ service: z.string(), method: z.string(), data: z.any() }),
];
Step 2: Write behavior in the system prompt
## Your Responsibilities
You are an agent running on Cloudflare's edge. When processing events:
1. Load your state from Durable Object storage
2. Analyze what actions are needed
3. Update state, send messages, or call other agents
4. Use your judgment about coordination patterns
Available tools:
- Durable Object storage for your state
- Queues for async messaging to other agents
- Service bindings for RPC to specialized agents
- KV for shared read-heavy data
You decide the workflow. Make it efficient.
Step 3: Deploy as Durable Object + Worker
export class Agent extends DurableObject {
async fetch(request: Request): Promise<Response> {
const event = await request.json();
const result = await query({
prompt: JSON.stringify(event),
options: {
systemPrompt,
tools,
}
});
return Response.json(result);
}
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const agentId = env.AGENT.idFromName("agent-1");
const agent = env.AGENT.get(agentId);
return agent.fetch(request);
}
};
See examples/ directory for detailed implementations:
Durable Objects State Management: examples/durable-objects-state.md
- Event-driven agent state with atomic storage
- State transitions based on agent decisions
- Rollback and versioning patterns
Workers Event-Driven Architecture: examples/workers-event-driven.md
- HTTP, Queue, and Cron triggered agents
- Multi-event source coordination
- Cold start optimization strategies
Queue-Based Messaging: examples/queue-messaging.md
- Agent-to-agent communication patterns
- Workflow orchestration via messages
- Error handling and retry strategies
Verification Patterns: examples/verification-patterns.md
- Rules-based verification (linting, type checking, validation)
- Visual verification (screenshots, Browser Rendering)
- LLM-as-judge patterns (when and how to use)
- Cloudflare Workers-specific verification strategies
THE CARDINAL SIN: Agent executes your Worker code instead of figuring things out
// WRONG - You wrote the workflow, agent just executes it
tool("process_order", async ({ order }) => {
const validated = await validateOrder(order); // Your code
const inventory = await checkInventory(order); // Your code
const charge = await processPayment(order); // Your code
await env.ORDERS_DO.put(order.id, { validated, charge }); // Your code
if (charge.amount > 1000) {
await env.NOTIFICATIONS_QUEUE.send({ type: 'high_value' }); // Your code
}
});
// RIGHT - Agent figures out how to process orders
tool("durable_object_get", { key }, ...); // Primitive
tool("durable_object_put", { key, value }, ...); // Primitive
tool("queue_send", { message }, ...); // Primitive
tool("service_call", { service, method, data }, ...); // Primitive
// Prompt says: "Validate order, check inventory via INVENTORY service binding,
// charge payment, store result, notify if > $1000"
Don't fight Cloudflare's execution model
// WRONG - trying to maintain long-running connections in Workers
const connections = new Map(); // This won't work across requests
// RIGHT - use Durable Objects for stateful connections
export class WebSocketAgent extends DurableObject {
async fetch(request: Request) {
const pair = new WebSocketPair();
this.ctx.acceptWebSocket(pair[1]);
return new Response(null, { status: 101, webSocket: pair[0] });
}
}
Don't over-complicate agent coordination
// Wrong - complex state synchronization
tool("sync_all_agents", async () => {
// Complex merge logic across multiple Durable Objects
});
// Right - let agents message each other via Queues
tool("queue_send", { recipientId, message }, ...);
// Agents figure out coordination via prompts
Don't encode environment-specific logic in tools
// Wrong - tool decides based on environment
tool("store_data", { data }, async ({ data }) => {
if (env.ENVIRONMENT === 'production') {
await env.PROD_DO.put(key, data);
} else {
await env.DEV_DO.put(key, data);
}
});
// Right - agent chooses based on context in prompt
tool("durable_object_put", { objectName, key, value }, ...);
// System prompt tells agent which DO to use
- The agent figures out HOW to achieve outcomes using Cloudflare primitives
- Whatever a user could do with Workers/DO/Queues, the agent can do
- Features are prompts that define outcomes, not Workers code that defines workflows
- Tools are Cloudflare primitives (DO storage, Queue send, Service RPC)
- Changing behavior means editing prose, not refactoring TypeScript
- The agent can surprise you with clever edge patterns you didn't anticipate
- You could add a feature by writing a prompt section, not new Workers code
- The system leverages Cloudflare's distributed nature (edge execution, DO consistency)
Durable Objects for Agent State
- One Durable Object per agent instance
- Use atomic storage transactions for state consistency
- Leverage SQLite storage API for complex queries
- Implement state versioning for rollback capability
Workers for Event Handling
- Keep Workers stateless, state goes in Durable Objects
- Use Service Bindings for zero-latency agent RPC
- Implement Circuit Breaker pattern for external calls
- Cache frequently used data in Workers KV
Queues for Async Coordination
- Use Queues for agent-to-agent messaging
- Implement idempotency for queue consumers
- Set appropriate retry policies per workflow
- Use Dead Letter Queues for error handling
Multi-Region Considerations
- Durable Objects provide global consistency
- Use KV replication for read-heavy shared data
- Consider location hints for latency-sensitive agents
- Design for eventual consistency across regions
Cost Optimization
- Batch operations in Durable Objects to reduce billable time
- Use KV for static/shared data to reduce DO reads
- Implement smart caching to minimize external API calls
- Monitor and optimize cold start patterns
Observability
- Use Logpush for agent decision logs
- Implement structured logging for agent actions
- Set up Analytics Engine for agent metrics
- Use Tail Workers for real-time debugging