| name | ai-dev-orchestration |
| version | 2.1.0 |
| description | Meta-orchestrator for AI-assisted app development with behavioral guardrails and prompt templates. 5-phase SOP - Product Framing (planner) -> Setup & Foundations (system-architect) -> Feature Development Loop (coder + tester + reviewer) -> Testing & Refactors (tester + coder) -> Deployment (cicd-engineer). Wraps three-loop system with AI-specific safety patterns. |
| category | orchestration |
| agents | planner, system-architect, coder, tester, reviewer, cicd-engineer |
| mcp_tools | memory-mcp |
| artifacts | Feature-Prompt-Framework.md, Foundation-Prompt-Template.md, Compliance-Checklist.md, Do-Not-Touch-Template.md |
AI-Assisted App Development Orchestration
Phase 0: Expertise Loading
Before orchestrating AI development:
- Detect Domain: Identify app type and tech stack
- Check Expertise: Look for
.claude/expertise/ai-dev-${stack}.yaml - Load Context: If exists, load guardrail patterns and successful prompts
- Apply Configuration: Use expertise for development orchestration
Purpose: Ship reliable apps with AI coding tools while minimizing bugs, rework, and chaos
Core Principle: Treat AI as a junior dev with superpowers inside a structured pipeline. Real leverage comes from tight specs, clean context, and strong foundations.
Timeline: Varies by complexity (Product Framing 30-60min, Foundations 1-2hrs, Per-Feature 1-3hrs, Testing 30-90min, Deployment 15-30min)
Integration: Wraps feature-dev-complete, sparc-methodology, cicd-intelligent-recovery with AI-specific guardrails
System Architecture
[User Product Idea]
↓
[Phase 0: Product Framing] (planner) - OPTIONAL
↓ (App One-Pager, Persona, Validation)
↓
[Phase 1: Setup & Foundations] (system-architect)
↓ (Stack selection, MVP definition, Foundation implementation)
↓
[Phase 2: Feature Development Loop] (coder + tester + reviewer) - ITERATIVE
↓ (Per-feature: Plan → Implement → Test → Accept/Rollback)
↓ (Fresh context per feature, Do Not Touch lists, Manual testing)
↓
[Phase 3: Testing & Refactors] (tester + coder)
↓ (Bug fixes, Refactoring with scope limits)
↓
[Phase 4: Deployment] (cicd-engineer)
↓ (Staging/Production, Monitoring, Health checks)
↓
[Memory-MCP Storage] (with WHO/WHEN/PROJECT/WHY tags)
When to Use This Skill
Activate this skill when:
- Building apps with AI coding tools (Cursor, Claude Code, Lovable, Bolt)
- Need to prevent AI coding chaos and "theater implementations"
- Want systematic approach to AI-assisted development
- Building greenfield projects (web, mobile, internal tools)
- Small teams or solo builders using AI agents
- Need AI-safe guardrails (scope limiting, context management, testing gates)
DO NOT use this skill for:
- Traditional development without AI assistance (use feature-dev-complete)
- Quick scripts or throwaway code (too much process)
- Well-understood repetitive tasks (use existing patterns)
- Emergency hotfixes (skip structured workflow)
Guiding Principles (from second-order insights)
- Spec > Code: Quality depends more on how well you specify than how fast you type
- Foundations First: Mini-waterfall for architecture, agile for features
- Ephemeral Context: Fresh chat per feature; persistent knowledge in code/docs, not chat history
- Guardrails Over Brute Force: Constrain what AI can touch; use "do not change X" aggressively
- Small, Tested Steps: One feature at a time, each fully tested before moving on
- Human Product Judgment: AI can simulate validation, but real users validate markets
- AI is Factory, You are Orchestrator: Your job is design specs, run pipeline, decide pass/fail
Input Contract
input:
product:
name: string (required)
description: string (required) # 1-2 sentence "who + what outcome"
target_user: string (optional) # Narrow customer segment
differentiators: array[string] (optional) # How you differ from competitors
requirements:
must_have_features: array[string] (1-2 core features for MVP)
nice_to_have: array[string]
constraints:
technical: array[string] # Required stack, frameworks
timeline: string
budget: string
options:
run_product_framing: boolean (default: true) # Skip if already validated
auto_test: boolean (default: true) # Run tests after each feature
manual_review: boolean (default: true) # Human approval gates
deployment_target: enum[staging, production] (default: staging)
Output Contract
output:
product_frame: # Phase 0 output
app_one_pager: path # docs/app-one-pager.md
persona_snapshot: object
validation_summary: string
monetization_hypothesis: string
foundations: # Phase 1 output
tech_stack: object
frontend: string
backend: string
database: string
auth: string
payments: string (if applicable)
deployment: string
mvp_checklist: array[object]
task: string
status: enum[pending, in_progress, completed]
repo_url: string
base_app_running: boolean
features: # Phase 2 output (per feature)
feature_specs: array[path] # docs/features/*.md
implementations: array[object]
feature_id: string
files_changed: array[string]
test_coverage: number (percentage)
status: enum[accepted, rejected, rolled_back]
total_features_completed: number
quality: # Phase 3 output
bugs_fixed: number
refactorings_applied: number
test_coverage_overall: number
deployment: # Phase 4 output
deployment_url: string
monitoring_dashboard: string
health_check_url: string
integration:
memory_namespace: string # ai-dev/{project-name}
compliance_rate: number # % of features following framework
SOP Phase 0: Product Framing & Validation (30-60 min) - OPTIONAL
Objective: Define problem, user journey, constraints, and success criteria BEFORE coding
Agent: planner
Prompt:
await Task("Product Frame Architect", `
Create comprehensive product frame for AI-assisted app development.
Product Idea: <product-description>
Generate App One-Pager with structure:
## App One-Pager
### Overview
- **Name**: <product-name>
- **Target User**: <narrow customer segment>
Example: "Working parents with toddlers ages 1-4"
NOT: "Busy people" (too broad)
- **Core Problem + Outcome**: <what + for whom + result>
Example: "A mobile to-do app to reduce mental load for busy parents of toddlers"
- **Magic Elements**: <1-2 differentiators>
Example: "Voice capture, Collaborative lists for co-parents"
### Persona Snapshot
- **Demographics**: <age, location, job, family>
- **Daily Routine**: <typical day>
- **Main Goals**: <what they're trying to achieve>
- **Pain Points (3-5)**:
1. <recurring problem 1>
2. <recurring problem 2>
...
### Lightweight Validation
- **Communities**: <2-3 subreddits/forums/discords where problem is discussed>
Examples: r/Parenting, r/toddlers, working-parents Discord
- **Competitors (3-5)**:
1. <competitor-name>: <their approach>
**How we differ**: <key difference>
2. ...
- **Differentiation Summary**: <1-2 sentences on unique value>
### Initial Pricing Hypothesis
- **Free Tier**: <what's included>
- **Pro Tier**: <$X/month> - <additional features>
- **Annual**: <$Y/year> - <discount %>
### Success Criteria (Measurable)
1. <outcome metric 1>
2. <outcome metric 2>
3. <outcome metric 3>
Save to: docs/app-one-pager.md
Store in Memory-MCP with WHO/WHEN/PROJECT/WHY tagging.
`, "planner");
MCP Requirements
This skill requires the following MCP servers for optimal functionality:
memory-mcp (6.0k tokens)
Purpose: Store product frames, feature specs, implementation decisions, and compliance metrics across all 5 phases for cross-session persistence.
Tools Used:
mcp__memory-mcp__memory_store: Store product frames, feature specifications, bug logs, deployment configsmcp__memory-mcp__vector_search: Retrieve similar feature patterns, past bug fixes, architectural decisions
Activation (PowerShell):
# Check if already active
claude mcp list
# Add if not present
claude mcp add memory-mcp node C:\Users\17175\memory-mcp\build\index.js
Usage Example:
// Phase 0: Store product frame
await mcp__memory-mcp__memory_store({
text: `Product frame complete for ${product_name}. Target: ${target_user}. Core problem: ${problem}. Differentiators: ${differentiators}. Validation: ${communities} communities, ${competitors_count} competitors analyzed. Pricing: ${pricing_hypothesis}`,
metadata: {
key: `ai-dev-orchestration/${project_name}/product-frame`,
namespace: "ai-dev/product-framing",
layer: "long-term",
category: "product-planning",
tags: {
WHO: "ai-dev-orchestration",
WHEN: new Date().toISOString(),
PROJECT: project_name,
WHY: "product-framing"
}
}
});
// Phase 2: Store feature completion
await mcp__memory-mcp__memory_store({
text: `Feature ${feature_name} completed. User journey: ${journey_summary}. Implementation: ${files_changed}. Tests: ${test_results}. Bugs: ${bugs_found}. Status: ACCEPTED`,
metadata: {
key: `ai-dev-orchestration/${project_name}/features/${feature_id}`,
namespace: "ai-dev/features",
layer: "mid-term",
category: "feature-implementation",
tags: {
WHO: "ai-dev-orchestration",
WHEN: new Date().toISOString(),
PROJECT: project_name,
WHY: "feature-completion"
}
}
});
// Retrieve similar bug fix patterns
const similarBugs = await mcp__memory-mcp__vector_search({
query: `Bug fix for ${error_type} in ${component}`,
limit: 5
});
Token Cost: 6.0k tokens (3.0% of 200k context) When to Load: Always - required for all 5 phases (product framing, foundations, features, testing, deployment)
MCP Tools Used:
mcp__memory-mcp__memory_store
Memory Storage:
Success Criteria:
- App One-Pager saved to docs/
- Target user narrowly defined
- 3-5 pain points identified
- 3-5 competitors researched
- Pricing hypothesis documented
- Success metrics are measurable
SOP Phase 1: Setup & Foundations (1-2 hrs)
Objective: Choose stack, create repo, define MVP, implement foundations BEFORE feature chaos
Agent: system-architect
Prompt for Tech Stack Selection:
await Task("Tech Stack Architect", `
Choose technology stack for AI-assisted app development.
Product Context: <app-one-pager-summary>
Requirements: <functional + non-functional>
Constraints: <technical-constraints>
Decision Criteria:
1. **AI Tool Compatibility**: Works well with Cursor/Claude Code (export capability, ownership)
2. **Stack you can own**: Prefer stacks you can export (Lovable/Bolt with GitHub export)
3. **Proven patterns**: Battle-tested combinations
4. **Team expertise**: Match team skills or learning goals
Stack Decisions (with justification):
### Frontend
- **Framework**: <Next.js/React, Expo React Native, Vue, etc.>
- **Rationale**: <why this choice>
### Backend
- **Platform**: <Next API routes, Supabase, Convex, Express, FastAPI, etc.>
- **Rationale**: <why this choice>
### Database
- **Choice**: <Supabase Postgres, Convex KV, MongoDB, etc.>
- **Rationale**: <why this choice>
### Auth
- **Provider**: <Supabase Auth, Clerk, Auth0, platform-built, etc.>
- **Rationale**: <why this choice>
### Payments (if applicable)
- **Provider**: <Stripe, RevenueCat, etc.>
- **Rationale**: <why this choice>
### Deployment
- **Platform**: <Vercel, Netlify, Render, app stores, etc.>
- **Rationale**: <why this choice>
Save to: docs/tech-stack-decision.md
Generate MVP Checklist:
1. Setup app + basic layout
2. DB + auth (only if needed for MVP)
3. Core feature A (must-have)
4. Core feature B (must-have, if applicable)
5. Payments / subscription (only if revenue-critical)
6. Basic deployment
Save to: docs/mvp-checklist.md
`, "system-architect");
Prompt for Repository & Base App Setup:
await Task("Repository Scaffolder", `
Initialize repository and create base app.
Tech Stack: <from tech-stack-decision.md>
Steps:
1. Initialize git repo
git init
git remote add origin <repo-url>
2. Scaffold base app (using CLI or AI coding agent with SINGLE, CLEAR prompt):
Prompt: "Create a barebones [framework] app with a simple home screen and a health check route."
Example for Next.js:
npx create-next-app@latest <app-name> --typescript --tailwind --app --no-src-dir
3. Verify can run locally:
- npm install
- npm run dev
- Open http://localhost:3000
- See simple landing page
- Hit test API route (/api/health)
4. Create .gitignore with:
- node_modules/
- .env*
- .next/
- build/
- dist/
5. Create initial commit:
git add .
git commit -m "Initial commit: Barebones [framework] app
- Simple home screen
- Health check route /api/health
- Basic project structure
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>"
Report:
- [ ] Repo initialized
- [ ] Base app created
- [ ] Runs locally
- [ ] Landing page visible
- [ ] Health check route works
- [ ] Initial commit created
`, "system-architect");
Prompt for Foundations Implementation:
await Task("Foundations Implementer", `
Implement core infrastructure BEFORE adding features.
Use Foundation Prompt Template (.claude/skills/ai-dev-orchestration/.artifacts/Foundation-Prompt-Template.md)
Foundations to implement:
1. Frontend shell (basic layout, navigation)
2. Backend scaffold (routes, handlers)
3. Database schema (only tables needed for MVP)
4. Auth (signup/login/logout) - ONLY if needed for MVP
5. Payments (if revenue-critical for MVP)
6. External APIs (if needed)
For EACH foundation item:
1. Open a FRESH AI chat (Cursor/Claude Code)
2. Use Foundation Prompt Template:
---
**Foundation Prompt Template**
You are helping me implement the [DATABASE/AUTH/PAYMENT/API] foundation for my app.
**Context**
- App: [1-2 sentence description from app-one-pager]
- Tech stack: [frameworks/services from tech-stack-decision]
- Current state: [what already exists]
**Goal**
- Implement [DB/auth/payments/API] capabilities needed for MVP only
**Requirements**
- [List of tables/fields / flows / API endpoints]
- [Constraints: multi-tenant, regional, etc.]
**Constraints (DO NOT TOUCH)**
- Don't change existing routes/components not mentioned
- Don't add more tables than necessary without asking
- [Other constraints specific to your app]
Propose a plan, then implement step by step. After each change, summarize what you changed.
---
3. Apply code changes via tool's review mechanism (diff/preview)
4. MANUALLY TEST:
- DB migrations and basic CRUD
- Auth flows (signup, login, logout)
- Payment sandbox flow (if integrated)
Report per foundation:
- [ ] Plan reviewed and approved
- [ ] Implementation complete
- [ ] Manual tests passing
- [ ] Committed with clear message
Store decisions in Memory-MCP:
- Database schema rationale
- Auth flow design
- Payment integration approach
`, "system-architect");
MCP Tools Used:
mcp__memory-mcp__memory_store
Success Criteria:
- Tech stack chosen with justification
- MVP checklist defined (1-2 must-have features)
- Repo created and base app running
- Foundations implemented (DB, auth, payments if needed)
- All foundations manually tested
- Decisions stored in Memory-MCP
SOP Phase 2: Feature Development Loop (1-3 hrs per feature) - ITERATIVE
Objective: Implement features one at a time with AI safety guardrails
Core Rule: ONE FEATURE PER LOOP, ONE FEATURE PER PROMPT THREAD
Agents: coder + tester + reviewer
Per-Feature Workflow:
Step 1: Plan the Feature (Human)
Prompt for Feature Planning:
await Task("Feature Spec Writer", `
Write feature specification for AI implementation.
Feature Name: <short-name>
Purpose (1 sentence): <what + why>
Generate Feature Spec:
## Feature: <name>
### Purpose
<One-sentence description of what this feature does and why>
### User Journey (step-by-step narrative)
1. User opens <screen/page>
2. User clicks <button/link>
3. User sees <what appears>
4. User enters <data>
5. User clicks <submit>
6. System <what happens>
7. User sees <result>
### Data Changes
- **New tables**: <table-name> with columns <col1, col2, col3>
- **New columns**: <existing-table>.<new-column>
- **New relationships**: <table1> → <table2> (foreign key)
### Technology & APIs
- **Stack**: <Next.js + Supabase + Stripe>
- **External APIs**: <OpenAI for suggested tasks> (if relevant)
### Design Direction
- **Style**: Clean, minimal, mobile-first
- **Reuse**: Use existing button styles from components/Button.tsx
- **Layout**: Same as task list view (grid, 2-column)
### Negative Constraints (DO NOT TOUCH)
- Do NOT change auth flow
- Do NOT rename existing DB columns
- Do NOT modify payment logic
- Do NOT alter core navigation components
Save to: docs/features/<feature-name>.md
`, "planner");
Step 2: Implement with Feature Prompt Framework
Feature Prompt Framework (used in Cursor/Claude Code fresh chat):
# Feature Prompt Framework
## 1. Context
I'm building a [web/mobile] app for [target user] to [primary outcome].
This feature is: [short name]
## 2. User Journey
- User opens X
- User clicks Y
- User sees Z
- User enters A
- User clicks B
- System does C
- User sees D
## 3. Technology & Data
- **Stack**: [Next.js + Supabase + Stripe]
- **DB changes**: [Add \`shared_list_id\` to tasks, create \`shared_lists\` table with columns: id, name, owner_id, created_at]
- **APIs/LLMs**: [Use OpenAI to generate suggested tasks] (if relevant)
## 4. Design Direction
- Style notes: "Clean, minimal, mobile-first, reusing existing button styles. Use the same layout as the task list view."
## 5. Negative Constraints (DO NOT TOUCH)
- Do NOT:
- change auth flow
- rename existing DB columns
- modify payment logic
- alter core navigation components
Agent Execution:
await Task("Feature Implementer", `
Implement feature using AI coding tool (Cursor/Claude Code) in FRESH chat.
Feature Spec: docs/features/<feature-name>.md
Instructions:
1. Open NEW chat in AI coding tool
2. Paste Feature Prompt Framework from docs/features/<feature-name>.md
3. Ask AI for brief plan first: "Outline the changes you'll make"
4. Review plan for scope creep (reject if touches "do not touch" items)
5. Let AI edit files
6. Review diffs carefully
7. Reject any scope creep beyond feature spec
CRITICAL GUARDRAILS:
- ONE feature per chat (no mixing features)
- Fresh context (don't reuse previous chat)
- Do not touch list is MANDATORY
- Reject changes outside feature scope
Implementation checklist:
- [ ] Fresh AI chat opened
- [ ] Feature Prompt Framework used
- [ ] Plan reviewed (no scope creep)
- [ ] Files edited (within scope)
- [ ] Diffs reviewed (approved)
- [ ] Ready for testing
`, "coder");
Step 3: Manual Testing
Agent Execution:
await Task("Feature Tester", `
Manually test feature following exact user journey.
Feature Spec: docs/features/<feature-name>.md
Implementation: <files-changed>
Test Steps (from user journey):
1. <step 1 from spec>
Expected: <expected result>
Actual: <what happened>
Status: PASS / FAIL
2. <step 2 from spec>
Expected: <expected result>
Actual: <what happened>
Status: PASS / FAIL
...
Edge Cases to Test:
- Empty inputs
- Invalid data
- Error states
- Failure conditions
- Mobile responsiveness (if applicable)
Log issues in: docs/bugs/<feature-name>-bugs.md
Report:
- [ ] User journey tested end-to-end
- [ ] All steps PASS
- [ ] Edge cases tested
- [ ] Bugs logged (if any)
- [ ] Decision: ACCEPT / ROLLBACK
`, "tester");
Step 4: Accept or Roll Back
Agent Execution:
await Task("Feature Acceptance Reviewer", `
Review feature test results and decide acceptance.
Test Results: <from tester>
Bugs Found: <count>
Decision Criteria:
- ACCEPT if:
- All user journey steps PASS
- No critical bugs
- Edge cases handled
- "Do not touch" constraints respected
- ROLLBACK if:
- Critical bugs found
- User journey broken
- Scope creep into "do not touch" areas
- Implementation doesn't match spec
If ACCEPT:
1. Commit with meaningful message:
git add .
git commit -m "feat: <feature-name>
- <what it does>
- <key implementation detail>
- <tables/routes/components touched>
Tested: <user-journey-summary>
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>"
2. Document in docs/features/<feature-name>.md:
- What it does
- What tables/routes/components it touches
- Any TODOs or known limitations
If ROLLBACK:
1. git reset --hard HEAD # Discard changes
2. Re-scope or clarify spec
3. Restart feature loop
Report:
- Decision: ACCEPT / ROLLBACK
- Reasoning: <why>
- Next steps: <commit OR re-scope>
`, "reviewer");
Step 5: Document & Reset Context
Agent Execution:
// Store feature completion in Memory-MCP
mcp__memory-mcp__memory_store({
text: `Feature ${feature_name} completed. User journey: ${journey_summary}. Implementation: ${files_changed}. Tests: ${test_results}. Bugs: ${bugs_found}. Status: ACCEPTED`,
metadata: {
key: `ai-dev-orchestration/${project_name}/features/${feature_id}`,
namespace: "ai-dev/features",
layer: "mid-term",
category: "feature-implementation",
tags: {
WHO: "ai-dev-orchestration",
WHEN: new Date().toISOString(),
PROJECT: project_name,
WHY: "feature-completion"
}
}
});
// Close AI chat / start new one for next feature
// This prevents context pollution
Compliance Checklist (per feature):
- [ ] Feature has brief written spec (name, purpose, user journey)
- [ ] Feature Prompt Framework used
- [ ] Work in fresh AI chat (not reused)
- [ ] "Do not touch" list included in prompt
- [ ] Manual tests executed against user journey
- [ ] Bugs fixed or logged with owner & priority
- [ ] Changes documented (3-5 bullets)
- [ ] Committed with clear message
Repeat this loop for every feature in MVP checklist
Success Criteria (per feature):
- Feature spec written with user journey
- Fresh AI chat used (context reset)
- "Do not touch" list respected
- Manual tests PASS
- Feature ACCEPTED or ROLLED BACK (never left broken)
- Documented and committed
SOP Phase 3: Testing, Bugs, & Refactors (30-90 min)
Objective: Keep code healthy while moving fast
Agents: tester + coder
Bug Hygiene Rule: Never let more than 5 bugs linger per feature
Prompt for Bug Fixes:
await Task("Bug Fixer", `
Fix bug with AI coding tool in FRESH chat.
Bug Description: <error-messages / failing-behavior>
Relevant Code: <snippets>
Instructions:
1. Open FRESH AI chat
2. Provide:
- Error messages / failing behavior
- Relevant code snippets
- Clear constraint: "Fix only what is necessary to resolve this bug. Do not touch <X, Y, Z>."
3. Apply fix
4. Test again
If bug cascades into new bugs:
- Consider rolling back
- Re-implement more cleanly
Report:
- [ ] Bug fixed
- [ ] Tests passing
- [ ] No new bugs introduced
- [ ] Committed with message: "fix: <bug-description>"
`, "coder");
Prompt for Refactors:
await Task("Refactorer", `
Apply refactoring with AI coding tool.
Refactor Type: <renaming / extraction / simplification>
Scope: <files/components affected>
Use AI for:
- Renaming components/functions across codebase
- Extracting shared components/hooks
- Simplifying complex logic
ALWAYS:
- Scope the refactor clearly
- Prevent database schema or auth changes unless explicitly requested
- Test after refactoring
Report:
- [ ] Refactor applied
- [ ] Tests passing
- [ ] No unintended changes
- [ ] Committed with message: "refactor: <what-changed>"
`, "coder");
Success Criteria:
- Bugs fixed promptly (within 1-2 feature cycles)
- Refactorings scoped and tested
- No accumulation of technical debt
- Code quality maintained
SOP Phase 4: Deployment & Monitoring (15-30 min)
Objective: Deploy to staging/production with monitoring
Agent: cicd-engineer
Prompt for Deployment:
await Task("Deployment Engineer", `
Deploy app to <staging/production>.
Hosting: <Vercel/Netlify/Render>
Environment Variables: <API keys, DB URLs, secrets>
Steps:
1. Setup environment variables in hosting platform:
- DATABASE_URL
- AUTH_SECRET
- STRIPE_KEY (if applicable)
- etc.
2. Deploy:
- Connect GitHub repo to hosting platform
- Configure build settings
- Deploy
3. Verify in <staging/production>:
- App loads
- Health check works (/api/health)
- Basic functionality works
4. Setup monitoring:
- Error tracking: Sentry / Logtail
- Health check endpoint: /api/health
- Usage metrics: DAU, key actions
Report:
- [ ] Environment variables configured
- [ ] Deployed successfully
- [ ] App verified in environment
- [ ] Monitoring configured
- [ ] Deployment URL: <url>
`, "cicd-engineer");
Post-Launch Loop:
await Task("Product Feedback Collector", `
Collect user feedback and feed into Feature Development Loop.
Actions:
1. Monitor user feedback channels
2. Prioritize improvements
3. Create feature specs for high-priority items
4. Feed back into Phase 2 (Feature Development Loop)
Report:
- Top user requests: <list>
- Prioritized roadmap: <next 3 features>
`, "planner");
Success Criteria:
- App deployed to staging/production
- Environment variables configured
- Monitoring operational (errors, health, usage)
- Deployment URL accessible
- Feedback loop established
Operational Rules of Thumb (Behavioral)
These encode "unintuitive" insights into simple rules:
Rule 1: Never ask AI to "build my whole app"
Always scope to: one foundation piece OR one feature
Rule 2: No feature without a user journey
If you can't write: "User does A → B → C → sees D", you're not ready
Rule 3: Every major feature prompt must include a "do not touch" list
Protect by default: DB schema, auth, payments
Rule 4: One chat = one feature/foundation
When done: reset context, start fresh for next task
Rule 5: Architecture choices are sticky
Don't pick casually: DB/auth/stack decisions are long-term
Rule 6: Manual testing is non-optional, especially early
You can add automated tests later: first understand how things fail
Rule 7: AI is a factory, you are the orchestrator
Your main job: design good specs, run the pipeline, decide pass/fail
Compliance Checklist (per feature)
Before calling a feature "done", check:
- Feature has brief written spec (name, purpose, user journey, data touchpoints)
- Feature Prompt Framework used (context, journey, tech, design, constraints)
- Work in fresh AI chat (not messy prior thread)
- "Do not touch" list included (DB/auth/payments & other sensitive areas)
- Manual tests executed against user journey
- Bugs fixed or logged with owner & priority
- Changes documented (3-5 bullets) and committed with clear message
Integration with Three-Loop System
AI Dev Orchestration wraps the three-loop system:
[AI Dev Orchestration]
│
├─ Phase 0: Product Framing → planner
│
├─ Phase 1: Foundations → system-architect
│ └─ Uses: sparc-methodology (architecture phase)
│
├─ Phase 2: Feature Loop → coder + tester + reviewer
│ └─ Uses: feature-dev-complete (wrapped with AI guardrails)
│ └─ Uses: parallel-swarm-implementation (Loop 2)
│
├─ Phase 3: Testing & Refactors → tester + coder
│ └─ Uses: smart-bug-fix (intelligent debugging)
│ └─ Uses: functionality-audit (sandbox validation)
│
└─ Phase 4: Deployment → cicd-engineer
└─ Uses: cicd-intelligent-recovery (Loop 3)
Unique Value of AI Dev Orchestration:
- Prompt Templates: Feature Prompt Framework, Foundation Prompt Template
- Behavioral Rules: Fresh context, do not touch lists, manual testing gates
- Context Management: Ephemeral chat history, persistent memory in code/docs
- Scope Limiting: One feature per chat, guardrails against feature creep
Performance Targets
| Phase | Target Time | Success Metric |
|---|---|---|
| Phase 0: Product Framing | 30-60 min | App One-Pager complete |
| Phase 1: Foundations | 1-2 hrs | Base app running + foundations tested |
| Phase 2: Per-Feature | 1-3 hrs | Feature ACCEPTED with tests passing |
| Phase 3: Testing | 30-90 min | Bugs fixed, refactors scoped |
| Phase 4: Deployment | 15-30 min | Deployed + monitoring active |
Quality Targets:
- Bug accumulation: ≤5 bugs per feature
- Test coverage: ≥80% (manual initially, automated later)
- Rollback rate: ≤20% (80% features accepted first try)
- Deployment success: 100% (no broken deploys)
Example Usage
# Full workflow with product framing
npx claude-flow@alpha skills run ai-dev-orchestration \
--product "TaskMaster for busy parents" \
--must-have-features "Voice task capture, Shared family lists" \
--run-product-framing true
# Skip product framing (already validated)
npx claude-flow@alpha skills run ai-dev-orchestration \
--product "TaskMaster" \
--must-have-features "Voice task capture, Shared family lists" \
--run-product-framing false
# Production deployment
npx claude-flow@alpha skills run ai-dev-orchestration \
--deployment-target production
Artifacts Reference
- Feature-Prompt-Framework.md: Reusable template for feature specs
- Foundation-Prompt-Template.md: Reusable template for foundations (DB, auth, etc.)
- Compliance-Checklist.md: Per-feature validation checklist
- Do-Not-Touch-Template.md: Template for constraint lists
Success Metrics
Track in Memory-MCP:
- Features shipped per week
- Rollback rate (target: ≤20%)
- Bug accumulation trend (target: ≤5 per feature)
- Test coverage (target: ≥80%)
- Deployment frequency (target: daily for staging, weekly for production)
- Compliance rate (% features following framework, target: 100%)
END OF AI-DEV-ORCHESTRATION SKILL SOP
Recursive Improvement Integration (v2.1)
Input/Output Contracts
input_contract:
required:
- app_requirements: string
optional:
- tech_stack: string
- expertise_file: path
output_contract:
required:
- app_artifacts: object
- deployment_status: string
- status: string
Eval Harness Integration
benchmark: ai-dev-benchmark-v1
tests:
- ad-001: Feature Completion Rate
- ad-002: Bug Prevention Rate
minimum_scores:
feature_completion: 0.90
bug_prevention: 0.85
Memory Namespace
namespaces:
- ai-dev-orchestration/projects/{id}: Project state
- ai-dev-orchestration/patterns: Successful patterns
Cross-Skill Coordination
Works with: feature-dev-complete, cicd-intelligent-recovery, skill-forge
!! SKILL COMPLETION VERIFICATION (MANDATORY) !!
After invoking this skill, you MUST complete ALL items below before proceeding:
Completion Checklist
Agent Spawning: Did you spawn at least 1 agent via Task()?
- Example:
Task("Agent Name", "Task description", "agent-type-from-registry")
- Example:
Agent Registry Validation: Is your agent from the registry?
- Registry location:
claude-code-plugins/ruv-sparc-three-loop-system/agents/ - Valid categories: delivery, foundry, operations, orchestration, platforms, quality, research, security, specialists, tooling
- NOT valid: Made-up agent names
- Registry location:
TodoWrite Called: Did you call TodoWrite with 5+ todos?
- Example:
TodoWrite({ todos: [8-10 items covering all work] })
- Example:
Work Delegation: Did you delegate to agents (not do work yourself)?
- CORRECT: Agents do the implementation via Task()
- WRONG: You write the code directly after reading skill
Correct Pattern After Skill Invocation
// After Skill("<skill-name>") is invoked:
[Single Message - ALL in parallel]:
Task("Agent 1", "Description of task 1...", "agent-type-1")
Task("Agent 2", "Description of task 2...", "agent-type-2")
Task("Agent 3", "Description of task 3...", "agent-type-3")
TodoWrite({ todos: [
{content: "Task 1 description", status: "in_progress", activeForm: "Working on task 1"},
{content: "Task 2 description", status: "pending", activeForm: "Working on task 2"},
{content: "Task 3 description", status: "pending", activeForm: "Working on task 3"},
]})
Wrong Pattern (DO NOT DO THIS)
// WRONG - Reading skill and then doing work yourself:
Skill("<skill-name>")
// Then you write all the code yourself without Task() calls
// This defeats the purpose of the skill system!
The skill is NOT complete until all checklist items are checked.
Remember the pattern: Skill() -> Task() -> TodoWrite() - ALWAYS
Core Principles
AI-Assisted App Development Orchestration operates on 3 fundamental principles:
Principle 1: AI as Constrained Factory, Human as Strategic Orchestrator
Treat AI coding tools as powerful but undisciplined junior developers that require explicit boundaries, not as autonomous decision-makers.
In practice:
- Write detailed "do not touch" lists for every feature (DB schema, auth flows, payment logic)
- Use Feature Prompt Framework to constrain AI scope (context + journey + negative constraints)
- Review AI-generated plans BEFORE code execution, reject scope creep immediately
- Manual testing gates prevent theater implementations from reaching production
Principle 2: Ephemeral Context, Persistent Knowledge
Fresh chat context per feature prevents context pollution while maintaining system intelligence through structured artifacts.
In practice:
- Open NEW AI chat for every feature or foundation component
- Store architectural decisions in Memory-MCP with WHO/WHEN/PROJECT/WHY tags
- Document patterns in code/docs, not in chat history (chat history is throwaway)
- Reset context aggressively - when a feature is complete, close the chat and move on
Principle 3: Foundations Before Features, Mini-Waterfall for Architecture
Invest heavily in technical foundations (DB, auth, payments) using mini-waterfall before entering agile feature loops.
In practice:
- Phase 1 is 1-2 hours of pure foundation work (no features)
- Foundations are manually tested exhaustively (migrations, auth flows, payment sandbox)
- Once foundations are solid, Phase 2 becomes fast agile iterations (1-3 hours per feature)
- Never skip foundations - fixing broken auth mid-feature development causes 10x rework
Common Anti-Patterns
| Anti-Pattern | Problem | Solution |
|---|---|---|
| "Build my whole app" prompts | AI attempts to implement everything in one massive change, resulting in scope creep, missed requirements, and untested code. | Break into Phase 1 (foundations) and Phase 2 (per-feature loops). One feature per chat, 1-3 hours max per iteration. |
| Reusing AI chat context across features | Context pollution causes AI to make assumptions from previous work, touching files outside current scope. | Open fresh AI chat for every feature or foundation. No exceptions. Reset context to reset AI assumptions. |
| Skipping manual testing gates | Theater implementations (code that looks correct but doesn't work) reach production because automated tests pass but functionality fails. | Manual test EVERY feature against user journey spec before accepting. No "Accept" without human verification. |
| Vague "do not touch" lists | AI breaks auth, modifies DB schema, or changes payment logic because constraints weren't explicit. | Include exhaustive "DO NOT TOUCH" list in every prompt: specific tables, routes, components, flows. Default to restrictive. |
| Feature creep in single chat | AI starts with "implement login" and ends with "redesigned entire auth system + password reset + 2FA". | Feature Prompt Framework defines EXACT user journey (7-10 steps). Reject ANY changes beyond that scope. |
Conclusion
AI-Assisted App Development Orchestration bridges the gap between AI coding tools' raw power and production-quality software delivery. By treating AI as a constrained factory inside a structured 5-phase pipeline (Product Framing, Foundations, Feature Loop, Testing, Deployment), teams ship reliable apps 3-5x faster than traditional development while maintaining quality through behavioral guardrails.
The skill's unique value lies not in the technical stack decisions or coding speed, but in the orchestration patterns: fresh context per feature, "do not touch" lists, manual testing gates, and ephemeral chat history combined with persistent architectural memory. These patterns prevent the chaos typical of AI-assisted development (scope creep, theater implementations, context pollution) while preserving the leverage (rapid prototyping, automation of repetitive tasks).
Use this skill when building greenfield applications with AI coding tools (Cursor, Claude Code, Lovable), especially for small teams or solo builders who need systematic quality controls without heavyweight processes. The 5-phase structure scales from MVP (1-2 must-have features) to production (iterative feature additions with deployment monitoring), making it ideal for both product validation and sustained development. The key takeaway: AI accelerates execution, but humans must orchestrate strategy, enforce boundaries, and validate outcomes.