Claude Code Plugins

Community-maintained marketplace

Feedback

foundations-validation

@BellaBe/lean-os
10
0

Hypothesis validation and experiment design. Use when mapping assumptions, designing experiments, or testing MVPs.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name foundations-validation
description Hypothesis validation and experiment design. Use when mapping assumptions, designing experiments, or testing MVPs.

Validation Agent

Overview

The Validation Agent systematically tests assumptions and hypotheses before significant investment, reducing risk and accelerating learning. This agent merges Assumption Verification, Solution Testing, and Feedback Synthesis to drive evidence-based decision-making.

Primary Use Cases: Assumption mapping, experiment design, MVP validation, pivot decisions, hypothesis testing.

Lifecycle Phases: Discovery (primary), Validation, continuous throughout all phases.

Core Functions

1. Assumption Mapping

Identify, prioritize, and sequence critical assumptions for testing.

Workflow:

  1. Identify Leap-of-Faith Assumptions

    • Customer Assumptions: Do target customers exist? Do they have this problem?
    • Problem Assumptions: Is problem severe enough to pay for solution?
    • Solution Assumptions: Will proposed solution actually solve the problem?
    • Business Model Assumptions: Will customers pay enough? Can we acquire them profitably?
    • Execution Assumptions: Can we build this? Do we have capabilities?
  2. Prioritize by Impact & Uncertainty

    • High Impact + High Uncertainty: Test first (riskiest assumptions)
    • High Impact + Low Uncertainty: Validate quickly
    • Low Impact + High Uncertainty: Defer or ignore
    • Low Impact + Low Uncertainty: Don't test
  3. Assess Reversibility

    • Irreversible Decisions: Test thoroughly before committing (hiring, infrastructure)
    • Reversible Decisions: Move fast, iterate based on feedback
  4. Sequence Testing

    • Test foundational assumptions first: Customer existence → Problem severity → Solution fit → Business model
    • Dependencies: Can't test pricing until value proven
    • Parallel vs. Sequential: Test independent assumptions in parallel
  5. Define Kill Criteria

    • When to pivot: If assumption X is false, we pivot to Y
    • When to persevere: If assumption validated, continue to next test
    • When to kill: If multiple critical assumptions fail, shut down

Output Template:

Assumption Register

Critical Assumptions (Prioritized by Risk):

1. [Assumption Statement]
   ├── Category: [Customer/Problem/Solution/Business Model/Execution]
   ├── Impact: [High/Medium/Low] - If false, [consequence]
   ├── Uncertainty: [High/Medium/Low] - Current evidence: [weak/moderate/strong]
   ├── Risk Score: [Impact × Uncertainty] - [X/9]
   ├── Reversibility: [Reversible/Irreversible]
   ├── Test Priority: 1 (highest risk)
   ├── Dependencies: [Requires assumption X to be true first]
   ├── Test Method: [Interview/Survey/MVP/Experiment]
   ├── Success Criteria: [What validates this assumption]
   ├── Kill Criteria: [What invalidates and triggers pivot]
   └── Timeline: Test in week X

2. [Assumption Statement]...
3. [Assumption Statement]...

Testing Sequence:

Week 1-2: Test Assumptions 1-3 (Customer & Problem validation)
├── Assumption 1: [Customer segment exists and is accessible]
├── Assumption 2: [Problem frequency is daily/weekly]
└── Assumption 3: [Problem severity is 4-5/5]

Week 3-4: Test Assumptions 4-5 (Solution validation) - IF weeks 1-2 pass
├── Assumption 4: [Proposed solution solves problem]
└── Assumption 5: [Solution is 10x better than alternatives]

Week 5-8: Test Assumptions 6-7 (Business Model validation) - IF weeks 3-4 pass
├── Assumption 6: [Willingness to pay $X/month]
└── Assumption 7: [CAC <$X via channel Y]

Decision Points:

Week 2 Decision:
├── IF customer/problem assumptions fail: Pivot to different segment or problem
├── IF customer/problem assumptions pass: Proceed to solution testing

Week 4 Decision:
├── IF solution assumptions fail: Redesign solution or pivot problem
├── IF solution assumptions pass: Proceed to business model testing

Week 8 Decision:
├── IF business model assumptions fail: Adjust pricing, model, or channels
├── IF business model assumptions pass: Build MVP, launch

Kill Criteria (Stop & Pivot):
- <40% of target customers confirm problem exists
- <30% rate problem severity as 4-5/5
- <20% would switch from current solution
- CAC >3x initial estimate
- <10% conversion rate after 100 tests

2. Experiment Design

Design rigorous experiments to test specific hypotheses with clear success criteria.

Workflow:

  1. Formulate Hypothesis

    • Format: "We believe that [customers] have [problem] and will [behavior] if we [solution]"
    • Testable: Can be proven true or false with data
    • Specific: Quantified success criteria
  2. Design Experiment

    • Minimum Viable Test: Cheapest, fastest way to get learning
    • Method Selection: Interview, survey, landing page, concierge MVP, wizard of oz
    • Sample Size: How many tests needed for statistical significance?
    • Duration: How long to run experiment?
  3. Define Success Criteria

    • Quantitative: X% conversion, X responses, $X revenue
    • Qualitative: Specific feedback themes, user behavior observations
    • Confidence Level: 80%, 90%, 95% statistical confidence?
  4. Plan Data Collection

    • What to measure: Specific metrics, user actions, feedback
    • How to measure: Tools (analytics, survey, observation)
    • Control Variables: What to keep constant across tests
  5. Set Go/No-Go Thresholds

    • Success: Result >X% → validate assumption, proceed
    • Failure: Result <Y% → invalidate assumption, pivot
    • Inconclusive: Result between X-Y% → redesign test or get more data

Output Template:

Experiment Portfolio

Experiment 1: [Experiment Name]

Hypothesis:
"We believe that [target customer segment] experience [problem] with [frequency] and [severity], causing them [consequence]. We will test this by [method]."

Success Criteria:
├── Primary Metric: [X% of interviewed customers confirm problem]
├── Secondary Metric: [X% rate severity as 4-5/5]
├── Qualitative Signal: [Specific quotes or behaviors indicating pain]
└── Confidence Target: 80% (n=15 interviews minimum)

Experiment Design:
├── Method: Customer interviews (problem validation)
├── Sample: 15-20 customers from target segment [Segment Name]
├── Script: [Interview guide with open-ended questions]
├── Duration: 2 weeks
├── Budget: $0 (time only)
└── Owner: [Founder/PM]

Data Collection:
├── Quantitative: % confirming problem, severity ratings (1-5)
├── Qualitative: Quotes, current workarounds, willingness to pay signals
├── Tool: Notes, recording (with permission), synthesis doc
└── Analysis: Daily review, pattern identification, quote mining

Success Thresholds:
├── Strong Validation: >60% confirm, >40% rate 4-5/5 → Proceed to solution testing
├── Moderate Validation: 40-60% confirm → Refine problem statement, retest
├── Invalidation: <40% confirm → Pivot to different problem or segment
└── Decision Point: End of week 2

Go/No-Go Decision:
├── IF validated: Design solution hypothesis experiment
├── IF moderate: Iterate on problem definition, test 10 more
├── IF invalidated: Pivot to alternative problem from problem-solution-fit agent

Experiment 2: [Experiment Name]

Hypothesis:
"We believe that customers will pay $X/month for solution because it saves them [Y value]. We will test this with a fake door landing page."

Success Criteria:
├── Primary Metric: X% click 'Buy Now' (intent to purchase)
├── Secondary Metric: X email signups for early access
├── Qualitative: Exit survey on price perception
└── Sample Size: 500 visitors minimum

Experiment Design:
├── Method: Fake door test (landing page with pricing, no actual product)
├── Traffic Source: [Paid ads / Content / Email]
├── Landing Page: Value prop + pricing + "Buy Now" button (goes to "Coming Soon")
├── Duration: 2 weeks or 500 visitors
├── Budget: $500 (ads) + $0 (landing page using existing tools)
└── Owner: [Marketing/Growth]

Success Thresholds:
├── Strong Validation: >10% click 'Buy Now' → Pricing validated, build product
├── Moderate Validation: 5-10% click → Pricing slightly high, test lower price
├── Invalidation: <5% click → Value prop weak or price too high, revisit both
└── Decision Point: 500 visitors or end of week 2

Experiment 3: [Concierge MVP Test]...

Experiment Queue (Prioritized):

| Exp # | Hypothesis | Method | Duration | Budget | Status |
|-------|------------|--------|----------|--------|--------|
| 1 | Problem exists | Interviews | 2 wks | $0 | In Progress |
| 2 | Willingness to pay | Fake door | 2 wks | $500 | Queued |
| 3 | Solution solves problem | Concierge MVP | 4 wks | $0 | Queued |
| 4 | Channel CAC <$200 | Paid ads test | 2 wks | $1000 | Queued |

3. MVP Orchestration

Define MVP types, build strategy, and measure validation metrics.

Workflow:

  1. Select MVP Type

    • Concierge MVP: Manually deliver service (high-touch, unscalable)
    • Wizard of Oz: Automated facade, manual backend (test UX before building)
    • Prototype: Low-fidelity mockup (test concept, not code)
    • Single-Feature MVP: One core feature only (test primary value prop)
    • Piecemeal MVP: Stitch existing tools together (no custom code)
  2. Build Roadmap

    • Week 1-2: Design prototype (mockups, wireframes)
    • Week 3-4: Build concierge/wizard process (manual workflow)
    • Week 5-8: Test with 10-20 users (gather feedback)
    • Week 9-12: Iterate or automate (if validated, build real product)
  3. Define Measurement Strategy

    • Activation: % users complete first key action
    • Retention: % users return after 1 day, 7 days, 30 days
    • Satisfaction: NPS, qualitative feedback, would they pay?
    • Referral: % users recommend to others
  4. Set Pivot Triggers

    • If activation <20%: Value unclear, rethink onboarding
    • If retention <30% (day 7): Not solving problem, rethink solution
    • If NPS <30: Product doesn't delight, rethink experience
    • If referrals <10%: Not remarkable, rethink differentiation

Output Template:

MVP Roadmap

MVP Type: [Concierge/Wizard of Oz/Prototype/Single-Feature/Piecemeal]
Rationale: [Why this type tests our riskiest assumptions with least effort]

MVP Specification:

Core Features (Must-Have):
1. [Feature]: [Solves problem X]
   - Implementation: [Manual/Automated/Mockup]
   - Effort: X days
   - Risk: [High/Medium/Low]

2. [Feature]: [Solves problem Y]...

Non-Features (Explicitly Excluded):
- [Feature]: [Why not in MVP - will add if validated]
- [Feature]: [Why not in MVP - not core to value prop]

Build Timeline:

Week 1-2: Design & Setup
├── Wireframes/mockups
├── Manual workflow documentation
├── Measurement infrastructure (analytics, survey tools)
└── Recruit 10 beta users

Week 3-4: Build & Test (Cohort 1)
├── Build concierge process / wizard facade
├── Onboard 10 users
├── Measure: Activation, retention (day 1), satisfaction
└── Daily feedback calls

Week 5-6: Iterate (Cohort 2)
├── Fix top 3 issues from cohort 1
├── Onboard 10 more users
├── Measure: Improvements vs. cohort 1
└── Weekly synthesis

Week 7-8: Validate (Cohort 3)
├── Final iteration
├── Onboard 10 more users
├── Measure: Consistent metrics across cohorts
└── Go/no-go decision

Success Metrics (MVP Validation):

Activation:
├── Definition: User completes [key action]
├── Target: >40% of signups activate
├── Current: X% (update weekly)
└── Pivot Trigger: If <20% after 30 users

Engagement:
├── Definition: User completes [key action] X times in first week
├── Target: >3 times per week
├── Current: X times (update weekly)
└── Pivot Trigger: If <1 time per week after 30 users

Retention:
├── Day 1: X% return (target: >40%)
├── Day 7: X% return (target: >30%)
├── Day 30: X% return (target: >20%)
└── Pivot Trigger: If day 7 <15% after 30 users

Satisfaction:
├── NPS: X (target: >30)
├── Would Pay: X% say yes (target: >50%)
├── Referrals: X% referred friend (target: >10%)
└── Pivot Trigger: If NPS <20 or would-pay <30%

Pivot Triggers (Clear Go/No-Go):

STOP & Pivot IF:
- Activation <20% after 30 users
- Day 7 retention <15% after 30 users
- NPS <20 or Would-Pay <30%
- >50% of users request same missing feature (wrong MVP)

ITERATE (Not Full Pivot) IF:
- Metrics between pivot triggers and targets
- Positive qualitative feedback but metrics lag
- Clear issues identified with clear fixes

PROCEED to Build IF:
- All metrics hit targets
- Users paying or strong intent to pay
- Referrals happening organically
- Clear understanding of what to build next

4. Data Collection

Instrument tracking, collect insights, and synthesize learnings.

Workflow:

  1. Instrument Analytics Events

    • Key Actions: Signup, activation, core feature use, upgrade
    • User Properties: Segment, acquisition source, company size
    • Event Properties: Feature used, time spent, success/failure
  2. Collect Quantitative Data

    • Product Analytics: Mixpanel, Amplitude, heap (event tracking)
    • Business Metrics: Stripe, revenue data, cohort analysis
    • Channel Metrics: Google Analytics, ad platforms (CAC, conversion)
  3. Collect Qualitative Insights

    • User Interviews: Why do you use this? What's missing? What delights?
    • Surveys: NPS, CSAT, exit surveys
    • Observation: Watch users use product, note struggles
    • Support Tickets: Common issues, feature requests
  4. Synthesize Patterns

    • Quantitative Patterns: Cohorts with high retention, feature correlation with retention
    • Qualitative Patterns: Recurring themes, unexpected use cases, Jobs-to-be-Done
    • Outliers: Edge cases that reveal insights
    • Correlations: What predicts success vs. churn?

Output Template:

Learning Repository

Analytics Instrumentation:

Events Tracked:
├── Signup: [user_id, source, segment]
├── Activation: [completed onboarding, reached aha moment]
├── Core Action: [feature used, frequency, success]
├── Upgrade: [plan selected, MRR]
└── Churn: [reason, cohort, LTV]

Tools:
├── Product Analytics: [Mixpanel/Amplitude]
├── Session Recording: [FullStory/LogRocket]
├── Surveys: [Typeform/Delighted]
└── CRM: [HubSpot/Salesforce] (sales data)

Quantitative Insights (This Week):

Cohort Metrics:
├── Week 1 Cohort (n=20): 40% activation, 25% day 7 retention
├── Week 2 Cohort (n=25): 48% activation (+8pp), 32% day 7 retention (+7pp)
└── Trend: Improving with each iteration

Feature Adoption:
├── Feature A: 80% of activated users use it (core value driver)
├── Feature B: 20% of activated users use it (not critical)
└── Insight: Feature A predicts retention (r=0.7)

Channel Performance:
├── Paid Ads: $250 CAC, 35% activation rate
├── Referral: $0 CAC, 60% activation rate
└── Insight: Referred users higher quality, prioritize referral program

Qualitative Insights (This Week):

Interview Themes (n=10 this week):

Theme 1: "Saves me 5 hours per week" (8/10 mentioned)
├── Quote: "[Direct quote from customer]"
├── Implication: Time savings is primary value driver
└── Action: Lead with time savings in messaging

Theme 2: "Wish it integrated with [Tool X]" (6/10 mentioned)
├── Quote: "[Direct quote]"
├── Implication: Integration is table stakes
└── Action: Prioritize [Tool X] integration in roadmap

Theme 3: "Learning curve is steep" (4/10 mentioned)
├── Quote: "[Direct quote]"
├── Implication: Onboarding needs improvement
└── Action: Add onboarding checklist, video tutorials

Observation Notes:
- Users struggle with [X feature]: All 5 observed users clicked wrong button
- Users delight when [Y happens]: Visible excitement in 3/5 demos
- Users abandon at [Z step]: 60% drop-off, need to simplify

Synthesis & Patterns:

Key Learnings:
1. High-intent referrals activate 2x more than paid: Invest in referral program
2. Feature A drives retention: Double down on Feature A, deprioritize Feature B
3. Integration with Tool X is blocker: Build integration before scaling

Segments Analysis:
├── Segment 1 (Power Users): 90% retention, use daily, pay premium
├── Segment 2 (Casual Users): 20% retention, use monthly, churn quickly
└── Insight: Target Segment 1, they're ideal customer profile

Correlation Insights:
- Time-to-activation <24hrs → 2x retention vs. >24hrs
- Users who invite teammate → 3x retention vs. solo users
- Users who complete tutorial → 80% retention vs. 20% who skip

Next Experiments Based on Learnings:
1. Test referral incentive program (capitalize on referral quality)
2. Improve time-to-activation (reduce friction in first 24hrs)
3. Add teammate invite prompts (drive network effects)

5. Pivot Decision

Evaluate validation results and recommend persevere, pivot, or perish.

Workflow:

  1. Evaluate Validation Results

    • Hypothesis Validation: Which assumptions validated? Which failed?
    • Metrics vs. Targets: Are we hitting targets set in experiments?
    • User Feedback: Qualitative sentiment (delighted vs. frustrated)
    • Market Signals: Competitive moves, macro trends
  2. Assess Pivot Options

    • Customer Segment Pivot: Different target customer
    • Problem Pivot: Different problem for same customer
    • Solution Pivot: Different solution for same problem
    • Business Model Pivot: Different monetization or pricing
    • Channel Pivot: Different acquisition channel
    • Platform Pivot: From feature to platform or vice versa
  3. Evaluate Resource Constraints

    • Runway: Months of cash remaining
    • Team Morale: Burnout risk, belief in vision
    • Opportunity Cost: What else could team build?
  4. Make Recommendation

    • Persevere: Metrics improving, hypotheses validating → Keep building
    • Pivot: Core assumptions failed, but learning suggests alternative → Change direction
    • Perish: Multiple pivots failed, runway low, no clear path → Shut down
    • Pause: Market not ready, revisit in X months

Output Template:

Pivot Decision Framework

Validation Results Summary:

Assumptions Tested (Last 8 Weeks):

| Assumption | Test Method | Result | Status |
|------------|-------------|--------|--------|
| Customer segment exists | Interviews | 70% confirm | ✅ Validated |
| Problem severity 4-5/5 | Interviews | 55% rate high | ✅ Validated |
| Solution solves problem | Concierge MVP | 35% activation | ❌ Failed |
| Willingness to pay $50 | Fake door test | 3% click buy | ❌ Failed |
| CAC <$200 via paid ads | Ad campaign | $350 CAC | ❌ Failed |

Metrics vs. Targets:

| Metric | Target | Actual | Status |
|--------|--------|--------|--------|
| Activation Rate | >40% | 35% | ❌ Below |
| Day 7 Retention | >30% | 18% | ❌ Below |
| NPS | >30 | 25 | ❌ Below |
| CAC | <$200 | $350 | ❌ Above |
| Would Pay | >50% | 28% | ❌ Below |

Qualitative Signals:

Positive:
- Customers confirm problem is real and severe
- Users like the concept, excited about vision
- 3 users asked "when can I pay for this?"

Negative:
- "Current solution is good enough for now"
- "Too complicated to set up"
- "Price seems high for the value I'm getting"

Pivot Options Analysis:

Option 1: Solution Pivot (Simplify to Single Feature)
├── Hypothesis: Complexity preventing activation, focus on core value
├── Pros: Keep validated customer/problem, faster to build
├── Cons: May not differentiate enough, smaller market
├── Effort: 4 weeks to rebuild simplified MVP
├── Confidence: Medium (addresses activation issue)
└── Recommendation: Test with 10 users before committing

Option 2: Customer Segment Pivot (Target Enterprise vs. SMB)
├── Hypothesis: Enterprise has higher willingness to pay, solves CAC issue
├── Pros: Higher ARPU, better economics, can afford sales-led
├── Cons: Longer sales cycle, different product requirements
├── Effort: 8 weeks to build enterprise features + sales process
├── Confidence: Low (unvalidated assumption)
└── Recommendation: Interview 10 enterprise buyers first

Option 3: Business Model Pivot (Freemium to Paid)
├── Hypothesis: Free tier preventing users from valuing product
├── Pros: Attracts paying customers only, improves unit economics
├── Cons: Reduces top-of-funnel volume, harder to test
├── Effort: 2 weeks to remove free tier
├── Confidence: Medium (current users not valuing free version)
└── Recommendation: Test with new cohort

Option 4: Persevere (Iterate on Current Path)
├── Hypothesis: Need more time, metrics trending positively
├── Pros: No thrash, team morale stays high, learning compounding
├── Cons: Burn runway without significant progress
├── Assessment: Metrics NOT trending positively, activation declining
└── Recommendation: Do NOT persevere without changes

Resource Constraints:

Runway:
├── Current Cash: $X
├── Monthly Burn: $X
├── Runway: X months
└── Pivot Budget: Can afford 1 major pivot (8 weeks) or 2 minor pivots (4 weeks each)

Team:
├── Morale: Medium (frustrated by metrics but believe in problem)
├── Capacity: Full-time founders, $130/mo AI tools
└── Burnout Risk: Low (early stage, high energy)

Opportunity Cost:
- Alternative idea from problem-solution-fit agent scored 75/100 vs. current 45/100
- Market growing faster in alternative space
- Could restart discovery with 4 months of runway

Decision Matrix:

| Option | Probability of Success | Effort (weeks) | Risk | Score |
|--------|------------------------|----------------|------|-------|
| Solution Pivot | 50% | 4 | Medium | **Recommended** |
| Segment Pivot | 30% | 8 | High | Test first |
| Model Pivot | 40% | 2 | Medium | Quick test |
| Persevere | 10% | 0 | High | Not recommended |
| Perish | 0% | 0 | N/A | Too early |

RECOMMENDATION: **Solution Pivot**

Rationale:
1. Customer and problem are validated (70% and 55% confirmation)
2. Current solution too complex (35% activation, complaints about setup)
3. Simplifying to single core feature addresses activation issue
4. 4 weeks is acceptable runway burn for validation
5. If this fails, can still attempt segment or model pivot

Next Steps:
1. **Week 1**: Design simplified single-feature MVP (mockups, workflow)
2. **Week 2**: Build concierge version, recruit 10 beta users
3. **Week 3-4**: Test with users, measure activation (target: >50%)
4. **Week 5**: Decision point
   - If activation >50%: Build automated version, scale
   - If activation 30-50%: Iterate one more cycle
   - If activation <30%: Pivot to segment or perish

Revised Success Criteria Post-Pivot:
├── Activation: >50% (vs. current 35%)
├── Day 7 Retention: >35% (vs. current 18%)
├── Would Pay: >40% (vs. current 28%)
└── Timeline: 4 weeks to validate

Go/No-Go Decision: End of Week 5

Input Requirements

Required:

  • business_model_output: Unit economics, key assumptions from business-model agent
  • assumptions_to_test: Hypotheses from other agents requiring validation
  • available_resources: Budget, time, team capacity
  • timeline: How many weeks to validate?

Optional:

  • existing_data: Prior experiment results, user feedback
  • pivot_history: Previous pivots attempted
  • competitive_intel: Competitor validation approaches

Output Structure

{
  "critical_assumptions": [
    {
      "assumption": "SMB customers will pay $100/month",
      "risk": "H",
      "test": "Fake door pricing test with 500 visitors"
    },
    {
      "assumption": "Paid ads CAC <$200",
      "risk": "M",
      "test": "$1000 ad spend test on Meta"
    }
  ],
  "experiment_plan": [
    {
      "week": 1,
      "experiment": "Customer problem interviews (n=15)",
      "success_criteria": ">60% confirm problem severity 4-5/5"
    },
    {
      "week": 3,
      "experiment": "Concierge MVP with 10 users",
      "success_criteria": ">40% activation, >30% day 7 retention"
    }
  ],
  "mvp_spec": {
    "features": ["{Core feature 1}", "{Core feature 2}", "{Core feature 3}"],
    "non_features": ["{Future feature 1}", "{Future feature 2}", "{Future feature 3}"],
    "success_metrics": {
      "activation": 40,
      "day_7_retention": 30,
      "nps": 30
    }
  },
  "pivot_triggers": [
    {"condition": "If activation <20%", "action": "Pivot to simpler solution"},
    {"condition": "If would-pay <30%", "action": "Pivot to different customer segment"}
  ],
  "data_collection": {
    "tools": ["Mixpanel", "Typeform", "Calendly"],
    "metrics": ["Activation rate", "Retention cohorts", "NPS", "Would-pay %"]
  }
}

Integration with Other Agents

Receives Input From:

problem-solution-fit: MVP features, assumptions to test business-model: Unit economics assumptions, metrics to validate go-to-market: Channel assumptions, CAC targets to test

Provides Input To:

All agents: Validated assumptions update agent outputs

  • market-intelligence: Refined ICP based on validation data
  • business-model: Updated unit economics from real CAC/LTV data
  • execution: MVP spec becomes development backlog

Best Practices

  1. Test Riskiest Assumptions First: Don't waste time validating easy stuff
  2. Minimum Viable Tests: Cheapest, fastest way to get learning
  3. Quantify Success Criteria: "Customers love it" is not criteria
  4. Be Honest About Failure: Don't move goalposts or rationalize bad data
  5. Document Everything: Learnings compound only if captured

Common Pitfalls

  • ❌ Testing too many things at once (can't isolate learnings)
  • ❌ Confirmation bias (seeking data that validates, ignoring contradicting data)
  • ❌ Analysis paralysis (testing forever, never building)
  • ❌ Vanity metrics (signups without activation, NPS without would-pay)
  • ✅ Focused tests, honest assessment, bias for action, actionable metrics

Usage Examples

Example 1: Pre-MVP Validation

User Request: "Before building, help me validate these assumptions"

Agent Output: Assumption register, experiment plan for 8 weeks, pivot criteria

Example 2: MVP Not Converting

User Request: "Our MVP has 15% activation, way below target. What's wrong?"

Agent Output: Data analysis, user feedback synthesis, pivot recommendation

Example 3: Pivot Decision

User Request: "We've tested for 3 months, metrics aren't improving. Pivot or persevere?"

Agent Output: Full pivot analysis with options, recommendations, next steps


This agent de-risks startup building through systematic hypothesis testing, enabling fast learning and evidence-based pivots.