| name | Methodology Bootstrapping |
| description | Apply Bootstrapped AI Methodology Engineering (BAIME) to develop project-specific methodologies through systematic Observe-Codify-Automate cycles with dual-layer value functions (instance quality + methodology quality). Use when creating testing strategies, CI/CD pipelines, error handling patterns, observability systems, or any reusable development methodology. Provides structured framework with convergence criteria, agent coordination, and empirical validation. Validated in 8 experiments with 100% success rate, 4.9 avg iterations, 10-50x speedup vs ad-hoc. Works for testing, CI/CD, error recovery, dependency management, documentation systems, knowledge transfer, technical debt, cross-cutting concerns. |
| allowed-tools | Read, Grep, Glob, Edit, Write, Bash |
Methodology Bootstrapping
Apply Bootstrapped AI Methodology Engineering (BAIME) to systematically develop and validate software engineering methodologies through observation, codification, and automation.
The best methodologies are not designed but evolved through systematic observation, codification, and automation of successful practices.
What is BAIME?
BAIME (Bootstrapped AI Methodology Engineering) is a unified framework that integrates three complementary methodologies optimized for LLM-based development:
- OCA Cycle (Observe-Codify-Automate) - Core iterative framework
- Empirical Validation - Scientific method and data-driven decisions
- Value Optimization - Dual-layer value functions for quantitative evaluation
This skill provides the complete BAIME framework for systematic methodology development. The methodology is especially powerful when combined with AI agents (like Claude Code) that can execute the OCA cycle, coordinate specialized agents, and calculate value functions automatically.
Key Innovation: BAIME treats methodology development like software developmentβwith empirical observation, automated testing, continuous iteration, and quantitative metrics.
When to Use This Skill
Use this skill when you need to:
- π― Create systematic methodologies for testing, CI/CD, error handling, observability, etc.
- π Validate methodologies empirically with data-driven evidence
- π Evolve practices iteratively using OCA (Observe-Codify-Automate) cycle
- π Measure methodology quality with dual-layer value functions
- π Achieve rapid convergence (typically 3-7 iterations, 6-15 hours)
- π Create transferable methodologies (70-95% reusable across projects)
Don't use this skill for:
- β One-time ad-hoc tasks without reusability goals
- β Trivial processes (<100 lines of code/docs)
- β When established industry standards fully solve your problem
Quick Start with BAIME (10 minutes)
1. Define Your Domain
Choose what methodology you want to develop using BAIME:
- Testing strategy (15x speedup example)
- CI/CD pipeline (2.5-3.5x speedup example)
- Error recovery patterns (80% error reduction example)
- Observability system (23-46x speedup example)
- Dependency management (6x speedup example)
- Documentation system (47% token cost reduction example)
- Knowledge transfer (3-8x speedup example)
- Technical debt management
- Cross-cutting concerns
2. Establish Baseline
Measure current state:
# Example: Testing domain
- Current coverage: 65%
- Test quality: Ad-hoc
- No systematic approach
- Bug rate: Baseline
# Example: CI/CD domain
- Build time: 5 minutes
- No quality gates
- Manual releases
3. Set Dual Goals
Define both layers:
- Instance goal (domain-specific): "Reach 80% test coverage"
- Meta goal (methodology): "Create reusable testing strategy with 85%+ transferability"
4. Start Iteration 0
Follow the OCA cycle (see reference/observe-codify-automate.md)
Specialized Subagents
BAIME provides two specialized Claude Code subagents to streamline experiment execution:
iteration-prompt-designer
When to use: At experiment start, to create comprehensive ITERATION-PROMPTS.md
What it does:
- Designs iteration templates tailored to your domain
- Incorporates modular Meta-Agent architecture
- Provides domain-specific guidance for each iteration
- Creates structured prompts for baseline and subsequent iterations
How to invoke:
Use the Task tool with subagent_type="iteration-prompt-designer"
Example:
"Design ITERATION-PROMPTS.md for refactoring methodology experiment"
Benefits:
- β Comprehensive iteration prompts (saves 2-3 hours setup time)
- β Domain-specific value function design
- β Proper baseline iteration structure
- β Evidence-driven evolution guidance
iteration-executor
When to use: For each iteration execution (Iteration 0, 1, 2, ...)
What it does:
- Executes iteration through lifecycle phases (Observe β Codify β Automate β Evaluate)
- Coordinates Meta-Agent capabilities and agent invocations
- Tracks state transitions (M_{n-1} β M_n, A_{n-1} β A_n, s_{n-1} β s_n)
- Calculates dual-layer value functions (V_instance, V_meta) systematically
- Evaluates convergence criteria rigorously
- Generates complete iteration documentation
How to invoke:
Use the Task tool with subagent_type="iteration-executor"
Example:
"Execute Iteration 2 of testing methodology experiment using iteration-executor"
Benefits:
- β Consistent iteration structure across experiments
- β Systematic value calculation (reduces bias, improves honesty)
- β Proper convergence evaluation (prevents premature convergence)
- β Complete artifact generation (data, knowledge, reflections)
- β Reduced iteration time (structured execution vs ad-hoc)
Important: iteration-executor reads capability files fresh each iteration (no caching) to ensure latest guidance is applied.
knowledge-extractor
When to use: After experiment converges, to extract and transform knowledge into reusable artifacts
What it does:
- Extracts patterns, principles, templates from converged BAIME experiment
- Transforms experiment artifacts into production-ready Claude Code skills
- Creates knowledge base entries (patterns/.md, principles/.md)
- Validates output quality with structured criteria (V_instance β₯ 0.85)
- Achieves 195x speedup (2 min vs 390 min manual extraction)
- Produces distributable, reusable artifacts for the community
How to invoke:
Use the Task tool with subagent_type="knowledge-extractor"
Example:
"Extract knowledge from Bootstrap-004 refactoring experiment and create code-refactoring skill using knowledge-extractor"
Benefits:
- β Systematic knowledge preservation (vs ad-hoc documentation)
- β Reusable Claude Code skills (ready for distribution)
- β Quality validation (95% content equivalence to hand-crafted)
- β Fast extraction (2-5 min, 195x speedup)
- β Knowledge base population (patterns, principles, templates)
- β Automated artifact generation (43% workflow automation with 4 tools)
Lifecycle position: Post-Convergence phase
Experiment Design β iteration-prompt-designer β ITERATION-PROMPTS.md
β
Iterate β iteration-executor (x N) β iteration-0..N.md
β
Converge β Create results.md
β
Extract β knowledge-extractor β .claude/skills/ + knowledge/
β
Distribute β Claude Code users
Validated performance (Bootstrap-005):
- Speedup: 195x (390 min β 2 min)
- Quality: V_instance = 0.87, 95% content equivalence
- Reliability: 100% success across 3 experiments
- Automation: 43% of workflow (6/14 steps)
Core Framework
The OCA Cycle
Observe β Codify β Automate
β β
βββββββ Evolve βββββββ
Observe: Collect empirical data about current practices
- Use meta-cc MCP tools to analyze session history
- Git analysis for commit patterns
- Code metrics (coverage, complexity)
- Access pattern tracking
- Error rate monitoring
Codify: Extract patterns and document methodologies
- Pattern recognition from data
- Hypothesis formation
- Documentation as markdown
- Validation with real scenarios
Automate: Convert methodologies to automated checks
- Detection: Identify when pattern applies
- Validation: Check compliance
- Enforcement: CI/CD gates
- Suggestion: Automated fix recommendations
Evolve: Apply methodology to itself for continuous improvement
- Use tools on development process
- Discover meta-patterns
- Optimize methodology
Detailed guide: reference/observe-codify-automate.md
Dual-Layer Value Functions
Every iteration calculates two scores:
V_instance(s): Domain-specific task quality
- Example (testing): coverage Γ quality Γ stability Γ performance
- Example (CI/CD): speed Γ reliability Γ automation Γ observability
- Target: β₯0.80
V_meta(s): Methodology transferability quality
- Components: completeness Γ effectiveness Γ reusability Γ validation
- Completeness: Is methodology fully documented?
- Effectiveness: What speedup does it provide?
- Reusability: What % transferable across projects?
- Validation: Is it empirically validated?
- Target: β₯0.80
Detailed guide: reference/dual-value-functions.md
Convergence Criteria
Methodology complete when:
- β System stable: Agent set unchanged for 2+ iterations
- β Dual threshold: V_instance β₯ 0.80 AND V_meta β₯ 0.80
- β Objectives complete: All planned work finished
- β Diminishing returns: ΞV < 0.02 for 2+ iterations
Alternative patterns:
- Meta-Focused Convergence: V_meta β₯ 0.80, V_instance β₯ 0.55 (when methodology is primary goal)
- Practical Convergence: Combined quality exceeds metrics, justified partial criteria
Detailed guide: reference/convergence-criteria.md
Iteration Documentation Structure
Every BAIME iteration must produce a comprehensive iteration report following a standardized 10-section structure. This ensures consistent quality, complete knowledge capture, and reproducible methodology development.
Required Sections
See complete example: examples/iteration-documentation-example.md
Use blank template: examples/iteration-structure-template.md
Executive Summary (2-3 paragraphs)
- Iteration focus and objectives
- Key achievements
- Key learnings
- Value scores (V_instance, V_meta)
Pre-Execution Context
- Previous state: M_{n-1}, A_{n-1}, s_{n-1}
- Previous values: V_instance(s_{n-1}), V_meta(s_{n-1}) with component breakdowns
- Primary objectives for this iteration
Work Executed (organized by BAIME phases)
- Phase 1: OBSERVE - Data collection, measurements, gap identification
- Phase 2: CODIFY - Pattern extraction, documentation, knowledge creation
- Phase 3: AUTOMATE - Tool creation, script development, enforcement
- Phase 4: EVALUATE - Metric calculation, value assessment
Value Calculations (detailed, evidence-based)
- V_instance(s_n) with component breakdowns
- Each component score with concrete evidence
- Formula application with arithmetic
- Final score calculation
- Change from previous iteration (ΞV)
- V_meta(s_n) with rubric assessments
- Completeness score (checklist-based, with evidence)
- Effectiveness score (speedup, quality gains, with evidence)
- Reusability score (transferability estimate, with evidence)
- Final score calculation
- Change from previous iteration (ΞV)
- V_instance(s_n) with component breakdowns
Gap Analysis
- Instance layer gaps (what's needed to reach V_instance β₯ 0.80)
- Prioritized list with estimated effort
- Meta layer gaps (what's needed to reach V_meta β₯ 0.80)
- Prioritized list with estimated effort
- Estimated work remaining
- Instance layer gaps (what's needed to reach V_instance β₯ 0.80)
Convergence Check (systematic criteria evaluation)
- Dual threshold: V_instance β₯ 0.80 AND V_meta β₯ 0.80
- System stability: M_n == M_{n-1} AND A_n == A_{n-1}
- Objectives completeness: All planned work finished
- Diminishing returns: ΞV < 0.02 for 2+ iterations
- Convergence decision: YES/NO with detailed rationale
Evolution Decisions (evidence-driven)
- Agent sufficiency analysis (A_n vs A_{n-1})
- Each agent's performance assessment
- Decision: evolution needed or not
- Rationale with evidence
- Meta-Agent sufficiency analysis (M_n vs M_{n-1})
- Each capability's effectiveness assessment
- Decision: evolution needed or not
- Rationale with evidence
- Agent sufficiency analysis (A_n vs A_{n-1})
Artifacts Created
- Data files (coverage reports, metrics, measurements)
- Knowledge files (patterns, principles, methodology documents)
- Code changes (implementation, tests, tools)
- Other deliverables
Reflections
- What worked well (successes to repeat)
- What didn't work (failures to avoid)
- Learnings (insights from this iteration)
- Insights for methodology (meta-level learnings)
Conclusion
- Iteration summary
- Key metrics and improvements
- Critical decisions made
- Next steps
- Confidence assessment
File Naming Convention
iterations/iteration-N.md
Where N = 0, 1, 2, 3, ... (starting from 0 for baseline)
Documentation Quality Standards
Evidence-based scores:
- Every value component score must have concrete evidence
- Avoid vague assessments ("seems good" β, "72.3% coverage, +5% from baseline" β )
- Show arithmetic for all calculations
Honest assessment:
- Low scores early are expected and acceptable (baseline V_meta often 0.15-0.25)
- Don't inflate scores to meet targets
- Document gaps explicitly
- Acknowledge when objectives are not met
Complete coverage:
- All 10 sections must be present
- Don't skip reflections (valuable for meta-learning)
- Don't skip gap analysis (critical for planning)
- Don't skip convergence check (prevents premature convergence)
Tools for Iteration Documentation
Recommended workflow:
- Copy examples/iteration-structure-template.md to
iterations/iteration-N.md - Invoke
iteration-executorsubagent to execute iteration with structured documentation - Review examples/iteration-documentation-example.md for quality reference
Automated generation: Use iteration-executor subagent to ensure consistent structure and systematic value calculation.
Three-Layer Architecture
BAIME integrates three complementary methodologies into a unified framework:
Layer 1: Core Framework (OCA Cycle)
- Observe β Codify β Automate β Evolve
- Three-tuple output: (O, Aβ, Mβ)
- Self-referential feedback loop
- Agent coordination
Layer 2: Scientific Foundation (Empirical Methodology)
- Empirical observation tools
- Data-driven pattern extraction
- Hypothesis testing
- Scientific validation
Layer 3: Quantitative Evaluation (Value Optimization)
- Dual-layer value functions (V_instance + V_meta)
- Convergence mathematics
- Agent as gradient, Meta-Agent as Hessian
- Optimization perspective
Why "BAIME"? The framework bootstraps itselfβmethodologies developed using BAIME can be applied to improve BAIME itself. This self-referential property, combined with AI-agent coordination, makes it uniquely suited for LLM-based development tools.
Detailed guide: reference/three-layer-architecture.md
Proven Results
Validated in 8 experiments:
- β 100% success rate (8/8 converged)
- β±οΈ Average: 4.9 iterations, 9.1 hours
- π V_instance average: 0.784 (range: 0.585-0.92)
- π V_meta average: 0.840 (range: 0.83-0.877)
- π Transferability: 70-95%+
- π Speedup: 3-46x vs ad-hoc
Example applications:
- Testing strategy: 15x speedup, 75%β86% coverage (examples/testing-methodology.md)
- CI/CD pipeline: 2.5-3.5x speedup, 91.7% pattern validation (examples/ci-cd-optimization.md)
- Error recovery: 80% error reduction, 85% transferability
- Observability: 23-46x speedup, 90-95% transferability
- Dependency health: 6x speedup (9hβ1.5h), 88% transferability
- Knowledge transfer: 3-8x onboarding speedup, 95%+ transferability
- Documentation: 47% token cost reduction, 85% transferability
- Technical debt: SQALE quantification, 85% transferability
Usage Templates
Experiment Template
Use templates/experiment-template.md to structure your methodology development:
- README.md structure
- Iteration prompts
- Knowledge extraction format
- Results documentation
Iteration Prompt Template
Use templates/iteration-prompts-template.md to guide each iteration:
- Iteration N objectives
- OCA cycle execution steps
- Value calculation rubrics
- Convergence checks
Automated generation: Use iteration-prompt-designer subagent to create domain-specific iteration prompts.
Iteration Documentation Template
Structure template: examples/iteration-structure-template.md
- 10-section standardized structure
- Blank template ready to copy and fill
- Includes all required components
Complete example: examples/iteration-documentation-example.md
- Real iteration from test strategy experiment
- Shows proper value calculations with evidence
- Demonstrates honest assessment and gap analysis
- Illustrates quality reflections and insights
Automated execution: Use iteration-executor subagent to ensure consistent structure and systematic value calculation.
Quality standards:
- Evidence-based scoring (concrete data, not vague assessments)
- Honest evaluation (low scores acceptable, inflation harmful)
- Complete coverage (all 10 sections required)
- Arithmetic shown (all value calculations with steps)
Common Pitfalls
β Don't:
- Use only one methodology layer in isolation (except quick prototyping)
- Predetermine agent evolution path (let specialization emerge from data)
- Force convergence at target iteration count (trust the criteria)
- Inflate value metrics to meet targets (honest assessment critical)
- Skip empirical validation (data-driven decisions only)
β Do:
- Start with OCA cycle, add evaluation and validation
- Let agent specialization emerge from domain needs
- Trust the convergence criteria (system knows when done)
- Calculate V(s) honestly based on actual state
- Complete all analysis thoroughly before codifying
Iteration Documentation Pitfalls
β Don't:
- Skip iteration documentation (every iteration needs iteration-N.md)
- Calculate V-scores without component breakdowns and evidence
- Use vague assessments ("seems good", "probably 0.7")
- Omit gap analysis or convergence checks
- Document only successes (failures provide valuable learnings)
- Assume convergence without systematic criteria evaluation
- Inflate scores to meet targets (honesty is critical)
- Skip reflections section (meta-learning opportunity)
β Do:
- Use
iteration-executorsubagent for consistent structure - Provide concrete evidence for each value component
- Show arithmetic for all calculations
- Document both instance and meta layer gaps explicitly
- Include reflections (what worked, didn't work, learnings, insights)
- Be honest about scores (baseline V_meta of 0.20 is normal and acceptable)
- Follow the 10-section structure for every iteration
- Reference iteration documentation example for quality standards
Related Skills
Acceleration techniques (achieve 3-4 iteration convergence):
- rapid-convergence - Fast convergence patterns
- retrospective-validation - Historical data validation
- baseline-quality-assessment - Strong iteration 0
Supporting skills:
- agent-prompt-evolution - Track agent specialization
Domain applications (ready-to-use methodologies):
- testing-strategy - TDD, coverage-driven, fixtures
- error-recovery - Error taxonomy, recovery patterns
- ci-cd-optimization - Quality gates, automation
- observability-instrumentation - Logging, metrics, tracing
- dependency-health - Security, freshness, compliance
- knowledge-transfer - Onboarding, learning paths
- technical-debt-management - SQALE, prioritization
- cross-cutting-concerns - Pattern extraction, enforcement
References
Core documentation:
- Overview - Architecture and philosophy
- OCA Cycle - Detailed process
- Value Functions - Evaluation framework
- Convergence Criteria - When to stop
- Three-Layer Architecture - Framework layers
Quick start:
- Quick Start Guide - Step-by-step tutorial
Examples:
- Testing Methodology - Complete walkthrough
- CI/CD Optimization - Pipeline example
- Error Recovery - Error handling example
Templates:
- Experiment Template - Structure your experiment
- Iteration Prompts - Guide each iteration
Status: β Production-ready | BAIME Framework | 8 experiments | 100% success rate | 95% transferable
Terminology: This skill implements the Bootstrapped AI Methodology Engineering (BAIME) framework. Use "BAIME" when referring to this methodology in documentation, research, or when asking Claude Code for assistance with methodology development.