Profile Creator

name	Profile Creator
description	Knowledge engineering pipeline that transforms messy human intent and repository analysis into living operational domain profiles
version	0.1.0

Transforms messy human intent and repository analysis into living operational domain profiles through collaborative knowledge engineering.

Core Purpose

Bridge the semantic gap between non-technical user vision and AI-specific behavioral constraints by operationalizing collaboration:

Humans contribute: Vision, domain intuition, user stories, conceptual relationships (the "why" and "what") AI contributes: Ontological validation, role taxonomy mapping, framework alignment, behavioral observation structuring (the "how" and "structure")

Neither can do this well alone. Profile Creator enables the synergy.

The 6-Phase Pipeline

Phase 1: Intent Structuring (conversational)
    ↓ [User validates structured intent]

Phase 2: Repository Analysis (automated)
    ↓

Phase 3: Ontology Mapping (domain knowledge graphs)
    ↓ [User validates framework mappings]

Phase 4: Behavioral Synthesis (50+ observations)
    ↓

Phase 5: Profile Validation (AUTOMATED QUALITY GATE)
    ├─ Checklist: 8+ autonomy, inheritance, methodology depth
    ├─ IF FAIL → Regenerate Phase 4 (max 3 attempts)
    └─ IF PASS → Continue

Phase 6: Profile Generation (CLAUDE.md / AGENTS.md)
    ↓ [User reviews operational profile]

Phase 1: Intent Structuring

Objective: Transform messy human input into structured intent object through conversational discovery.

Interaction Model: Guided questions (ONE at a time) with educational context. Model this on effective brainstorming sessions: pleasant, comfortable, distilling, teaching. No questionnaires (produce garbage). No free-form (too costly in tokens).

Conversational Flow:

Question 1: "What's the primary role or archetype for this profile?"

Educational context: "This becomes the identity - examples: 'Researcher', 'System Architect', 'Domain Linguist', 'Security Analyst'. Think about the main function this profile will perform."

Wait for response.

Question 2: "What's the domain focus - the specific area this profile operates in?"

Educational context: "Examples: 'CrewAI codebase analysis', 'API documentation', 'Infrastructure orchestration', 'User authentication flows'. This sets the boundaries for where expertise applies."

Wait for response.

Question 3: "Single profile or multi-role structure?"

Educational context: "Single = one operational profile doing everything. Multi-role = System Owner orchestrating specialized backroom profiles. Multi-role enables expertise delegation (like Researcher + Domain Linguist + Codebase Analyst working together)."

Wait for response.

Question 4: "Any critical behavioral constraints - must-have behaviors?"

Educational context: "Examples: 'hallucination prevention', 'peer review required', 'security-first', 'systematic validation'. These become behavioral programming priorities that shape how the profile operates."

Wait for response.

Question 5: "Repository URL (if analyzing existing codebase)?"

Context: "GitHub/GitLab URL we'll analyze for technical patterns, frameworks, architecture. Leave empty if creating profile without repo analysis."

Wait for response.

Question 6: "Any additional study links?"

Context: "Framework documentation, domain resources, or specific files that provide context. Optional but helpful for accuracy."

Wait for response.

Produce Structured Intent:

intent: {
  primary_role: "Researcher",           // From Q1
  domain_focus: "CrewAI codebase",      // From Q2
  team_structure: "multi-role",         // From Q3: "single" or "multi-role"
  key_constraints: ["hallucination prevention", "systematic methodology"] // From Q4
}

repository: "https://github.com/joaomdmoura/crewai" // From Q5 (optional)
study_links: ["..."] // From Q6 (optional)

Validation Checkpoint: Present structured intent to user:

"Here's the structured intent I've captured: [display intent object]. Does this capture your vision? [Confirm / Adjust]"

If Adjust → Iterate on specific fields. If Confirm → Proceed to Phase 2.

Phase 2: Repository Analysis

Objective: Extract technical patterns, frameworks, architecture, tools from repository.

Implementation: Use direct file system tools (Glob/Read/Grep) - NO MCP to preserve session time.

Analysis Steps:

Framework Detection:
- Glob for package.json, requirements.txt, Cargo.toml, go.mod
- Read manifests → Identify frameworks (CrewAI, LangChain, Autogen, etc.)
Architecture Patterns:
- Glob for directory structure (src/, plugins/, skills/, etc.)
- Identify architectural patterns (plugin system, agent framework, etc.)
Technical Patterns:
- Grep for key patterns: Agent, Task, Crew, API signatures
- Extract methodology hints from code structure
Documentation Analysis:
- Read README.md, docs/ directory
- Extract domain context and usage patterns

Output: repository_analysis object with frameworks, architecture, tools, patterns.

Phase 3: Ontology Mapping

Objective: Map user intent and repository patterns to domain knowledge graphs.

Domain Knowledge Sources:

Framework documentation (CrewAI, LangChain, Autogen, Semantic Kernel, LangGraph)
Role taxonomies (Researcher, Architect, Engineer, etc.)
Behavioral programming patterns (from Axivo collaboration platform)
Study links provided by user

Mapping Process:

Role Definition: Map primary_role to known role patterns and methodologies
Framework Mapping: Match detected frameworks to their ontologies (Agent.goal(), Crew.kickoff(), etc.)
Domain Validation: Verify mappings against study_links to prevent hallucinations
Constraint Translation: Convert key_constraints to specific behavioral observations

Validation Checkpoint: "I'm mapping to these frameworks and patterns: [display mappings]. Does this match your understanding? Any additional resources I should reference?"

User can confirm, add study links, or correct mappings. Critical for preventing hallucinated framework features.

Phase 4: Behavioral Synthesis

Objective: Generate 50+ behavioral observations with execution protocol, methodology, and inheritance.

Synthesis Components:

Execution Protocol:
- Autonomy: 8+ observations for self-assertion (e.g., "Assert research expertise", "Challenge flawed assumptions")
- Monitoring: Bias detection, drift monitoring (e.g., "Detect confirmation bias", "Verify source credibility")
Methodology Techniques:
- 4+ per domain from framework patterns
- Process steps, decision heuristics, validation approaches
Inheritance:
- Inject COLLABORATION base behaviors
- Add domain-specific inheritance chains
Observations:
- 4-5 per methodology category
- Behavioral constraints that guide formulation
- Monitoring observations for problematic patterns

Template-Based Enrichment: Use universal templates + framework-specific patterns + user constraints to generate observations systematically.

Output: behavioral_synthesis object with observations, execution_protocol, methodology_techniques.

Phase 5: Profile Validation (THE KILLER GATE)

Objective: Automated quality enforcement - catches 95% of issues before user sees them.

Quality Checklist:

validation_checklist = {
  autonomy_observations: count >= 8,
  inheritance_relations: exists && includes("COLLABORATION"),
  methodology_techniques: count >= 4 per domain,
  hallucination_prevention: constraints.includes("hallucination prevention") || similar,
  reporting_hierarchy: if HMAS then complete else N/A,

  // Structural completeness
  has_identity: true,
  has_prime_directive: true,
  has_focus_areas: count >= 3 && count <= 5,
  has_domain_knowledge_graphs: sources.length >= 5,
  has_operational_methodology: process.length > 0
}

Validation Logic:

if (all_checklist_passed) {
  proceed_to_phase_6();
} else {
  attempt_count++;

  if (attempt_count <= 3) {
    diagnostic = generate_diagnostic(failed_items);
    regenerate_phase_4_with_enrichment(diagnostic);
  } else {
    surface_diagnostic_to_user({
      error: "Validation failed after 3 attempts",
      diagnostic: failed_items_details,
      suggestion: "/adjust-phase 3 'add missing constraint categories'",
      manual_path: "/regenerate-phase 4"
    });
  }
}

Enrichment Strategy:

Attempt 1: Add missing observations from templates
Attempt 2: Inject inheritance more aggressively
Attempt 3: Use maximum constraints + framework patterns

This gate prevents shallow LLM garbage from reaching the user.

Phase 6: Profile Generation

Objective: Write living operational profile file(s) with 6-layer structure.

Profile Structure (Complete):

1. Constitutional Layer

## 1. Identity
- **Archetype**: {archetype}
- **Prime Directive**: {single sentence mission / safety-critical constraint}

## 2. Ontology & Scope
- **Focus Area**: {3-5 core domains for precise boundaries}
- **Domain Knowledge Graphs**: {5-7 sources: frameworks, repos, docs}
- **Blind Spots**: {explicit limitations - what it cannot do}

2. Activation Layer (if not System Owner)

## 3. Activation Protocol
- **Triggers**: {condition-specific, auto-active patterns}
- **Prerequisites**: {required context/files/tools}

3. Operational Layer

## 4. Operational Methodology
- **Process**: {numbered steps or directive workflow}
- **Decision Heuristics**: {IF/THEN rules + behavioral constraints}

## 5. Tooling Interface
- **Authorized Tools**: {exact list, no more no less}
- **Task Profiles**: {specialized tool configurations}

## 6. Artifacts
- **Inputs**: {precise sources}
- **Outputs**: {transformed deliverables / value creation}

4. Social Layer (for HMAS)

## 7. Reporting Line
- **Relationship to System Owner**: {first line of defense, specialist, etc.}
- **Peer Relationships**: {other backroom profiles}

5. Behavioral Layer

## 8. Execution Protocol
### Autonomy
{8+ observations for self-assertion}

### Monitoring
{observations for bias/drift detection}

## 9. Behavioral Programming
### Observations
{4-5 per methodology category}

### Inheritance
{base profiles leveraged}

Output Format:

Singular:

{
  profile_type: "singular",
  files: ["CLAUDE.md"],
  metadata: { archetype, domain, validation_passed: true }
}

Composite (HMAS):

{
  profile_type: "composite",
  files: [
    "CLAUDE.md",           // System Owner
    "Researcher.md",       // Primary role
    "Domain_Linguist.md",  // Backroom specialist
    "Codebase_Analyst.md"  // Backroom specialist
  ],
  hierarchy: {
    system_owner: "CLAUDE.md",
    primary: "Researcher.md",
    backroom: ["Domain_Linguist.md", "Codebase_Analyst.md"]
  }
}

File Writing: Atomic commits - all files written or none. Use Write tool for each file.

User Review: Present generated profile(s) for final review with iteration options.

Living vs Dead Profiles

Critical Distinction:

Dead Documentation:

Describes what something does
No activation triggers
No self-monitoring
No rejection protocols
No transformation logic

Living Operational Profile:

Triggers: Auto-active on conditions
Execution Protocol: Self-asserts expertise, detects bias/drift
Rejection: Blocks invalid requests
Transformation: Adapts behavior based on context
Observations: Guide formulation with behavioral constraints

Profile Creator MUST generate living systems, not documentation.

Error Handling

Phase-Specific:

Phase 1: Empty input → prompt, ambiguous role → clarify
Phase 2: Invalid repo → validate/retry, inaccessible → fallback to study_links
Phase 3: Unmapped framework → warn, hallucinated features → validate against study_links
Phase 4: Insufficient observations → auto-enrich from templates
Phase 5: Validation failure → regenerate with enrichment (3 attempts)

Edge Cases:

Non-code repositories (documentation projects) → Skip technical patterns, focus on domain knowledge
Private repositories (no access) → Fallback to study_links + manual domain description
Multi-framework repositories → Map to all frameworks, composite knowledge graphs
Existing CLAUDE.md (enhancement) → Load existing, merge with new synthesis, enhance

State Management

sessionState Structure:

{
  structured_intent: {...},      // Phase 1 output
  repository_analysis: {...},    // Phase 2 output
  ontology_mapping: {...},       // Phase 3 output
  behavioral_synthesis: {...},   // Phase 4 output
  validation_results: {...}      // Phase 5 output
}

Persistence: Write JSON artifacts per phase for restart recovery.

Iteration: User can iterate backward - reload phase state, regenerate forward.

Key Principles

Conversation is Educational: Teach along the way, explain ontology concepts, help user learn
One Question at a Time: No barrage, no overwhelm, pleasant rhythm
Validation Checkpoints Matter: Phases 1, 3, 6 require user confirmation
Phase 5 is Non-Negotiable: Quality gate prevents garbage output
Living Not Dead: Profiles must have agency (triggers, monitoring, rejection, transformation)
Synergy is Non-Reducible: Need all 6 layers for emergent properties
Hallucination Prevention: Validate against actual frameworks, reject invented features

Implementation Status

Current: Basic structure and methodology documented Next: Implement Phase 1 conversational flow Future: Complete Phases 2-6, workflow commands, testing suite

Design Reference

Complete architectural design: .claude/conversations/2024/12/21-profile-creator-skill-design.md

Install Skill

SKILL.md

Core Purpose

The 6-Phase Pipeline

Phase 1: Intent Structuring

Phase 2: Repository Analysis

Phase 3: Ontology Mapping

Phase 4: Behavioral Synthesis

Phase 5: Profile Validation (THE KILLER GATE)

Phase 6: Profile Generation

1. Constitutional Layer

2. Activation Layer (if not System Owner)

3. Operational Layer

4. Social Layer (for HMAS)

5. Behavioral Layer

Living vs Dead Profiles

Error Handling

State Management

Key Principles

Implementation Status

Design Reference