| name | ontology-phase-2-analyze |
| description | Phase 2 of Ontology Builder Pipeline. AI acts as domain SME to analyze raw inputs, extract entities/workflows/rules, fill knowledge gaps using market expertise. Use after Phase 1 ingestion is complete. |
Phase 2: Analyze as SME
Act as domain Subject Matter Expert to analyze inputs and extract structured knowledge.
Trigger
Execute when:
- Phase 1 ingestion-report.md is available
- Analysis refresh requested
SME Mindset
You are a Senior Business Analyst + Domain Expert. Your expertise adapts to the domain specified in project-context.md.
Your job: Transform raw, messy inputs into clean, structured domain knowledge.
Domain Expertise Activation
Step 0: Identify and Immerse in Domain
Read
project-context.mdto identify:- Domain type (HR, Finance, Sales, Operations, Healthcare, etc.)
- Industry vertical (if specified)
- Region (for regulatory context)
- Reference systems mentioned
Activate SME Mode for the identified domain:
I am now acting as a Senior SME in [DOMAIN] with: - 10+ years hands-on experience - Deep knowledge of industry-leading systems - Understanding of best practices and standards - Familiarity with regional regulationsIf domain is unfamiliar or specialized:
- Use web search to research current industry standards
- Look up major enterprise systems in that domain
- Research regional compliance requirements
- Find common entity patterns and workflows
SME Knowledge Areas (Dynamic)
For ANY domain, you should be able to reason about:
| Area | What to Consider |
|---|---|
| Standard Entities | What objects typically exist in this domain? |
| Common Workflows | What are the standard business processes? |
| Industry Systems | What software systems are leaders in this space? |
| Regulatory | What compliance requirements apply (by region)? |
| Best Practices | What patterns are considered best practice? |
| Anti-Patterns | What mistakes should be avoided? |
Example Domain Activation
# If project-context says: Domain = HR, Region = Vietnam
SME Activation:
domain: "Human Capital Management (HCM)"
reference_systems:
- "Workday HCM"
- "SAP SuccessFactors"
- "Oracle HCM Cloud"
standards:
- "SHRM guidelines"
- "ISO 30414 (HR metrics)"
regulations:
- "Vietnam Labor Code 2019"
- "Social Insurance Law"
- "Decree 145/2020 on labor contracts"
key_patterns:
- "Position-based vs Job-based staffing"
- "Accrual-based leave management"
- "Multi-level approval workflows"
# If project-context says: Domain = Finance, Region = Vietnam
SME Activation:
domain: "Financial Management"
reference_systems:
- "SAP S/4HANA FI/CO"
- "Oracle Financials Cloud"
- "MISA (Vietnam local)"
standards:
- "Vietnam Accounting Standards (VAS)"
- "Circular 200/2014"
- "IFRS alignment"
regulations:
- "E-invoice (Decree 123/2020)"
- "VAT regulations"
- "Transfer pricing rules"
key_patterns:
- "Procure-to-Pay (P2P)"
- "Order-to-Cash (O2C)"
- "Month-end close process"
When to Use Web Search
Use web search when you need to:
- Verify current regulations or standards
- Research unfamiliar domain terminology
- Find latest best practices
- Understand specific regional requirements
- Research competitor/reference systems
Search Strategy:
1. "[Domain] enterprise software best practices"
2. "[Domain] [Region] compliance requirements 2024"
3. "[Reference System] [Entity] data model"
4. "[Industry] standard workflows"
Process
Step 1: Load Context
Read:
_output/_logs/ingestion-report.md_input/project-context.md_input/domain-hints.md(if exists)
Establish:
- Domain type (HR, Finance, Sales, etc.)
- Regional context (for regulations)
- Existing system landscape
- Key constraints
Step 2: Deep Read All Sources
For each file in ingestion report, perform deep analysis:
analysis_per_file:
file: [path]
entities_extracted:
- name: [EntityName]
evidence: "[quote from source]"
confidence: [HIGH|MEDIUM|LOW]
attributes_mentioned: [list]
workflows_extracted:
- name: [WorkflowName]
evidence: "[quote from source]"
confidence: [HIGH|MEDIUM|LOW]
actors_mentioned: [list]
steps_mentioned: [list]
rules_extracted:
- description: [rule description]
evidence: "[quote from source]"
confidence: [HIGH|MEDIUM|LOW]
applies_to: [entity or workflow]
questions_raised:
- question: [what's unclear]
context: [why it matters]
Step 3: Consolidate Findings
Merge extractions from all files:
3.1 Entity Consolidation
For each potential entity:
- Merge mentions from different sources
- Resolve naming conflicts (use most common or clearest name)
- Combine attributes from all sources
- Assign confidence score
consolidated_entity:
name: [EntityName]
aliases: [other names used in sources]
sources: [list of files mentioning this]
confidence: [HIGH|MEDIUM|LOW]
classification: [CORE|VALUE_OBJECT|REFERENCE|TRANSACTION]
description: [synthesized from sources]
attributes:
- name: [attr]
type: [inferred type]
source: [which file mentioned it]
relationships:
- target: [OtherEntity]
type: [relationship type]
evidence: [quote]
3.2 Workflow Consolidation
For each potential workflow:
- Merge related actions into coherent workflows
- Identify actors and triggers
- Map to related entities
- Assign confidence score
3.3 Business Rule Consolidation
For each rule:
- Link to entity or workflow
- Determine rule type (validation, constraint, calculation)
- Assign ID and confidence
Step 4: Gap Analysis
Identify what's missing based on domain expertise:
4.1 Entity Gaps
As SME, I expect these entities in [domain] but don't see them:
- [Expected entity 1] - [why expected]
- [Expected entity 2] - [why expected]
4.2 Workflow Gaps
Standard workflows I expect but aren't documented:
- [Expected workflow 1] - [why expected]
- [Expected workflow 2] - [why expected]
4.3 Attribute Gaps
Common attributes missing from entities:
- [Entity] missing [attribute] - standard in [reference system]
4.4 Rule Gaps
Standard business rules not mentioned:
- [Rule description] - common in [domain]
Step 5: Fill Gaps with Domain Expertise
For each gap identified, apply your SME expertise:
gap_resolution:
gap: [what's missing]
resolution: [how I'm filling it]
reasoning: "As SME in [domain], this is standard because [reason]"
reference: "[Industry system] pattern" or "[Standard/Regulation]"
confidence: ASSUMED
assumption: [explicit assumption being made]
needs_validation: [true|false]
SME Gap-Filling Strategies
Strategy 1: Industry Pattern Recognition
"In [domain], the standard approach for [problem] is [solution].
This is how [Reference System 1] and [Reference System 2] handle it."
Strategy 2: Regulatory Inference
"Based on [Region] [Regulation], this entity must include [attribute]
to comply with [requirement]."
Strategy 3: Best Practice Application
"Industry best practice for [scenario] is [pattern].
This avoids common issues like [anti-pattern]."
Strategy 4: Web Research (when needed)
If unfamiliar with domain-specific pattern:
1. Search for "[domain] [entity/workflow] best practices"
2. Search for "[reference system] [feature] documentation"
3. Synthesize findings into gap resolution
Gap-Filling Examples
Example 1: Missing Entity Attributes
gap: "LeaveRequest entity missing approval tracking"
resolution: "Add approver_id, approved_at, approval_comments"
reasoning: "Standard workflow pattern - all approval-based entities
need to track who approved, when, and why"
reference: "Workday Absence Management pattern"
Example 2: Missing Business Rule
gap: "No rule for leave balance validation"
resolution: "Add rule: Available balance >= Requested days"
reasoning: "Fundamental constraint in all leave management systems
to prevent negative balances"
reference: "Universal pattern across SAP, Workday, Oracle"
Example 3: Missing Workflow Step
gap: "Approval workflow missing escalation"
resolution: "Add auto-escalation after [configurable] days"
reasoning: "Prevents requests from being stuck indefinitely.
Standard in enterprise approval workflows."
reference: "ServiceNow, Workday approval patterns"
Example 4: Regional Compliance
gap: "Missing mandatory fields for Vietnam labor compliance"
resolution: "Add social_insurance_number, tax_code to Employee"
reasoning: "Required by Vietnam Labor Code for employment records"
reference: "Vietnam Labor Code 2019, Decree 145/2020"
# If uncertain, verify with web search
Step 6: Resolve Conflicts
When sources conflict:
- Date conflict: Use most recent source
- Stakeholder conflict: Flag for human review
- Detail conflict: Use most detailed version
- Terminology conflict: Create alias mapping
Step 7: Ask Clarifying Questions (If Needed)
If critical information is genuinely missing and cannot be assumed:
## Questions for Stakeholder
Before proceeding, please clarify:
1. **[Question]**
- Context: [why this matters]
- My assumption if no answer: [what I'll assume]
2. **[Question]**
- Context: [why this matters]
- My assumption if no answer: [what I'll assume]
Important: Only ask questions that are:
- Critical for correctness
- Cannot be reasonably assumed
- Would significantly change the output
Prefer making documented assumptions over blocking progress.
Step 8: Generate Analysis Report
Output: _output/_logs/analysis-report.md
# Analysis Report
**Generated**: [timestamp]
**Analyst**: AI SME Agent
**Domain Expertise Applied**: [domain type]
## Executive Summary
- **Entities identified**: [N] ([confidence breakdown])
- **Workflows identified**: [N] ([confidence breakdown])
- **Business rules identified**: [N]
- **Gaps filled with market knowledge**: [N]
- **Assumptions made**: [N]
## Consolidated Entities
### [EntityName] [CONFIDENCE]
**Classification**: [type]
**Sources**: [list]
**Description**: [description]
**Attributes**:
| Attribute | Type | Required | Source |
|-----------|------|----------|--------|
| [attr] | [type] | [Y/N] | [source] |
**Relationships**:
| Target | Cardinality | Description |
|--------|-------------|-------------|
| [entity] | [card] | [desc] |
**Business Rules**:
- BR-XXX: [rule]
---
[Repeat for each entity]
## Consolidated Workflows
### [WorkflowName] [CONFIDENCE]
**Classification**: [CORE|SUPPORT|INTEGRATION]
**Trigger**: [trigger]
**Actors**: [list]
**Related Entities**: [list]
**High-Level Steps**:
1. [step]
2. [step]
3. [step]
**Business Rules Applied**: [list]
---
[Repeat for each workflow]
## Business Rules Catalog
| ID | Description | Applies To | Source | Confidence |
|----|-------------|------------|--------|------------|
| BR-001 | [desc] | [entity/workflow] | [source] | [conf] |
## Gap Analysis
### Gaps Filled with Market Knowledge
| Gap | Resolution | Source | Assumption |
|-----|------------|--------|------------|
| [gap] | [resolution] | [market ref] | [assumption] |
### Remaining Gaps (Need Human Input)
| Gap | Impact | Suggested Resolution |
|-----|--------|---------------------|
| [gap] | [impact] | [suggestion] |
## Assumptions Made
| # | Assumption | Rationale | Risk if Wrong |
|---|------------|-----------|---------------|
| 1 | [assumption] | [rationale] | [risk] |
## Questions for Stakeholder (Optional)
[Only if critical questions exist]
## Confidence Summary
| Category | HIGH | MEDIUM | LOW | ASSUMED |
|----------|------|--------|-----|---------|
| Entities | [N] | [N] | [N] | [N] |
| Workflows | [N] | [N] | [N] | [N] |
| Rules | [N] | [N] | [N] | [N] |
## Ready for Phase 3
Analysis complete. Proceed to DRD synthesis.
Output
_output/_logs/analysis-report.md_output/_logs/gate-2-manifest.yaml(verification manifest)- Ready for Phase 3: Synthesize
Gate 2: Self-Verification
Before completing Phase 2, generate verification manifest:
# _output/_logs/gate-2-manifest.yaml
gate: 2
name: "Post-Analysis Verification"
timestamp: "[ISO timestamp]"
structural_checks:
- check: "analysis-report.md exists"
status: PASS
- check: "Entities section present"
status: PASS | FAIL
entity_count: [N]
- check: "Workflows section present"
status: PASS | FAIL
workflow_count: [N]
- check: "Business rules section present"
status: PASS | FAIL
rule_count: [N]
consistency_checks:
- check: "All entities have classification"
status: PASS | FAIL
unclassified: []
- check: "All entities have confidence level"
status: PASS | FAIL
- check: "Confidence values valid"
status: PASS | FAIL
invalid_values: []
traceability_checks:
- check: "Every entity has source reference"
status: PASS | FAIL
entities_without_source: []
- check: "Every workflow has source reference"
status: PASS | FAIL
workflows_without_source: []
- check: "Assumed items documented in assumptions section"
status: PASS | FAIL
undocumented_assumptions: []
coverage_metrics:
input_files_processed: "[N]/[Total]"
entities_from_input: [N]
entities_assumed: [N]
assumption_ratio: "[%]"
result:
status: PASS | FAIL | WARN
blocking_failures: []
warnings: []
proceed_to_next_phase: true | false
Verification Rules:
- FAIL if any entity has no source AND is not marked ASSUMED
- WARN if assumption_ratio > 30%
- Only proceed if
status: PASSorstatus: WARN
Key Behaviors
DO:
- Make reasonable assumptions based on domain expertise
- Document every assumption explicitly
- Use market knowledge to fill gaps
- Prefer progress over perfection
- Be specific about confidence levels
DON'T:
- Block on minor missing details
- Invent arbitrary numbers without flagging
- Ignore conflicting information
- Make assumptions without documenting them
Next Phase
After completing analysis:
→ Load phase-3-synthesize/SKILL.md
→ Pass analysis-report.md as input