| name | analyzing-unknown-codebases |
| description | Analyze unfamiliar codebases systematically to produce subsystem catalog entries - emphasizes strict contract compliance and confidence marking |
Analyzing Unknown Codebases
Purpose
Systematically analyze unfamiliar code to identify subsystems, components, dependencies, and architectural patterns. Produce catalog entries that follow EXACT output contracts.
When to Use
- Coordinator delegates subsystem analysis task
- Task specifies reading from workspace and appending to
02-subsystem-catalog.md - You need to analyze code you haven't seen before
- Output must integrate with downstream tooling (validation, diagram generation)
Critical Principle: Contract Compliance
Your analysis quality doesn't matter if you violate the output contract.
Common rationalization: "I'll add helpful extra sections to improve clarity"
Reality: Extra sections break downstream tools. The coordinator expects EXACT format for parsing and validation. Your job is to follow the specification, not improve it.
Output Contract (MANDATORY)
When writing to 02-subsystem-catalog.md, append EXACTLY this format:
## [Subsystem Name]
**Location:** `path/to/subsystem/`
**Responsibility:** [One sentence describing what this subsystem does]
**Key Components:**
- `file1.ext` - [Brief description]
- `file2.ext` - [Brief description]
- `file3.ext` - [Brief description]
**Dependencies:**
- Inbound: [Subsystems that depend on this one]
- Outbound: [Subsystems this one depends on]
**Patterns Observed:**
- [Pattern 1]
- [Pattern 2]
**Concerns:**
- [Any issues, gaps, or technical debt observed]
**Confidence:** [High/Medium/Low] - [Brief reasoning]
---
If no concerns exist, write:
**Concerns:**
- None observed
CRITICAL COMPLIANCE RULES:
- ❌ Add extra sections ("Integration Points", "Recommendations", "Files", etc.)
- ❌ Change section names or reorder them
- ❌ Write to separate file (must append to
02-subsystem-catalog.md) - ❌ Skip sections (include ALL sections - use "None observed" if empty)
- ✅ Copy the template structure EXACTLY
- ✅ Keep section order: Location → Responsibility → Key Components → Dependencies → Patterns → Concerns → Confidence
Contract is specification, not minimum. Extra sections break downstream validation.
Example: Complete Compliant Entry
Here's what a correctly formatted entry looks like:
## Authentication Service
**Location:** `/src/services/auth/`
**Responsibility:** Handles user authentication, session management, and JWT token generation for API access.
**Key Components:**
- `auth_handler.py` - Main authentication logic with login/logout endpoints (342 lines)
- `token_manager.py` - JWT token generation and validation (156 lines)
- `session_store.py` - Redis-backed session storage (98 lines)
**Dependencies:**
- Inbound: API Gateway, User Service
- Outbound: Database Layer, Cache Service, Logging Service
**Patterns Observed:**
- Dependency injection for testability (all external services injected)
- Token refresh pattern with sliding expiration
- Audit logging for all authentication events
**Concerns:**
- None observed
**Confidence:** High - Clear entry points, documented API, test coverage validates behavior
---
This is EXACTLY what your output should look like. No more, no less.
Systematic Analysis Approach
Step 1: Read Task Specification
Your task file (temp/task-[name].md) specifies:
- What to analyze (scope: directories, plugins, services)
- Where to read context (
01-discovery-findings.md) - Where to write output (
02-subsystem-catalog.md- append) - Expected format (the contract above)
Read these files FIRST before analyzing code.
Step 2: Layered Exploration
Use this proven approach from baseline testing:
- Metadata layer - Read plugin.json, package.json, setup.py
- Structure layer - Examine directory organization
- Router layer - Find and read router/index files (often named "using-X")
- Sampling layer - Read 3-5 representative files
- Quantitative layer - Use line counts as depth indicators
Why this order works:
- Metadata gives overview without code diving
- Structure reveals organization philosophy
- Routers often catalog all components
- Sampling verifies patterns
- Quantitative data supports claims
Step 3: Mark Confidence Explicitly
Every output MUST include confidence level with reasoning.
High confidence - Router skill provided catalog + verified with sampling
**Confidence:** High - Router skill listed all 10 components, sampling 4 confirmed patterns
Medium confidence - No router, but clear structure + sampling
**Confidence:** Medium - No router catalog, inferred from directory structure + 5 file samples
Low confidence - Incomplete, placeholders, or unclear organization
**Confidence:** Low - Several SKILL.md files missing, test artifacts suggest work-in-progress
Step 4: Distinguish States Clearly
When analyzing codebases with mixed completion:
Complete - Skill file exists, has content, passes basic read test
- `skill-name/SKILL.md` - Complete skill (1,234 lines)
Placeholder - Skill file exists but is stub/template
- `skill-name/SKILL.md` - Placeholder (12 lines, template only)
Planned - Referenced in router but no file exists
- `skill-name` - Planned (referenced in router, not implemented)
TDD artifacts - Test scenarios, baseline results (these ARE documentation)
- `test-scenarios.md` - TDD test scenarios (RED phase)
- `baseline-results.md` - Baseline behavior documentation
Step 5: Write Output (Contract Compliance)
Before writing:
- Prepare your entry in EXACT contract format from the template above
- Copy the structure - don't paraphrase or reorganize
- Triple-check you have ALL sections in correct order
When writing:
- Target file:
02-subsystem-catalog.mdin workspace directory - Operation: Append your entry (create file if first entry, append if file exists)
- Method:
- If file exists: Read current content, then Write with original + your entry
- If file doesn't exist: Write your entry directly
- Format: Follow contract sections in exact order
- Completeness: Include ALL sections - use "None observed" for empty Concerns
DO NOT create separate files (e.g., subsystem-X-analysis.md). The coordinator expects all entries in 02-subsystem-catalog.md.
After writing:
- Re-read
02-subsystem-catalog.mdto verify your entry was added correctly - Validate format matches contract exactly using this checklist:
Self-Validation Checklist:
[ ] Section 1: Subsystem name as H2 heading (## Name)
[ ] Section 2: Location with backticks and absolute path
[ ] Section 3: Responsibility as single sentence
[ ] Section 4: Key Components as bulleted list with descriptions
[ ] Section 5: Dependencies with "Inbound:" and "Outbound:" labels
[ ] Section 6: Patterns Observed as bulleted list
[ ] Section 7: Concerns present (with issues OR "None observed")
[ ] Section 8: Confidence level (High/Medium/Low) with reasoning
[ ] Separator: "---" line after confidence
[ ] NO extra sections added
[ ] Sections in correct order
[ ] Entry in file: 02-subsystem-catalog.md (not separate file)
Handling Uncertainty
When architecture is unclear:
State what you observe - Don't guess at intent
**Patterns Observed:** - 3 files with similar structure (analysis.py, parsing.py, validation.py) - Unclear if this is deliberate pattern or coincidenceMark confidence appropriately - Low confidence is valid
**Confidence:** Low - Directory structure suggests microservices, but no service definitions foundUse "Concerns" section - Document gaps
**Concerns:** - No clear entry point identified - Dependencies inferred from imports, not explicit manifest
DO NOT:
- Invent relationships you didn't verify
- Assume "obvious" architecture without evidence
- Skip confidence marking because you're uncertain
Positive Behaviors to Maintain
From baseline testing, these approaches WORK:
✅ Read actual files - Don't infer from names alone ✅ Use router skills - They often provide complete catalogs ✅ Sample strategically - 3-5 files verifies patterns without exhaustive reading ✅ Cross-reference - Verify claims (imports match listed dependencies) ✅ Document assumptions - Make reasoning explicit ✅ Line counts indicate depth - 1,500-line skill vs 50-line stub matters
Common Rationalizations (STOP SIGNALS)
If you catch yourself thinking these, STOP:
| Rationalization | Reality |
|---|---|
| "I'll add Integration Points section for clarity" | Extra sections break downstream parsing |
| "I'll write to separate file for organization" | Coordinator expects append to specified file |
| "I'll improve the contract format" | Contract is specification from coordinator |
| "More information is always helpful" | Your job: follow spec. Coordinator's job: decide what's included |
| "This comprehensive format is better" | "Better" violates contract. Compliance is mandatory. |
Validation Criteria
Your output will be validated against:
- Contract compliance - All sections present, no extras
- File operation - Appended to
02-subsystem-catalog.md, not separate file - Confidence marking - High/Medium/Low with reasoning
- Evidence-based claims - Components you actually read
- Bidirectional dependencies - If A→B, then B must show A as inbound
If validation returns NEEDS_REVISION:
- Read the validation report
- Fix specific issues identified
- Re-submit following contract
Success Criteria
You succeeded when:
- Entry appended to
02-subsystem-catalog.mdin exact contract format - All sections included (none skipped, none added)
- Confidence level marked with reasoning
- Claims supported by files you read
- Validation returns APPROVED
You failed when:
- Added "helpful" extra sections
- Wrote to separate file
- Changed contract format
- Skipped sections
- No confidence marking
- Validation returns BLOCK status
Anti-Patterns
❌ Add extra sections "I'll add Recommendations section" → Violates contract
❌ Write to new file
"I'll create subsystem-X-analysis.md" → Should append to 02-subsystem-catalog.md
❌ Skip required sections "No concerns, so I'll omit that section" → Include section with "None observed"
❌ Change format "I'll use numbered lists instead of bullet points" → Follow contract exactly
❌ Work without reading task spec
"I know what to do" → Read temp/task-*.md first
Integration with Workflow
This skill is typically invoked as:
- Coordinator creates workspace and holistic assessment
- Coordinator writes task specification in
temp/task-[yourname].md - YOU read task spec +
01-discovery-findings.md - YOU analyze assigned subsystem systematically
- YOU append entry to
02-subsystem-catalog.mdfollowing contract - Validator checks your output against contract
- Coordinator proceeds to next phase if validation passes
Your role: Analyze systematically, follow contract exactly, mark confidence explicitly.