name	analyzing-unknown-codebases
description	Analyze unfamiliar codebases systematically to produce subsystem catalog entries - emphasizes strict contract compliance and confidence marking

Analyzing Unknown Codebases

Purpose

Systematically analyze unfamiliar code to identify subsystems, components, dependencies, and architectural patterns. Produce catalog entries that follow EXACT output contracts.

When to Use

Coordinator delegates subsystem analysis task
Task specifies reading from workspace and appending to 02-subsystem-catalog.md
You need to analyze code you haven't seen before
Output must integrate with downstream tooling (validation, diagram generation)

Critical Principle: Contract Compliance

Your analysis quality doesn't matter if you violate the output contract.

Common rationalization: "I'll add helpful extra sections to improve clarity"

Reality: Extra sections break downstream tools. The coordinator expects EXACT format for parsing and validation. Your job is to follow the specification, not improve it.

Output Contract (MANDATORY)

When writing to 02-subsystem-catalog.md, append EXACTLY this format:

## [Subsystem Name]

**Location:** `path/to/subsystem/`

**Responsibility:** [One sentence describing what this subsystem does]

**Key Components:**
- `file1.ext` - [Brief description]
- `file2.ext` - [Brief description]
- `file3.ext` - [Brief description]

**Dependencies:**
- Inbound: [Subsystems that depend on this one]
- Outbound: [Subsystems this one depends on]

**Patterns Observed:**
- [Pattern 1]
- [Pattern 2]

**Concerns:**
- [Any issues, gaps, or technical debt observed]

**Confidence:** [High/Medium/Low] - [Brief reasoning]

---

If no concerns exist, write:

**Concerns:**
- None observed

CRITICAL COMPLIANCE RULES:

❌ Add extra sections ("Integration Points", "Recommendations", "Files", etc.)
❌ Change section names or reorder them
❌ Write to separate file (must append to 02-subsystem-catalog.md)
❌ Skip sections (include ALL sections - use "None observed" if empty)
✅ Copy the template structure EXACTLY
✅ Keep section order: Location → Responsibility → Key Components → Dependencies → Patterns → Concerns → Confidence

Contract is specification, not minimum. Extra sections break downstream validation.

Example: Complete Compliant Entry

Here's what a correctly formatted entry looks like:

## Authentication Service

**Location:** `/src/services/auth/`

**Responsibility:** Handles user authentication, session management, and JWT token generation for API access.

**Key Components:**
- `auth_handler.py` - Main authentication logic with login/logout endpoints (342 lines)
- `token_manager.py` - JWT token generation and validation (156 lines)
- `session_store.py` - Redis-backed session storage (98 lines)

**Dependencies:**
- Inbound: API Gateway, User Service
- Outbound: Database Layer, Cache Service, Logging Service

**Patterns Observed:**
- Dependency injection for testability (all external services injected)
- Token refresh pattern with sliding expiration
- Audit logging for all authentication events

**Concerns:**
- None observed

**Confidence:** High - Clear entry points, documented API, test coverage validates behavior

---

This is EXACTLY what your output should look like. No more, no less.

Systematic Analysis Approach

Step 1: Read Task Specification

Your task file (temp/task-[name].md) specifies:

What to analyze (scope: directories, plugins, services)
Where to read context (01-discovery-findings.md)
Where to write output (02-subsystem-catalog.md - append)
Expected format (the contract above)

Read these files FIRST before analyzing code.

Step 2: Layered Exploration

Use this proven approach from baseline testing:

Metadata layer - Read plugin.json, package.json, setup.py
Structure layer - Examine directory organization
Router layer - Find and read router/index files (often named "using-X")
Sampling layer - Read 3-5 representative files
Quantitative layer - Use line counts as depth indicators

Why this order works:

Metadata gives overview without code diving
Structure reveals organization philosophy
Routers often catalog all components
Sampling verifies patterns
Quantitative data supports claims

Step 3: Mark Confidence Explicitly

Every output MUST include confidence level with reasoning.

High confidence - Router skill provided catalog + verified with sampling

**Confidence:** High - Router skill listed all 10 components, sampling 4 confirmed patterns

Medium confidence - No router, but clear structure + sampling

**Confidence:** Medium - No router catalog, inferred from directory structure + 5 file samples

Low confidence - Incomplete, placeholders, or unclear organization

**Confidence:** Low - Several SKILL.md files missing, test artifacts suggest work-in-progress

Step 4: Distinguish States Clearly

When analyzing codebases with mixed completion:

Complete - Skill file exists, has content, passes basic read test

- `skill-name/SKILL.md` - Complete skill (1,234 lines)

Placeholder - Skill file exists but is stub/template

- `skill-name/SKILL.md` - Placeholder (12 lines, template only)

Planned - Referenced in router but no file exists

- `skill-name` - Planned (referenced in router, not implemented)

TDD artifacts - Test scenarios, baseline results (these ARE documentation)

- `test-scenarios.md` - TDD test scenarios (RED phase)
- `baseline-results.md` - Baseline behavior documentation

Step 5: Write Output (Contract Compliance)

Before writing:

Prepare your entry in EXACT contract format from the template above
Copy the structure - don't paraphrase or reorganize
Triple-check you have ALL sections in correct order

When writing:

Target file: 02-subsystem-catalog.md in workspace directory
Operation: Append your entry (create file if first entry, append if file exists)
Method:
- If file exists: Read current content, then Write with original + your entry
- If file doesn't exist: Write your entry directly
Format: Follow contract sections in exact order
Completeness: Include ALL sections - use "None observed" for empty Concerns

DO NOT create separate files (e.g., subsystem-X-analysis.md). The coordinator expects all entries in 02-subsystem-catalog.md.

After writing:

Re-read 02-subsystem-catalog.md to verify your entry was added correctly
Validate format matches contract exactly using this checklist:

Self-Validation Checklist:

[ ] Section 1: Subsystem name as H2 heading (## Name)
[ ] Section 2: Location with backticks and absolute path
[ ] Section 3: Responsibility as single sentence
[ ] Section 4: Key Components as bulleted list with descriptions
[ ] Section 5: Dependencies with "Inbound:" and "Outbound:" labels
[ ] Section 6: Patterns Observed as bulleted list
[ ] Section 7: Concerns present (with issues OR "None observed")
[ ] Section 8: Confidence level (High/Medium/Low) with reasoning
[ ] Separator: "---" line after confidence
[ ] NO extra sections added
[ ] Sections in correct order
[ ] Entry in file: 02-subsystem-catalog.md (not separate file)

Handling Uncertainty

When architecture is unclear:

State what you observe - Don't guess at intent

**Patterns Observed:**
- 3 files with similar structure (analysis.py, parsing.py, validation.py)
- Unclear if this is deliberate pattern or coincidence

Mark confidence appropriately - Low confidence is valid

**Confidence:** Low - Directory structure suggests microservices, but no service definitions found

Use "Concerns" section - Document gaps

**Concerns:**
- No clear entry point identified
- Dependencies inferred from imports, not explicit manifest

DO NOT:

Invent relationships you didn't verify
Assume "obvious" architecture without evidence
Skip confidence marking because you're uncertain

Positive Behaviors to Maintain

From baseline testing, these approaches WORK:

✅ Read actual files - Don't infer from names alone ✅ Use router skills - They often provide complete catalogs ✅ Sample strategically - 3-5 files verifies patterns without exhaustive reading ✅ Cross-reference - Verify claims (imports match listed dependencies) ✅ Document assumptions - Make reasoning explicit ✅ Line counts indicate depth - 1,500-line skill vs 50-line stub matters

Common Rationalizations (STOP SIGNALS)

If you catch yourself thinking these, STOP:

Rationalization	Reality
"I'll add Integration Points section for clarity"	Extra sections break downstream parsing
"I'll write to separate file for organization"	Coordinator expects append to specified file
"I'll improve the contract format"	Contract is specification from coordinator
"More information is always helpful"	Your job: follow spec. Coordinator's job: decide what's included
"This comprehensive format is better"	"Better" violates contract. Compliance is mandatory.

Validation Criteria

Your output will be validated against:

Contract compliance - All sections present, no extras
File operation - Appended to 02-subsystem-catalog.md, not separate file
Confidence marking - High/Medium/Low with reasoning
Evidence-based claims - Components you actually read
Bidirectional dependencies - If A→B, then B must show A as inbound

If validation returns NEEDS_REVISION:

Read the validation report
Fix specific issues identified
Re-submit following contract

Success Criteria

You succeeded when:

Entry appended to 02-subsystem-catalog.md in exact contract format
All sections included (none skipped, none added)
Confidence level marked with reasoning
Claims supported by files you read
Validation returns APPROVED

You failed when:

Added "helpful" extra sections
Wrote to separate file
Changed contract format
Skipped sections
No confidence marking
Validation returns BLOCK status

Anti-Patterns

❌ Add extra sections "I'll add Recommendations section" → Violates contract

❌ Write to new file "I'll create subsystem-X-analysis.md" → Should append to 02-subsystem-catalog.md

❌ Skip required sections "No concerns, so I'll omit that section" → Include section with "None observed"

❌ Change format "I'll use numbered lists instead of bullet points" → Follow contract exactly

❌ Work without reading task spec "I know what to do" → Read temp/task-*.md first

Integration with Workflow

This skill is typically invoked as:

Coordinator creates workspace and holistic assessment
Coordinator writes task specification in temp/task-[yourname].md
YOU read task spec + 01-discovery-findings.md
YOU analyze assigned subsystem systematically
YOU append entry to 02-subsystem-catalog.md following contract
Validator checks your output against contract
Coordinator proceeds to next phase if validation passes

Your role: Analyze systematically, follow contract exactly, mark confidence explicitly.

Install Skill

SKILL.md