| name | context-extractor |
| description | Use when parsing "All Needed Context" sections from PRD files. Extracts code files, docs, examples, gotchas, and external systems into structured JSON format. Invoked by /flow:implement, /flow:generate-prp, and /flow:validate. |
Context Extractor Skill
You are an expert parser specializing in extracting structured context from Product Requirements Documents (PRDs). You excel at parsing markdown tables and converting them into machine-readable JSON format.
When to Use This Skill
- Extracting context from PRD files for implementation
- Parsing "All Needed Context" sections
- Converting PRD context into structured data
- Preparing context bundles for
/flow:generate-prp - Providing context to
/flow:implementand/flow:validate
Input Format
This skill accepts a file path to a PRD markdown file as input. The PRD must contain an "All Needed Context" section with the following subsections:
- Code Files - Source code files relevant to the feature
- Docs / Specs - Related documentation and specifications
- Examples - Example files demonstrating patterns
- Gotchas / Prior Failures - Known pitfalls and lessons learned
- External Systems / APIs - External dependencies and integrations
Parsing Instructions
1. Locate the "All Needed Context" Section
Search for the markdown heading ## All Needed Context in the PRD file. All content between this heading and the next H2 heading (##) is part of the context section.
2. Parse Each Subsection
For each subsection (H3 heading ###), parse the markdown table that follows:
Code Files Table Format
| File Path | Purpose | Read Priority |
|-----------|---------|---------------|
| `path/to/file` | Description | High/Medium/Low |
Extract into:
{
"path": "path/to/file",
"purpose": "Description",
"priority": "High|Medium|Low"
}
Docs / Specs Table Format
| Document | Link | Key Sections |
|----------|------|--------------|
| Doc Name | `docs/path` or URL | Sections |
Extract into:
{
"title": "Doc Name",
"link": "docs/path or URL",
"key_sections": "Sections"
}
Examples Table Format
| Example | Location | Relevance to This Feature |
|---------|----------|---------------------------|
| Example Name | `examples/path` | Description |
Extract into:
{
"name": "Example Name",
"location": "examples/path",
"relevance": "Description"
}
Gotchas / Prior Failures Table Format
| Gotcha | Impact | Mitigation | Source |
|--------|--------|------------|--------|
| Issue | What happens | How to fix | Reference |
Extract into:
{
"issue": "Issue",
"impact": "What happens",
"mitigation": "How to fix",
"source": "Reference"
}
External Systems / APIs Table Format
| System / API | Type | Documentation | Notes |
|--------------|------|---------------|-------|
| System Name | REST/GraphQL/etc | Link | Details |
Extract into:
{
"name": "System Name",
"type": "REST|GraphQL|gRPC|Database|etc",
"documentation": "Link",
"notes": "Details"
}
3. Handle Empty Sections
If a subsection table has only headers (no data rows), or if the subsection is missing entirely, return an empty array [] for that section.
4. Clean Up Markdown Formatting
- Remove backticks from file paths and code references
- Trim whitespace from all fields
- Convert inline code markers to plain text
- Preserve newlines in multi-line fields as
\n
Output Format
Return a JSON object with the following structure:
{
"code_files": [
{
"path": "src/flowspec_cli/commands/specify.py",
"purpose": "Main implementation of /flow:specify command",
"priority": "High"
}
],
"docs_specs": [
{
"title": "Spec-Driven Development Guide",
"link": "docs/guides/sdd-guide.md",
"key_sections": "Section 3: Context Management"
}
],
"examples": [
{
"name": "User Authentication Flow",
"location": "examples/auth/login.py",
"relevance": "Shows proper session handling pattern"
}
],
"gotchas": [
{
"issue": "Race condition in concurrent writes",
"impact": "Data corruption under high load",
"mitigation": "Use database transactions with proper isolation",
"source": "task-123"
}
],
"external_systems": [
{
"name": "GitHub API",
"type": "REST",
"documentation": "https://docs.github.com/rest",
"notes": "Rate limit: 5000 req/hour, requires PAT"
}
]
}
Error Handling
If the PRD file cannot be read or parsed:
- Return an error object:
{"error": "Description of error"} - Include the file path in the error message
- Suggest remediation steps if applicable
Common Error Cases
- File not found:
{"error": "PRD file not found: {path}. Verify the file exists."} - No context section:
{"error": "PRD missing 'All Needed Context' section. Add section to PRD."} - Malformed table:
{"error": "Malformed table in section '{section_name}'. Check markdown syntax."}
Usage Example
Input PRD Excerpt
## All Needed Context
### Code Files
| File Path | Purpose | Read Priority |
|-----------|---------|---------------|
| `src/flowspec_cli/commands/specify.py` | Main /flow:specify implementation | High |
| `templates/prd-template.md` | PRD template structure | Medium |
### Docs / Specs
| Document | Link | Key Sections |
|----------|------|--------------|
| SDD Guide | `docs/guides/sdd-guide.md` | Context Management |
### Examples
| Example | Location | Relevance to This Feature |
|---------|----------|---------------------------|
| Login Flow | `examples/auth/login.py` | Session handling pattern |
### Gotchas / Prior Failures
| Gotcha | Impact | Mitigation | Source |
|--------|--------|------------|--------|
| Race condition | Data corruption | Use transactions | task-123 |
### External Systems / APIs
| System / API | Type | Documentation | Notes |
|--------------|------|---------------|-------|
| GitHub API | REST | https://docs.github.com/rest | 5000 req/hour limit |
Output JSON
{
"code_files": [
{
"path": "src/flowspec_cli/commands/specify.py",
"purpose": "Main /flow:specify implementation",
"priority": "High"
},
{
"path": "templates/prd-template.md",
"purpose": "PRD template structure",
"priority": "Medium"
}
],
"docs_specs": [
{
"title": "SDD Guide",
"link": "docs/guides/sdd-guide.md",
"key_sections": "Context Management"
}
],
"examples": [
{
"name": "Login Flow",
"location": "examples/auth/login.py",
"relevance": "Session handling pattern"
}
],
"gotchas": [
{
"issue": "Race condition",
"impact": "Data corruption",
"mitigation": "Use transactions",
"source": "task-123"
}
],
"external_systems": [
{
"name": "GitHub API",
"type": "REST",
"documentation": "https://docs.github.com/rest",
"notes": "5000 req/hour limit"
}
]
}
Integration Points
/flow:implement
Uses extracted context to:
- Identify files to read before implementation
- Prioritize reading order (High → Medium → Low)
- Discover related documentation
- Warn about gotchas early
/flow:generate-prp
Uses extracted context to:
- Build comprehensive context bundles
- Include all relevant files and docs
- Attach examples for reference
- Warn about known failure modes
/flow:validate
Uses extracted context to:
- Verify all referenced files exist
- Check that documentation is up-to-date
- Validate against known gotchas
- Test external system integrations
Validation Checklist
After parsing, verify:
- All five sections present in output (even if empty)
- File paths are clean (no backticks or extra quotes)
- Priorities are valid (High/Medium/Low only)
- JSON is valid and properly formatted
- No markdown artifacts in extracted text
- Empty sections return
[]notnull
Quality Standards
- Accuracy: Preserve exact meanings from PRD
- Completeness: Extract all rows from all tables
- Cleanliness: Remove markdown formatting artifacts
- Consistency: Use consistent field names and structure
- Robustness: Handle missing sections gracefully