| name | file-categorization |
| description | Reusable logic for categorizing files as Command, Agent, Skill, or Documentation based on structure and content analysis |
File Categorization Skill
When to Use This Skill
- Processing files in integration pipelines
- Scanning directories for file organization
- Auto-routing files to appropriate locations
- Generating file inventory reports
- Validating repository structure
What This Skill Does
Analyzes file structure and content to accurately categorize files into:
- Commands - Slash command definitions
- Agents - Agent configuration files
- Skills - Reusable workflow automation
- Documentation - General markdown documentation
- Other - Uncategorized files requiring manual review
Categorization Logic
Step 1: Filename Pattern Matching
Commands:
- Filename matches
*-command.mdor*command.md - Located in
.claude/commands/directory - Filename uses verb-noun pattern (e.g.,
integration-scan.md)
Agents:
- Filename matches
*-agent.mdor*agent.md - Located in
agents-templates/directory - Contains role-based names (architect, builder, validator, etc.)
Skills:
- Filename is
SKILL.mdor*-SKILL.mdor*-skill.md - Located in
skills/*/directories - Contains workflow automation content
Documentation:
- Standard
.mdfiles - Located in
docs/directory - Contains reference or tutorial content
Step 2: Frontmatter Analysis
Read the YAML frontmatter (if present) to identify:
Command Indicators:
---
description: "..."
allowed-tools: [...]
author: "..."
version: "X.Y"
---
Skill Indicators:
---
name: skill-name
description: "..."
---
Agent Indicators (less structured, more prose):
## Agent Identity
**Role**: [Agent Role]
**Version**: X.Y.Z
**Purpose**: [Purpose description]
Step 3: Content Structure Analysis
Commands have:
- Workflow sections with numbered steps
- Bash command examples (prefixed with
!) allowed-toolsrestrictions- Usage examples
Agents have:
- Core Responsibilities section
- Allowed Tools and Permissions section
- Workflow Patterns section
- Context Management section
Skills have:
- "When to Use" section
- "What This Skill Does" section
- Step-by-step process descriptions
- Examples with real data
Documentation has:
- Standard markdown structure
- Tutorial or reference content
- No executable workflows
- Educational purpose
Step 4: Keyword Detection
Scan content for category-specific keywords:
Command Keywords:
!bash,!git,!npm, etc. (shell commands)- "allowed-tools"
- "Usage:", "Workflow:", "Steps:"
- Command-line patterns
Agent Keywords:
- "Core Responsibilities"
- "Workflow Patterns"
- "Context Management"
- "Orchestrator", "Sub-Agent"
- "Handoff", "Delegation"
Skill Keywords:
- "When to Use"
- "What This Skill Does"
- "Skill" in self-references
- Reusable workflow language
Documentation Keywords:
- "Introduction", "Overview", "Guide"
- "Tutorial", "Reference", "Best Practices"
- Educational/explanatory language
Categorization Algorithm
function categorizeFile(filePath, content):
// Phase 1: Filename and location
if filename matches command patterns OR in .claude/commands/:
category = "Command"
confidence = "High"
else if filename == "SKILL.md" OR in skills/*/:
category = "Skill"
confidence = "High"
else if in agents-templates/:
category = "Agent"
confidence = "High"
else if in docs/:
category = "Documentation"
confidence = "Medium"
// Phase 2: Frontmatter analysis (refine)
frontmatter = extractYAML(content)
if frontmatter contains "allowed-tools" AND "version":
category = "Command"
confidence = "High"
else if frontmatter contains "name" (no allowed-tools):
category = "Skill"
confidence = "High"
// Phase 3: Content structure (if still uncertain)
if confidence != "High":
if content contains "## Agent Identity":
category = "Agent"
confidence = "High"
else if content contains "## When to Use":
category = "Skill"
confidence = "Medium"
else if content contains "!bash" OR "!git":
category = "Command"
confidence = "Medium"
// Phase 4: Fallback
if category == null:
category = "Other"
confidence = "Low"
reason = "Unable to determine category, manual review needed"
return {category, confidence, reasoning}
Output Format
For each categorized file, return:
### [Filename]
- **Category**: [Command|Agent|Skill|Documentation|Other]
- **Confidence**: [High|Medium|Low]
- **Reasoning**: [Why this category was assigned]
- **Frontmatter**: [✅ Valid | ⚠️ Malformed | ❌ Missing]
- **Required Fields**: [List of found/missing fields]
- **Recommended Location**: [Target directory path]
Example Usage
Example 1: Categorizing Integration File
Input:
File: USING-GIT-WORKTREES-SKILL.md
Content:
---
name: using-git-worktrees
description: Creates isolated git worktrees...
---
# Using Git Worktrees
## When to Use
...
Output:
### USING-GIT-WORKTREES-SKILL.md
- **Category**: Skill
- **Confidence**: High
- **Reasoning**: Filename matches skill pattern, frontmatter has 'name' field, content has "When to Use" section
- **Frontmatter**: ✅ Valid
- **Required Fields**: name ✅, description ✅
- **Recommended Location**: skills/using-git-worktrees/SKILL.md
Example 2: Categorizing Command File
Input:
File: integration-scan.md
Content:
---
description: "Scan and categorize incoming files"
allowed-tools: ["Read", "Bash(find)"]
author: "Claude Command and Control"
version: "1.0"
---
# Integration Scan
## Purpose
...
Output:
### integration-scan.md
- **Category**: Command
- **Confidence**: High
- **Reasoning**: Filename uses verb-noun pattern, frontmatter has 'allowed-tools' and 'version'
- **Frontmatter**: ✅ Valid
- **Required Fields**: description ✅, allowed-tools ✅, author ✅, version ✅
- **Recommended Location**: .claude/commands/integration-scan.md
Example 3: Uncategorizable File
Input:
File: notes.md
Content:
# Random Notes
Some thoughts about the project...
Output:
### notes.md
- **Category**: Other
- **Confidence**: Low
- **Reasoning**: No frontmatter, no structural indicators, generic content
- **Frontmatter**: ❌ Missing
- **Required Fields**: N/A
- **Recommended Location**: Manual review required
Integration with Commands
Used By
/integration-scan- Primary categorization logic/integration-process- Determines target directory/integration-validate- Validates category-specific structure
Usage Pattern
# In integration-scan command
For each file in /INTEGRATION/incoming:
1. Read file content
2. Use file-categorization skill
3. Extract category and confidence
4. Include in scan report
5. Mark for processing if High confidence
6. Flag for review if Medium/Low confidence
Category-Specific Validation Rules
Commands
- ✅ MUST have: description, allowed-tools, author, version
- ✅ SHOULD have: workflow steps, usage examples
- ⚠️ Check: Tool permissions not overly broad
Agents
- ✅ MUST have: Agent Identity, Core Responsibilities, Allowed Tools
- ✅ SHOULD have: Workflow Patterns, Context Management
- ⚠️ Check: Role clearly defined
Skills
- ✅ MUST have: name, description, "When to Use"
- ✅ SHOULD have: Examples, step-by-step process
- ⚠️ Check: Examples use real data (not placeholders)
Documentation
- ✅ MUST have: Clear title, structured content
- ✅ SHOULD have: Table of contents, cross-references
- ⚠️ Check: No executable workflows (should be in Command/Skill)
Error Handling
Malformed Frontmatter
Issue: YAML syntax error
Action: Note in categorization output
Category: "Other" with reason "Invalid frontmatter"
Recommendation: Fix YAML before processing
Conflicting Indicators
Issue: Filename says "command" but structure says "skill"
Action: Confidence = "Medium"
Reasoning: "Filename and content indicators conflict"
Recommendation: Manual review
Missing Content
Issue: File is empty or too short (<100 chars)
Action: Category = "Other"
Confidence: "Low"
Reasoning: "Insufficient content for categorization"
Testing Recommendations
Test with:
- Typical files - Standard commands, agents, skills
- Edge cases - Mixed indicators, missing frontmatter
- Malformed files - Syntax errors, incomplete content
- Ambiguous files - Could fit multiple categories
Expected accuracy:
- High confidence: >95% correct
- Medium confidence: >80% correct
- Low confidence: Requires manual review
Version History
1.0 (2025-11-23)
- Initial file categorization skill
- Four-phase categorization algorithm
- Integration with scan/process commands
- Comprehensive validation rules
Skill Status: Production Ready Accuracy Target: >95% for High confidence categorizations Dependencies: None (standalone logic)