name	file-categorization
description	Reusable logic for categorizing files as Command, Agent, Skill, or Documentation based on structure and content analysis

File Categorization Skill

When to Use This Skill

Processing files in integration pipelines
Scanning directories for file organization
Auto-routing files to appropriate locations
Generating file inventory reports
Validating repository structure

What This Skill Does

Analyzes file structure and content to accurately categorize files into:

Commands - Slash command definitions
Agents - Agent configuration files
Skills - Reusable workflow automation
Documentation - General markdown documentation
Other - Uncategorized files requiring manual review

Categorization Logic

Step 1: Filename Pattern Matching

Commands:

Filename matches *-command.md or *command.md
Located in .claude/commands/ directory
Filename uses verb-noun pattern (e.g., integration-scan.md)

Agents:

Filename matches *-agent.md or *agent.md
Located in agents-templates/ directory
Contains role-based names (architect, builder, validator, etc.)

Skills:

Filename is SKILL.md or *-SKILL.md or *-skill.md
Located in skills/*/ directories
Contains workflow automation content

Documentation:

Standard .md files
Located in docs/ directory
Contains reference or tutorial content

Step 2: Frontmatter Analysis

Read the YAML frontmatter (if present) to identify:

Command Indicators:

---
description: "..."
allowed-tools: [...]
author: "..."
version: "X.Y"
---

Skill Indicators:

---
name: skill-name
description: "..."
---

Agent Indicators (less structured, more prose):

## Agent Identity
**Role**: [Agent Role]
**Version**: X.Y.Z
**Purpose**: [Purpose description]

Step 3: Content Structure Analysis

Commands have:

Workflow sections with numbered steps
Bash command examples (prefixed with !)
allowed-tools restrictions
Usage examples

Agents have:

Core Responsibilities section
Allowed Tools and Permissions section
Workflow Patterns section
Context Management section

Skills have:

"When to Use" section
"What This Skill Does" section
Step-by-step process descriptions
Examples with real data

Documentation has:

Standard markdown structure
Tutorial or reference content
No executable workflows
Educational purpose

Step 4: Keyword Detection

Scan content for category-specific keywords:

Command Keywords:

!bash, !git, !npm, etc. (shell commands)
"allowed-tools"
"Usage:", "Workflow:", "Steps:"
Command-line patterns

Agent Keywords:

"Core Responsibilities"
"Workflow Patterns"
"Context Management"
"Orchestrator", "Sub-Agent"
"Handoff", "Delegation"

Skill Keywords:

"When to Use"
"What This Skill Does"
"Skill" in self-references
Reusable workflow language

Documentation Keywords:

"Introduction", "Overview", "Guide"
"Tutorial", "Reference", "Best Practices"
Educational/explanatory language

Categorization Algorithm

function categorizeFile(filePath, content):
  // Phase 1: Filename and location
  if filename matches command patterns OR in .claude/commands/:
    category = "Command"
    confidence = "High"

  else if filename == "SKILL.md" OR in skills/*/:
    category = "Skill"
    confidence = "High"

  else if in agents-templates/:
    category = "Agent"
    confidence = "High"

  else if in docs/:
    category = "Documentation"
    confidence = "Medium"

  // Phase 2: Frontmatter analysis (refine)
  frontmatter = extractYAML(content)
  if frontmatter contains "allowed-tools" AND "version":
    category = "Command"
    confidence = "High"

  else if frontmatter contains "name" (no allowed-tools):
    category = "Skill"
    confidence = "High"

  // Phase 3: Content structure (if still uncertain)
  if confidence != "High":
    if content contains "## Agent Identity":
      category = "Agent"
      confidence = "High"

    else if content contains "## When to Use":
      category = "Skill"
      confidence = "Medium"

    else if content contains "!bash" OR "!git":
      category = "Command"
      confidence = "Medium"

  // Phase 4: Fallback
  if category == null:
    category = "Other"
    confidence = "Low"
    reason = "Unable to determine category, manual review needed"

  return {category, confidence, reasoning}

Output Format

For each categorized file, return:

### [Filename]
- **Category**: [Command|Agent|Skill|Documentation|Other]
- **Confidence**: [High|Medium|Low]
- **Reasoning**: [Why this category was assigned]
- **Frontmatter**: [✅ Valid | ⚠️ Malformed | ❌ Missing]
- **Required Fields**: [List of found/missing fields]
- **Recommended Location**: [Target directory path]

Example Usage

Example 1: Categorizing Integration File

Input:

File: USING-GIT-WORKTREES-SKILL.md
Content:
---
name: using-git-worktrees
description: Creates isolated git worktrees...
---

# Using Git Worktrees

## When to Use
...

Output:

### USING-GIT-WORKTREES-SKILL.md
- **Category**: Skill
- **Confidence**: High
- **Reasoning**: Filename matches skill pattern, frontmatter has 'name' field, content has "When to Use" section
- **Frontmatter**: ✅ Valid
- **Required Fields**: name ✅, description ✅
- **Recommended Location**: skills/using-git-worktrees/SKILL.md

Example 2: Categorizing Command File

Input:

File: integration-scan.md
Content:
---
description: "Scan and categorize incoming files"
allowed-tools: ["Read", "Bash(find)"]
author: "Claude Command and Control"
version: "1.0"
---

# Integration Scan

## Purpose
...

Output:

### integration-scan.md
- **Category**: Command
- **Confidence**: High
- **Reasoning**: Filename uses verb-noun pattern, frontmatter has 'allowed-tools' and 'version'
- **Frontmatter**: ✅ Valid
- **Required Fields**: description ✅, allowed-tools ✅, author ✅, version ✅
- **Recommended Location**: .claude/commands/integration-scan.md

Example 3: Uncategorizable File

Input:

File: notes.md
Content:
# Random Notes

Some thoughts about the project...

Output:

### notes.md
- **Category**: Other
- **Confidence**: Low
- **Reasoning**: No frontmatter, no structural indicators, generic content
- **Frontmatter**: ❌ Missing
- **Required Fields**: N/A
- **Recommended Location**: Manual review required

Integration with Commands

Used By

/integration-scan - Primary categorization logic
/integration-process - Determines target directory
/integration-validate - Validates category-specific structure

Usage Pattern

# In integration-scan command

For each file in /INTEGRATION/incoming:
  1. Read file content
  2. Use file-categorization skill
  3. Extract category and confidence
  4. Include in scan report
  5. Mark for processing if High confidence
  6. Flag for review if Medium/Low confidence

Category-Specific Validation Rules

Commands

✅ MUST have: description, allowed-tools, author, version
✅ SHOULD have: workflow steps, usage examples
⚠️ Check: Tool permissions not overly broad

Agents

✅ MUST have: Agent Identity, Core Responsibilities, Allowed Tools
✅ SHOULD have: Workflow Patterns, Context Management
⚠️ Check: Role clearly defined

Skills

✅ MUST have: name, description, "When to Use"
✅ SHOULD have: Examples, step-by-step process
⚠️ Check: Examples use real data (not placeholders)

Documentation

✅ MUST have: Clear title, structured content
✅ SHOULD have: Table of contents, cross-references
⚠️ Check: No executable workflows (should be in Command/Skill)

Error Handling

Malformed Frontmatter

Issue: YAML syntax error
Action: Note in categorization output
Category: "Other" with reason "Invalid frontmatter"
Recommendation: Fix YAML before processing

Conflicting Indicators

Issue: Filename says "command" but structure says "skill"
Action: Confidence = "Medium"
Reasoning: "Filename and content indicators conflict"
Recommendation: Manual review

Missing Content

Issue: File is empty or too short (<100 chars)
Action: Category = "Other"
Confidence: "Low"
Reasoning: "Insufficient content for categorization"

Testing Recommendations

Test with:

Typical files - Standard commands, agents, skills
Edge cases - Mixed indicators, missing frontmatter
Malformed files - Syntax errors, incomplete content
Ambiguous files - Could fit multiple categories

Expected accuracy:

High confidence: >95% correct
Medium confidence: >80% correct
Low confidence: Requires manual review

Version History

1.0 (2025-11-23)

Initial file categorization skill
Four-phase categorization algorithm
Integration with scan/process commands
Comprehensive validation rules

Skill Status: Production Ready Accuracy Target: >95% for High confidence categorizations Dependencies: None (standalone logic)

Install Skill

SKILL.md

File Categorization Skill

When to Use This Skill

What This Skill Does

Categorization Logic

Step 1: Filename Pattern Matching

Step 2: Frontmatter Analysis

Step 3: Content Structure Analysis

Step 4: Keyword Detection

Categorization Algorithm

Output Format

Example Usage

Example 1: Categorizing Integration File

Example 2: Categorizing Command File

Example 3: Uncategorizable File

Integration with Commands

Used By

Usage Pattern

Category-Specific Validation Rules

Commands

Agents

Skills

Documentation

Error Handling

Malformed Frontmatter

Conflicting Indicators

Missing Content

Testing Recommendations

Version History