| name | codebase-classification |
| description | Classify codebases before modification to choose appropriate development approach |
Codebase Classification Skill
Analyze and classify codebases before making changes to ensure appropriate development approach.
Overview
Before modifying any codebase, classify it to determine whether to:
- Follow existing patterns (Disciplined)
- Gradually improve while following conventions (Transitional)
- Propose improvements to legacy patterns (Legacy)
- Establish best practices from scratch (Greenfield)
Classification Types
1. Disciplined Codebase
Signals:
- Consistent code style (formatting, naming conventions)
- Comprehensive test coverage (>70%)
- Clear module boundaries and interfaces
- Type hints/annotations throughout
- Up-to-date dependencies
- Active CI/CD pipeline
- Good documentation (README, docstrings)
Approach: Follow existing patterns strictly. Don't introduce new conventions.
2. Transitional Codebase
Signals:
- Mixed code quality (some areas good, others not)
- Partial test coverage (30-70%)
- Some type hints, inconsistent usage
- Active development with modernization efforts
- Dependencies somewhat current
Approach: Follow existing conventions in touched areas. Propose improvements for new code.
3. Legacy Codebase
Signals:
- Inconsistent patterns across the codebase
- Minimal or no tests (<30% coverage)
- No type hints
- Outdated dependencies
- Complex, undocumented logic
- Possibly unmaintained
Approach: Be careful with changes. Add tests before modifying. Propose gradual improvements.
4. Greenfield Codebase
Signals:
- New project (<6 months old)
- Few files (<20 source files)
- No established patterns yet
- Minimal or no tests (but not legacy)
- Active initial development
Approach: Establish best practices from the start. Set up proper structure, testing, CI.
Quick Classification Checklist
Run this analysis before making significant changes:
1. Check test coverage: Is there a test/ or tests/ directory? How comprehensive?
2. Check type hints: Are functions annotated? Is there py.typed or mypy config?
3. Check CI/CD: Is there .github/workflows/, .gitlab-ci.yml, or similar?
4. Check code style: Is there .pre-commit-config.yaml, ruff.toml, or similar?
5. Check dependencies: When was requirements.txt/pyproject.toml last updated?
6. Check documentation: Is there a comprehensive README? API docs?
Decision Matrix
| Signal | Disciplined | Transitional | Legacy | Greenfield |
|---|---|---|---|---|
| Test coverage | >70% | 30-70% | <30% | Varies (new) |
| Type hints | Comprehensive | Partial | None/minimal | Varies |
| CI/CD | Active | Present | None/broken | May be new |
| Code style | Consistent | Mixed | Inconsistent | Establishing |
| Dependencies | Current | Somewhat current | Outdated | Latest |
| Age | Any | Any | Usually old | <6 months |
Behavior Guidelines
When Disciplined
- Study existing patterns before writing new code
- Match naming conventions exactly
- Follow established module structure
- Add tests matching existing test style
- Don't propose architectural changes without strong justification
When Transitional
- Follow patterns in the specific area you're modifying
- Match quality of surrounding code or slightly better
- Add tests for new functionality
- Document rationale for any pattern deviations
When Legacy
- Add tests BEFORE modifying code
- Make minimal changes to achieve goal
- Document assumptions and findings
- Propose improvements as separate follow-up work
- Be extra careful with untested code paths
When Greenfield
- Establish best practices immediately
- Set up proper project structure
- Configure linting, formatting, type checking
- Write tests for new functionality
- Create comprehensive documentation
Examples
Identifying Disciplined Codebase
$ ls -la
pyproject.toml # Modern packaging
.pre-commit-config.yaml # Style enforcement
mypy.ini # Type checking
.github/workflows/ # CI/CD
$ wc -l tests/**/*.py
2500 total # Substantial tests
→ Classification: DISCIPLINED
→ Approach: Follow existing patterns strictly
Identifying Legacy Codebase
$ ls -la
setup.py # Old-style packaging
requirements.txt # Pinned 3 years ago
# No tests directory
# No CI configuration
$ grep -r "def " src/ | head -5
def process_data(x): # No type hints
def handle_input(data): # No docstrings
→ Classification: LEGACY
→ Approach: Careful changes, add tests first
Integration
This skill helps agents:
- Avoid imposing new patterns on well-structured codebases
- Avoid perpetuating bad patterns in legacy codebases
- Make appropriate improvement suggestions
- Set up proper structure for new projects
Related
- Tool: shell (for running analysis commands)
- Tool: read (for examining codebase structure)