| name | legacy-codebase-analyzer |
| title | Legacy Codebase Analyzer |
| description | Comprehensive legacy codebase analysis skill for technical debt assessment, security vulnerability scanning, performance bottleneck detection, and modernization roadmap generation. Includes 7 Python tools for automated codebase inventory, architecture health analysis, and strategic modernization planning. |
| domain | engineering |
| subdomain | legacy-modernization |
| difficulty | advanced |
| time-saved | 60%+ reduction in analysis time |
| frequency | Per-project or quarterly assessment |
| use-cases | Comprehensive technical debt assessment for legacy systems, Security vulnerability scanning and risk prioritization, Performance bottleneck identification and optimization planning, Architecture pattern detection and anti-pattern remediation, Modernization roadmap generation with phased migration strategies |
| related-agents | cs-legacy-codebase-analyzer |
| related-skills | senior-architect, code-reviewer, senior-secops, cto-advisor |
| related-commands | /plan.refactor, /audit.security, /review.code |
| orchestrated-by | cs-legacy-codebase-analyzer |
| dependencies | [object Object] |
| compatibility | [object Object] |
| tech-stack | Python 3.8+, AST analysis, Regex pattern matching, Git integration, JSON/CSV output |
| version | v1.0.0 |
| author | Claude Skills Team |
| created | Sat Dec 13 2025 00:00:00 GMT+0000 (Coordinated Universal Time) |
| updated | Sat Dec 13 2025 00:00:00 GMT+0000 (Coordinated Universal Time) |
| license | MIT |
| tags | legacy, technical-debt, modernization, refactoring, architecture, security, performance, codebase-analysis |
| featured | false |
| verified | true |
Legacy Codebase Analyzer
Comprehensive legacy codebase analysis skill with automated tools for technical debt assessment, security vulnerability detection, performance analysis, and strategic modernization planning. Transforms complex legacy systems into actionable modernization roadmaps through seven production-ready Python analysis tools.
Overview
This skill delivers enterprise-grade legacy codebase analysis capabilities through seven specialized Python automation tools and extensive reference documentation. Whether assessing technical debt, scanning for security vulnerabilities, identifying performance bottlenecks, or planning modernization strategies, this skill provides the analytical depth and strategic guidance needed to transform legacy systems into modern, maintainable architectures.
Target Users:
- Engineering leaders managing legacy system modernization
- CTOs planning technical debt reduction strategies
- Architects designing migration paths to modern stacks
- Security teams auditing legacy application vulnerabilities
- Technical debt managers prioritizing remediation efforts
- Platform teams planning infrastructure modernization
Quantified Benefits:
- 60% time savings on legacy codebase analysis through automation
- 40% improvement in technical debt prioritization accuracy
- 75% reduction in security vulnerability detection time
- 50% faster modernization roadmap creation
- 80% increase in stakeholder confidence with data-driven insights
Use this skill when:
- Inheriting a legacy codebase and need comprehensive assessment
- Planning major system modernization or replatforming initiatives
- Conducting technical due diligence for acquisitions
- Establishing technical debt baseline metrics
- Prioritizing security remediation efforts
- Creating multi-year modernization roadmaps
- Justifying modernization investments to stakeholders
Core Capabilities
- Comprehensive Codebase Inventory - Automated discovery of languages, frameworks, dependencies, file structure, LOC metrics, and technology stack identification
- Security Vulnerability Scanning - Pattern-based vulnerability detection with OWASP Top 10 coverage, CVE mapping, and risk-based prioritization
- Performance Bottleneck Detection - Algorithm complexity analysis, database query optimization identification, memory leak detection, and scalability assessment
- Code Quality Analysis - Cyclomatic complexity measurement, code duplication detection, maintainability scoring, and best practice validation
- Architecture Health Assessment - Dependency graph analysis, coupling/cohesion metrics, anti-pattern detection, and architectural drift identification
- Technical Debt Scoring - Multi-dimensional technical debt quantification with remediation cost estimation and priority ranking
- Modernization Roadmap Generation - Strategic migration planning with phased approaches, risk mitigation, and ROI analysis
Quick Start
Complete Codebase Inventory
# Basic codebase scan
python scripts/codebase_inventory.py /path/to/legacy-project
# Detailed inventory with dependency analysis
python scripts/codebase_inventory.py /path/to/legacy-project --detailed --dependencies
# Generate JSON output for automation
python scripts/codebase_inventory.py /path/to/legacy-project --format json --output inventory.json
Scan for Security Vulnerabilities
# Basic security scan
python scripts/security_vulnerability_scanner.py /path/to/legacy-project
# Scan with severity filtering (high and critical only)
python scripts/security_vulnerability_scanner.py /path/to/legacy-project --min-severity high
# Generate detailed security report
python scripts/security_vulnerability_scanner.py /path/to/legacy-project --detailed --output security-report.json
Detect Performance Bottlenecks
# Identify performance issues
python scripts/performance_bottleneck_detector.py /path/to/legacy-project
# Focus on specific file types
python scripts/performance_bottleneck_detector.py /path/to/legacy-project --file-types .py,.js
# Generate performance optimization report
python scripts/performance_bottleneck_detector.py /path/to/legacy-project --detailed --output perf-report.json
Generate Modernization Roadmap
# Create comprehensive roadmap
python scripts/modernization_roadmap_generator.py /path/to/legacy-project
# Roadmap with custom timeline (12 months)
python scripts/modernization_roadmap_generator.py /path/to/legacy-project --timeline 12
# Generate executive summary
python scripts/modernization_roadmap_generator.py /path/to/legacy-project --executive-summary --output roadmap.md
Access Documentation
- Analysis Framework:
references/analysis_framework.md - Modernization Patterns:
references/modernization_patterns.md - Deliverable Templates:
references/deliverable_templates.md
Key Workflows
Workflow 1: Initial Legacy System Assessment (2-4 hours)
Scenario: First-time analysis of inherited legacy codebase requiring comprehensive baseline assessment
# 1. Run comprehensive codebase inventory
python scripts/codebase_inventory.py /path/to/legacy-project --detailed --dependencies
# Output provides:
# - Total lines of code by language
# - Technology stack identification
# - Dependency versions and outdated packages
# - File structure and organization
# - Entry points and critical modules
# - Third-party library usage
# 2. Analyze code quality metrics
python scripts/code_quality_analyzer.py /path/to/legacy-project --verbose
# Metrics include:
# - Cyclomatic complexity scores
# - Code duplication percentage
# - Maintainability index
# - Comment density
# - Test coverage estimation
# - Best practice violations
# 3. Assess architecture health
python scripts/architecture_health_analyzer.py /path/to/legacy-project --detailed
# Analysis covers:
# - Dependency graphs and circular dependencies
# - Layer violations and architectural drift
# - Coupling and cohesion metrics
# - Anti-pattern detection
# - Module organization assessment
# - Design principle violations
# 4. Calculate technical debt score
python scripts/technical_debt_scorer.py /path/to/legacy-project --output debt-baseline.json
# Scoring includes:
# - Overall technical debt score (0-100)
# - Debt by category (architecture, code quality, security, performance)
# - Estimated remediation effort (person-days)
# - Priority recommendations
# - Quick wins vs long-term investments
# 5. Generate executive summary
# Use assets/executive_summary_template.md
# Populate with metrics from all tools
# Include key findings and recommendations
# Reference: references/deliverable_templates.md
# 6. Create baseline documentation
mkdir -p reports/baseline-$(date +%Y-%m-%d)
cp *.json reports/baseline-$(date +%Y-%m-%d)/
echo "Baseline assessment complete: $(date)" > reports/baseline-$(date +%Y-%m-%d)/README.md
Time Estimate: 2-4 hours (vs 8-12 hours manual assessment)
Success Criteria:
- Complete inventory of all technologies and dependencies
- Technical debt baseline score established
- Architecture health assessment documented
- Executive summary delivered to stakeholders
- All metrics captured in machine-readable format
- Priority issues identified for immediate attention
Workflow 2: Security Vulnerability Assessment (1-2 hours)
Scenario: Comprehensive security audit of legacy application prior to production deployment or acquisition
# 1. Run comprehensive security scan
python scripts/security_vulnerability_scanner.py /path/to/legacy-project --detailed --output security-scan.json
# Scanner detects:
# - SQL injection vulnerabilities
# - XSS (cross-site scripting) risks
# - CSRF vulnerabilities
# - Hardcoded credentials
# - Insecure cryptography
# - Path traversal issues
# - Authentication/authorization flaws
# - Dependency vulnerabilities (CVEs)
# 2. Filter high-severity issues
python scripts/security_vulnerability_scanner.py /path/to/legacy-project --min-severity high --format json | jq '.vulnerabilities | length'
# Priority triage:
# - CRITICAL: Immediate remediation required
# - HIGH: Fix before deployment
# - MEDIUM: Plan for next sprint
# - LOW: Technical debt backlog
# 3. Generate OWASP compliance report
python scripts/security_vulnerability_scanner.py /path/to/legacy-project --owasp-mapping --output owasp-report.html
# Report includes:
# - OWASP Top 10 coverage
# - Vulnerability distribution
# - Compliance gaps
# - Remediation recommendations
# 4. Cross-reference with code quality issues
python scripts/code_quality_analyzer.py /path/to/legacy-project --security-focused
# Identifies:
# - Input validation gaps
# - Error handling weaknesses
# - Logging and monitoring deficiencies
# - Security best practice violations
# 5. Create security remediation plan
# Use references/modernization_patterns.md
# Prioritize by risk score and remediation effort
# Group related vulnerabilities
# Estimate remediation timelines
# 6. Generate stakeholder report
# Use assets/technical_debt_report_template.md
# Focus on security section
# Include risk matrix
# Provide remediation cost estimates
# Document compliance gaps
Time Estimate: 1-2 hours (vs 6-8 hours manual security audit)
Success Criteria:
- All high and critical vulnerabilities identified
- OWASP Top 10 compliance assessed
- Dependency vulnerabilities mapped to CVEs
- Risk-based prioritization complete
- Remediation estimates provided
- Stakeholder-ready security report delivered
Workflow 3: Performance Optimization Planning (2-3 hours)
Scenario: System experiencing performance issues requiring systematic bottleneck identification and optimization planning
# 1. Detect performance bottlenecks
python scripts/performance_bottleneck_detector.py /path/to/legacy-project --detailed --output perf-analysis.json
# Detection includes:
# - Algorithm complexity analysis (O(n²) or worse)
# - Inefficient database queries (N+1 problems)
# - Memory leaks and resource leaks
# - Synchronous blocking operations
# - Inefficient data structures
# - Large object allocations
# - Missing caching opportunities
# 2. Analyze database query patterns
python scripts/performance_bottleneck_detector.py /path/to/legacy-project --focus database --verbose
# Identifies:
# - Missing indexes
# - Full table scans
# - N+1 query patterns
# - Inefficient joins
# - Missing query optimization
# - Connection pool issues
# 3. Identify algorithmic inefficiencies
python scripts/performance_bottleneck_detector.py /path/to/legacy-project --focus algorithms
# Finds:
# - Nested loops causing O(n²) or worse
# - Inefficient sorting/searching
# - Unnecessary data copying
# - Repeated calculations
# - Lack of memoization
# 4. Cross-reference with architecture analysis
python scripts/architecture_health_analyzer.py /path/to/legacy-project --performance-focus
# Architectural bottlenecks:
# - Tight coupling causing cascade effects
# - Missing caching layers
# - Synchronous processing where async appropriate
# - Monolithic architecture scaling limits
# - Missing service boundaries
# 5. Estimate optimization impact
# For each bottleneck, estimate:
# - Current performance impact
# - Optimization effort (person-days)
# - Expected performance improvement
# - Risk of optimization
# Reference: references/analysis_framework.md
# 6. Create optimization roadmap
# Prioritize by impact/effort ratio
# Group related optimizations
# Identify quick wins (high impact, low effort)
# Plan phased optimization approach
# Use assets/roadmap_template.md
Time Estimate: 2-3 hours (vs 8-12 hours manual analysis)
Success Criteria:
- All major performance bottlenecks identified
- Algorithm complexity issues documented
- Database optimization opportunities listed
- Impact vs effort analysis complete
- Prioritized optimization roadmap created
- Quick wins identified for immediate implementation
Workflow 4: Technical Debt Remediation Strategy (3-5 hours)
Scenario: Creating comprehensive technical debt reduction plan for quarterly or annual planning
# 1. Calculate comprehensive technical debt score
python scripts/technical_debt_scorer.py /path/to/legacy-project --detailed --output debt-analysis.json
# Scoring dimensions:
# - Code quality debt (complexity, duplication, maintainability)
# - Architectural debt (coupling, anti-patterns, layer violations)
# - Security debt (vulnerabilities, outdated dependencies)
# - Performance debt (bottlenecks, inefficient algorithms)
# - Testing debt (coverage gaps, missing tests)
# - Documentation debt (outdated or missing docs)
# 2. Prioritize debt by impact and effort
python scripts/technical_debt_scorer.py /path/to/legacy-project --prioritize --min-score 70
# Output shows:
# - High-priority debt items (quick wins)
# - Long-term strategic debt (requires planning)
# - Risk level for each debt category
# - Estimated remediation effort
# - Dependencies between debt items
# 3. Analyze code quality specifics
python scripts/code_quality_analyzer.py /path/to/legacy-project --detailed
# Detailed quality metrics:
# - Files with highest complexity
# - Most duplicated code sections
# - Lowest maintainability scores
# - Missing test coverage areas
# - Best practice violations by category
# 4. Cross-reference with security and performance
# Combine outputs from:
python scripts/security_vulnerability_scanner.py /path/to/legacy-project --min-severity medium
python scripts/performance_bottleneck_detector.py /path/to/legacy-project
# Identify overlapping issues:
# - Security vulnerabilities in complex code
# - Performance issues in high-debt modules
# - Architecture problems enabling vulnerabilities
# 5. Create phased remediation plan
# Reference: references/modernization_patterns.md
# Phase 1 (Sprint 1-2): Quick wins
# - Fix critical security issues
# - Address high-complexity functions
# - Eliminate duplicate code
# Phase 2 (Quarter 1): Medium-term improvements
# - Reduce architectural coupling
# - Improve test coverage
# - Optimize performance bottlenecks
# Phase 3 (Year 1): Strategic investments
# - Refactor architectural anti-patterns
# - Modernize framework/library versions
# - Implement missing design patterns
# 6. Generate technical debt report
# Use assets/technical_debt_report_template.md
# Include:
# - Executive summary with total debt score
# - Debt breakdown by category
# - Prioritized remediation roadmap
# - Cost-benefit analysis
# - Risk assessment for deferring remediation
# - Success metrics and tracking plan
# 7. Create tracking mechanism
mkdir -p reports/debt-tracking
echo "Baseline: $(jq '.overall_score' debt-analysis.json)" > reports/debt-tracking/baseline.txt
# Schedule quarterly re-assessment
# Track debt score trend over time
Time Estimate: 3-5 hours (vs 16-24 hours manual debt analysis)
Success Criteria:
- Complete technical debt inventory
- Multi-dimensional debt scoring complete
- Prioritized remediation roadmap created
- Cost-benefit analysis provided
- Phased implementation plan documented
- Tracking mechanism established
- Stakeholder report delivered
Workflow 5: Modernization Roadmap Creation (4-8 hours)
Scenario: Planning comprehensive legacy system modernization with multi-year timeline and stakeholder buy-in
# 1. Generate comprehensive modernization roadmap
python scripts/modernization_roadmap_generator.py /path/to/legacy-project --detailed --timeline 24 --output roadmap-full.json
# Roadmap includes:
# - Current state assessment
# - Target state definition
# - Migration strategies by component
# - Phased implementation plan
# - Risk mitigation strategies
# - Resource requirements
# - Success metrics
# 2. Run all prerequisite analyses
python scripts/codebase_inventory.py /path/to/legacy-project --detailed
python scripts/architecture_health_analyzer.py /path/to/legacy-project --detailed
python scripts/technical_debt_scorer.py /path/to/legacy-project --detailed
python scripts/security_vulnerability_scanner.py /path/to/legacy-project
python scripts/performance_bottleneck_detector.py /path/to/legacy-project
# Consolidate findings:
# - Technology stack gaps (outdated frameworks)
# - Architectural modernization needs (monolith to microservices)
# - Security remediation requirements
# - Performance optimization opportunities
# - Infrastructure modernization needs
# 3. Define target architecture
# Reference: references/modernization_patterns.md
# Consider:
# - Strangler Fig pattern (incremental replacement)
# - Branch by Abstraction (parallel development)
# - Anti-Corruption Layer (protect new from legacy)
# - Big Bang replacement (full rewrite)
# - Hybrid approaches
# 4. Create phased migration strategy
# Phase 1 (Months 1-3): Foundation
# - Establish CI/CD pipeline
# - Implement comprehensive test suite
# - Document current architecture
# - Fix critical security issues
# - Set up monitoring and observability
# Phase 2 (Months 4-9): Incremental modernization
# - Extract bounded contexts
# - Implement API gateway
# - Migrate high-value modules first
# - Modernize data layer incrementally
# - Deploy side-by-side with legacy
# Phase 3 (Months 10-18): Major migration
# - Migrate core business logic
# - Implement event-driven architecture
# - Refactor data models
# - Migrate users gradually
# - Maintain legacy fallback
# Phase 4 (Months 19-24): Completion and optimization
# - Complete migration of remaining modules
# - Optimize new architecture
# - Decommission legacy systems
# - Knowledge transfer and documentation
# - Post-migration review
# 5. Assess risks and mitigation strategies
# Technical risks:
# - Data migration complexity
# - Integration challenges
# - Performance regressions
# - Security gaps during transition
# Business risks:
# - User disruption
# - Feature delivery slowdown
# - Budget overruns
# - Timeline extensions
# Mitigation strategies:
# - Comprehensive testing at each phase
# - Blue-green deployment
# - Feature flags for gradual rollout
# - Rollback procedures
# - Regular stakeholder updates
# 6. Estimate costs and resources
# Development effort:
# - Engineering team size and composition
# - External consultants/contractors
# - Training and onboarding
# - Tools and infrastructure
# Timeline considerations:
# - Parallel work vs sequential
# - Dependencies between phases
# - Buffer for unknowns (20-30%)
# - Stakeholder availability
# 7. Generate executive roadmap document
# Use assets/roadmap_template.md
# Include:
# - Executive summary (1-page)
# - Current state assessment
# - Target architecture vision
# - Phased implementation plan
# - Resource requirements
# - Cost-benefit analysis
# - Risk assessment and mitigation
# - Success metrics and KPIs
# - Timeline with milestones
# - Appendices with technical details
# 8. Create presentation for stakeholders
# Prepare:
# - Before/after architecture diagrams
# - Phased timeline visualization
# - Cost breakdown and ROI projections
# - Risk matrix
# - Success stories from similar migrations
# - Demo of modernization POC (if available)
Time Estimate: 4-8 hours (vs 3-5 days manual roadmap creation)
Success Criteria:
- Comprehensive current state assessment documented
- Target architecture clearly defined
- Phased migration strategy with 3+ phases
- Risk assessment and mitigation plans complete
- Resource and cost estimates provided
- Executive roadmap document delivered
- Stakeholder presentation prepared
- Success metrics and tracking plan defined
Workflow 6: Quarterly Technical Health Review (1-2 hours)
Scenario: Regular quarterly assessment to track technical debt trends and modernization progress
# 1. Run all assessment tools with baseline comparison
QUARTER="Q4-2025"
mkdir -p reports/quarterly/$QUARTER
python scripts/technical_debt_scorer.py /path/to/legacy-project --compare reports/baseline/debt-baseline.json --output reports/quarterly/$QUARTER/debt-score.json
python scripts/code_quality_analyzer.py /path/to/legacy-project --output reports/quarterly/$QUARTER/quality.json
python scripts/security_vulnerability_scanner.py /path/to/legacy-project --output reports/quarterly/$QUARTER/security.json
python scripts/architecture_health_analyzer.py /path/to/legacy-project --output reports/quarterly/$QUARTER/architecture.json
# 2. Generate trend analysis
# Compare with previous quarters:
# - Technical debt score trend (improving or worsening?)
# - New vulnerabilities vs remediated
# - Architecture health trajectory
# - Code quality metrics over time
# 3. Assess modernization progress
python scripts/modernization_roadmap_generator.py /path/to/legacy-project --progress-check --roadmap reports/roadmap-full.json
# Progress metrics:
# - Milestones completed vs planned
# - Modules migrated vs remaining
# - Timeline adherence
# - Budget vs actual spend
# 4. Identify emerging issues
# New technical debt introduced?
# Unexpected architectural drift?
# New security vulnerabilities?
# Performance regressions?
# 5. Create quarterly report
# Use assets/technical_debt_report_template.md
# Include:
# - Quarter-over-quarter comparison
# - Progress on modernization roadmap
# - Newly identified issues
# - Resolved issues
# - Updated priorities
# - Recommendations for next quarter
# 6. Update roadmap if needed
# Adjust timeline based on progress
# Re-prioritize based on new findings
# Update resource estimates
Time Estimate: 1-2 hours (vs 4-6 hours manual review)
Success Criteria:
- All metrics updated with current state
- Trend analysis showing progress or regressions
- Modernization progress tracked against plan
- Quarterly report delivered to leadership
- Roadmap adjusted based on actuals
- Next quarter priorities established
Python Tools
1. Codebase Inventory
Comprehensive automated discovery and cataloging of legacy codebase composition.
Key Features:
- Language detection and LOC (lines of code) counting
- Framework and library identification
- Dependency analysis with version detection
- Technology stack mapping
- Entry point discovery
- File structure analysis
- Dead code detection
- Configuration file discovery
- Build system identification
- Multi-format output (JSON, CSV, HTML)
Common Usage:
# Basic inventory scan
python scripts/codebase_inventory.py /path/to/legacy-project
# Detailed inventory with dependencies
python scripts/codebase_inventory.py /path/to/legacy-project --detailed --dependencies
# Focus on specific languages
python scripts/codebase_inventory.py /path/to/legacy-project --languages python,javascript
# JSON output for automation
python scripts/codebase_inventory.py /path/to/legacy-project --format json --output inventory.json
# Include file-level details
python scripts/codebase_inventory.py /path/to/legacy-project --file-details
Use Cases:
- Initial assessment of inherited codebases
- Technology stack documentation
- Dependency upgrade planning
- Licensing compliance audits
- LOC metrics for project estimation
- Dead code identification for cleanup
Output Example:
{
"summary": {
"total_files": 1247,
"total_loc": 145830,
"primary_languages": ["Python", "JavaScript", "SQL"],
"frameworks": ["Django 2.2", "React 16.8", "PostgreSQL 11"],
"outdated_dependencies": 18
},
"languages": {
"Python": {"files": 342, "loc": 67430, "percentage": 46.2},
"JavaScript": {"files": 521, "loc": 58920, "percentage": 40.4},
"SQL": {"files": 89, "loc": 12480, "percentage": 8.6}
},
"dependencies": {
"outdated": [
{"name": "Django", "current": "2.2.0", "latest": "4.2.0", "risk": "high"},
{"name": "React", "current": "16.8.0", "latest": "18.2.0", "risk": "medium"}
]
}
}
2. Security Vulnerability Scanner
Pattern-based security vulnerability detection with OWASP coverage.
Key Features:
- SQL injection detection (parameterization checks)
- XSS vulnerability identification
- CSRF token validation
- Hardcoded credential detection (passwords, API keys)
- Insecure cryptography identification
- Path traversal vulnerability detection
- Authentication/authorization flaw patterns
- Dependency vulnerability scanning (CVE mapping)
- OWASP Top 10 compliance checking
- Risk-based severity scoring
- Remediation guidance
Common Usage:
# Basic security scan
python scripts/security_vulnerability_scanner.py /path/to/legacy-project
# Filter by severity
python scripts/security_vulnerability_scanner.py /path/to/legacy-project --min-severity high
# OWASP compliance report
python scripts/security_vulnerability_scanner.py /path/to/legacy-project --owasp-mapping --output owasp.html
# Scan specific file types
python scripts/security_vulnerability_scanner.py /path/to/legacy-project --file-types .py,.js
# Generate detailed report
python scripts/security_vulnerability_scanner.py /path/to/legacy-project --detailed --output security-report.json
Use Cases:
- Pre-deployment security audits
- Acquisition due diligence
- Compliance assessments (SOC 2, PCI DSS)
- Vulnerability prioritization
- Security debt quantification
- Penetration test preparation
Output Example:
{
"summary": {
"total_vulnerabilities": 47,
"critical": 3,
"high": 12,
"medium": 23,
"low": 9,
"owasp_coverage": {
"A01:2021-Broken Access Control": 5,
"A02:2021-Cryptographic Failures": 3,
"A03:2021-Injection": 8
}
},
"vulnerabilities": [
{
"id": "VUL-001",
"type": "SQL Injection",
"severity": "critical",
"file": "app/models/user.py",
"line": 145,
"description": "Unsanitized user input in SQL query",
"code_snippet": "query = f\"SELECT * FROM users WHERE id = {user_id}\"",
"remediation": "Use parameterized queries: cursor.execute('SELECT * FROM users WHERE id = %s', (user_id,))",
"owasp": "A03:2021-Injection",
"cwe": "CWE-89"
}
]
}
3. Performance Bottleneck Detector
Algorithm complexity analysis and performance optimization identification.
Key Features:
- Algorithm complexity detection (O(n²) or worse)
- Database query efficiency analysis (N+1 problems)
- Memory leak pattern detection
- Inefficient loop identification
- Missing index opportunities
- Synchronous blocking operation detection
- Large object allocation analysis
- Cache optimization opportunities
- Resource leak detection
- Scalability bottleneck identification
Common Usage:
# Detect performance bottlenecks
python scripts/performance_bottleneck_detector.py /path/to/legacy-project
# Focus on database issues
python scripts/performance_bottleneck_detector.py /path/to/legacy-project --focus database
# Algorithmic complexity analysis
python scripts/performance_bottleneck_detector.py /path/to/legacy-project --focus algorithms --verbose
# Specific file types
python scripts/performance_bottleneck_detector.py /path/to/legacy-project --file-types .py,.js
# Detailed performance report
python scripts/performance_bottleneck_detector.py /path/to/legacy-project --detailed --output perf-report.json
Use Cases:
- Performance troubleshooting
- Scalability planning
- Pre-launch performance audits
- Optimization prioritization
- Database query optimization
- Algorithm refactoring planning
Output Example:
{
"summary": {
"total_bottlenecks": 34,
"critical": 5,
"high": 14,
"medium": 15,
"categories": {
"algorithm": 12,
"database": 15,
"memory": 4,
"io": 3
}
},
"bottlenecks": [
{
"id": "PERF-001",
"type": "Nested Loop (O(n²))",
"severity": "critical",
"file": "app/services/report_generator.py",
"line": 78,
"description": "Nested loops causing quadratic complexity",
"code_snippet": "for user in users:\n for order in orders:\n if order.user_id == user.id:",
"impact": "Processing time increases exponentially with data size",
"recommendation": "Use dictionary lookup: orders_by_user = {o.user_id: o for o in orders}",
"estimated_improvement": "95% reduction in processing time"
}
]
}
4. Code Quality Analyzer
Comprehensive code quality metrics and maintainability assessment.
Key Features:
- Cyclomatic complexity calculation
- Code duplication detection
- Maintainability index scoring
- Comment density analysis
- Function/method length analysis
- Test coverage estimation
- Naming convention validation
- Best practice compliance checking
- Code smell detection
- Technical debt estimation per file
Common Usage:
# Analyze code quality
python scripts/code_quality_analyzer.py /path/to/legacy-project
# Detailed quality metrics
python scripts/code_quality_analyzer.py /path/to/legacy-project --detailed --verbose
# Focus on high-complexity files
python scripts/code_quality_analyzer.py /path/to/legacy-project --min-complexity 15
# Security-focused analysis
python scripts/code_quality_analyzer.py /path/to/legacy-project --security-focused
# Generate quality report
python scripts/code_quality_analyzer.py /path/to/legacy-project --output quality-report.json
Use Cases:
- Code review automation
- Refactoring prioritization
- Technical debt quantification
- Maintainability assessment
- Onboarding complexity estimation
- Quality trend tracking
Output Example:
{
"summary": {
"overall_score": 62,
"maintainability": "medium",
"complexity": {
"average": 8.4,
"high_complexity_files": 23
},
"duplication": {
"percentage": 12.3,
"duplicate_lines": 17890
},
"test_coverage": {
"estimated": "45%",
"files_without_tests": 145
}
},
"files": [
{
"path": "app/services/order_processor.py",
"complexity": 28,
"maintainability_index": 42,
"loc": 450,
"comment_ratio": 0.08,
"issues": [
"Cyclomatic complexity too high (threshold: 15)",
"Function too long: process_order (180 lines)",
"Low comment density (8%)"
],
"recommendations": [
"Split process_order into smaller functions",
"Reduce conditional nesting",
"Add docstrings and inline comments"
]
}
]
}
5. Architecture Health Analyzer
Dependency graph analysis and architectural quality assessment.
Key Features:
- Dependency graph generation
- Circular dependency detection
- Coupling and cohesion metrics
- Layer violation identification
- Architectural anti-pattern detection
- Module organization assessment
- Design principle violations (SOLID)
- Component boundary analysis
- Architectural drift measurement
- Modularization recommendations
Common Usage:
# Analyze architecture health
python scripts/architecture_health_analyzer.py /path/to/legacy-project
# Detailed dependency analysis
python scripts/architecture_health_analyzer.py /path/to/legacy-project --detailed --dependencies
# Focus on specific modules
python scripts/architecture_health_analyzer.py /path/to/legacy-project --modules app,services,models
# Generate architecture diagram
python scripts/architecture_health_analyzer.py /path/to/legacy-project --diagram --output arch-diagram.svg
# Performance-focused analysis
python scripts/architecture_health_analyzer.py /path/to/legacy-project --performance-focus
Use Cases:
- Architectural refactoring planning
- Microservices extraction planning
- Circular dependency resolution
- Module boundary definition
- Anti-pattern remediation
- Architecture documentation
Output Example:
{
"summary": {
"health_score": 58,
"circular_dependencies": 7,
"layer_violations": 12,
"anti_patterns": ["God Object", "Spaghetti Code", "Tight Coupling"],
"coupling_score": "high",
"cohesion_score": "low"
},
"modules": [
{
"name": "app.services.order_service",
"dependencies_in": 18,
"dependencies_out": 24,
"coupling": "very_high",
"issues": [
"God Object anti-pattern detected",
"Depends on 24 other modules",
"Violates Single Responsibility Principle"
],
"recommendations": [
"Split into domain-specific services",
"Introduce dependency inversion",
"Define clear interfaces"
]
}
],
"circular_dependencies": [
{
"cycle": ["app.models.user", "app.services.auth", "app.models.session", "app.models.user"],
"severity": "high",
"impact": "Difficult to test and maintain"
}
]
}
6. Technical Debt Scorer
Multi-dimensional technical debt quantification and prioritization.
Key Features:
- Overall technical debt scoring (0-100)
- Debt categorization (code quality, architecture, security, performance)
- Remediation effort estimation (person-days)
- Priority ranking (quick wins vs strategic investments)
- Trend analysis (debt over time)
- Cost-benefit analysis
- Risk assessment for deferring remediation
- Dependency identification between debt items
- ROI calculation for remediation
Common Usage:
# Calculate technical debt score
python scripts/technical_debt_scorer.py /path/to/legacy-project
# Detailed debt analysis
python scripts/technical_debt_scorer.py /path/to/legacy-project --detailed --output debt-analysis.json
# Prioritize by impact
python scripts/technical_debt_scorer.py /path/to/legacy-project --prioritize --min-score 70
# Compare with baseline
python scripts/technical_debt_scorer.py /path/to/legacy-project --compare reports/baseline/debt-baseline.json
# Generate remediation plan
python scripts/technical_debt_scorer.py /path/to/legacy-project --remediation-plan --output remediation.md
Use Cases:
- Quarterly technical debt reviews
- Budget justification for refactoring
- Prioritization of remediation efforts
- Progress tracking over time
- Cost estimation for debt reduction
- Risk communication to stakeholders
Output Example:
{
"summary": {
"overall_score": 68,
"rating": "medium_debt",
"total_remediation_days": 127,
"categories": {
"code_quality": {"score": 62, "debt_days": 35},
"architecture": {"score": 55, "debt_days": 48},
"security": {"score": 71, "debt_days": 22},
"performance": {"score": 78, "debt_days": 12},
"testing": {"score": 45, "debt_days": 10}
}
},
"prioritized_items": [
{
"id": "DEBT-001",
"category": "architecture",
"title": "Resolve circular dependencies in core modules",
"impact": "high",
"effort": "medium",
"priority": "quick_win",
"remediation_days": 8,
"risk_if_deferred": "Increasingly difficult to add features without breaking changes",
"roi": "high"
}
],
"trend": {
"current_score": 68,
"previous_score": 65,
"change": -3,
"direction": "worsening"
}
}
7. Modernization Roadmap Generator
Strategic migration planning with phased approaches and ROI analysis.
Key Features:
- Current state to target state gap analysis
- Migration strategy recommendations (Strangler Fig, Big Bang, etc.)
- Phased implementation planning
- Resource requirement estimation
- Timeline generation with milestones
- Risk identification and mitigation strategies
- Cost-benefit analysis
- Success metrics definition
- Dependency sequencing
- Executive summary generation
Common Usage:
# Generate modernization roadmap
python scripts/modernization_roadmap_generator.py /path/to/legacy-project
# Roadmap with custom timeline (24 months)
python scripts/modernization_roadmap_generator.py /path/to/legacy-project --timeline 24
# Executive summary format
python scripts/modernization_roadmap_generator.py /path/to/legacy-project --executive-summary --output roadmap.md
# Include cost analysis
python scripts/modernization_roadmap_generator.py /path/to/legacy-project --cost-analysis --team-size 8
# Progress check against existing roadmap
python scripts/modernization_roadmap_generator.py /path/to/legacy-project --progress-check --roadmap roadmap-full.json
Use Cases:
- Legacy system modernization planning
- Stakeholder communication
- Budget and resource planning
- Timeline estimation
- Risk assessment
- Progress tracking
- Multi-year strategic planning
Output Example:
# Legacy System Modernization Roadmap
## Executive Summary
**Current State:** Monolithic Django 2.2 application with 145K LOC
**Target State:** Microservices architecture with modern Python 3.11+ and containerization
**Timeline:** 18 months
**Estimated Cost:** $1.2M (8-person team)
**Expected ROI:** 3x over 3 years
## Phase 1: Foundation (Months 1-3)
**Objectives:**
- Establish modern CI/CD pipeline
- Implement comprehensive test suite
- Containerize existing application
- Set up monitoring and observability
**Deliverables:**
- CI/CD pipeline (GitHub Actions)
- Test coverage increased to 70%
- Docker containerization complete
- APM and logging infrastructure
**Resources:** 2 DevOps, 3 Backend Engineers
**Cost:** $150K
## Phase 2: API Gateway and Bounded Contexts (Months 4-9)
**Objectives:**
- Implement API Gateway
- Extract first bounded contexts
- Migrate authentication service
- Deploy side-by-side with monolith
**Deliverables:**
- API Gateway operational
- Auth microservice extracted
- User management microservice
- 20% of traffic on new services
**Resources:** 1 Architect, 4 Backend Engineers, 1 DevOps
**Cost:** $450K
## Risk Assessment
**High Risks:**
- Data migration complexity
- Integration challenges between old and new
- User disruption during cutover
**Mitigation:**
- Comprehensive testing at each phase
- Blue-green deployment strategy
- Feature flags for gradual rollout
- Rollback procedures documented
Reference Documentation
Detailed guides available in the references/ directory:
Analysis Framework
analysis_framework.md - Comprehensive analysis methodology including:
- Multi-phase assessment process
- Analysis dimensions (technical, business, operational)
- Scoring methodologies and metrics
- Data collection techniques
- Stakeholder interview guidelines
- Deliverable templates and formats
- Quality assurance for analysis outputs
- Common pitfalls and how to avoid them
Modernization Patterns
modernization_patterns.md - Proven migration strategies covering:
- Strangler Fig pattern (incremental replacement)
- Branch by Abstraction (parallel development)
- Anti-Corruption Layer (protect new from legacy)
- Big Bang replacement (full rewrite)
- Hybrid approaches
- Technology-specific patterns (monolith to microservices, framework upgrades)
- Data migration strategies
- Rollback and safety mechanisms
- Case studies and lessons learned
Deliverable Templates
deliverable_templates.md - Stakeholder-ready templates including:
- Executive summary template
- Technical debt assessment report
- Security vulnerability report
- Performance analysis report
- Architecture health assessment
- Modernization roadmap document
- Cost-benefit analysis template
- Risk register and mitigation plans
- Progress tracking dashboards
- Presentation deck templates
Best Practices Summary
Analysis Best Practices
Comprehensive Coverage:
- Run all 7 tools for complete picture
- Cross-reference findings between tools
- Validate automated findings manually
- Document assumptions and limitations
- Include both quantitative and qualitative analysis
Stakeholder Communication:
- Tailor reports to audience (technical vs executive)
- Use visualizations for complex data
- Provide actionable recommendations
- Include cost-benefit analysis
- Set clear success criteria
Continuous Assessment:
- Establish baseline metrics
- Schedule regular reassessments (quarterly)
- Track trends over time
- Adjust priorities based on progress
- Celebrate improvements
Technical Debt Management
Prioritization:
- Quick wins first (high impact, low effort)
- Address critical security issues immediately
- Balance short-term fixes with long-term strategy
- Consider dependencies between debt items
- Align with business priorities
Remediation:
- Fix root causes, not just symptoms
- Refactor incrementally, not big bang
- Maintain backward compatibility during migration
- Test thoroughly at each step
- Document architectural decisions
Modernization Strategy
Planning:
- Start with comprehensive assessment
- Define clear target architecture
- Plan in phases with clear milestones
- Identify and mitigate risks early
- Secure stakeholder buy-in upfront
Execution:
- Implement CI/CD before major changes
- Increase test coverage progressively
- Deploy incrementally with rollback plans
- Monitor closely during migration
- Communicate progress regularly
Risk Mitigation:
- Maintain legacy system in parallel initially
- Use feature flags for gradual rollout
- Implement blue-green deployment
- Document rollback procedures
- Plan for the unexpected (20-30% buffer)
Integration Points
Integration with cs-legacy-codebase-analyzer Agent
This skill is orchestrated by the cs-legacy-codebase-analyzer agent for guided analysis workflows:
# The agent guides you through:
# 1. Initial assessment planning
# 2. Running appropriate analysis tools
# 3. Interpreting results
# 4. Creating stakeholder deliverables
# 5. Developing remediation strategies
Integration with Code Review Workflow
Works with code-reviewer skill for ongoing quality maintenance:
# 1. Legacy analysis identifies problem areas
python scripts/code_quality_analyzer.py /path/to/legacy-project --min-complexity 15
# 2. Code reviewer focuses on those areas
python skills/engineering-team/code-reviewer/scripts/pr_analyzer.py 123 --focus high-complexity
# 3. Track improvement over time
python scripts/technical_debt_scorer.py /path/to/legacy-project --compare baseline.json
Integration with Security Workflow
Complements security audits from secops skill:
# 1. Automated vulnerability scanning
python scripts/security_vulnerability_scanner.py /path/to/legacy-project --detailed
# 2. Deep security analysis (manual)
# Use cs-secops-engineer agent for comprehensive audit
# 3. Track remediation progress
python scripts/security_vulnerability_scanner.py /path/to/legacy-project --compare baseline-security.json
Integration with Architecture Planning
Feeds into architectural decisions via architect skill:
# 1. Analyze current architecture health
python scripts/architecture_health_analyzer.py /path/to/legacy-project --detailed
# 2. Design target architecture
# Use cs-architect agent for target state design
# 3. Generate migration roadmap
python scripts/modernization_roadmap_generator.py /path/to/legacy-project --timeline 18
Troubleshooting
Common Issues
Issue: Tool runs slowly on large codebases
# Solution: Use file type filtering
python scripts/codebase_inventory.py /path/to/project --file-types .py,.js --exclude node_modules,venv
# Or exclude specific directories
python scripts/code_quality_analyzer.py /path/to/project --exclude tests,migrations,fixtures
Issue: Too many false positives in security scan
# Solution: Adjust severity threshold and use suppressions
python scripts/security_vulnerability_scanner.py /path/to/project --min-severity high --suppress-file .security-suppressions.json
Issue: Performance detector missing known bottlenecks
# Solution: Use verbose mode and manual review
python scripts/performance_bottleneck_detector.py /path/to/project --verbose --detailed
# Combine with profiling data from actual runtime
# Reference: references/analysis_framework.md section on profiling integration
Issue: Technical debt score doesn't match intuition
# Solution: Review individual component scores
python scripts/technical_debt_scorer.py /path/to/project --detailed --breakdown
# Adjust category weights if needed
python scripts/technical_debt_scorer.py /path/to/project --weights architecture:0.4,security:0.3,quality:0.2,performance:0.1
Issue: Roadmap generation fails on complex projects
# Solution: Run prerequisite analyses first
python scripts/codebase_inventory.py /path/to/project --output inventory.json
python scripts/technical_debt_scorer.py /path/to/project --output debt.json
python scripts/architecture_health_analyzer.py /path/to/project --output arch.json
# Then generate roadmap with inputs
python scripts/modernization_roadmap_generator.py /path/to/project --inventory inventory.json --debt debt.json --architecture arch.json
Getting Help
- Tool-specific issues: Run any script with
--helpflag for detailed options - Analysis methodology: Review analysis_framework.md
- Modernization strategy: Consult modernization_patterns.md
- Report templates: See deliverable_templates.md
- Integration questions: Refer to cs-legacy-codebase-analyzer agent documentation
Additional Resources
- Analysis Framework: references/analysis_framework.md
- Modernization Patterns: references/modernization_patterns.md
- Deliverable Templates: references/deliverable_templates.md
- Python Tools:
scripts/directory (all tools support--help) - Agent Documentation: See cs-legacy-codebase-analyzer agent
- Related Skills: senior-architect, code-reviewer, senior-secops, cto-advisor
Version: 1.0.0 Last Updated: 2025-12-13 Tool Count: 7 Python analysis tools Documentation: Progressive disclosure with references/ Integration: Orchestrated by cs-legacy-codebase-analyzer agent