| name | code-review |
| description | Use when reviewing code for production readiness. Identifies critical issues causing production failures: data loss, security breaches, race conditions, performance degradation. Focuses on measurable impact, not style preferences. Triggers: reviewing code changes, verifying code safety/correctness, pre-commit reviews, after implementing significant changes. |
Code Review Practices
Core Principles
- Measurable Impact: Focus on issues causing actual failures: data loss, security breaches, race conditions, performance degradation. Ignore theoretical problems without real impact.
- Production Perspective: Consider how code behaves under production load and error conditions.
- Actionable Feedback: Provide specific locations and clear reasoning for issues found.
- Balanced Scrutiny: Find critical issues without being pedantic about minor concerns.
Project Context
Check project documentation for: quality standards/conventions, error handling patterns, performance requirements, architecture decisions, domain concepts/terminology
Critical Issue Categories
Production Failures (highest priority)
Data Loss Risks:
- Missing error handling that drops messages or data
- Incorrect acknowledgment before successful write
- Race conditions in concurrent writes
- Data corruption scenarios
Security Vulnerabilities:
- Credentials in code or logs
- Unvalidated external input (SQL injection, command injection, path traversal)
- Missing authentication or authorization checks
- Sensitive data exposure
Note: Only suggest high-performance validation checks. Avoid expensive validation in hot paths without justification.
Performance Killers:
- Unbounded memory growth
- Missing backpressure handling
- Blocking operations in hot paths
- O(n²) algorithms where O(n) is straightforward
- Unnecessary allocations in tight loops
Concurrency Bugs:
- Shared mutable state without synchronization
- Thread or task leaks
- Deadlock conditions
- Race conditions
Degraded Operation (medium priority)
Correctness Issues:
- Logic errors affecting business requirements
- Incomplete error propagation
- Missing error recovery mechanisms
- State management issues
Resource Management:
- Resource leaks (connections, file handles, memory)
- Missing cleanup on error paths
- Background tasks that cannot be terminated
Maintainability Issues:
- Unnecessary complexity that obscures intent
- Code duplication that should be unified
- New functions duplicating existing patterns without justification
- Mixing abstraction levels inappropriately
Error Boundaries:
- Implementation details leaking through error types across abstraction boundaries
- Database errors exposed as API errors (should be transformed to domain errors)
- File system errors bubbling up unchanged (should be contextualized)
Domain Alignment:
- Using generic technical terms instead of domain concepts
- Code structure not reflecting domain boundaries
- Missing domain terminology in critical areas
Comments:
- Incorrect or outdated comments
- Obvious comments that restate what code does
- Missing comments for non-obvious decisions
Low Priority or Non-Issues
Avoid flagging:
- Style preferences without functional impact
- Theoretical edge cases with no realistic risk
- Minor optimizations with negligible benefit
- Alternative implementations that are equivalent
- Cosmetic concerns
Review Checks
Verify Error Handling: Check errors handled appropriately at each level. Errors should be transformed when crossing abstraction boundaries, not just propagated unchanged.
Example issue:
# Problem: Database error leaks to API layer
def get_user(user_id):
return db.query("SELECT * FROM users WHERE id = ?", user_id)
# If query fails, DatabaseError bubbles up to API
# Better: Transform to domain error
def get_user(user_id):
try:
return db.query("SELECT * FROM users WHERE id = ?", user_id)
except DatabaseError as e:
raise UserNotFoundError(f"User {user_id} not found") from e
Check Concurrency Safety: Verify shared mutable state is properly synchronized. Look for race conditions in concurrent operations.
Example issue:
# Problem: Unsynchronized shared state
class Worker:
count = 0 # Shared mutable state
def process(self):
self.count += 1 # Race condition
Validate Resource Management: All resources should be properly closed or released. Cleanup must happen even on error paths. Background tasks should be terminable.
Check Domain Alignment: Code should use appropriate domain terminology. Does it reflect system's purpose and concepts? Are abstractions at right boundaries?
Review Complexity: Is complexity justified by requirements? Can code be simplified without losing functionality? Does control flow jump around unnecessarily? Are functions extracted appropriately (not too shallow)?
Verify Architecture Alignment: If architecture documentation or decisions exist, verify code aligns with them. Check for consistency with established patterns.
Consider Boy Scout Opportunities: Can nearby code be improved opportunistically alongside main change? Or does it require separate effort that should be deferred?
Guiding Principles
Prefer real issues over exhaustive coverage:
- Miss minor issues rather than flag non-issues
- Focus on actual production failures
- Consider cost-benefit of changes
Avoid absolutism:
- Not every error needs handling (sometimes failing fast is correct)
- Not every optimization is worth the complexity
- Not every duplication needs elimination
Context matters:
- Hot paths have different standards than setup code
- Prototype code has different standards than production code
- Core infrastructure has different standards than application code