| name | schema-design |
| description | Universal database schema design principles: normalization strategy, data types, primary keys, constraints, anti-patterns, and red flags. Apply when designing schemas, reviewing database architecture, or refactoring data models. |
Database Schema Design Principles
"Schema design debt compounds faster than code debt. Fix it now or pay 10x later."
Foundation Principles
Primary Keys
- Every table MUST have a primary key
- Prefer surrogate keys (auto-increment, UUIDv7) over composite natural keys
- UUIDv7 for distributed systems, auto-increment BIGINT for monoliths
Foreign Keys
- Use FK constraints unless specific reason not to (high-volume logging, sharded DBs)
- ON DELETE: RESTRICT (safest), CASCADE (use sparingly), SET NULL (breaks audit)
Data Types
- Choose smallest sufficient type (BIGINT vs INT = 4 bytes * rows)
- Money: DECIMAL (never FLOAT/DOUBLE)
- Dates without time: DATE not DATETIME
- Small sets: ENUM not VARCHAR
Constraints
- NOT NULL on required columns
- UNIQUE on natural keys
- CHECK for business rules
- DEFAULT where appropriate
Quality Checklist
Structural Integrity
- Every table has primary key
- Foreign key constraints defined
- Appropriate data types (smallest sufficient)
- NOT NULL, UNIQUE, CHECK constraints
Anti-Pattern Scan
- No EAV (entity-attribute-value) patterns
- No god tables (> 50 columns)
- No multi-valued fields (CSV in columns)
- No DATETIME for date-only data
Performance
- Indexes match query patterns
- Foreign keys indexed
- Composite index column order optimized
Decision Trees
"Should I denormalize this?"
Have evidence of query performance problem?
├─ NO → DON'T denormalize (premature optimization)
└─ YES → Tried indexes, query optimization, caching?
├─ NO → Try those first
└─ YES → Read-heavy (> 100:1)?
├─ NO → Normalize, optimize queries
└─ YES → Denormalize specific fields
"UUID or auto-increment?"
Distributed system (multiple write nodes)?
├─ YES → UUIDv7 (time-ordered, better than v4)
└─ NO → Exposed to users (issue-123)?
├─ YES → Auto-increment (better UX)
└─ NO → Auto-increment (better performance)
"Soft or hard delete?"
GDPR "right to erasure" applies?
├─ YES → Hard delete or audit table
└─ NO → Need audit trail?
├─ YES → Audit table pattern (recommended)
└─ NO → High deletion rate (> 20%)?
├─ YES → Hard delete
└─ NO → Soft delete acceptable
References
Detailed patterns and examples:
references/anti-patterns.md— EAV, god tables, multi-valued fields, red flagsreferences/normalization.md— 1NF/2NF/3NF, when to denormalize, OLTP vs OLAPreferences/advanced-patterns.md— Soft delete, temporal data, JSON columnsreferences/naming-conventions.md— Tables, columns, indexes, constraintsreferences/performance-patterns.md— Indexing strategy, partitioning, data types
Remember
"The best schema is one you can understand in 6 months and modify with confidence."
Design schemas that:
- Enforce integrity — Constraints, foreign keys, data types
- Optimize for common patterns — Indexes, denormalization where proven
- Enable evolution — Proper normalization, migration strategy
- Prevent known anti-patterns — No EAV, god tables, multi-valued fields