| name | database-migration-safe |
| description | Use when creating database migrations. Prevents data loss, downtime, and performance issues. Supports PostgreSQL, MySQL, SQLite. Python 3.8+ |
| author | Claude Code Learning Flywheel Team |
| allowed-tools | Read, Write, Edit, Bash, Grep, Glob |
| version | 1.0.0 |
| last_verified | 2026-01-01 |
| tags | database, migrations, safety, operations |
| related-skills |
Skill: Safe Database Migrations
Purpose
Database changes are high-risk operations. This skill acts as a safety harness to prevent data loss, downtime, and performance issues during schema migrations.
1. Negative Knowledge (Critical Blockers)
🛑 STOP: Do not proceed if your plan involves any of these patterns.
| Dangerous Operation | Why It Fails | Safe Alternative |
|---|---|---|
| Renaming columns | Causes downtime, breaks running app | Add new column → backfill → switch code → drop old |
| Renaming tables | Breaks all running queries | Create view → migrate code → rename later |
| Adding NOT NULL without default | Fails on large tables | Add as nullable → backfill → add constraint |
| Default values on large tables | Locks entire table | Add default in application layer first |
| Dropping columns immediately | Breaks running app instances | Deprecate → remove from code → wait → drop |
| Changing column types | Can lose data, slow migration | Add new column → migrate data → drop old |
| Adding indexes on large tables | Locks table for minutes/hours | Use CONCURRENTLY (PostgreSQL) or equivalent |
| Foreign key constraints without index | Slow queries, lock contention | Create index first, then constraint |
2. Verified Migration Patterns
Pattern 1: Renaming a Column (Safe)
❌ Dangerous Approach:
ALTER TABLE users RENAME COLUMN name TO full_name;
Problem: All running app instances crash immediately.
✅ Safe Approach (Multi-Step):
Step 1: Add new column
ALTER TABLE users ADD COLUMN full_name VARCHAR(255);
Step 2: Backfill data
UPDATE users SET full_name = name WHERE full_name IS NULL;
-- For large tables, batch this:
-- UPDATE users SET full_name = name WHERE id >= X AND id < Y AND full_name IS NULL;
Step 3: Update application code
// Old code: user.name
// New code: user.full_name || user.name (supports both)
Deploy this version.
Step 4: Make new column NOT NULL (after backfill complete)
ALTER TABLE users ALTER COLUMN full_name SET NOT NULL;
Step 5: Update code to only use new column
// New code: user.full_name
Deploy this version.
Step 6: Drop old column (in separate migration, days/weeks later)
ALTER TABLE users DROP COLUMN name;
Pattern 2: Adding NOT NULL Column (Safe)
❌ Dangerous:
ALTER TABLE products ADD COLUMN category_id INT NOT NULL;
Problem: Fails if table has existing rows.
✅ Safe:
Step 1: Add as nullable with default
ALTER TABLE products ADD COLUMN category_id INT;
Step 2: Backfill data
UPDATE products SET category_id = 1 WHERE category_id IS NULL;
-- Or more complex logic based on business rules
Step 3: Add NOT NULL constraint (after verification)
ALTER TABLE products ALTER COLUMN category_id SET NOT NULL;
Pattern 3: Adding Index Without Locking (PostgreSQL)
❌ Dangerous:
CREATE INDEX idx_users_email ON users(email);
Problem: Locks table for duration of index creation.
✅ Safe:
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);
Benefit: Allows reads/writes during index creation.
Note: For MySQL, use ALGORITHM=INPLACE, LOCK=NONE (5.6+)
Pattern 4: Dropping a Column (Safe)
✅ Safe Multi-Step Process:
Step 1: Stop writing to column
// Remove all code that sets this field
// Keep reads for backwards compatibility
Deploy.
Step 2: Wait for all instances to deploy (1-7 days)
Step 3: Remove reads from code Deploy.
Step 4: Drop column in migration
ALTER TABLE users DROP COLUMN deprecated_field;
3. Verified Procedure for All Migrations
Pre-Migration Checklist
- Backup: Ensure automated backups are working
- Test locally: Run migration against production-like dataset
- Validate: Run migration validator script:
python .claude/skills/database-migration-safe/scripts/validate_migration.py <migration_file> - Size check: If table > 1M rows, verify CONCURRENTLY or batching strategy
- Rollback plan: Document how to revert (within transaction if possible)
- Timing: Schedule during low-traffic window if risky
Migration Execution
Step 1: Generate migration file
# TypeORM
npm run migration:generate -- -n AddUserEmailIndex
# Sequelize
npx sequelize-cli migration:generate --name add-user-email-index
# Prisma
npx prisma migrate dev --name add-user-email-index
Step 2: Audit migration
python .claude/skills/database-migration-safe/scripts/validate_migration.py \
migrations/20260101_add_user_email_index.sql
Step 3: Apply locally
npm run migration:up
# Verify schema with: npm run migration:show
Step 4: Test application
npm test
npm run dev # Manual verification
Step 5: Commit migration
git add migrations/
git commit -m "feat(db): add index on users.email for faster lookups"
Step 6: Apply to staging
# On staging environment
npm run migration:up
# Verify no errors, check performance
Step 7: Apply to production
# On production environment
npm run migration:up
# Monitor logs, database metrics
Post-Migration Verification
- Check application logs for errors
- Verify query performance (EXPLAIN ANALYZE)
- Monitor database metrics (CPU, I/O, locks)
- Test critical user flows
- Confirm no data loss (row counts, checksums)
4. Zero-Context Scripts
validate_migration.py
Located at: .claude/skills/database-migration-safe/scripts/validate_migration.py
Purpose: Automated detection of dangerous migration patterns.
Usage:
python .claude/skills/database-migration-safe/scripts/validate_migration.py <migration_file>
Returns:
- Exit code 0: Safe migration
- Exit code 1: Dangerous patterns detected
- JSON report of findings
5. Failed Attempts (Negative Knowledge Evolution)
❌ Attempt: Rename column in single migration
Context: Renamed user.email to user.email_address in one step
Failure: All running app instances crashed, 15min downtime
Learning: Always use multi-step rename pattern with dual-read period
❌ Attempt: Add NOT NULL column with default on 10M row table
Context: ALTER TABLE orders ADD COLUMN status VARCHAR(20) NOT NULL DEFAULT 'pending'
Failure: Table locked for 45 minutes, application unavailable
Learning: Add as nullable, backfill in batches, then add constraint
❌ Attempt: Create index during peak traffic
Context: Added index on production during business hours Failure: Query timeouts, cascade failures across services Learning: Use CONCURRENTLY and schedule during low-traffic windows
❌ Attempt: Drop column referenced by running code
Context: Removed column that was still read by some services Failure: Services crashed until rollback Learning: Multi-step deprecation: stop writes → deploy → stop reads → deploy → drop column
6. Database-Specific Guidance
PostgreSQL
- ✅ Use
CONCURRENTLYfor index creation - ✅ Use
ADD COLUMN IF NOT EXISTSfor idempotency - ✅ Wrap DDL in transactions (except CONCURRENTLY operations)
- ✅ Use
pg_stat_activityto check for locks before migration
MySQL
- ✅ Use
ALGORITHM=INPLACE, LOCK=NONEfor online DDL (5.6+) - ✅ Use
pt-online-schema-changefor large tables (Percona Toolkit) - ⚠️ Be aware of metadata locks and long-running transactions
SQLite
- ⚠️ Limited ALTER TABLE support (can't drop columns in older versions)
- ✅ Use table recreation pattern: create new → copy data → rename
- ✅ Always use transactions
7. Migration Naming Conventions
Format: YYYYMMDDHHMMSS_descriptive_name.sql
Examples:
- ✅
20260101120000_add_index_users_email.sql - ✅
20260101120100_create_orders_table.sql - ✅
20260101120200_add_not_null_users_created_at.sql - ❌
migration.sql - ❌
update_db.sql
8. Rollback Strategy
Transactional Migrations (Preferred)
BEGIN;
-- Migration code here
ALTER TABLE users ADD COLUMN age INT;
-- Verification
DO $$
BEGIN
IF NOT EXISTS (
SELECT 1 FROM information_schema.columns
WHERE table_name = 'users' AND column_name = 'age'
) THEN
RAISE EXCEPTION 'Column not created';
END IF;
END $$;
COMMIT;
Non-Transactional Operations
For operations that can't be in a transaction (e.g., CREATE INDEX CONCURRENTLY):
- Document manual rollback steps
- Create a companion "down" migration file
- Test rollback locally before production
9. Governance
- Token Budget: ~480 lines (within 500 limit)
- Dependencies: Database-agnostic patterns (examples in PostgreSQL)
- Critical: Failure here causes data loss and downtime
- Maintenance: Update as new database versions add features
- Verification Date: 2026-01-01