name	database-migration-guardian
description	Prevents database migration failures in Sequelize/MySQL environments. Use when adding new models, modifying schemas, creating indexes, or deploying database changes. Catches foreign key errors, index conflicts, zero-downtime migration issues, data loss risks, and rollback problems. Validates migration scripts, suggests optimization, and ensures production safety.

Database Migration Guardian

Mission: Prevent database migration disasters through proactive schema validation and zero-downtime deployment patterns. Catches breaking changes before they reach production.

Activation Triggers

Adding new Sequelize models
Modifying existing model schemas
Creating database indexes
Foreign key relationship changes
Production database updates
"Migration failed" errors
Performance optimization needs
Data integrity issues

Framework Awareness

This skill understands:

Sequelize ORM - Model definitions, migrations, sync patterns
MySQL 8.0 - Constraints, indexes, storage engines
Production Databases - Zero-downtime migrations, rollback strategies
Docker - Database container management, volume persistence

Scan Methodology

1. Initial Context Gathering

Ask if not provided:

"Show me your new/modified model file"
"Show me existing migrations (if any)"
"What's your deployment strategy?" (docker-compose, manual, CI/CD)
"Is this for development or production?"
"Do you have existing data in this table?"

2. Critical Migration Scan

Execute ALL checks in this section.

🔴 CRITICAL: Model Definition Validation

Historical Failure: Missing foreign key caused cascading delete of all user data

Scan for:

Foreign keys have proper onDelete and onUpdate rules
Required fields (allowNull: false) have defaults or are optional in creation
Unique constraints on appropriate fields
Indexes defined for foreign keys and frequently queried fields
Data types match MySQL capabilities (BIGINT for large numbers, JSON for objects)
Enums match across model and migration files
Associations defined in both directions (hasMany/belongsTo)

Red flags:

// ❌ DANGEROUS - No cascade rule
user_id: {
  type: DataTypes.UUID,
  references: {
    model: 'users',
    key: 'id'
  }
  // Missing: onDelete: 'CASCADE' or 'SET NULL'
}

// ❌ DANGEROUS - Required field without default
email: {
  type: DataTypes.STRING,
  allowNull: false
  // Missing: defaultValue or Optional<> in creation interface
}

// ❌ DANGEROUS - No index on foreign key
user_id: {
  type: DataTypes.UUID,
  references: { model: 'users', key: 'id' }
}
// Missing: index definition

Optimization:

// ✅ SAFE - Proper foreign key with cascade
user_id: {
  type: DataTypes.UUID,
  allowNull: false,
  references: {
    model: 'users',
    key: 'id'
  },
  onDelete: 'CASCADE',  // Delete jobs when user deleted
  onUpdate: 'CASCADE'   // Update if user ID changes
}

// ✅ SAFE - Optional in creation, required in schema
interface ModelCreationAttributes
  extends Optional<ModelAttributes, 'id' | 'created_at'> {}

// ✅ SAFE - Index on foreign key
{
  indexes: [
    { fields: ['user_id'] },
    { fields: ['status'] },
    { fields: ['created_at'] }
  ]
}

🔴 CRITICAL: Migration Script Safety

Historical Failure: Migration ran successfully but forgot to copy data, losing all records

Scan for:

Migration uses QueryInterface properly (not sequelize.sync())
All columns from model are in migration CREATE TABLE
Indexes created in migration match model definition
Foreign keys in migration match model references
Enum values match exactly between model and migration
down() migration exists for rollback
Data migration strategy for existing rows (if modifying table)

Red flags:

// ❌ DANGEROUS - Using sync() instead of migrations
await sequelize.sync({ force: true })  // DELETES ALL DATA!

// ❌ DANGEROUS - Missing columns
await queryInterface.createTable('batch_jobs', {
  id: DataTypes.UUID,
  user_id: DataTypes.UUID
  // Missing: all other columns from model
})

// ❌ DANGEROUS - No rollback
export async function down(queryInterface: QueryInterface) {
  // Empty or missing - can't undo migration
}

Optimization:

// ✅ SAFE - Complete migration
export async function up(queryInterface: QueryInterface) {
  await queryInterface.createTable('batch_jobs', {
    id: { type: DataTypes.UUID, primaryKey: true },
    user_id: {
      type: DataTypes.UUID,
      allowNull: false,
      references: { model: 'users', key: 'id' },
      onDelete: 'CASCADE'
    },
    // ... ALL columns from model ...
  })

  // Add indexes
  await queryInterface.addIndex('batch_jobs', ['user_id'])
  await queryInterface.addIndex('batch_jobs', ['status'])
}

// ✅ SAFE - Rollback defined
export async function down(queryInterface: QueryInterface) {
  await queryInterface.dropTable('batch_jobs')
}

🟡 HIGH: Zero-Downtime Migration Patterns

Historical Issue: Adding NOT NULL column without default locked production table for 5 minutes

Scan for:

Adding required columns uses multi-step migration
Indexes created with ALGORITHM=INPLACE where possible
Large data migrations happen in batches
Backward-compatible changes (new code works with old schema)

Migration Patterns:

Adding Required Column (Multi-Step):

// Migration 1: Add column as nullable with default
await queryInterface.addColumn('users', 'new_field', {
  type: DataTypes.STRING,
  allowNull: true,  // Start as optional
  defaultValue: 'default_value'
})

// Deploy new code that handles both null and populated values

// Migration 2: Backfill data
await queryInterface.sequelize.query(
  'UPDATE users SET new_field = "value" WHERE new_field IS NULL'
)

// Migration 3: Make NOT NULL
await queryInterface.changeColumn('users', 'new_field', {
  type: DataTypes.STRING,
  allowNull: false
})

Removing Column (Multi-Step):

// Step 1: Deploy code that stops using column
// Step 2: Run migration to drop column
await queryInterface.removeColumn('users', 'old_field')
// Column already unused, no downtime

Renaming Column (Zero-Downtime):

// DON'T rename - it breaks old code
// Instead: Add new column → Dual write → Remove old column

// Migration 1: Add new column
await queryInterface.addColumn('users', 'new_name', {
  type: DataTypes.STRING,
  allowNull: true
})

// Update code to write to both columns
user.old_name = value
user.new_name = value

// Migration 2: Backfill
await queryInterface.sequelize.query(
  'UPDATE users SET new_name = old_name WHERE new_name IS NULL'
)

// Migration 3: Stop using old_name in code

// Migration 4: Drop old column
await queryInterface.removeColumn('users', 'old_name')

🟡 HIGH: Index Optimization

Historical Issue: Missing index on foreign key caused 10s query times

Scan for:

All foreign keys have indexes
Frequently queried columns have indexes
Composite indexes for multi-column queries
No redundant indexes (e.g., index on user_id AND (user_id, status))
Index cardinality appropriate (high cardinality = good for indexing)

Index Strategy:

// Query: Get user's conversion jobs ordered by date
// SELECT * FROM conversion_jobs WHERE user_id = ? ORDER BY created_at DESC

// ❌ NO INDEX - Full table scan
indexes: []

// ❌ PARTIAL - Index on user_id only
indexes: [{ fields: ['user_id'] }]
// MySQL still needs to sort by created_at

// ✅ OPTIMAL - Composite index
indexes: [
  { fields: ['user_id', 'created_at'] }
  // MySQL can use index for WHERE and ORDER BY
]

// Query: Get jobs by status
// SELECT * FROM conversion_jobs WHERE status = 'completed'

// ❌ POOR - Low cardinality (only 5 statuses)
indexes: [{ fields: ['status'] }]
// Better to filter in application layer

// ✅ BETTER - Composite with high cardinality
indexes: [
  { fields: ['status', 'created_at'] }
  // Useful if you filter by status AND sort by date
]

🟠 MEDIUM: Data Integrity

Scan for:

Timestamps enabled (timestamps: true)
Soft deletes if needed (paranoid: true)
Unique constraints on email, username, etc.
Check constraints for valid values
JSON validation for JSON fields
Cascade deletes configured correctly

Data Integrity Patterns:

// ✅ Email uniqueness
email: {
  type: DataTypes.STRING,
  allowNull: false,
  unique: true,
  validate: {
    isEmail: true
  }
}

// ✅ Enum validation
status: {
  type: DataTypes.ENUM('pending', 'processing', 'completed', 'failed'),
  allowNull: false,
  defaultValue: 'pending'
}

// ✅ Soft deletes (paranoid)
{
  timestamps: true,
  paranoid: true  // Adds deletedAt, doesn't actually delete
}

// ✅ Check constraint (MySQL 8.0+)
progress: {
  type: DataTypes.INTEGER,
  allowNull: false,
  validate: {
    min: 0,
    max: 100
  }
}

3. Sequelize.sync() vs Migrations Decision

Ask user: "Are you using sequelize.sync() or migration files?"

Sequelize.sync() - Development Only:

// ✅ OK for development
await sequelize.sync({ force: false, alter: false })
// Creates tables if they don't exist

// ❌ NEVER in production
await sequelize.sync({ force: true })  // DELETES ALL DATA
await sequelize.sync({ alter: true })  // RISKY - can lose data

Migrations - Production:

// ✅ Production-safe
// Run migration files in order
// 001-create-users.ts
// 002-create-conversion-jobs.ts
// 003-create-subscriptions.ts
// 004-create-payment-logs.ts
// 005-create-batch-jobs.ts

// Each migration is version-controlled and rollback-safe

Recommendation:

Development: sequelize.sync() is fine for rapid iteration
Production: ALWAYS use migration files
Transition: Write migrations for existing sync'd tables

4. Production Pre-Flight Checklist

Before deploying database changes:

Pre-Migration:

Backup database (mysqldump or snapshot)
Test migration on production data copy
Verify rollback (down()) migration works
Check migration locks table during execution
Estimate migration time (run on staging with prod data volume)
Plan maintenance window if needed (or use zero-downtime pattern)

During Migration:

Monitor database connections (SHOW PROCESSLIST)
Watch for lock waits (SELECT * FROM information_schema.innodb_locks)
Have rollback plan ready
Monitor application errors

Post-Migration:

Verify row counts (SELECT COUNT(*) FROM new_table)
Check foreign key relationships (SELECT * FROM table WHERE user_id NOT IN (SELECT id FROM users))
Run EXPLAIN on critical queries to verify index usage
Monitor slow query log

5. Common Migration Errors & Fixes

Error	Cause	Solution
`ER_DUP_FIELDNAME`	Column already exists	Check if migration ran before, use `IF NOT EXISTS`
`ER_BAD_NULL_ERROR`	Required field has null values	Backfill data before adding NOT NULL constraint
`ER_ROW_IS_REFERENCED`	Foreign key constraint violation	Can't delete parent row with child rows, use CASCADE
`ER_NO_REFERENCED_ROW`	Foreign key points to non-existent row	Data integrity issue, clean up orphaned records first
`ER_DUP_ENTRY`	Unique constraint violation	Remove duplicates before adding UNIQUE constraint
`Lock wait timeout exceeded`	Table locked during migration	Use smaller batches, add indexes offline

6. Migration Testing Strategy

Local Testing:

# Reset database
docker-compose down -v
docker-compose up -d

# Run migrations
cd backend
npx tsx src/scripts/run-migration.ts

# Verify tables created
mysql -u pdflab -p pdflab -e "SHOW TABLES"
mysql -u pdflab -p pdflab -e "DESCRIBE batch_jobs"

Staging Testing:

# Copy production data to staging
mysqldump --single-transaction production > prod_copy.sql
mysql staging < prod_copy.sql

# Run migration on staging
node run-migration.js

# Test application works with new schema
npm test

Production Rollout:

# Backup first
mysqldump --single-transaction pdflab > backup_$(date +%Y%m%d_%H%M%S).sql

# Run migration
node run-migration.js

# Verify
mysql -e "SELECT COUNT(*) FROM batch_jobs"

# Monitor
tail -f /var/log/mysql/slow-query.log

Output Format: Auto-Scan Report

═══════════════════════════════════════════════
🛡️ DATABASE MIGRATION GUARDIAN - SCAN RESULTS
═══════════════════════════════════════════════

📊 SCAN SCOPE
• Model: BatchJob
• Table: batch_jobs
• Migration: 005-create-batch-jobs.ts
• Strategy: Sequelize.sync() (development)

🚨 CRITICAL FINDINGS: 0
✅ No critical issues detected

⚠️  HIGH PRIORITY: 2
1. Missing index on frequently queried field
   • Field: status
   • Impact: Slow queries when filtering by status
   • Fix: Add index: { fields: ['status'] }

2. Foreign key without cascade rule
   • Field: user_id
   • Impact: Can't delete users with batch jobs
   • Fix: Add onDelete: 'CASCADE'

💡 OPTIMIZATIONS: 3
1. Composite index opportunity
   • Query: Get user's batches ordered by date
   • Current: Index on user_id only
   • Optimized: { fields: ['user_id', 'created_at'] }
   • Benefit: 10x faster queries

2. JSON field validation
   • Field: options
   • Current: No validation
   • Recommended: Add Joi/Zod schema validation

3. Soft deletes consideration
   • Current: Hard delete
   • Recommended: Add paranoid: true for audit trail

═══════════════════════════════════════════════
PRODUCTION READINESS
═══════════════════════════════════════════════
✅ Migration script complete
✅ Rollback defined
⚠️  Zero-downtime pattern needed for ALTER operations
✅ Indexes defined
⚠️  Test on production data copy recommended

RISK LEVEL: LOW
DEPLOYMENT RECOMMENDATION: Safe to deploy after addressing 2 high-priority issues

NEXT ACTIONS:
1. Add cascade rule to user_id foreign key
2. Add index on status field
3. Test migration on production data copy
4. Add composite index for performance

═══════════════════════════════════════════════

Quick Reference: Migration Commands

# Sequelize CLI (if using)
npx sequelize-cli migration:generate --name create-batch-jobs
npx sequelize-cli db:migrate
npx sequelize-cli db:migrate:undo

# Custom migration (PDFLab pattern)
npx tsx src/scripts/run-migration.ts

# MySQL verification
mysql -u pdflab -p pdflab -e "SHOW CREATE TABLE batch_jobs"
mysql -u pdflab -p pdflab -e "SHOW INDEX FROM batch_jobs"

# Check foreign keys
mysql -u pdflab -p pdflab -e "
  SELECT
    TABLE_NAME,
    COLUMN_NAME,
    REFERENCED_TABLE_NAME,
    REFERENCED_COLUMN_NAME
  FROM information_schema.KEY_COLUMN_USAGE
  WHERE REFERENCED_TABLE_NAME = 'users'
"

Key Principles

Always use migrations in production - Never sync({ alter: true })
Foreign keys need cascade rules - Prevent orphaned records
Index foreign keys - Prevent slow joins
Test with production data - Catch constraint violations early
Zero-downtime migrations - Multi-step for breaking changes
Rollback plan - Every up() needs a down()
Backup first - Murphy's law applies to databases

When to Escalate

Adding columns to tables with >1M rows
Changing primary keys or foreign keys
Renaming tables in production
Complex data migrations requiring ETL
Geo-distributed database changes
Changing storage engines or collation

database-migration-guardian

Install Skill

SKILL.md

Database Migration Guardian

Activation Triggers

Framework Awareness

Scan Methodology

1. Initial Context Gathering

2. Critical Migration Scan

🔴 CRITICAL: Model Definition Validation

🔴 CRITICAL: Migration Script Safety

🟡 HIGH: Zero-Downtime Migration Patterns

🟡 HIGH: Index Optimization

🟠 MEDIUM: Data Integrity

3. Sequelize.sync() vs Migrations Decision

4. Production Pre-Flight Checklist

5. Common Migration Errors & Fixes

6. Migration Testing Strategy

Output Format: Auto-Scan Report

Quick Reference: Migration Commands

Key Principles

When to Escalate