| name | staging-validation-phase |
| description | Guides manual staging validation before production deployment through smoke tests, critical user flow testing, data migration verification, and rollback capability checks. Use when validating staging deployments, running pre-production tests, or preparing for production promotion in staging-prod deployment model. (project) |
This skill orchestrates the staging validation phase, which occurs after /ship-staging and before /ship-prod in the staging-prod deployment workflow.
Core responsibilities:
- Run automated smoke tests on staging environment
- Test critical user flows manually (authentication, core features, payments)
- Verify data migrations executed successfully
- Test rollback capability (actual rollback + roll-forward)
- Document sign-off decision (approve/reject with justification)
- Update state.yaml with validation results
Inputs: Staging deployment (URL, deployment ID, migration results) Outputs: Validation report, sign-off decision, state.yaml update Expected duration: 30-60 minutes
Run smoke tests - Execute automated smoke test suite on staging URL
npm run test:smoke -- --url=$STAGING_URLVerify: homepage loads (200), API health endpoint (200), database connection
Test critical user flows - Manual testing of core functionality
- Authentication (login, logout, password reset)
- Primary user workflow (feature-specific)
- Payment processing (if applicable)
- Data CRUD operations
Verify data migrations - Check staging database for migration results
# Connect to staging database psql $STAGING_DATABASE_URL -c "SELECT version FROM alembic_version;" # Verify tables, columns, constraints match expectationsTest rollback capability - Execute actual rollback test
# Rollback to previous deployment vercel rollback $PREVIOUS_DEPLOYMENT_ID # Verify previous version is live # Roll forward to current deployment vercel promote $CURRENT_DEPLOYMENT_IDDocument sign-off - Update state.yaml
manual_gates: staging_validation: status: approved # or rejected approver: "Your Name" timestamp: "2025-11-19T10:30:00Z" blockers: [] # or list of issues if rejected
Key principle: Test as if this is production. All failures must be fixed before production deployment.
- Staging deployment completed successfully (from /ship-staging)
- Staging URL is live and accessible
- Deployment ID available (for rollback testing)
- Previous deployment ID available (for rollback test)
- Database migration logs available
- Test credentials available (for authentication flows)
- Smoke tests: What automated tests exist, how to run them, what they verify
- Critical user flows: Which workflows are essential for production (auth, core feature, payments)
- Data migrations: What schema changes were made, how to verify them
- Rollback procedure: How to rollback deployment, how to verify previous version, how to roll forward
- Sign-off criteria: What constitutes approval vs rejection (all tests pass, no blocking bugs)
See deployment-strategy.md in project docs for platform-specific rollback procedures.
Execute smoke test suite on staging environment.
Smoke Test Suite:
# Run smoke tests against staging URL
npm run test:smoke -- --url=$STAGING_URL
# Typical smoke tests include:
# - Homepage loads (HTTP 200, no errors in console)
# - API health endpoint responds (GET /api/health → 200)
# - Database connection established (health check includes DB ping)
# - Static assets load (CSS, JS, images)
# - Authentication page accessible (GET /login → 200)
Success Criteria:
- All smoke tests pass (0 failures)
- No 500 errors in server logs
- No console errors in browser DevTools
- Response times <2s for all endpoints
If smoke tests fail:
- Document failures in validation report
- Mark sign-off as "rejected" with blocker list
- Return to /implement to fix issues
- Re-deploy to staging
- Re-run validation
Quality Check: Smoke tests provide quick confidence that deployment is functional, not a comprehensive test.
Manually test essential user journeys on staging.
Authentication Flow:
1. Navigate to staging URL
2. Click "Login" or navigate to /login
3. Enter test credentials (test@example.com / test-password)
4. Verify successful login (redirects to dashboard, user menu shows)
5. Click "Logout"
6. Verify successful logout (redirects to homepage, user menu gone)
7. Test password reset flow (request reset, receive email, change password)
Core Feature Flow (feature-specific):
Example for "Student Progress Dashboard" feature:
1. Login as teacher
2. Navigate to /dashboard
3. Verify student list loads (check for >0 students)
4. Click on student name
5. Verify progress details load (completion rate, lessons, grades)
6. Test filters (by class, by date range)
7. Verify data accuracy (spot-check 3 students against database)
Payment Processing Flow (if applicable):
1. Add item to cart
2. Proceed to checkout
3. Enter test payment credentials (Stripe test mode)
4. Submit payment
5. Verify success confirmation
6. Verify order appears in user account
7. Verify payment recorded in admin panel
Data CRUD Operations:
1. Create: Add new entity (student, lesson, order)
2. Read: View entity details
3. Update: Edit entity details, save changes
4. Delete: Remove entity, verify removal
5. Verify persistence: Reload page, confirm changes persisted
Success Criteria:
- All critical flows complete without errors
- UI displays correctly (no layout issues, missing data)
- Data persists correctly (create/update/delete operations work)
- No JavaScript errors in console
- Performance acceptable (pages load <3s, interactions responsive)
Quality Check: Test flows that represent 80% of user activity. Don't test every edge case.
Check that database migrations executed successfully in staging.
Migration Verification:
# Connect to staging database
psql $STAGING_DATABASE_URL
# Check migration version
SELECT version FROM alembic_version;
# Expected: Latest migration version (e.g., 4f3a2b1c5d6e)
# Verify schema changes
\d+ users # Describe users table
# Check for expected columns, constraints, indexes
# Verify data migrations
SELECT COUNT(*) FROM users WHERE email_verified IS NOT NULL;
# Check backfill operations completed
Schema Validation:
- New tables exist (if migrations added tables)
- New columns exist with correct types (if migrations added columns)
- Constraints applied (NOT NULL, UNIQUE, FOREIGN KEY)
- Indexes created (check EXPLAIN ANALYZE on critical queries)
- Old columns removed (if migrations dropped columns)
Data Validation:
- Backfill operations completed (if migrations populated data)
- Default values applied (if migrations set defaults)
- Data integrity maintained (no orphaned records, referential integrity)
Success Criteria:
- Migration version matches expected version
- All schema changes present in staging database
- Data migrations completed (if applicable)
- No migration errors in deployment logs
If migrations failed:
- Check deployment logs for migration errors
- Document failure in validation report
- Mark sign-off as "rejected"
- Return to /implement to fix migration scripts
- Re-deploy to staging (may require manual database cleanup)
Quality Check: Migrations are critical. A failed migration in production is catastrophic.
Execute actual rollback test to verify production safety net.
Rollback Test Procedure:
Step 4a: Identify Previous Deployment:
# For Vercel deployments
vercel list --limit=5
# Find previous production deployment ID
# Store IDs
CURRENT_DEPLOYMENT_ID="<current-staging-deployment>"
PREVIOUS_DEPLOYMENT_ID="<previous-production-deployment>"
Step 4b: Execute Rollback:
# Rollback to previous deployment
vercel rollback $PREVIOUS_DEPLOYMENT_ID --yes
# Or via CLI:
vercel alias set $PREVIOUS_DEPLOYMENT_ID <staging-alias>
Step 4c: Verify Previous Version Live:
1. Navigate to staging URL
2. Verify previous version is live (check version number, feature presence)
3. Test critical flow to confirm functionality
4. Document: "Rollback successful, previous version ($PREVIOUS_DEPLOYMENT_ID) is live"
Step 4d: Roll Forward:
# Restore current deployment
vercel alias set $CURRENT_DEPLOYMENT_ID <staging-alias>
Step 4e: Verify Current Version Restored:
1. Navigate to staging URL
2. Verify current version is live (feature present)
3. Test critical flow to confirm functionality
4. Document: "Roll-forward successful, current version ($CURRENT_DEPLOYMENT_ID) is live"
Success Criteria:
- Rollback completed in <2 minutes
- Previous deployment verified live and functional
- Roll-forward completed successfully
- No data loss during rollback/roll-forward
- No downtime >30 seconds
If rollback test fails:
- Document failure (which step failed, error message)
- Mark sign-off as "rejected" with blocker: "Rollback capability not verified"
- BLOCK production deployment - DO NOT proceed to /ship-prod
- Fix rollback procedure (check deployment IDs, alias configuration, DNS)
- Re-test rollback on staging
Quality Check: Rollback capability is the safety net for production. Must work reliably.
Update state.yaml with validation results and approval decision.
Approval Criteria:
Sign-off as "approved" ONLY if:
- All smoke tests pass (0 failures)
- All critical user flows complete without errors
- Data migrations verified successfully
- Rollback test succeeds (rollback + roll-forward verified)
- No blocking bugs found during manual testing
Rejection Criteria:
Sign-off as "rejected" if ANY of:
- Smoke tests fail
- Critical user flows broken (authentication fails, core feature broken)
- Data migrations failed or incomplete
- Rollback test fails
- Blocking bugs found (security issue, data corruption, critical UX bug)
state.yaml Update:
Approval Example:
manual_gates:
staging_validation:
status: approved
approver: "Jane Smith"
timestamp: "2025-11-19T14:30:00Z"
validation_summary:
smoke_tests: "All passed (8/8)"
critical_flows: "All verified (authentication, dashboard, payments)"
migrations: "Version 4f3a2b verified, schema changes confirmed"
rollback_test: "Successful (rollback to dpl_abc123, roll-forward to dpl_xyz789)"
blockers: []
Rejection Example:
manual_gates:
staging_validation:
status: rejected
approver: "Jane Smith"
timestamp: "2025-11-19T14:30:00Z"
validation_summary:
smoke_tests: "1 failure (API health endpoint returned 503)"
critical_flows: "Authentication broken (login redirects to 404)"
migrations: "Verified"
rollback_test: "Not attempted (smoke tests failed)"
blockers:
- "API health endpoint failing (503 error)"
- "Login flow broken (404 on redirect)"
Next Steps After Sign-Off:
- If approved → Run
/ship-prodto deploy to production - If rejected → Return to
/implement, fix blockers, re-deploy to staging, re-run validation
Quality Check: Sign-off must be explicit, documented, and traceable. No verbal approvals.
During validation:
- Smoke tests executed and passed
- Authentication flow tested (login, logout, password reset)
- Core feature flow tested (feature-specific)
- Payment flow tested (if applicable)
- Data migrations verified (schema + data)
- Rollback test executed (rollback + roll-forward)
- No blocking bugs found
Post-validation:
- state.yaml updated with sign-off
- Validation summary documented
- Blockers listed (if rejected)
- Next steps clear (ship-prod or return to implement)
- All smoke tests pass (automated verification)
- Critical flows tested thoroughly (manual verification)
- Data migrations verified (database inspection)
- Rollback tested (actual rollback, not assumed)
- Sign-off documented (state.yaml with approver, timestamp)
- Duration: 30-60 minutes (efficient but thorough)
Bad staging validation:
- Only homepage tested (insufficient coverage)
- Rollback assumed to work (not actually tested)
- Verbal approval only (no documented sign-off)
- Blocking bugs ignored ("we'll fix in production")
- Rushed (<15 minutes, corners cut)
Scenario:
Tester: "I checked the homepage, looks good!"
Reality: API returns 500 errors, authentication broken, database connection failing
Result: Production deployment breaks core functionality
Prevention:
- Run full smoke test suite (homepage, API, database, authentication)
- Verify automated tests pass, not just manual homepage check
- Check server logs for errors, not just UI
- Test critical endpoints (health check, auth, core API)
Good Practice:
npm run test:smoke -- --url=$STAGING_URL
# Verifies: homepage (200), API health (200), DB connection (success), auth page (200)
Scenario:
Slack message: "Staging looks good 👍"
Result: No documented approval, unclear who approved, no timestamp, no validation summary
Prevention:
- Always update state.yaml with sign-off
- Include approver name, timestamp, validation summary
- Document blockers if rejected
- Make approval explicit and traceable
Good Practice:
manual_gates:
staging_validation:
status: approved
approver: "Jane Smith"
timestamp: "2025-11-19T14:30:00Z"
validation_summary: "All tests pass, rollback verified"
Scenario:
Tester: "Rollback should work, Vercel has rollback feature"
Reality: Rollback deployed but DNS not updated, or deployment ID incorrect, or database migration not reversible
Result: Production incident, attempted rollback fails, extended downtime
Prevention:
- Always test rollback on staging before production deployment
- Execute actual rollback (change alias/DNS)
- Verify previous version is live
- Test roll-forward to confirm current version restored
- Document rollback + roll-forward success
Good Practice:
# Actual rollback test
vercel rollback $PREVIOUS_ID
# Verify previous version live (manual test)
vercel alias set $CURRENT_ID staging
# Verify current version restored (manual test)
Scenario:
Tester: "Login is broken but we'll fix it in a hotfix"
Result: Production users cannot login, support tickets spike, revenue impacted
Prevention:
- Mark validation as "rejected" for any blocking bug
- Blocking bugs: authentication broken, core feature broken, data corruption, security issue
- Fix blocking bugs before production deployment
- No "we'll fix it later" for critical issues
Good Practice:
status: rejected
blockers:
- "Login redirects to 404 (critical - blocks all users)"
next_steps: "Fix login redirect, re-deploy to staging, re-validate"
Scenario:
Tester: "Validated in 10 minutes, good to go"
Reality: Only tested happy path, missed edge cases, didn't verify migrations
Result: Production deployment fails on edge cases (null values, missing data, concurrent users)
Prevention:
- Allocate 30-60 minutes for thorough validation
- Test critical flows completely (not just happy path)
- Verify data migrations (schema + data)
- Test rollback capability
- Don't rush the last quality gate before production
Good Practice:
30-60 minute validation:
- 10 min: Smoke tests
- 15 min: Critical user flows (auth, core feature, payments)
- 10 min: Data migration verification
- 10 min: Rollback test
- 5 min: Document sign-off
Approach:
- Create smoke test suite that runs against any URL
- Include tests for: homepage, API health, database connection, authentication page
- Run via npm script:
npm run test:smoke -- --url=$STAGING_URL - Verify all tests pass before manual testing
Benefits:
- Catches deployment issues immediately (before manual testing)
- Automated, repeatable, fast (2-3 minutes)
- Provides confidence baseline for manual testing
Example:
// tests/smoke.test.js
describe("Smoke Tests", () => {
const baseURL = process.env.TEST_URL || "http://localhost:3000";
test("homepage loads", async () => {
const response = await fetch(baseURL);
expect(response.status).toBe(200);
});
test("API health endpoint responds", async () => {
const response = await fetch(`${baseURL}/api/health`);
expect(response.status).toBe(200);
const data = await response.json();
expect(data.database).toBe("connected");
});
});
Approach:
- Identify 3-5 critical user flows (authentication, core feature, payments)
- Create checklist for each flow
- Test each flow manually on staging
- Document results in validation summary
Benefits:
- Ensures essential functionality works before production
- Catches UX bugs that automated tests miss
- Provides structured testing approach (no guessing)
Example Checklist:
Authentication Flow:
- [ ] Login with valid credentials succeeds
- [ ] Login with invalid credentials fails (shows error)
- [ ] Logout succeeds (session cleared)
- [ ] Password reset email sent
- [ ] Password reset link works
- [ ] New password accepted
Approach:
- Identify previous production deployment ID
- Execute rollback to previous deployment
- Verify previous version is live (manual test)
- Execute roll-forward to current deployment
- Verify current version restored (manual test)
- Document rollback + roll-forward success
Benefits:
- Verifies safety net works before production deployment
- Builds muscle memory for production rollback procedure
- Identifies rollback issues in safe environment (staging)
Example:
# Rollback test
PREVIOUS_ID=$(vercel list --limit=5 | grep production | head -1 | awk '{print $1}')
vercel rollback $PREVIOUS_ID
# Manual verification: Navigate to staging, confirm previous version live
vercel alias set $CURRENT_ID staging
# Manual verification: Navigate to staging, confirm current version live
- All smoke tests pass (0 failures)
- All critical user flows verified (authentication, core feature, payments)
- Data migrations verified (schema + data correct)
- Rollback test succeeds (rollback + roll-forward verified)
- Sign-off documented in state.yaml (approver, timestamp, validation summary)
- Decision is "approved" (ready for production) OR "rejected" (blockers documented, return to implement)
Ready to proceed when:
- If approved → Run
/ship-prodto deploy to production - If rejected → Return to
/implement, fix blockers, re-deploy to staging, re-run validation
Issue: Critical user flow broken (authentication, core feature) Solution: Mark validation as "rejected", document blocker, return to /implement to fix, re-deploy to staging, re-validate
Issue: Data migrations not showing in staging database Solution: Check deployment logs for migration errors, verify migration scripts syntax, manually run migrations on staging if needed
Issue: Rollback test fails (previous version not live) Solution: Verify deployment IDs correct, check alias/DNS configuration, test rollback procedure manually, update deployment scripts if needed
Issue: Unclear what to test (no critical flows documented) Solution: Review spec.md for feature requirements, identify essential user workflows (authentication always critical), create flow checklist, document for future validations
Issue: Validation taking >90 minutes (too long) Solution: Focus on critical flows only (don't test every edge case), automate smoke tests (don't test manually), parallelize testing where possible, skip exhaustive testing (save for QA phase)
Deployment Procedures: Project-specific deployment documentation
- Vercel deployment: See
.github/workflows/deploy-staging.ymlfor deployment automation - Rollback procedures: See
docs/project/deployment-strategy.mdfor platform-specific rollback steps - Database migrations: See
alembic/README.mdfor migration best practices
Testing Guides:
- Smoke tests: See
tests/smoke/README.mdfor smoke test suite documentation - Critical flow testing: See spec.md for feature-specific critical flows
- Performance testing: See
docs/performance-budgets.mdfor performance targets
Quality Gates:
- Pre-flight validation: Completed in /optimize phase (performance, accessibility, security)
- Staging validation: This skill (manual testing, smoke tests, rollback capability)
- Production validation: Post-deployment verification in /ship-prod (health checks, smoke tests on production)
Next phase after staging validation:
- If approved →
/ship-prod(deploy to production, run production smoke tests, finalize) - If rejected →
/implement(fix blockers, re-deploy to staging, re-run /validate-staging)