| name | test-scenario-framework |
| description | Define and run end-to-end test scenarios for medical residency scheduling. Use when validating complex scheduling logic, testing edge cases, verifying ACGME compliance under stress, or running regression scenarios. Includes 20+ pre-built scenarios covering N-1 failures, multi-swap operations, holiday conflicts, and moonlighting. |
Test Scenario Framework Skill
Comprehensive framework for defining, executing, and validating complex scheduling test scenarios.
When This Skill Activates
- Running regression tests before release
- Validating complex scheduling edge cases
- Testing N-1 contingency scenarios
- Verifying ACGME compliance under load
- Multi-swap operation validation
- Holiday/vacation conflict testing
- Moonlighting schedule validation
- Schedule generation stress testing
- Integration testing of scheduling engine
- Reproducing reported bugs with scenarios
Overview
The Test Scenario Framework provides a structured approach to test scheduling operations end-to-end. Each scenario includes:
- Setup: Initial state (persons, rotations, assignments)
- Test Case: Operations to perform (swaps, generation, validation)
- Expected Outcome: Success criteria and expected results
- Validation: Automated checks against expectations
Skill Phases
Phase 1: Scenario Selection
Identify the appropriate scenario(s) to run
Options:
1. Select from pre-built scenario library (20+ scenarios)
2. Define custom scenario for specific test case
3. Combine multiple scenarios for integration testing
4. Create scenario from bug report
Phase 2: Scenario Definition
Define or load scenario specification
scenario:
name: "N-1 Failure During Holiday Coverage"
id: "n1-holiday-coverage-001"
category: "resilience"
tags: ["n-1", "holiday", "acgme"]
setup:
persons:
- id: "resident-1"
role: "RESIDENT"
pgy_level: 2
max_hours_per_week: 80
# ... more persons
assignments:
- person_id: "resident-1"
block_date: "2024-12-25"
rotation: "inpatient"
hours: 12
# ... more assignments
test_case:
operation: "simulate_unavailability"
parameters:
person_id: "resident-1"
start_date: "2024-12-25"
end_date: "2024-12-27"
reason: "illness"
expected_outcome:
success: true
coverage_maintained: true
acgme_compliant: true
backup_activated: true
max_reassignments: 5
Phase 3: Scenario Execution
Run scenario with monitoring and timeout protection
from app.testing.scenario_executor import ScenarioExecutor
executor = ScenarioExecutor()
result = await executor.run_scenario(
scenario_id="n1-holiday-coverage-001",
timeout=300, # 5 minutes
capture_metrics=True
)
Phase 4: Result Validation
Compare results against expected outcomes
validator = ScenarioValidator()
validation_result = validator.validate(
actual=result,
expected=scenario.expected_outcome,
tolerance={
"numeric_fields": 0.05, # 5% tolerance
"assignment_count": 2 # Allow ±2 assignments
}
)
Phase 5: Reporting
Generate comprehensive test report
## Scenario Test Report
**Scenario:** N-1 Failure During Holiday Coverage
**Status:** ✅ PASSED
**Execution Time:** 3.2s
**Coverage Impact:** 0 gaps detected
### Validations
✅ Coverage maintained after failure
✅ ACGME compliance preserved
✅ Backup assignments within threshold
✅ No cascade failures detected
### Metrics
- Assignments modified: 3
- Persons affected: 2
- Compliance checks: 12/12 passed
- Performance: 3.2s (target: <5s)
Key Files
Scenario Definition Files
backend/tests/scenarios/
├── n1_failures/ # N-1 contingency scenarios
│ ├── holiday_coverage.yaml
│ ├── multi_absence.yaml
│ └── cascade_failure.yaml
├── swap_scenarios/ # Swap operation scenarios
│ ├── one_to_one_simple.yaml
│ ├── multi_swap_chain.yaml
│ └── absorb_with_coverage.yaml
├── acgme_edge_cases/ # ACGME compliance edge cases
│ ├── 80_hour_boundary.yaml
│ ├── one_in_seven_strict.yaml
│ └── supervision_ratio.yaml
└── integration/ # Multi-operation scenarios
├── full_month_generation.yaml
└── concurrent_swaps.yaml
Execution Engine
backend/app/testing/
├── scenario_executor.py # Main execution engine
├── scenario_validator.py # Result validation
├── scenario_fixtures.py # Test data generators
├── scenario_metrics.py # Performance tracking
└── scenario_reporter.py # Report generation
Output
Successful Scenario Run
{
"scenario_id": "n1-holiday-coverage-001",
"status": "passed",
"execution_time_seconds": 3.2,
"validations": {
"coverage_maintained": {"expected": true, "actual": true, "passed": true},
"acgme_compliant": {"expected": true, "actual": true, "passed": true},
"backup_activated": {"expected": true, "actual": true, "passed": true}
},
"metrics": {
"assignments_modified": 3,
"persons_affected": 2,
"compliance_checks_passed": 12,
"compliance_checks_total": 12
},
"artifacts": {
"initial_state": "snapshots/scenario-001-initial.json",
"final_state": "snapshots/scenario-001-final.json",
"execution_log": "logs/scenario-001.log"
}
}
Failed Scenario Run
{
"scenario_id": "multi-swap-chain-003",
"status": "failed",
"execution_time_seconds": 2.1,
"failure_reason": "ACGME compliance violation after swap chain",
"validations": {
"swap_chain_completed": {"expected": true, "actual": false, "passed": false},
"acgme_compliant": {"expected": true, "actual": false, "passed": false}
},
"errors": [
{
"type": "ACGMEViolation",
"person_id": "resident-3",
"rule": "80_hour_rule",
"actual_hours": 84,
"max_hours": 80,
"affected_weeks": ["2024-W52"]
}
]
}
Error Handling
Timeout Protection
# All scenarios run with timeout
try:
result = await asyncio.wait_for(
executor.run_scenario(scenario),
timeout=scenario.timeout or 300
)
except asyncio.TimeoutError:
return ScenarioResult(
status="timeout",
error="Scenario exceeded timeout limit"
)
State Rollback
# Automatic rollback on failure
async with executor.transaction_context() as ctx:
try:
result = await executor.execute_operations(scenario)
if not result.success:
await ctx.rollback()
except Exception as e:
await ctx.rollback()
raise
Artifact Capture
# Always capture state for debugging
executor.capture_snapshot("pre_execution")
try:
result = await executor.run_scenario(scenario)
finally:
executor.capture_snapshot("post_execution")
executor.save_execution_log()
Integration with Other Skills
With acgme-compliance
Scenario execution automatically triggers ACGME validation
- Pre-execution compliance check
- Post-execution compliance verification
- Continuous monitoring during operations
With safe-schedule-generation
Before running schedule generation scenarios:
1. Trigger database backup
2. Execute scenario in transaction
3. Validate results
4. Rollback if validation fails
With systematic-debugger
When scenario fails:
1. Capture full execution trace
2. Generate hypothesis list
3. Create minimal reproduction scenario
4. Debug with context
With test-writer
Convert successful scenarios to automated tests:
1. Extract scenario parameters
2. Generate pytest test case
3. Add to regression suite
4. Configure CI execution
Workflow References
- Scenario Definition - How to write scenario specs
- Scenario Execution - Running scenarios end-to-end
- Scenario Validation - Validating results
Reference Library
- Scenario Library - 20+ pre-built scenarios
- Success Criteria - Pass/fail criteria definitions
Common Scenario Patterns
Pattern 1: N-1 Failure Scenario
"""Test system resilience when one person becomes unavailable."""
setup:
- Create schedule with minimum coverage
- Ensure N-1 backups are defined
test:
- Mark person as unavailable
- Trigger contingency activation
validate:
- Coverage maintained
- ACGME compliance preserved
- Backup used efficiently
Pattern 2: Multi-Operation Chain
"""Test cascading operations maintain consistency."""
setup:
- Create baseline schedule
test:
- Execute swap operation
- Generate new assignments
- Validate compliance
- Execute another swap
validate:
- All operations succeeded
- Final state is valid
- No intermediate violations
Pattern 3: Boundary Condition Test
"""Test exact boundary conditions for rules."""
setup:
- Create schedule at exact limit (e.g., 80 hours)
test:
- Attempt operation that would exceed limit
validate:
- Operation rejected or compensated
- Limit not exceeded
- Clear error message
Running Scenarios
Single Scenario
cd /home/user/Autonomous-Assignment-Program-Manager/backend
# Run specific scenario
pytest -m scenario -k "n1_holiday_coverage"
# Run with verbose output
pytest -m scenario -k "n1_holiday_coverage" -v
# Capture artifacts
pytest -m scenario -k "n1_holiday_coverage" --capture-artifacts
Scenario Suite
# Run all N-1 scenarios
pytest -m scenario -k "n1_"
# Run all swap scenarios
pytest -m scenario -k "swap_"
# Run all ACGME edge cases
pytest -m scenario -k "acgme_edge"
# Run full regression suite
pytest -m scenario
CI Integration
# .github/workflows/scenario-tests.yml
name: Scenario Tests
on: [push, pull_request]
jobs:
scenarios:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Run N-1 scenarios
run: pytest -m scenario -k "n1_"
- name: Run swap scenarios
run: pytest -m scenario -k "swap_"
- name: Upload artifacts
uses: actions/upload-artifact@v2
with:
name: scenario-artifacts
path: tests/artifacts/
Escalation Rules
Escalate to human when:
- Scenario consistently times out (may need optimization)
- Expected behavior unclear from requirements
- Scenario reveals fundamental design flaw
- Multiple related scenarios failing (systemic issue)
- Performance degradation detected across scenarios
Can handle automatically:
- Running pre-defined scenarios
- Validating against expected outcomes
- Generating test reports
- Creating scenario variants
- Regression testing
Best Practices
1. Scenario Isolation
- Each scenario must be completely independent
- No shared state between scenarios
- Clean database state before each run
- Use fixtures for setup/teardown
2. Deterministic Results
- Avoid randomness unless testing random operations
- Use fixed dates and times
- Set random seeds when needed
- Mock external dependencies
3. Clear Naming
Format: {category}_{operation}_{edge_case}_{number}
Examples:
- n1_failure_holiday_coverage_001
- swap_multi_chain_acgme_boundary_002
- acgme_80hour_strict_limit_001
4. Comprehensive Documentation
scenario:
name: "Clear, descriptive name"
description: "What this scenario tests and why"
rationale: "Why this edge case matters"
related_bugs: ["#123", "#456"]
related_scenarios: ["swap_001", "acgme_005"]
5. Progressive Complexity
- Start with simple scenarios
- Build to complex multi-operation scenarios
- Use scenario composition for integration tests
- Keep individual scenarios focused
References
backend/tests/scenarios/- Scenario definition filesbackend/app/testing/- Execution frameworkdocs/testing/SCENARIO_TESTING.md- Detailed guide (if exists)CLAUDE.md- Testing requirements