| name | radulator-qa-tester |
| description | Automated QA testing for Radulator's 18 medical calculators across radiology, hepatology/liver, and urology specialties. Tests accuracy, collects browser diagnostics, generates Playwright tests, and manages three-branch Git workflow (dev1→test1→main). Use when testing Radulator calculators, reviewing PRs with qa label, verifying medical formulas, generating test reports, or creating comprehensive test suites. |
Radulator QA Tester
Comprehensive automated testing system for the Radulator medical calculator web application. This skill acts as a highly-competent manual tester that reads test specifications, runs the web app, collects diagnostics, verifies calculations, and produces detailed test reports.
Supported Calculators (18 Total)
Radiology (6)
- Adrenal CT Washout
- Adrenal MRI Chemical Shift Index (CSI)
- Prostate Volume & PSA Density
- Renal Cyst (Bosniak Classification)
- Spleen Size (Upper Limit of Normal)
- Hip Dysplasia Indices (DDH)
Hepatology/Liver (9)
- ALBI Score (Albumin-Bilirubin Grade)
- Adrenal Vein Sampling - Cortisol (Cushing)
- Adrenal Vein Sampling - Aldosterone (Hyperaldosteronism)
- BCLC Staging (Barcelona Clinic Liver Cancer)
- Child-Pugh Score
- Milan Criteria (HCC Transplant Eligibility)
- MELD-Na Score
- MR Elastography (Liver Fibrosis Staging)
- Y-90 Radiation Segmentectomy
Urology (3)
- IPSS (International Prostate Symptom Score)
- RENAL Nephrometry Score
- SHIM Score (Sexual Health Inventory for Men)
Quick Start
When a Radulator PR is opened with the qa label:
- Identify which calculator(s) changed
- Run automated tests using Playwright MCP
- Verify calculations against expected formulas
- Collect diagnostics (console logs, network requests, screenshots)
- Generate Playwright regression tests
- Post test report to PR via GitHub MCP
- Merge if tests pass, otherwise request fixes
Prerequisites
This skill requires two MCP servers configured via Docker Desktop MCP Toolkit (recommended):
- Playwright MCP: Browser automation for testing (21 tools)
- GitHub MCP: Repository management and PR operations (40+ tools)
Quick Setup (Claude Code CLI)
- Install Docker Desktop with MCP Toolkit enabled
- Add MCP servers via Docker Desktop UI:
- Playwright (from catalog)
- GitHub Official (from catalog, requires OAuth)
- Create
.mcp.jsonin Radulator project root:{ "mcpServers": { "MCP_DOCKER": { "command": "docker", "args": ["mcp", "gateway", "run"], "type": "stdio" } } } - Connect Claude Code:
docker mcp client connect claude-code - Verify:
docker mcp tools lsshould show ~67 tools
See references/mcp_setup.md for detailed installation and configuration instructions.
Core Testing Process
Step 1: Detect Calculator Changes
When a PR is opened:
Use GitHub MCP to:
1. Get PR details and diff
2. Search diff for: src/components/calculators/*.jsx
3. Extract calculator name from changed files
Step 2: Setup Test Environment
# Clone PR branch
git clone <repo-url> --branch <pr-branch>
cd radulator
# Install dependencies
npm install
# Start dev server
npm run dev
# Wait for "ready" message, note the port (typically 5173)
Step 3: Run Playwright Tests
For each changed calculator:
- Navigate: Use Playwright MCP to open
http://localhost:PORT - Select Calculator: Click sidebar item matching calculator name
- Load Test Data: Read test cases from
references/test_cases.md - Fill Inputs: Use
browser_typeto enter test values - Execute: Click "Calculate" button
- Capture Results: Extract output values from results section
Step 4: Verify Calculations
# Run verification script
python scripts/verify_calculators.py <calculator_name> '<test_case_json>'
# Compare actual vs expected values
# Tolerance: ±0.1 for percentages, ±0.01 for densities
Step 5: Collect Diagnostics
Use Playwright MCP to gather:
- browser_console_messages: Capture JavaScript errors/warnings
- browser_network_requests: Verify no unexpected API calls
- browser_take_screenshot: Document test state
Step 6: Generate Regression Tests
# Use Playwright MCP
browser_generate_playwright_test
# Or use helper script
python scripts/generate_playwright_test.py "<calculator_name>" '<test_data_json>'
# Save to: tests/<calculator-name>.spec.js
# Commit to branch: qa/<calculator-name>
Step 7: Report Results
Create a test report comment on the PR with:
## QA Test Report: <Calculator Name>
### Test Case: <Test Name>
**Inputs:**
- Input 1: <value>
- Input 2: <value>
- ...
**Expected Results:**
- Output 1: <expected> (formula: <formula>)
- Output 2: <expected> (formula: <formula>)
**Actual Results:**
- Output 1: <actual> ✅ / ❌
- Output 2: <actual> ✅ / ❌
**Status:** PASS / FAIL
**Diagnostics:**
- Console Errors: <count> errors (details below)
- Network Requests: <count> requests
- Render Time: <time> ms
<If tests failed:>
**Recommended Fixes:**
1. <specific fix suggestion>
2. <code reference if applicable>
**Screenshots:**

**Console Logs:**
Step 8: Merge or Request Changes
If ALL tests PASS:
- Post comment: "✅ QA: All tests passed"
- Merge PR into test1 via GitHub MCP
If ANY test FAILS:
- Post comment with detailed failure report
- Do NOT merge
- Wait for developer to fix and push updates
Calculator-Specific Testing
Adrenal CT Washout
Test Inputs (from references/test_cases.md):
- Typical adenoma: unenh=10, portal=100, delayed=40
- Non-adenoma: unenh=20, portal=80, delayed=70
Verification Script:
from scripts.verify_calculators import adrenal_ct_washout
result = adrenal_ct_washout(10, 100, 40)
# Expected: absolute_washout=66.7%, relative_washout=60.0%
UI Selectors:
- Input:
input[name="unenh"],input[name="portal"],input[name="delayed"] - Results:
.resultscontainer
Prostate Volume & PSA Density
Test Inputs:
- Normal: length=4, height=3, width=3.5, psa=2
- Elevated: length=3.5, height=2.5, width=3, psa=5.5
Verification Script:
from scripts.verify_calculators import prostate_volume_psa_density
result = prostate_volume_psa_density(4, 3, 3.5, 2)
# Expected: volume=21.84 mL, psa_density=0.092 ng/mL²
Other Calculators
See references/test_cases.md for complete test data for all six calculators:
- Adrenal CT Washout
- Adrenal MRI Chemical Shift
- Prostate Volume & PSA Density
- Renal Cyst (Bosniak)
- Spleen Size (Upper Limit)
- Hip Dysplasia Indices
Three-Branch Workflow
Radulator uses: dev1 → test1 → main
PR from dev1 → test1
- Developer creates PR with
qalabel - This skill runs automated tests
- If tests pass → merge to test1
- If tests fail → request fixes
PR from test1 → main
- After successful test1 merge, create PR to main
- Re-run tests to catch regressions
- If tests pass → merge to production
- Deploy to radulator.com
See references/workflow.md for detailed workflow documentation.
Quality Gates
Before Merging to test1:
- ✅ Calculations accurate within tolerance
- ✅ No console errors
- ✅ Results render in < 500ms
- ✅ All inputs properly validated
- ✅ Interpretation text correct
Before Merging to main:
- ✅ All test1 gates passed
- ✅ No regressions detected
- ✅ Playwright tests committed
- ✅ Documentation updated
- ✅ References section complete
Example Test Report
## QA Test Report: Adrenal CT Washout Calculator
### Test Case: Typical Adenoma
**Inputs:**
- Unenhanced HU: 10
- Portal Venous HU: 100
- Delayed (15 min) HU: 40
**Expected Results:**
- Absolute Washout: 66.7% (formula: ((portal - delayed)/(portal - unenh)) × 100)
- Relative Washout: 60.0% (formula: ((portal - delayed)/portal) × 100)
- Interpretation: "Suggests benign adenoma"
**Actual Results:**
- Absolute Washout: 66.7% ✅
- Relative Washout: 60.0% ✅
- Interpretation: "Suggests benign adenoma" ✅
**Status:** ✅ PASS
**Diagnostics:**
- Console Errors: 0
- Network Requests: 3 (all to localhost)
- Render Time: 142 ms
**Generated Regression Test:**
Committed to: `tests/adrenal-ct-washout.spec.js`
---
✅ **QA: All tests passed. Safe to merge to test1.**
Troubleshooting
Common Issues
Issue: Calculator not found in sidebar
Fix: Check calculator name matches calcDefs array in App.jsx
Issue: Input field selectors not working Fix: Inspect actual input field names/IDs, update selectors
Issue: Results not rendering Fix: Check for console errors, verify Calculate button click registered
Issue: Math precision errors
Fix: Use .toBeCloseTo() with appropriate precision tolerance
Debugging Tools
// Get all input fields
await page.$$eval('input', els => els.map(e => ({name: e.name, value: e.value})))
// Get results container HTML
await page.locator('.results').innerHTML()
// Monitor all console messages
page.on('console', msg => console.log('Browser:', msg.text()))
Best Practices
- Verify MCP connection before starting tests:
docker mcp client ls # Check connection status docker mcp tools ls # Verify tools available - Use Docker Desktop MCP Toolkit for team consistency and CI/CD readiness
- Commit
.mcp.jsonto version control for team collaboration - Run tests in headless mode for CI, headful for debugging
- Take screenshots after every major step for documentation
- Keep test data in sync with medical literature references
- Generate Playwright tests for all calculators to build regression suite
- Document deviations clearly with formula references and code snippets
- Test on multiple browsers (Chrome, Firefox, Safari) using Playwright
- Monitor token usage - simplify Playwright interactions if needed
Resources
- Test Cases:
references/test_cases.md - Workflow Guide:
references/workflow.md - MCP Setup:
references/mcp_setup.md - Verification Scripts:
scripts/verify_calculators.py - Test Generator:
scripts/generate_playwright_test.py
Notes
- Recommended for Claude Code CLI users: This skill uses Docker Desktop MCP Toolkit
- Docker MCP Gateway provides team collaboration and CI/CD-ready configuration
- For CI/CD integration, see
references/workflow.mdGitHub Actions section - Playwright runs identically on macOS, Linux, and Windows via Docker
- All test data based on peer-reviewed medical literature
- Generated Playwright tests can run independently in CI pipelines
.mcp.jsonshould be committed to git for team sharing