| name | starwards-tdd |
| description | Test-driven development for Starwards - write the test first, watch it fail, write minimal code to pass; includes Jest unit tests, Playwright E2E tests, Colyseus state sync, @gameField decorators, Tweakpane UI, and multiplayer scenarios |
| version | Tue Nov 04 2025 00:00:00 GMT+0000 (Coordinated Universal Time) |
| related_skills | starwards-debugging (when tests fail unexpectedly), starwards-verification (verify all tests pass before completion), starwards-monorepo (understand build workflow), starwards-colyseus (test state sync patterns), using-superpowers (announce skill usage) |
Test-Driven Development for Starwards
Overview
Write the test first. Watch it fail. Write minimal code to pass.
Core principle: "If you didn't watch the test fail, you don't know if it tests the right thing."
Starwards-specific: Test state sync, decorators, multiplayer scenarios, and UI interactions.
When to Use
Always:
- New ship systems (@gameField decorators)
- New space objects (Spaceship, Projectile, etc.)
- Bug fixes in game logic
- UI widget changes (Tweakpane panels)
- Multiplayer scenarios (Colyseus state sync)
- Command handlers (JSON Pointer or typed commands)
Exceptions (ask first):
- Throwaway prototypes
- Configuration files
- Static assets
The Iron Law
NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
Write code before the test? Delete it. Start over.
No exceptions:
- Don't keep it as "reference"
- Don't "adapt" it while writing tests
- Don't look at it
- Delete means delete
Implement fresh from tests. Period.
Violating the letter of the rules is violating the spirit of the rules.
Starwards Test Types
1. Unit Tests (Jest) - modules/*/test/*.spec.ts
For: Game logic, state managers, utility functions
Run: npm test (all) or npm test -- modules/core/test/specific.spec.ts
Pattern:
// modules/core/test/shield.spec.ts
import { Shield } from '../src/ship/shield';
describe('Shield', () => {
test('recharges at design rate', () => {
const shield = new Shield();
shield.strength = 500;
shield.design.rechargeRate = 100;
// Simulate 1 second update
shield.strength += shield.design.rechargeRate * 1.0;
expect(shield.strength).toBeCloseTo(600, 1);
});
});
2. Integration Tests (Jest + Test Harness) - modules/*/test/*.spec.ts
For: Ship systems interaction, multiplayer scenarios, state sync
Use: ShipTestHarness or MultiClientDriver from docs/testing/UTILITIES.md
Pattern:
// modules/core/test/shield-sync.spec.ts
import { ShipTestHarness } from './ship-test-harness';
describe('Shield state sync', () => {
test('strength syncs to all clients', async () => {
const harness = new ShipTestHarness();
await harness.connect();
// Change shield strength server-side
harness.shipManager.state.shield.strength = 750;
// Wait for sync
await harness.waitForSync();
// Verify client received update
expect(harness.shipDriver.state.shield.strength).toBe(750);
await harness.cleanup();
});
});
3. E2E Tests (Playwright) - modules/e2e/test/*.spec.ts
For: UI interactions, Tweakpane panels, visual verification
Run: npm run test:e2e or npm run test:e2e -- --headed
Pattern:
// modules/e2e/test/shield-panel.spec.ts
import { test, expect } from '@playwright/test';
import { createTestClient } from './test-infrastructure';
test('shield power slider updates strength', async ({ page }) => {
const client = await createTestClient(page);
// Locate shield panel by data-id
const shieldPanel = page.locator('[data-id="Shield Status"]');
await expect(shieldPanel).toBeVisible();
// Get initial strength
const initialStrength = await getPropertyValue(page, 'strength', 'Shield Status');
// Adjust power slider
await shieldPanel.locator('.tp-slider').fill('0.5');
// Wait for state update
await page.waitForTimeout(100);
// Verify strength changed
const newStrength = await getPropertyValue(page, 'strength', 'Shield Status');
expect(newStrength).not.toBe(initialStrength);
await client.cleanup();
});
Red-Green-Refactor: Starwards Examples
Example 1: New Ship System with @gameField
RED - Write Failing Test:
// modules/core/test/shield.spec.ts
import { Shield } from '../src/ship/shield';
describe('Shield', () => {
test('has max strength from design', () => {
const shield = new Shield();
shield.design.maxStrength = 1000;
expect(shield.strength).toBeLessThanOrEqual(shield.design.maxStrength);
});
});
Verify RED:
$ npm test -- modules/core/test/shield.spec.ts
FAIL: Cannot find module '../src/ship/shield'
Good! Test fails because Shield doesn't exist.
GREEN - Minimal Code:
// modules/core/src/ship/shield.ts
import { SystemState, DesignState } from './system';
import { gameField } from '../game-field';
import { range } from '../range';
class ShieldDesign extends DesignState {
@gameField('float32') maxStrength = 1000;
}
export class Shield extends SystemState {
@gameField(ShieldDesign) design = new ShieldDesign();
@gameField('float32')
@range((t: Shield) => [0, t.design.maxStrength])
strength = 1000;
}
Verify GREEN:
$ npm test -- modules/core/test/shield.spec.ts
PASS ✓ Shield > has max strength from design
REFACTOR: Extract common system patterns if needed.
Example 2: Multiplayer State Sync
RED - Write Failing Test:
// modules/core/test/shield-command.spec.ts
import { ShipTestHarness } from './ship-test-harness';
describe('Shield power command', () => {
test('syncs shield power to clients', async () => {
const harness = new ShipTestHarness();
await harness.connect();
// Send command from client
harness.shipDriver.room.send({
type: '/Spaceship/ship-0/shield/power',
value: 0.5
});
await harness.waitForSync();
// Verify server received update
expect(harness.shipManager.state.shield.power).toBe(0.5);
await harness.cleanup();
});
});
Verify RED:
$ npm test -- modules/core/test/shield-command.spec.ts
FAIL: Property 'shield' does not exist on ShipState
GREEN - Add to ShipState:
// modules/core/src/ship/ship-state.ts
@gameField(Shield) shield!: Shield;
Verify GREEN:
$ npm test -- modules/core/test/shield-command.spec.ts
PASS ✓ Shield power command > syncs shield power to clients
Example 3: Tweakpane UI Widget
RED - Write E2E Test:
// modules/e2e/test/shield-widget.spec.ts
import { test, expect } from '@playwright/test';
import { createTestClient } from './test-infrastructure';
test('shield widget displays current strength', async ({ page }) => {
const client = await createTestClient(page);
// Navigate to ship screen with shield widget
await page.goto('http://localhost:3000/#/ship/ship-0');
// Find shield panel by data-id
const shieldPanel = page.locator('[data-id="Shield Status"]');
await expect(shieldPanel).toBeVisible();
// Check strength label exists
await expect(shieldPanel.locator('label:has-text("strength")')).toBeVisible();
await client.cleanup();
});
Verify RED:
$ npm run test:e2e -- shield-widget.spec.ts
FAIL: Locator not found: [data-id="Shield Status"]
GREEN - Create Widget:
// modules/browser/src/widgets/shield.ts
import { createPane } from '../panel/blades';
import { ShipDriver } from '@starwards/core';
export function renderShield(ship: ShipDriver, container: HTMLElement) {
const pane = createPane({
title: 'Shield Status',
container
});
const shield = ship.state.shield;
pane.addBinding(shield, 'strength', {
readonly: true,
label: 'strength'
});
return container;
}
Register widget in Dashboard, rebuild, verify GREEN.
Starwards-Specific Patterns
Testing @gameField Decorators
test('@gameField syncs float32 values', () => {
const shield = new Shield();
shield.strength = 123.456789;
// Float32 precision loss expected
expect(shield.strength).toBeCloseTo(123.46, 1);
});
Testing @range Constraints
test('@range clamps shield strength to design max', () => {
const shield = new Shield();
shield.design.maxStrength = 1000;
shield.strength = 1500; // Exceeds max
// @range decorator should clamp
expect(shield.strength).toBe(1000);
});
Testing Multiplayer with MultiClientDriver
import { MultiClientDriver } from '@starwards/server/test/multi-client-driver';
test('multiple clients see same shield state', async () => {
const driver = new MultiClientDriver();
await driver.start();
const [client1, client2] = await Promise.all([
driver.joinShip('ship-1'),
driver.joinShip('ship-1')
]);
// Change shield on server
driver.getShipManager('ship-1').state.shield.strength = 800;
await driver.waitForSync();
// Both clients see update
expect(client1.state.shield.strength).toBe(800);
expect(client2.state.shield.strength).toBe(800);
await driver.cleanup();
});
Testing UI with Page Object Pattern
class ShieldPanelPage {
constructor(private page: Page) {}
async setPower(value: number) {
const slider = this.page.locator('[data-id="Shield Status"] .tp-slider');
await slider.fill(value.toString());
}
async getStrength(): Promise<number> {
return getPropertyValue(this.page, 'strength', 'Shield Status');
}
}
test('adjusting shield power affects strength', async ({ page }) => {
const client = await createTestClient(page);
const shieldPanel = new ShieldPanelPage(page);
await shieldPanel.setPower(0.5);
await page.waitForTimeout(100);
const strength = await shieldPanel.getStrength();
expect(strength).toBeGreaterThan(0);
await client.cleanup();
});
Verification Commands
Unit tests:
npm test # All tests
npm test -- modules/core/test/shield.spec.ts # Specific file
npm test -- --testNamePattern="shield recharge" # Specific test
npm test -- --watch # Watch mode
E2E tests:
npm run test:e2e # Headless
npm run test:e2e -- --headed # With browser
npm run test:e2e -- shield-widget.spec.ts # Specific file
npm run test:e2e -- --update-snapshots # Update screenshots
Full verification:
npm run test:types # TypeScript check
npm run test:format # ESLint + Prettier
npm test # Unit tests
npm run test:e2e # E2E tests
Common Starwards Test Patterns
Float Precision
// WRONG
expect(state.speed).toBe(123.456789);
// CORRECT
expect(state.speed).toBeCloseTo(123.46, 1);
Angle Wrapping
test('angle wraps at 360°', () => {
ship.angle = 370;
expect(ship.angle).toBe(10); // toPositiveDegreesDelta
});
Velocity Zero Check
// WRONG
expect(ship.velocity.x === 0 && ship.velocity.y === 0).toBe(true);
// CORRECT
expect(XY.isZero(ship.velocity, 0.01)).toBe(true);
System Effectiveness
test('broken system has 0 effectiveness', () => {
shield.broken = true;
expect(shield.effectiveness).toBe(0);
});
test('hacked system reduces effectiveness', () => {
shield.hacked = 0.3;
shield.power = 1.0;
const expected = 1.0 * (1 - 0.3); // power × (1 - hacked)
expect(shield.effectiveness).toBeCloseTo(expected, 2);
});
Monorepo Testing Considerations
Run tests from root:
npm test # Runs all module tests
npm test -- --projects=core # Only core module
Module-specific:
cd modules/core && npm test
cd modules/server && npm test
Watch during development:
# Terminal 1: Build core on change
cd modules/core && npm run build:watch
# Terminal 2: Run tests on change
npm test -- --watch
Integration with Other Skills
- systematic-debugging - Use when tests fail unexpectedly
- verification-before-completion - Run full test suite before claiming done
- starwards-monorepo - Understand workspace test organization
Common Rationalizations - STOP
| Excuse | Reality |
|---|---|
| "Too simple to test" | Simple code breaks. Test takes 30 seconds. |
| "I'll test after" | Tests passing immediately prove nothing. |
| "Tests after achieve same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" |
| "Already manually tested" | Ad-hoc ≠ systematic. No record, can't re-run. |
| "Deleting X hours is wasteful" | Sunk cost fallacy. Keeping unverified code is technical debt. |
| "Keep as reference, write tests first" | You'll adapt it. That's testing after. Delete means delete. |
| "Need to explore first" | Fine. Throw away exploration, start with TDD. |
| "Test hard = design unclear" | Listen to test. Hard to test = hard to use. |
| "TDD will slow me down" | TDD faster than debugging. Pragmatic = test-first. |
| "Manual test faster" | Manual doesn't prove edge cases. You'll re-test every change. |
Red Flags - STOP and Start Over
- Code before test
- Test after implementation
- Test passes immediately (didn't watch it fail)
- Can't explain why test failed
- Tests added "later"
- Rationalizing "just this once"
- "I already manually tested it"
- "Tests after achieve the same purpose"
- "It's about spirit not ritual"
- "Keep as reference" or "adapt existing code"
- "Already spent X hours, deleting is wasteful"
- "TDD is dogmatic, I'm being pragmatic"
- Skipping Playwright tests for UI changes
- Not using ShipTestHarness for multiplayer tests
- Mocking Colyseus state sync instead of testing it
- "I'll add E2E tests later" for new widgets
All of these mean: Delete code. Start over with TDD.
Final Rule
Production code → test exists and failed first
UI widget → E2E test exists and failed first
Multiplayer feature → integration test with harness exists and failed first
Otherwise → not TDD
No exceptions.