name	Writing Tests
description	Principles for writing effective, maintainable tests. Covers naming conventions, assertion best practices, and comprehensive edge case checklists. Based on BugMagnet by Gojko Adzic.

GitHub Copilot Skill: writing-tests

Note: This skill has been adapted from claude-skillz for use with GitHub Copilot Agent Skills.

Writing Tests

How to write tests that catch bugs, document behavior, and remain maintainable.

Based on BugMagnet by Gojko Adzic. Adapted with attribution.

Critical Rules

🚨 Test names describe outcomes, not actions. "returns empty array when input is null" not "test null input". The name IS the specification.

🚨 Assertions must match test titles. If the test claims to verify "different IDs", assert on the actual ID values—not just count or existence.

🚨 Assert specific values, not types. expect(result).toEqual(['First.', ' Second.']) not expect(result).toBeDefined(). Specific assertions catch specific bugs.

🚨 One concept per test. Each test verifies one behavior. If you need "and" in your test name, split it.

🚨 Bugs cluster together. When you find one bug, test related scenarios. The same misunderstanding often causes multiple failures.

When This Applies

Writing new tests
Reviewing test quality
During TDD RED phase (writing the failing test)
Expanding test coverage
Investigating discovered bugs

Test Naming

Pattern: [outcome] when [condition]

Good Names (Describe Outcomes)

returns empty array when input is null
throws ValidationError when email format invalid
calculates tax correctly for tax-exempt items
preserves original order when duplicates removed

Bad Names (Describe Actions)

test null input           // What about null input?
should work               // What does "work" mean?
handles edge cases        // Which edge cases?
email validation test     // What's being validated?

The Specification Test

Your test name should read like a specification. If someone reads ONLY the test names, they should understand the complete behavior of the system.

Assertion Best Practices

Assert Specific Values

// ❌ WEAK - passes even if completely wrong data
expect(result).toBeDefined()
expect(result.items).toHaveLength(2)
expect(user).toBeTruthy()

// ✅ STRONG - catches actual bugs
expect(result).toEqual({ status: 'success', items: ['a', 'b'] })
expect(user.email).toBe('test@example.com')

Match Assertions to Test Title

// ❌ TEST SAYS "different IDs" BUT ASSERTS COUNT
it('generates different IDs for each call', () => {
  const id1 = generateId()
  const id2 = generateId()
  expect([id1, id2]).toHaveLength(2)  // WRONG: doesn't check they're different!
})

// ✅ ACTUALLY VERIFIES DIFFERENT IDs
it('generates different IDs for each call', () => {
  const id1 = generateId()
  const id2 = generateId()
  expect(id1).not.toBe(id2)  // RIGHT: verifies the claim
})

Avoid Implementation Coupling

// ❌ BRITTLE - tests implementation details
expect(mockDatabase.query).toHaveBeenCalledWith('SELECT * FROM users WHERE id = 1')

// ✅ FLEXIBLE - tests behavior
expect(result.user.name).toBe('Alice')

Test Structure

Arrange-Act-Assert

it('calculates total with tax for non-exempt items', () => {
  // Arrange: Set up test data
  const item = { price: 100, taxExempt: false }
  const taxRate = 0.1

  // Act: Execute the behavior
  const total = calculateTotal(item, taxRate)

  // Assert: Verify the outcome
  expect(total).toBe(110)
})

One Concept Per Test

// ❌ MULTIPLE CONCEPTS - hard to diagnose failures
it('validates and processes order', () => {
  expect(validate(order)).toBe(true)
  expect(process(order).status).toBe('complete')
  expect(sendEmail).toHaveBeenCalled()
})

// ✅ SINGLE CONCEPT - clear failures
it('accepts valid orders', () => {
  expect(validate(validOrder)).toBe(true)
})

it('rejects orders with negative quantities', () => {
  expect(validate(negativeQuantityOrder)).toBe(false)
})

it('sends confirmation email after processing', () => {
  process(order)
  expect(sendEmail).toHaveBeenCalledWith(order.customerEmail)
})

Edge Case Checklists

When testing a function, systematically consider these edge cases based on input types.

Numbers

Zero
Negative numbers
Very large numbers (near MAX_SAFE_INTEGER)
Very small numbers (near MIN_SAFE_INTEGER)
Decimal precision (0.1 + 0.2)
NaN
Infinity / -Infinity
Boundary values (off-by-one at limits)

Strings

Empty string ""
Whitespace only " "
Very long strings (10K+ characters)
Unicode: emojis 👨‍👩‍👧‍👦, RTL text, combining characters
Special characters: quotes, backslashes, null bytes
SQL/HTML/script injection patterns
Leading/trailing whitespace
Mixed case sensitivity

Collections (Arrays, Objects, Maps)

Empty collection [], {}
Single element
Duplicates
Nested structures
Circular references
Very large collections (performance)
Sparse arrays
Mixed types in arrays

Dates and Times

Leap years (Feb 29)
Daylight saving transitions
Timezone boundaries
Midnight (00:00:00)
End of day (23:59:59)
Year boundaries (Dec 31 → Jan 1)
Invalid dates (Feb 30, Month 13)
Unix epoch edge cases
Far future/past dates

Null and Undefined

null input
undefined input
Missing optional properties
Explicit undefined vs missing key

Domain-Specific

Email: valid formats, edge cases (plus signs, subdomains)
URLs: protocols, ports, special characters, relative paths
Phone numbers: international formats, extensions
Addresses: Unicode, multi-line, missing components
Currency: rounding, different currencies, zero amounts
Percentages: 0%, 100%, over 100%

Violated Domain Constraints

These test implicit assumptions in your domain:

Uniqueness violations (duplicate IDs, emails)
Missing required relationships (orphaned records)
Ordering violations (events out of sequence)
Range breaches (age -1, quantity 1000000)
State inconsistencies (shipped but not paid)
Format mismatches (expected JSON, got XML)
Temporal ordering (end before start)

Bug Clustering

When you discover a bug, don't stop—explore related scenarios:

Same function, similar inputs - If null fails, test undefined, empty string
Same pattern, different locations - If one endpoint mishandles auth, check others
Same developer assumption - If off-by-one here, check other boundaries
Same data type - If dates fail at DST, check other time edge cases

When Tempted to Cut Corners

If your test name says "test" or "should work": STOP. What outcome are you actually verifying? Name it specifically.
If you're asserting toBeDefined() or toBeTruthy(): STOP. What value do you actually expect? Assert that instead.
If your assertion doesn't match your test title: STOP. Either fix the assertion or rename the test. They must agree.
If you're testing multiple concepts in one test: STOP. Split it. Future you debugging a failure will thank you.
If you found a bug and wrote one test: STOP. Bugs cluster. What related scenarios might have the same problem?
If you're skipping edge cases because "that won't happen": STOP. It will happen. In production. At 3 AM.

Integration with Other Skills

With TDD Process: This skill guides the RED phase—how to write the failing test well.

With Software Design Principles: Testable code follows design principles. Hard-to-test code often has design problems.

Writing Tests

Install Skill

SKILL.md

GitHub Copilot Skill: writing-tests

Writing Tests

Critical Rules

When This Applies

Test Naming

Good Names (Describe Outcomes)

Bad Names (Describe Actions)

The Specification Test

Assertion Best Practices

Assert Specific Values

Match Assertions to Test Title

Avoid Implementation Coupling

Test Structure

Arrange-Act-Assert

One Concept Per Test

Edge Case Checklists

Numbers

Strings

Collections (Arrays, Objects, Maps)

Dates and Times

Null and Undefined

Domain-Specific

Violated Domain Constraints

Bug Clustering

When Tempted to Cut Corners

Integration with Other Skills