name

shellspec

description

Comprehensive unit testing framework for Bash and POSIX shell scripts using ShellSpec with TDD/BDD best practices. Use when writing tests for shell scripts, debugging test failures, refactoring scripts for testability, setting up test infrastructure, mocking external dependencies, or implementing test-driven development for Bash/shell projects. Covers test structure, isolation, mocking, output capture, coverage, CI integration, and troubleshooting.

ShellSpec Unit Testing Guide

ShellSpec is a full-featured BDD unit testing framework for bash, ksh, zsh, dash and all POSIX shells. It brings professional-grade Test-Driven Development (TDD) practices to shell scripting.

Think of ShellSpec as: A translator between natural language test intentions and shell execution verification - like having a bilingual interpreter who understands both "what you want to test" and "how shells actually work."

Quick Start

Installation

# Install ShellSpec
curl -fsSL https://git.io/shellspec | sh -s -- -y

# Initialize project
shellspec --init

Basic Test Example

# lib/calculator.sh
add() { echo "$(($1 + $2))"; }

# spec/calculator_spec.sh
Describe 'Calculator'
  Include lib/calculator.sh
  
  It 'performs addition'
    When call add 2 3
    The output should eq 5
  End
End

Run tests: shellspec

Project Structure

project/
├── .shellspec              # Project configuration (mandatory)
├── .shellspec-local        # Local overrides (gitignored)
├── lib/                    # Production code
│   ├── module1.sh
│   └── module2.sh
├── spec/                   # Test specifications
│   ├── spec_helper.sh      # Global test setup
│   ├── support/            # Shared test utilities
│   │   ├── mocks.sh
│   │   └── helpers.sh
│   └── lib/
│       ├── module1_spec.sh
│       └── module2_spec.sh
├── coverage/               # Coverage reports (generated)
└── report/                 # Test reports (generated)

Test Structure

DSL Hierarchy

Describe 'Feature Name'              # Top-level grouping
  BeforeEach 'setup_function'        # Runs before each test
  AfterEach 'cleanup_function'       # Runs after each test
  
  Context 'when condition X'         # Scenario grouping
    It 'behaves in way Y'            # Individual test
      # GIVEN: Setup (arrange)
      local input="test data"
      
      # WHEN: Execute (act)
      When call function_under_test "$input"
      
      # THEN: Verify (assert)
      The output should equal "expected"
      The status should be success
    End
  End
End

Analogy: Think of tests like a filing cabinet - Describe is the drawer, Context is the folder, It is the document.

Execution Modes

Mode	Use Case	Isolation	Coverage
`When call func`	Unit test functions	Same shell (fast)	Yes
`When run script`	Integration test	New process	Yes
`When run source`	Hybrid approach	Subshell	Yes

Recommended: Use When call for unit tests (fastest), When run script for integration tests.

Making Scripts Testable

Pattern 0: Logger-Driven Testability (Foundation)

Analogy: Like a black box flight recorder - captures execution paths for post-flight analysis, enabling verification of which code branches executed.

For scripts verification via tests, use the project's logger (.scripts/_logger.sh). Each module declares its own unique logger tag that can be partially enabled or disabled via the DEBUG environment variable.

Module Setup:

#!/bin/bash
# my_module.sh

# shellcheck source=.scripts/_logger.sh
source "${E_BASH:-$(dirname "$0")}/_logger.sh"

process_data() {
  local input="$1"
  
  if [[ -z "$input" ]]; then
    echo:Mymodule "validation failed: empty input"
    return 1
  fi
  
  echo:Mymodule "processing: $input"
  # ... actual processing ...
  echo:Mymodule "completed successfully"
  return 0
}

# DO NOT allow execution of code bellow those line in shellspec tests
${__SOURCED__:+return}

# Declare unique logger for this module
logger:init "mymodule" "[mymodule] " ">&2"

Controlling Logger Output:

# Enable specific loggers via DEBUG variable
DEBUG=mymodule ./my_module.sh       # Enable only 'mymodule' logger
DEBUG=mymodule,hooks ./my_module.sh # Enable multiple loggers
DEBUG=* ./my_module.sh              # Enable all loggers
DEBUG=*,-mymodule ./my_module.sh    # Enable all except 'mymodule'

Test Verification Strategy:

Describe 'my_module'
  Include .scripts/_logger.sh
  Include my_module.sh
  
  # Helper functions to strip ANSI color codes for comparison
  # $1 = stdout, $2 = stderr, $3 = exit status
  no_colors_stderr() { echo -n "$2" | sed -E $'s/\x1B\\[[0-9;]*[A-Za-z]//g; s/\x1B\\([A-Z]//g; s/\x0F//g' | tr -s ' '; }
  no_colors_stdout() { echo -n "$1" | sed -E $'s/\x1B\\[[0-9;]*[A-Za-z]//g; s/\x1B\\([A-Z]//g; s/\x0F//g' | tr -s ' '; }
  
  BeforeEach 'enable_logger'
  enable_logger() {
    TAGS[mymodule]=1  # Enable logger output for verification
  }
  
  Context 'when input is empty'
    It 'logs validation failure and returns error'
      When call process_data ""
      The status should be failure
      # Verify execution path via logger output
      The stderr should include "validation failed: empty input"
    End
  End
  
  Context 'when input is valid'
    It 'logs processing steps and succeeds'
      When call process_data "test-data"
      The status should be success
      # Verify which branch executed via log messages
      The result of no_colors_stderr should include "processing: test-data"
      The result of no_colors_stderr should include "completed successfully"
    End
  End
End

Testability Balance: Achieve comprehensive test coverage by combining:

Verification Method	Use Case	Example
Logger output	Verify execution paths, internal decisions	`The stderr should include user-friendly message like "branch A executed"`
stdout	Verify user-facing output, function results	`The output should eq "result"`
stderr	Verify error messages, warnings	`The error should include "warning"`
Exit status	Verify success/failure outcomes	`The status should be failure`
Mocks/Spies	Verify external command calls in isolation	`Mock curl; ...; End`

Test Isolation Pattern (Mocking Logger Functions):

The practical approach used in tests is to:

Set DEBUG=tag once at the file level to enable specific logger(s)
Mock logger:init to suppress logger initialization side effects
Mock echo:Tag and printf:Tag functions to redirect output for verification

#!/usr/bin/env bash
# spec/mymodule_spec.sh

eval "$(shellspec - -c) exit 1"

# 1. Enable debug output for this module at file level
export DEBUG="mymodule"

# 2. Mock logger initialization to prevent side effects
Mock logger:init
  return 0
End

# 3. Mock logger output functions - redirect to stderr for test verification
Mock printf:Mymodule
  printf "$@" >&2
End

Mock echo:Mymodule
  echo "$@" >&2
End

# Helper to strip ANSI color codes for comparison
# $1 = stdout, $2 = stderr, $3 = exit status
no_colors_stderr() { 
  echo -n "$2" | sed -E $'s/\x1B\\[[0-9;]*[A-Za-z]//g; s/\x1B\\([A-Z]//g; s/\x0F//g' | tr -s ' '
}

Include ".scripts/mymodule.sh"

Describe 'mymodule'
  It 'verifies execution path via logger output'
    When call process_data "test"

    The status should be success
    # Verify which code branch executed via logger messages
    The result of function no_colors_stderr should include "processing: test"
  End
End

Alternative: Suppress Logger Output Entirely:

When you don't need to verify logger output but want to prevent noise:

# Mock logger functions to suppress all output
Mock echo:Mymodule
  :  # No-op - silently ignore
End

Mock printf:Mymodule
  :  # No-op - silently ignore
End

Pros: Precise verification of code paths, debuggable tests, controlled verbosity Cons: Requires logger discipline in production code Recommended action: Add unique logger to each module, use DEBUG=tag to control output, verify logs in tests

Pattern 1: Source Guard (Critical)

Analogy: Like a bouncer at a club - decides whether to let execution in based on context.

#!/bin/bash
# my_script.sh

# Testable functions
process_data() {
  validate_input "$1" || return 1
  transform_data "$1"
}

validate_input() {
  [[ -n "$1" ]] || return 1
}

# Source guard - prevents execution when sourced
if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then
  # Only runs when executed, not when sourced for testing
  process_data "$@"
fi

Test file:

Describe 'my_script'
  Include my_script.sh  # Loads functions without executing
  
  It 'validates input'
    When call validate_input ""
    The status should be failure
  End
End

Pros: Functions become unit-testable, zero test pollution in production code Cons: Requires discipline, legacy scripts need refactoring Recommended action: Always add source guards to new scripts

Pattern 2: Extract Functions from Pipelines

# ❌ BAD: Untestable inline pipeline
cat /etc/passwd | grep "^${USER}:" | cut -d: -f6

# ✅ GOOD: Testable functions
get_user_home() {
  local user="${1:-$USER}"
  grep "^${user}:" /etc/passwd | cut -d: -f6
}

process_user_homes() {
  local home
  home=$(get_user_home "$1")
  [[ -n "$home" ]] && check_bashrc "$home"
}

Recommended action: Extract every logical step into a named function

Dependency Isolation and Mocking

Three-Tier Mocking Strategy

1. Function-Based Mocks (Fastest)

Describe 'function mocking'
  date() { echo "2024-01-01 00:00:00"; }
  
  It 'uses mocked date'
    When call get_timestamp
    The output should eq "2024-01-01 00:00:00"
  End
End

2. Mock Block (Cleaner)

Describe 'command mocking'
  Mock curl
    echo '{"status": "success"}'
    return 0
  End
  
  It 'handles API response'
    When call fetch_data
    The output should include "success"
  End
End

3. Intercept Pattern (For Built-ins)

Describe 'intercepting built-ins'
  Intercept command
  
  __command__() {
    if [[ "$2" == "rm" ]]; then
      echo "MOCK: rm intercepted"
      return 0
    fi
    command "$@"
  }
  
  It 'safely mocks dangerous operations'
    When run source ./cleanup_script.sh
    The output should include "MOCK: rm intercepted"
  End
End

Decision Matrix:

Dependency	Mock?	Rationale
Network (curl, wget)	✅ Always	Slow, unreliable
Date/time	✅ Always	Reproducibility
Random values	✅ Always	Deterministic tests
System commands (grep, sed)	❌ Rarely	Fast, stable
File I/O	⚠️ Sometimes	Use temp dirs

Recommended action: Mock boundaries (network, time, random), trust stable commands

Output Capture and Comparison

Capturing stdout, stderr, and exit status

It 'captures all output streams'
  When call function_with_output
  The output should eq "stdout message"      # stdout
  The error should eq "error message"        # stderr
  The status should be success               # exit code (0)
End

Comparing Without Color Codes

# Helper functions to strip ANSI color codes for comparison
# $1 = stdout, $2 = stderr, $3 = exit status
no_colors_stderr() { echo -n "$2" | sed -E $'s/\x1B\\[[0-9;]*[A-Za-z]//g; s/\x1B\\([A-Z]//g; s/\x0F//g' | tr -s ' '; }
no_colors_stdout() { echo -n "$1" | sed -E $'s/\x1B\\[[0-9;]*[A-Za-z]//g; s/\x1B\\([A-Z]//g; s/\x0F//g' | tr -s ' '; }

It 'compares without colors'
  When call colored_output
  The result of no_colors_stdout should eq "Plain text"
  The result of no_colors_stderr should eq "Plain text"
End

# Alternative: Disable colors via environment
It 'disables colors'
  BeforeCall 'export NO_COLOR=1'
  BeforeCall 'export TERM=dumb'
  When call my_command
  The output should not include pattern '\x1B\['
End

Recommended action: Set NO_COLOR=1 in test environment to avoid color-stripping

Test Environment Management

Temporary Directories

eval "$(shellspec - -c) exit 1"

# shellcheck disable=SC2288
% TEST_DIR: "$SHELLSPEC_TMPBASE/tmprepo"

Describe 'isolated environment'
  BeforeEach 'setup_test_env'
  AfterEach 'cleanup_test_env'
  
  setup_test_env() {
    mkdir -p "$TEST_DIR" || true
    cd "$TEST_DIR" || exit 1
  }
  
  cleanup_test_env() {
    cd - >/dev/null
    rm -rf "$TEST_DIR"
  }
  
  It 'works in isolation'
    When call touch test.txt
    The path test.txt should be file
  End
End

Pros: Complete isolation, no test interference Cons: Requires discipline in cleanup Recommended action: Always use temp dirs, never write to fixed paths

Parameterized Tests

Describe 'data-driven tests'
  Parameters
    "#1" "valid"   "success" 0
    "#2" "invalid" "error"   1
    "#3" "empty"   "missing" 1
  End
  
  It "handles $1 input"
    When call validate "$2"
    The output should include "$3"
    The status should eq $4
  End
End

Recommended action: Use Parameters to avoid copy-paste test code

Code Coverage

Enable Coverage

# .shellspec configuration
--kcov
--kcov-options "--include-pattern=.sh"
--kcov-options "--exclude-pattern=/spec/,/coverage/"
--kcov-options "--fail-under-percent=80"

# Run with coverage
shellspec --kcov

# View report
open coverage/index.html

Coverage Goals:

Critical paths: 100%
Main functionality: 80-90%
Utility functions: 70-80%

Recommended action: Enable coverage from day 1, set minimum thresholds

Test Execution

Running Tests

# All tests
shellspec

# Specific file
shellspec spec/module_spec.sh

# Specific line
shellspec spec/module_spec.sh:42

# Only previously failed
shellspec --quick

# Stop on first failure
shellspec --fail-fast

# Parallel execution (4 jobs)
shellspec --jobs 4

# With debug trace
shellspec --xtrace

# Focus mode (run only fIt, fDescribe)
shellspec --focus

Focus Mode for TDD

Describe 'feature under development'
  fIt 'focused test - runs only this'
    When call new_feature
    The status should be success
  End
  
  It 'other test - skipped during focus'
    When call other_feature
  End
End

Recommended action: Use --quick and --focus for rapid TDD cycles

CI/CD Integration

JUnit Reports

# .shellspec configuration
--format junit
--output report/junit.xml

# Run in CI
shellspec --kcov --format junit

GitHub Actions Example

name: ShellSpec Tests
on: [push, pull_request]
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v3
    - name: Install ShellSpec
      run: curl -fsSL https://git.io/shellspec | sh -s -- -y
    - name: Run tests
      run: shellspec --kcov --format junit
    - name: Upload coverage
      uses: codecov/codecov-action@v3

Recommended action: Generate JUnit reports for CI integration

Troubleshooting

Debug Failed Tests

# Show execution trace
shellspec --xtrace spec/failing_spec.sh

# Check spec syntax
shellspec --syntax-check spec/failing_spec.sh

# See generated shell code
shellspec --translate spec/failing_spec.sh

# Inspect output during test
It 'debugs with Dump'
  When call my_function
  Dump  # Shows stdout, stderr, status
  The output should not be blank
End

Common Issues

Problem: Test passes alone but fails in suite Solution: Check for global state leakage, ensure cleanup in AfterEach

Problem: Can't mock external command Solution: Use Mock block or Intercept for built-ins

Problem: Tests are slow Solution: Enable parallel execution with --jobs 4

Problem: Coverage doesn't work Solution: Ensure kcov is installed, check --kcov-options

Best Practices Checklist

All scripts have source guards
External dependencies are mocked
Tests use temporary directories
Each test verifies one behavior
Tests follow GIVEN/WHEN/THEN structure
Coverage is enabled and > 80%
Tests run in parallel (--jobs)
JUnit reports generated for CI
No hard-coded paths or commands
Side effects are documented and tested

Common Anti-Patterns to Avoid

❌ Global state mutation

COUNTER=0  # Bad: mutable global
increment() { COUNTER=$((COUNTER + 1)); }

✅ Return values

increment() { echo $(($1 + 1)); }  # Good: pure function
counter=$(increment "$counter")

❌ Testing implementation details

It 'calls grep with -E flag'  # Bad: too coupled

✅ Testing behavior

It 'finds files matching pattern'  # Good: tests outcome

❌ Unmocked network calls

It 'fetches real data' curl https://api.com  # Bad: slow, flaky

✅ Mocked dependencies

Mock curl; echo "mock"; End  # Good: fast, reliable

Advanced Topics

For deeper coverage of advanced patterns, see:

Advanced Patterns: See references/advanced-patterns.md for complex mocking, spies, state management
Troubleshooting Guide: See references/troubleshooting.md for systematic debugging
Real-World Examples: See references/real-world-examples.md for production patterns from top OSS projects
Collected Experience: See references/collected-experience.md for lessons learned from multiple projects

Quick Reference

# Common commands
shellspec                   # Run all tests
shellspec --quick           # Re-run failures only
shellspec --xtrace          # Debug trace
shellspec --kcov            # With coverage
shellspec --format junit    # JUnit output

# DSL basics
Describe/Context/It/End     # Test structure
When call/run/source        # Execution
The output/status/error     # Assertions
Mock/Intercept              # Mocking
BeforeEach/AfterEach        # Hooks
Parameters                  # Data-driven tests
Dump                        # Debug helper

Recommended action: Start with simple unit tests, add coverage, then integrate into CI