Claude Code Plugins

Community-maintained marketplace

Feedback

Comprehensive unit testing framework for Bash and POSIX shell scripts using ShellSpec with TDD/BDD best practices. Use when writing tests for shell scripts, debugging test failures, refactoring scripts for testability, setting up test infrastructure, mocking external dependencies, or implementing test-driven development for Bash/shell projects. Covers test structure, isolation, mocking, output capture, coverage, CI integration, and troubleshooting.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name shellspec
description Comprehensive unit testing framework for Bash and POSIX shell scripts using ShellSpec with TDD/BDD best practices. Use when writing tests for shell scripts, debugging test failures, refactoring scripts for testability, setting up test infrastructure, mocking external dependencies, or implementing test-driven development for Bash/shell projects. Covers test structure, isolation, mocking, output capture, coverage, CI integration, and troubleshooting.

ShellSpec Unit Testing Guide

ShellSpec is a full-featured BDD unit testing framework for bash, ksh, zsh, dash and all POSIX shells. It brings professional-grade Test-Driven Development (TDD) practices to shell scripting.

Think of ShellSpec as: A translator between natural language test intentions and shell execution verification - like having a bilingual interpreter who understands both "what you want to test" and "how shells actually work."

Quick Start

Installation

# Install ShellSpec
curl -fsSL https://git.io/shellspec | sh -s -- -y

# Initialize project
shellspec --init

Basic Test Example

# lib/calculator.sh
add() { echo "$(($1 + $2))"; }

# spec/calculator_spec.sh
Describe 'Calculator'
  Include lib/calculator.sh
  
  It 'performs addition'
    When call add 2 3
    The output should eq 5
  End
End

Run tests: shellspec

Project Structure

project/
├── .shellspec              # Project configuration (mandatory)
├── .shellspec-local        # Local overrides (gitignored)
├── lib/                    # Production code
│   ├── module1.sh
│   └── module2.sh
├── spec/                   # Test specifications
│   ├── spec_helper.sh      # Global test setup
│   ├── support/            # Shared test utilities
│   │   ├── mocks.sh
│   │   └── helpers.sh
│   └── lib/
│       ├── module1_spec.sh
│       └── module2_spec.sh
├── coverage/               # Coverage reports (generated)
└── report/                 # Test reports (generated)

Test Structure

DSL Hierarchy

Describe 'Feature Name'              # Top-level grouping
  BeforeEach 'setup_function'        # Runs before each test
  AfterEach 'cleanup_function'       # Runs after each test
  
  Context 'when condition X'         # Scenario grouping
    It 'behaves in way Y'            # Individual test
      # GIVEN: Setup (arrange)
      local input="test data"
      
      # WHEN: Execute (act)
      When call function_under_test "$input"
      
      # THEN: Verify (assert)
      The output should equal "expected"
      The status should be success
    End
  End
End

Analogy: Think of tests like a filing cabinet - Describe is the drawer, Context is the folder, It is the document.

Execution Modes

Mode Use Case Isolation Coverage
When call func Unit test functions Same shell (fast) Yes
When run script Integration test New process Yes
When run source Hybrid approach Subshell Yes

Recommended: Use When call for unit tests (fastest), When run script for integration tests.

Making Scripts Testable

Pattern 0: Logger-Driven Testability (Foundation)

Analogy: Like a black box flight recorder - captures execution paths for post-flight analysis, enabling verification of which code branches executed.

For scripts verification via tests, use the project's logger (.scripts/_logger.sh). Each module declares its own unique logger tag that can be partially enabled or disabled via the DEBUG environment variable.

Module Setup:

#!/bin/bash
# my_module.sh

# shellcheck source=.scripts/_logger.sh
source "${E_BASH:-$(dirname "$0")}/_logger.sh"

process_data() {
  local input="$1"
  
  if [[ -z "$input" ]]; then
    echo:Mymodule "validation failed: empty input"
    return 1
  fi
  
  echo:Mymodule "processing: $input"
  # ... actual processing ...
  echo:Mymodule "completed successfully"
  return 0
}

# DO NOT allow execution of code bellow those line in shellspec tests
${__SOURCED__:+return}

# Declare unique logger for this module
logger:init "mymodule" "[mymodule] " ">&2"

Controlling Logger Output:

# Enable specific loggers via DEBUG variable
DEBUG=mymodule ./my_module.sh       # Enable only 'mymodule' logger
DEBUG=mymodule,hooks ./my_module.sh # Enable multiple loggers
DEBUG=* ./my_module.sh              # Enable all loggers
DEBUG=*,-mymodule ./my_module.sh    # Enable all except 'mymodule'

Test Verification Strategy:

Describe 'my_module'
  Include .scripts/_logger.sh
  Include my_module.sh
  
  # Helper functions to strip ANSI color codes for comparison
  # $1 = stdout, $2 = stderr, $3 = exit status
  no_colors_stderr() { echo -n "$2" | sed -E $'s/\x1B\\[[0-9;]*[A-Za-z]//g; s/\x1B\\([A-Z]//g; s/\x0F//g' | tr -s ' '; }
  no_colors_stdout() { echo -n "$1" | sed -E $'s/\x1B\\[[0-9;]*[A-Za-z]//g; s/\x1B\\([A-Z]//g; s/\x0F//g' | tr -s ' '; }
  
  BeforeEach 'enable_logger'
  enable_logger() {
    TAGS[mymodule]=1  # Enable logger output for verification
  }
  
  Context 'when input is empty'
    It 'logs validation failure and returns error'
      When call process_data ""
      The status should be failure
      # Verify execution path via logger output
      The stderr should include "validation failed: empty input"
    End
  End
  
  Context 'when input is valid'
    It 'logs processing steps and succeeds'
      When call process_data "test-data"
      The status should be success
      # Verify which branch executed via log messages
      The result of no_colors_stderr should include "processing: test-data"
      The result of no_colors_stderr should include "completed successfully"
    End
  End
End

Testability Balance: Achieve comprehensive test coverage by combining:

Verification Method Use Case Example
Logger output Verify execution paths, internal decisions The stderr should include user-friendly message like "branch A executed"
stdout Verify user-facing output, function results The output should eq "result"
stderr Verify error messages, warnings The error should include "warning"
Exit status Verify success/failure outcomes The status should be failure
Mocks/Spies Verify external command calls in isolation Mock curl; ...; End

Test Isolation Pattern (Mocking Logger Functions):

The practical approach used in tests is to:

  1. Set DEBUG=tag once at the file level to enable specific logger(s)
  2. Mock logger:init to suppress logger initialization side effects
  3. Mock echo:Tag and printf:Tag functions to redirect output for verification
#!/usr/bin/env bash
# spec/mymodule_spec.sh

eval "$(shellspec - -c) exit 1"

# 1. Enable debug output for this module at file level
export DEBUG="mymodule"

# 2. Mock logger initialization to prevent side effects
Mock logger:init
  return 0
End

# 3. Mock logger output functions - redirect to stderr for test verification
Mock printf:Mymodule
  printf "$@" >&2
End

Mock echo:Mymodule
  echo "$@" >&2
End

# Helper to strip ANSI color codes for comparison
# $1 = stdout, $2 = stderr, $3 = exit status
no_colors_stderr() { 
  echo -n "$2" | sed -E $'s/\x1B\\[[0-9;]*[A-Za-z]//g; s/\x1B\\([A-Z]//g; s/\x0F//g' | tr -s ' '
}

Include ".scripts/mymodule.sh"

Describe 'mymodule'
  It 'verifies execution path via logger output'
    When call process_data "test"

    The status should be success
    # Verify which code branch executed via logger messages
    The result of function no_colors_stderr should include "processing: test"
  End
End

Alternative: Suppress Logger Output Entirely:

When you don't need to verify logger output but want to prevent noise:

# Mock logger functions to suppress all output
Mock echo:Mymodule
  :  # No-op - silently ignore
End

Mock printf:Mymodule
  :  # No-op - silently ignore
End

Pros: Precise verification of code paths, debuggable tests, controlled verbosity Cons: Requires logger discipline in production code Recommended action: Add unique logger to each module, use DEBUG=tag to control output, verify logs in tests

Pattern 1: Source Guard (Critical)

Analogy: Like a bouncer at a club - decides whether to let execution in based on context.

#!/bin/bash
# my_script.sh

# Testable functions
process_data() {
  validate_input "$1" || return 1
  transform_data "$1"
}

validate_input() {
  [[ -n "$1" ]] || return 1
}

# Source guard - prevents execution when sourced
if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then
  # Only runs when executed, not when sourced for testing
  process_data "$@"
fi

Test file:

Describe 'my_script'
  Include my_script.sh  # Loads functions without executing
  
  It 'validates input'
    When call validate_input ""
    The status should be failure
  End
End

Pros: Functions become unit-testable, zero test pollution in production code Cons: Requires discipline, legacy scripts need refactoring Recommended action: Always add source guards to new scripts

Pattern 2: Extract Functions from Pipelines

# ❌ BAD: Untestable inline pipeline
cat /etc/passwd | grep "^${USER}:" | cut -d: -f6

# ✅ GOOD: Testable functions
get_user_home() {
  local user="${1:-$USER}"
  grep "^${user}:" /etc/passwd | cut -d: -f6
}

process_user_homes() {
  local home
  home=$(get_user_home "$1")
  [[ -n "$home" ]] && check_bashrc "$home"
}

Recommended action: Extract every logical step into a named function

Dependency Isolation and Mocking

Three-Tier Mocking Strategy

1. Function-Based Mocks (Fastest)

Describe 'function mocking'
  date() { echo "2024-01-01 00:00:00"; }
  
  It 'uses mocked date'
    When call get_timestamp
    The output should eq "2024-01-01 00:00:00"
  End
End

2. Mock Block (Cleaner)

Describe 'command mocking'
  Mock curl
    echo '{"status": "success"}'
    return 0
  End
  
  It 'handles API response'
    When call fetch_data
    The output should include "success"
  End
End

3. Intercept Pattern (For Built-ins)

Describe 'intercepting built-ins'
  Intercept command
  
  __command__() {
    if [[ "$2" == "rm" ]]; then
      echo "MOCK: rm intercepted"
      return 0
    fi
    command "$@"
  }
  
  It 'safely mocks dangerous operations'
    When run source ./cleanup_script.sh
    The output should include "MOCK: rm intercepted"
  End
End

Decision Matrix:

Dependency Mock? Rationale
Network (curl, wget) ✅ Always Slow, unreliable
Date/time ✅ Always Reproducibility
Random values ✅ Always Deterministic tests
System commands (grep, sed) ❌ Rarely Fast, stable
File I/O ⚠️ Sometimes Use temp dirs

Recommended action: Mock boundaries (network, time, random), trust stable commands

Output Capture and Comparison

Capturing stdout, stderr, and exit status

It 'captures all output streams'
  When call function_with_output
  The output should eq "stdout message"      # stdout
  The error should eq "error message"        # stderr
  The status should be success               # exit code (0)
End

Comparing Without Color Codes

# Helper functions to strip ANSI color codes for comparison
# $1 = stdout, $2 = stderr, $3 = exit status
no_colors_stderr() { echo -n "$2" | sed -E $'s/\x1B\\[[0-9;]*[A-Za-z]//g; s/\x1B\\([A-Z]//g; s/\x0F//g' | tr -s ' '; }
no_colors_stdout() { echo -n "$1" | sed -E $'s/\x1B\\[[0-9;]*[A-Za-z]//g; s/\x1B\\([A-Z]//g; s/\x0F//g' | tr -s ' '; }

It 'compares without colors'
  When call colored_output
  The result of no_colors_stdout should eq "Plain text"
  The result of no_colors_stderr should eq "Plain text"
End

# Alternative: Disable colors via environment
It 'disables colors'
  BeforeCall 'export NO_COLOR=1'
  BeforeCall 'export TERM=dumb'
  When call my_command
  The output should not include pattern '\x1B\['
End

Recommended action: Set NO_COLOR=1 in test environment to avoid color-stripping

Test Environment Management

Temporary Directories

eval "$(shellspec - -c) exit 1"

# shellcheck disable=SC2288
% TEST_DIR: "$SHELLSPEC_TMPBASE/tmprepo"

Describe 'isolated environment'
  BeforeEach 'setup_test_env'
  AfterEach 'cleanup_test_env'
  
  setup_test_env() {
    mkdir -p "$TEST_DIR" || true
    cd "$TEST_DIR" || exit 1
  }
  
  cleanup_test_env() {
    cd - >/dev/null
    rm -rf "$TEST_DIR"
  }
  
  It 'works in isolation'
    When call touch test.txt
    The path test.txt should be file
  End
End

Pros: Complete isolation, no test interference Cons: Requires discipline in cleanup Recommended action: Always use temp dirs, never write to fixed paths

Parameterized Tests

Describe 'data-driven tests'
  Parameters
    "#1" "valid"   "success" 0
    "#2" "invalid" "error"   1
    "#3" "empty"   "missing" 1
  End
  
  It "handles $1 input"
    When call validate "$2"
    The output should include "$3"
    The status should eq $4
  End
End

Recommended action: Use Parameters to avoid copy-paste test code

Code Coverage

Enable Coverage

# .shellspec configuration
--kcov
--kcov-options "--include-pattern=.sh"
--kcov-options "--exclude-pattern=/spec/,/coverage/"
--kcov-options "--fail-under-percent=80"

# Run with coverage
shellspec --kcov

# View report
open coverage/index.html

Coverage Goals:

  • Critical paths: 100%
  • Main functionality: 80-90%
  • Utility functions: 70-80%

Recommended action: Enable coverage from day 1, set minimum thresholds

Test Execution

Running Tests

# All tests
shellspec

# Specific file
shellspec spec/module_spec.sh

# Specific line
shellspec spec/module_spec.sh:42

# Only previously failed
shellspec --quick

# Stop on first failure
shellspec --fail-fast

# Parallel execution (4 jobs)
shellspec --jobs 4

# With debug trace
shellspec --xtrace

# Focus mode (run only fIt, fDescribe)
shellspec --focus

Focus Mode for TDD

Describe 'feature under development'
  fIt 'focused test - runs only this'
    When call new_feature
    The status should be success
  End
  
  It 'other test - skipped during focus'
    When call other_feature
  End
End

Recommended action: Use --quick and --focus for rapid TDD cycles

CI/CD Integration

JUnit Reports

# .shellspec configuration
--format junit
--output report/junit.xml

# Run in CI
shellspec --kcov --format junit

GitHub Actions Example

name: ShellSpec Tests
on: [push, pull_request]
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v3
    - name: Install ShellSpec
      run: curl -fsSL https://git.io/shellspec | sh -s -- -y
    - name: Run tests
      run: shellspec --kcov --format junit
    - name: Upload coverage
      uses: codecov/codecov-action@v3

Recommended action: Generate JUnit reports for CI integration

Troubleshooting

Debug Failed Tests

# Show execution trace
shellspec --xtrace spec/failing_spec.sh

# Check spec syntax
shellspec --syntax-check spec/failing_spec.sh

# See generated shell code
shellspec --translate spec/failing_spec.sh

# Inspect output during test
It 'debugs with Dump'
  When call my_function
  Dump  # Shows stdout, stderr, status
  The output should not be blank
End

Common Issues

Problem: Test passes alone but fails in suite Solution: Check for global state leakage, ensure cleanup in AfterEach

Problem: Can't mock external command Solution: Use Mock block or Intercept for built-ins

Problem: Tests are slow Solution: Enable parallel execution with --jobs 4

Problem: Coverage doesn't work Solution: Ensure kcov is installed, check --kcov-options

Best Practices Checklist

  • All scripts have source guards
  • External dependencies are mocked
  • Tests use temporary directories
  • Each test verifies one behavior
  • Tests follow GIVEN/WHEN/THEN structure
  • Coverage is enabled and > 80%
  • Tests run in parallel (--jobs)
  • JUnit reports generated for CI
  • No hard-coded paths or commands
  • Side effects are documented and tested

Common Anti-Patterns to Avoid

Global state mutation

COUNTER=0  # Bad: mutable global
increment() { COUNTER=$((COUNTER + 1)); }

Return values

increment() { echo $(($1 + 1)); }  # Good: pure function
counter=$(increment "$counter")

Testing implementation details

It 'calls grep with -E flag'  # Bad: too coupled

Testing behavior

It 'finds files matching pattern'  # Good: tests outcome

Unmocked network calls

It 'fetches real data' curl https://api.com  # Bad: slow, flaky

Mocked dependencies

Mock curl; echo "mock"; End  # Good: fast, reliable

Advanced Topics

For deeper coverage of advanced patterns, see:

Quick Reference

# Common commands
shellspec                   # Run all tests
shellspec --quick           # Re-run failures only
shellspec --xtrace          # Debug trace
shellspec --kcov            # With coverage
shellspec --format junit    # JUnit output

# DSL basics
Describe/Context/It/End     # Test structure
When call/run/source        # Execution
The output/status/error     # Assertions
Mock/Intercept              # Mocking
BeforeEach/AfterEach        # Hooks
Parameters                  # Data-driven tests
Dump                        # Debug helper

Recommended action: Start with simple unit tests, add coverage, then integrate into CI