name	audio-injection-testing
description	Test Bob The Skull with virtual audio injection instead of speaking. Use when testing wake word detection, STT accuracy, full conversation pipeline, or automated testing. Covers setup, configuration, injection methods, and troubleshooting.
allowed-tools	Read, Edit, Bash, Glob

Audio Injection Testing Skill

Test Bob using virtual audio devices - inject pre-recorded audio instead of saying "Hey Bob" hundreds of times!

When to Use This Skill

"Test wake word detection" - Automated wake word testing
"Test with audio injection" - Use virtual microphone
"Setup virtual audio" - Configure test environment
"Run automated tests" - Test conversation pipeline
"Debug STT accuracy" - Test speech recognition

Quick Reference

Testing Methods Comparison

Method	Platform	Manual Testing	Automated Testing	Setup Complexity
Combined Mic (RECOMMENDED)	Linux	✅ Yes	✅ Yes	Medium
Test-Only Virtual Devices	Linux	❌ No	✅ Yes	Low
VB-Audio Virtual Cable	Windows	❌ No	✅ Yes	Low
Direct File Input	Both	❌ No	✅ Yes	High (code changes)

Method 1: Combined Microphone (Linux - RECOMMENDED)

Best for: Seamless switching between manual and automated testing

Architecture

Real Microphone ──┐
                  ├──> module-loopback ──> null-sink ──> monitor ──> bob_combined_mic
Injected Audio ──┘                         (mixer)                     (Bob reads here)

Setup

# One-time setup
python3 setup_combined_audio.py

# Output shows device index:
# ========================================
# ✓ Combined microphone ready!
#
# Device name: bob_combined_mic
# Device index: 5  <── Use this in config
# ========================================

Configuration

# Edit .env
nano .env

# Set audio input device
BOBTHESKULL_AUDIO_INPUT_DEVICE_INDEX=5  # Use index from setup

# Restart Bob
python BobTheSkull.py

Usage

Manual testing (just talk normally):

# Bob hears your voice through combined mic
# No commands needed - your mic is automatically mixed in!

Automated testing (inject audio):

# Play single test file
python3 test_wake_word_inject.py play --file audio/static/testing/wake_up_bob.mp3

# Run test sequence with delays
python3 test_wake_word_inject.py test --files \
    audio/static/testing/wake_up_bob.mp3 \
    audio/static/testing/what_time_is_it.mp3 \
    audio/static/testing/goodbye_bob.mp3 \
    --delay 3.0

# Test while monitoring logs
tail -f logs/bob.log &
python3 test_wake_word_inject.py test --files audio/static/testing/*.mp3

Both work simultaneously! Inject test audio while retaining ability to interrupt manually.

Cleanup

# Remove virtual devices
python3 setup_combined_audio.py --cleanup

# Revert Bob's audio config to real microphone
nano .env
# Change BOBTHESKULL_AUDIO_INPUT_DEVICE_INDEX back to original

Method 2: Test-Only Virtual Devices (Linux)

Best for: Pure automated testing (no manual fallback needed)

Setup

# Create virtual devices
python3 test_wake_word_inject.py setup

# Output:
# ✓ Virtual audio devices created:
#   - Output: bob_test_output
#   - Input: bob_test_mic

Configuration

# List devices to find index
python3 test_wake_word_inject.py list

# Configure Bob
nano .env
BOBTHESKULL_AUDIO_INPUT_DEVICE_NAME=bob_test_mic
# OR
BOBTHESKULL_AUDIO_INPUT_DEVICE_INDEX=<index>

# Restart Bob

Usage

# Play audio file (Bob will hear it)
python3 test_wake_word_inject.py play --file audio/static/testing/hey_bob.mp3

# Run test sequence
python3 test_wake_word_inject.py test --files \
    audio/static/testing/wake_up_bob.mp3 \
    audio/static/testing/tell_me_a_joke.mp3 \
    --delay 2.0

Cleanup

python3 test_wake_word_inject.py cleanup

Method 3: VB-Audio Virtual Cable (Windows)

Best for: Windows automated testing

Setup

Install VB-Audio Virtual Cable
- Download: https://vb-audio.com/Cable/
- Install: Run setup as Administrator
- Creates "CABLE Input" (playback) and "CABLE Output" (recording)

Configure Bob

# Find device index
python list_audio_devices.py

# Look for "CABLE Output" in input devices
# Edit .env
BOBTHESKULL_AUDIO_INPUT_DEVICE_INDEX=X  # CABLE Output index

Restart Bob

Usage (Windows)

# Play audio to virtual cable using mpv
mpv --audio-device=wasapi/<CABLE-Input-GUID> audio/static/testing/wake_up_bob.mp3

# Or use Python with PyAudio
python test_audio_injection_windows.py --file audio/static/testing/wake_up_bob.mp3

Creating Test Audio Files

Option 1: Generate with ElevenLabs (Bob's voice)

Use static-audio-generation skill to create test audio:

# Create generate_test_audio.py
TEST_PHRASES = [
    "Wake up Bob",
    "Hey Bob",
    "What time is it?",
    "Tell me a joke",
    "Can you speak louder?",
    "What is the weather like today?",
    "Goodbye Bob"
]

# Generate to audio/static/testing/
python generate_test_audio.py

Option 2: Record yourself

# Linux
arecord -d 3 -f S16_LE -r 16000 -c 1 audio/static/testing/wake_up_bob.wav

# Windows
# Use Audacity or Voice Recorder app
# Export as WAV: 16kHz, mono, 16-bit

Option 3: Use espeak (quick but robotic)

espeak "Wake up Bob" --stdout | \
    sox -t wav - -r 16000 -c 1 -b 16 audio/static/testing/wake_up_bob.wav

Option 4: Convert existing audio

# Convert to correct format (16kHz, mono, 16-bit)
sox input.mp3 -r 16000 -c 1 -b 16 audio/static/testing/output.wav

# Or use ffmpeg
ffmpeg -i input.mp3 -ar 16000 -ac 1 audio/static/testing/output.wav

Testing Workflows

Workflow 1: Wake Word Detection Testing

Goal: Verify wake word sensitivity and accuracy

# 1. Create test files (if not exist)
audio/static/testing/wake_up_bob.mp3       # Primary wake word
audio/static/testing/hey_bob.mp3           # Secondary wake word
audio/static/testing/false_positive_*.mp3  # Should NOT trigger

# 2. Setup virtual audio
python3 setup_combined_audio.py

# 3. Configure Bob with combined mic
nano .env  # Set AUDIO_INPUT_DEVICE_INDEX

# 4. Start Bob
python BobTheSkull.py &

# 5. Run test sequence
python3 test_wake_word_inject.py test --files \
    audio/static/testing/wake_up_bob.mp3 \
    audio/static/testing/hey_bob.mp3 \
    --delay 3.0

# 6. Monitor logs for detections
tail -f logs/bob.log | grep -i "wake word"

# 7. Verify web monitor
# Open: http://localhost:5001
# Check event feed for WakeWordDetectedEvent

Workflow 2: STT Accuracy Testing

Goal: Test speech recognition accuracy

# 1. Create test phrases with known text
audio/static/testing/what_time_is_it.mp3
audio/static/testing/tell_me_a_joke.mp3
audio/static/testing/whats_the_weather.mp3

# 2. Create expected results file
cat > audio/static/testing/expected.txt <<EOF
what_time_is_it.mp3|What time is it?
tell_me_a_joke.mp3|Tell me a joke.
whats_the_weather.mp3|What's the weather like today?
EOF

# 3. Run tests and capture STT results
python3 test_wake_word_inject.py test --files audio/static/testing/*.mp3 --delay 5.0

# 4. Check logs for STT transcriptions
grep "SpeechRecognizedEvent" logs/bob.log

# 5. Compare actual vs expected transcriptions
# Manual verification or automated diff script

Workflow 3: Full Conversation Pipeline Testing

Goal: Test wake word → STT → LLM → TTS → response

# 1. Create conversation test sequence
audio/static/testing/conversation_test_1.mp3  # "Wake up Bob"
audio/static/testing/conversation_test_2.mp3  # Wait for greeting
audio/static/testing/conversation_test_3.mp3  # "What time is it?"
audio/static/testing/conversation_test_4.mp3  # Wait for response
audio/static/testing/conversation_test_5.mp3  # "Thank you"
audio/static/testing/conversation_test_6.mp3  # "Goodbye Bob"

# 2. Run full sequence with appropriate delays
python3 test_wake_word_inject.py test --files \
    audio/static/testing/conversation_test_*.mp3 \
    --delay 8.0  # Longer delay for LLM processing

# 3. Monitor full pipeline
tail -f logs/bob.log | grep -E "WakeWord|SpeechRecognized|LLMResponse|SpeakingComplete"

# 4. Verify state transitions
# Watch web monitor for state machine transitions:
# IDLE → WAKE_LISTENING → GREETING → LISTENING → PROCESSING → SPEAKING → WAKE_LISTENING

Workflow 4: Regression Testing

Goal: Ensure changes don't break existing functionality

# 1. Create comprehensive test suite
audio/static/testing/regression/
├── wake_word_tests/
│   ├── wake_up_bob_1.mp3
│   ├── wake_up_bob_2.mp3
│   └── hey_bob_1.mp3
├── stt_tests/
│   ├── simple_question_1.mp3
│   ├── complex_sentence_1.mp3
│   └── multi_sentence_1.mp3
└── conversation_tests/
    ├── full_interaction_1.mp3
    └── full_interaction_2.mp3

# 2. Create test script
cat > run_regression_tests.sh <<'EOF'
#!/bin/bash
echo "=== Bob The Skull Regression Tests ==="
echo "Wake Word Tests..."
python3 test_wake_word_inject.py test --files audio/static/testing/regression/wake_word_tests/*.mp3 --delay 2.0

echo "STT Tests..."
python3 test_wake_word_inject.py test --files audio/static/testing/regression/stt_tests/*.mp3 --delay 5.0

echo "Conversation Tests..."
python3 test_wake_word_inject.py test --files audio/static/testing/regression/conversation_tests/*.mp3 --delay 10.0

echo "=== Tests Complete ==="
EOF

chmod +x run_regression_tests.sh

# 3. Run before major changes
./run_regression_tests.sh > regression_results_$(date +%Y%m%d_%H%M%S).log

# 4. Run after changes and compare logs
diff regression_results_before.log regression_results_after.log

Platform-Specific Notes

Linux (PulseAudio)

Virtual devices: module-null-sink + module-loopback Setup script: setup_combined_audio.py Injection script: test_wake_word_inject.py

Verify devices:

pactl list short sources  # List input devices
pactl list short sinks    # List output devices

Monitor audio flow:

# Install PulseAudio Volume Control
sudo apt install pavucontrol

# Run and check "Recording" tab
pavucontrol

Windows (VB-Audio Cable)

Virtual device: VB-Audio Virtual Cable Download: https://vb-audio.com/Cable/ Device names: "CABLE Input" (output), "CABLE Output" (input)

List devices:

python list_audio_devices.py

Playback to virtual cable:

# Use Windows audio API or mpv with WASAPI
mpv --audio-device=wasapi/CABLE-Input audio/static/testing/wake_up_bob.mp3

Raspberry Pi (ALSA Loopback)

Alternative: ALSA loopback module (if PulseAudio not available)

# Load loopback module
sudo modprobe snd-aloop

# Devices created:
# hw:1,0 - Write audio here (playback)
# hw:1,1 - Bob reads from here (capture)

# Configure Bob
BOBTHESKULL_AUDIO_INPUT_DEVICE_NAME=hw:1,1

# Play test audio
aplay -D hw:1,0 audio/static/testing/wake_up_bob.wav

Troubleshooting

Bob doesn't hear injected audio

Diagnosis:

# Verify virtual device exists
pactl list short sources | grep bob

# Test recording from virtual mic
parecord -d bob_combined_mic test_capture.wav
# Play injected audio while recording
python3 test_wake_word_inject.py play --file audio/static/testing/wake_up_bob.mp3
# Listen to captured audio
paplay test_capture.wav
# Should hear the injected audio

Common causes:

Bob using wrong audio device index
Virtual device not created
Audio format mismatch

Audio plays but Bob doesn't detect wake word

Diagnosis:

# Check wake word sensitivity
grep WAKE_WORD_SENSITIVITY .env

# Try lower sensitivity (more sensitive)
BOBTHESKULL_WAKE_WORD_SENSITIVITY=0.3  # Default is 0.5

# Check audio format of test file
ffprobe audio/static/testing/wake_up_bob.mp3

# Should be: 16kHz or 22kHz, mono or stereo, MP3 or WAV

Solutions:

Increase audio volume in test file
Regenerate test audio with clearer pronunciation
Lower wake word sensitivity threshold

Virtual device not appearing after setup

Diagnosis:

# Check if PulseAudio is running
pulseaudio --check
echo $?  # Should return 0

# List loaded modules
pactl list short modules | grep bob

# Check system logs
journalctl -xe | grep pulse

Solutions:

# Restart PulseAudio
pulseaudio -k
pulseaudio --start

# Rerun setup
python3 setup_combined_audio.py --cleanup
python3 setup_combined_audio.py

Injection audio too quiet or too loud

Solution: Adjust playback volume

# Linux - adjust virtual sink volume
pactl set-sink-volume bob_audio_mixer 150%  # Increase
pactl set-sink-volume bob_audio_mixer 50%   # Decrease

# Or normalize audio file itself
ffmpeg-normalize audio/static/testing/wake_up_bob.mp3 -o wake_up_bob_normalized.mp3

Pro Tips

Use combined mic for development - Allows quick manual overrides during automated testing
Create diverse test corpus - Different voices, speeds, accents, background noise levels
Test silence/noise - Inject silence or white noise to test false positive rate
Automate with CI/CD - Run regression tests on every commit
Record actual user interactions - Convert real usage into test cases (with permission)
Test edge cases - Very quiet audio, loud audio, multiple speakers, music in background
Use delays strategically - Allow time for state transitions (wake→listen→process→speak)
Monitor state machine - Web monitor shows state transitions in real-time
Log everything - Capture full logs during testing for debugging
Parametrize tests - Create test configs with expected outcomes for automated validation

Integration with Other Skills

Works well with:

static-audio-generation - Generate test audio files with ElevenLabs
pi-deployment - Deploy test audio to Pi for remote testing
cross-repo-sync - Sync test audio between repos

Time Savings

Without skill:

15-20 minutes setup per session (figure out virtual audio)
30+ minutes creating test audio manually
Manual testing requires saying phrases repeatedly (exhausting!)

With skill:

5 minutes setup (documented commands)
5-10 minutes creating test audio (documented methods)
Automated testing runs unattended

Estimated time savings: 3x faster, infinite test iterations

References

Setup Scripts:

setup_combined_audio.py - Combined mic setup (Linux)
test_wake_word_inject.py - Audio injection script

Documentation:

TESTING_AUDIO_INJECTION.md - Complete guide
COMBINED_MIC_QUICKSTART.md - Quick setup

Test Audio:

audio/static/testing/ - Test audio files directory

Related Tools:

list_audio_devices.py - List available audio devices
test_audio_output.py - Test speaker output

Install Skill

SKILL.md

Audio Injection Testing Skill

When to Use This Skill

Quick Reference

Testing Methods Comparison

Method 1: Combined Microphone (Linux - RECOMMENDED)

Architecture

Setup

Configuration

Usage

Cleanup

Method 2: Test-Only Virtual Devices (Linux)

Setup

Configuration

Usage

Cleanup

Method 3: VB-Audio Virtual Cable (Windows)

Setup

Usage (Windows)

Creating Test Audio Files

Option 1: Generate with ElevenLabs (Bob's voice)

Option 2: Record yourself

Option 3: Use espeak (quick but robotic)

Option 4: Convert existing audio

Testing Workflows

Workflow 1: Wake Word Detection Testing

Workflow 2: STT Accuracy Testing

Workflow 3: Full Conversation Pipeline Testing

Workflow 4: Regression Testing

Platform-Specific Notes

Linux (PulseAudio)

Windows (VB-Audio Cable)

Raspberry Pi (ALSA Loopback)

Troubleshooting

Bob doesn't hear injected audio

Audio plays but Bob doesn't detect wake word

Virtual device not appearing after setup

Injection audio too quiet or too loud

Pro Tips

Integration with Other Skills

Time Savings

References