| name | static-audio-generation |
| description | Generate and manage static audio files for Bob The Skull using ElevenLabs TTS. Use when creating greetings, startup messages, error responses, or test audio. Handles generation, naming, directory structure, and cross-repo syncing. |
| allowed-tools | Read, Edit, Bash, Glob |
Static Audio Generation Skill
Generate pre-recorded audio files using ElevenLabs TTS for instant playback (no API latency, reduced cost, offline support).
When to Use This Skill
- "Generate greeting audio" - Create greeting responses
- "Add new startup message" - System status audio
- "Create test audio files" - Testing wake word/STT
- "Sync audio to BobFast5" - Cross-repo audio management
- "Generate static TTS" - Any pre-recorded phrases
Quick Reference
Directory Structure
audio/static/
├── greetings/ # Greeting responses ("Yes wizard?", "I'm listening")
│ ├── yes_wizard.mp3
│ ├── im_listening.mp3
│ └── greetings.txt # Index file
├── startup/ # Startup/shutdown/error messages
│ ├── initializing.mp3
│ ├── startup_complete.mp3
│ └── startup.txt # Index file
└── testing/ # Test audio for wake word/STT testing
├── wake_up_bob.mp3
├── hey_bob.mp3
└── what_time_is_it.mp3
Generation Commands
# Generate all greetings (from predefined list)
python generate_greeting_audio.py
# Generate all startup/shutdown/error messages
python generate_startup_audio.py
# Generate all static audio (comprehensive)
python generate_static_audio.py
Naming Convention
Rule: Lowercase, underscores, descriptive
- "Yes wizard?" →
yes_wizard.mp3 - "I'm listening" →
im_listening.mp3 - "Startup complete. Listening for wake words." →
startup_complete_listening_for_wake_words.mp3
Normalization function (tts/static_audio.py):
from tts.static_audio import normalize_phrase_to_filename
filename = normalize_phrase_to_filename("Yes wizard?") # → "yes_wizard"
Generation Workflows
Workflow 1: Generate Greetings
Script: generate_greeting_audio.py
# 1. Define greetings list (edit script)
GREETINGS = [
"Yes wizard?",
"What do you need boss?",
"I'm listening",
"Yes?"
]
# 2. Run generation
python generate_greeting_audio.py
# Output:
# audio/static/greetings/yes_wizard.mp3
# audio/static/greetings/what_do_you_need_boss.mp3
# audio/static/greetings/im_listening.mp3
# audio/static/greetings/yes.mp3
# audio/static/greetings/greetings.txt (index)
When to add new greetings:
- Adding personality variety
- Testing different responses
- Supporting new conversation states
Workflow 2: Generate Startup Messages
Script: generate_startup_audio.py
# 1. Define messages (edit script)
STARTUP_PHRASES = [
"Initializing",
"Found eye controller",
"Startup complete. Listening for wake words.",
]
SHUTDOWN_PHRASES = ["Shutting down"]
ERROR_PHRASES = ["Configuration error"]
# 2. Run generation
python generate_startup_audio.py
# Output: audio/static/startup/*.mp3
When to add startup messages:
- New component initialization feedback
- Debugging startup sequence
- User experience improvements
Workflow 3: Generate Test Audio
Purpose: Audio files for automated testing (wake word, STT, full pipeline)
Test audio types:
- Wake word triggers: "Wake up Bob", "Hey Bob"
- Commands: "What time is it?", "Tell me a joke"
- Conversations: Full conversation test sequences
Generation options:
Option A: Use ElevenLabs (Bob's voice)
# Add to generate_static_audio.py or create test-specific script
TEST_PHRASES = [
"Wake up Bob",
"Hey Bob",
"What time is it?",
"Tell me a joke",
"Can you speak louder?",
"What is the weather like today?",
"Goodbye Bob"
]
# Generate to audio/static/testing/
Option B: Record yourself
# Record 3 seconds
arecord -d 3 -f S16_LE -r 16000 -c 1 audio/static/testing/wake_up_bob.wav
# Convert to MP3 (optional)
ffmpeg -i wake_up_bob.wav -b:a 32k wake_up_bob.mp3
Option C: Use espeak (quick but robotic)
espeak "Wake up Bob" --stdout | \
sox -t wav - -r 16000 -c 1 -b 16 audio/static/testing/wake_up_bob.wav
Workflow 4: Cross-Repo Sync (BobTheSkull5 → BobFast5)
When: After generating new audio files for testing on vision system
Method 1: Manual copy (Windows)
# Copy specific category
copy audio\static\testing\*.mp3 ..\BobFast5\audio\static\testing\
# Or use xcopy for directory sync
xcopy audio\static\testing ..\BobFast5\audio\static\testing\ /Y /S
Method 2: Use cross-repo-sync skill
# See cross-repo-sync skill for safe patterns
Method 3: Deploy to Pi (includes audio)
# deploy_to_pi.bat doesn't currently copy audio/ directory
# Add manual step or extend deployment script
pscp -pw peacock7 -r audio/static knarl@192.168.1.44:/home/knarl/BobTheSkull5/audio/
ElevenLabs Configuration
Voice Settings (from BobConfig.py)
ELEVEN_LABS_VOICE_ID = "nPczCjzI2devNBz1zQrb" # Brian (default Bob voice)
ELEVEN_LABS_MODEL = "eleven_turbo_v2_5"
TTS_STABILITY = 0.71
TTS_SIMILARITY_BOOST = 0.5
TTS_STYLE = 0.0
TTS_USE_SPEAKER_BOOST = True
Voice Selection Guide
Brian (default): Deep, authoritative, sarcastic personality Use for: Greetings, conversation responses, personality-driven content
Alternative voices (if needed):
- Calmer voice for error messages
- Different voice for testing/debugging distinction
Cost Optimization
Strategy: Generate once, reuse forever
- Greetings used hundreds of times → synthesize once saves $$$
- Startup messages on every boot → pre-generate
- Test audio → generate once, test infinite times
Cost per file: ~$0.18 per 1000 characters (turbo_v2_5)
- Average greeting: ~15 characters = $0.0027 per file
- Generate 10 greetings once = $0.027
- Use 1000 times = $0.00003 per use (vs $0.0027 per dynamic TTS)
Common Use Cases
Use Case 1: Add New Greeting Variant
# 1. Edit generate_greeting_audio.py
GREETINGS = [
"Yes wizard?",
"What do you need boss?",
"I'm listening",
"Yes?",
"Speak wizard", # NEW
]
# 2. Generate
python generate_greeting_audio.py
# 3. Verify
ls audio/static/greetings/
# Should see: speak_wizard.mp3
# 4. Update state_machine.py to use new greeting (if needed)
# Edit GREETINGS list in state_machine/state_machine.py
# 5. Test playback
# Use test_audio_output.py or manually play
Use Case 2: Generate Test Suite Audio
# Create generate_test_audio.py
#!/usr/bin/env python3
from pathlib import Path
from dotenv import load_dotenv
from elevenlabs import ElevenLabs, VoiceSettings
from BobConfig import BobConfig
from tts.static_audio import normalize_phrase_to_filename
load_dotenv()
config = BobConfig()
config.load_from_env()
OUTPUT_DIR = Path("audio/static/testing")
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
TEST_PHRASES = [
"Wake up Bob",
"Hey Bob",
"What time is it?",
"Tell me a joke",
"Can you speak louder?",
"What is the weather like today?",
"Goodbye Bob"
]
client = ElevenLabs(api_key=config.ELEVEN_LABS_API_KEY)
for phrase in TEST_PHRASES:
filename = f"{normalize_phrase_to_filename(phrase)}.mp3"
filepath = OUTPUT_DIR / filename
print(f"Generating: {phrase} → {filename}")
audio_generator = client.text_to_speech.convert(
voice_id=config.ELEVEN_LABS_VOICE_ID,
text=phrase,
model_id=config.ELEVEN_LABS_MODEL,
voice_settings=VoiceSettings(
stability=config.TTS_STABILITY,
similarity_boost=config.TTS_SIMILARITY_BOOST,
style=config.TTS_STYLE,
use_speaker_boost=config.TTS_USE_SPEAKER_BOOST
)
)
audio_data = b"".join(audio_generator)
filepath.write_bytes(audio_data)
print(f" ✓ Saved ({len(audio_data)/1024:.1f} KB)\n")
Use Case 3: Batch Regenerate All Audio
# Regenerate everything (after voice change or quality update)
python generate_greeting_audio.py
python generate_startup_audio.py
python generate_static_audio.py
# Verify total file count
find audio/static -name "*.mp3" | wc -l
# Check total size
du -sh audio/static
Troubleshooting
Error: "ELEVEN_LABS_API_KEY not found"
Problem: API key not in environment
Solution:
# Check .env file
cat .env | grep ELEVEN_LABS_API_KEY
# Should show:
# BOBTHESKULL_ELEVEN_LABS_API_KEY=sk_...
# If missing, add it:
echo "BOBTHESKULL_ELEVEN_LABS_API_KEY=sk-your-key-here" >> .env
Error: "Audio generation failed"
Problem: API rate limit or network issue
Solution:
# Check API quota at elevenlabs.io dashboard
# Wait 1 minute and retry
# Or add retry logic with delay
Files generated but playback fails
Problem: Incorrect audio format or corrupted file
Solution:
# Check file size (should be >1KB for typical greeting)
ls -lh audio/static/greetings/
# Test playback directly
mpv audio/static/greetings/yes_wizard.mp3
# Regenerate specific file if corrupted
Filename normalization incorrect
Problem: Special characters in phrase causing issues
Solution:
# Check normalization
from tts.static_audio import normalize_phrase_to_filename
print(normalize_phrase_to_filename("Your phrase here"))
# Should convert:
# - Spaces → underscores
# - Punctuation → removed
# - Uppercase → lowercase
# Example: "Yes, wizard?" → "yes_wizard"
Pro Tips
Generate in batches - Create all related audio at once (all greetings, all startup messages)
Test before deploying - Play generated files locally before syncing to Pi
Version control audio - Commit generated MP3 files to git (they're small and rarely change)
Use index files -
greetings.txtandstartup.txtdocument what's availableConsistent voice settings - Don't change TTS settings mid-project or you'll need to regenerate everything
Organize by category - Use subdirectories (
greetings/,startup/,testing/) for clarityName descriptively -
startup_complete_listening_for_wake_words.mp3better thanstartup_msg_3.mp3Test audio duration - Keep greetings short (1-2 seconds) for responsive feel
Create test variants - Generate same phrase with different emphases for testing
Document custom scripts - If you create
generate_test_audio.py, add it to repo
Integration with Other Skills
Works well with:
- cross-repo-sync - Syncing audio between BobTheSkull5 and BobFast5
- audio-injection-testing - Using generated test audio for automated testing
- pi-deployment - Deploying audio files to Raspberry Pi
Time Savings
Without skill:
- 10-15 minutes per audio file (setup, generation, naming, placement, verification)
- Frequent errors in naming/directory structure
- Manual cross-repo copying with mistakes
With skill:
- 3-5 minutes per audio file (documented process)
- Consistent naming via normalization function
- Clear cross-repo sync patterns
Estimated time savings: 2-3x faster
References
Generation Scripts:
Supporting Code:
- tts/static_audio.py - Static audio playback and normalization
- BobConfig.py - ElevenLabs configuration
Audio Directories:
audio/static/greetings/- Greeting responsesaudio/static/startup/- Startup/shutdown/error messagesaudio/static/testing/- Test audio files