Claude Code Plugins

Community-maintained marketplace

Feedback

youtube-shorts-creator

@j8647877-hash/claude-code-skills
0
17

Use this skill when the user asks to create a YouTube Short, generate a short-form vertical video, make a YouTube Shorts video from an idea, or mentions 'youtube-shorts-creator'. Generates a 30-60 second 1080x1920 portrait video with AI-generated video clips (Veo 3.1), static images with Ken Burns effects (Gemini Nano Banana 2), AI voiceover (Gemini TTS), background music, sound effects, optional captions, and a thumbnail. Output is a local MP4 + thumbnail PNG.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name youtube-shorts-creator
description Use this skill when the user asks to create a YouTube Short, generate a short-form vertical video, make a YouTube Shorts video from an idea, or mentions 'youtube-shorts-creator'. Generates a 30-60 second 1080x1920 portrait video with AI-generated video clips (Veo 3.1), static images with Ken Burns effects (Gemini Nano Banana 2), AI voiceover (Gemini TTS), background music, sound effects, optional captions, and a thumbnail. Output is a local MP4 + thumbnail PNG.
version 3.0.0
metadata [object Object]

YouTube Shorts Creator v3.0

Create professional YouTube Shorts (30-60 second vertical videos) from a simple idea input. Uses an audio-first pipeline that generates per-segment voiceover FIRST, measures actual audio durations, then generates visuals to match — ensuring perfect audio-visual sync where no scene ever changes while narration is still playing.

v3.0 New Features:

  • Background music with automatic volume ducking
  • Transition sound effects
  • WhisperX-accurate captions (speech recognition)
  • Advanced transitions (wipe, clock-wipe, flip)
  • Spring-based animations for text overlays
  • Video progress bar
  • Loop ending for seamless rewatching
  • Configurable caption styling
  • Audio normalization to -14 LUFS (YouTube standard)
  • Hook segment optimization

Technology Stack

Technology Purpose
Veo 3.1 (veo-3.1-generate-001) Generate 8-second video clips in 9:16 portrait
Nano Banana 2 (gemini-3.1-flash-image-preview) Generate scene images, reference frames, thumbnails
Gemini TTS (gemini-2.5-flash-preview-tts) Primary voiceover narration
Google Cloud TTS (Chirp 3 HD) Fallback TTS
FFmpeg Background music and synthetic sound effects generation
Giphy API Search and download GIFs, memes, and animated stickers
@remotion/gif Render GIF files in Remotion compositions
Remotion Framework Compose all assets into final MP4 with transitions
WhisperX (whisper.cpp) Accurate word-level caption timestamps
FFmpeg Audio processing, normalization, concatenation

Required Environment Variables

Before running, ensure these are set:

  • GEMINI_API_KEY - Google AI API key (for Veo 3.1 + Gemini image generation + Gemini TTS)
  • GIPHY_API_KEY - (Optional) Giphy API key for GIF segments and sticker overlays (get one at https://developers.giphy.com/)
  • GOOGLE_APPLICATION_CREDENTIALS - (Optional) Google Cloud service account JSON path for TTS fallback

Workflow

Phase 1: Idea Intake & Creative Brief

When the user provides an idea, use the AskUserQuestion tool to ask ALL of the following in a single multi-question prompt. Group them logically. The user can always pick "Other" to customize.

Question 1 — Duration

  • 30 seconds (quick, punchy)
  • 45 seconds (balanced, recommended)
  • 60 seconds (detailed, in-depth)

Question 2 — Visual Style

  • Cinematic (dramatic lighting, film grain, lens flares)
  • Clean & Modern (minimalist, flat design, bright colors)
  • Neon / Cyberpunk (dark backgrounds, glowing colors, futuristic)
  • Nature / Documentary (earthy tones, natural lighting, organic textures)
  • Retro / Vintage (muted colors, film effects, old-school typography)
  • Abstract / Artistic (surreal imagery, bold shapes, creative compositions)

Question 3 — Color Palette / Mood

  • Dark & Dramatic (blacks, deep blues, purples)
  • Bright & Energetic (vivid colors, high saturation, yellows/oranges)
  • Warm & Inviting (ambers, warm whites, soft oranges)
  • Cool & Professional (blues, grays, clean whites)
  • Pastel & Soft (light pinks, lavenders, muted tones)

Question 4 — Voice

  • Sulafat (Female, warm narration)
  • Laomedeia (Female, energetic)
  • Achernar (Female, soft & warm)
  • Charon (Male, informative narration)
  • Alnilam (Male, firm & confident)
  • Iapetus (Male, clear)
  • Fenrir (Male, dynamic)

Question 5 — Pacing / Energy

  • Slow & contemplative (fewer words, longer visual holds)
  • Moderate & balanced (standard YouTube Shorts pacing)
  • Fast & high-energy (rapid-fire facts, quick cuts, dynamic)

Question 6 — Captions

  • TikTok-style (word-by-word highlight, bold, centered — recommended)
  • Subtitle bar (lower-third bar with text)
  • No captions

Question 7 — Text Overlays

  • Yes, show key facts/concepts on screen (recommended for educational/instructional)
  • Minimal (only intro title)
  • No text overlays

Question 8 — Content Type (helps determine segment types and text overlay usage)

  • Educational / Instructional (facts, how-to, explanations — will add text overlays for key concepts)
  • Storytelling / Narrative (story arc, dramatic moments)
  • Listicle / Top N (numbered items, each gets its own segment)
  • Motivational / Inspirational (quotes, uplifting visuals)
  • Entertainment / Fun facts (surprising, engaging, casual tone)

Question 9 — Visual Mode (determines what type of visuals to use)

  • Mixed — images + video clips (recommended, uses both Veo video and Gemini images)
  • Images only (all segments use AI-generated images with Ken Burns effects — faster, no Veo wait)
  • Video only (all segments use Veo 3.1 video clips — more dynamic, longer generation time)

Question 10 — Background Music

  • Yes, add background music (recommended — auto-ducks under narration)
  • No background music

Question 11 — Giphy Content (GIFs, memes, and animated stickers from Giphy)

  • Both GIFs and stickers (full-frame GIF/meme segments + animated sticker overlays)
  • GIFs/memes only (reaction GIFs as full-frame segments — great for humor/reactions)
  • Stickers only (animated sticker overlays on top of other segments)
  • None (no Giphy content — recommended if no GIPHY_API_KEY is set)

After collecting answers, use them to inform every aspect of the storyboard: the visual prompts should reference the chosen style and color palette, the narration pacing should match the energy level, text overlays should be added where appropriate based on the content type, and all segment types must respect the chosen visual mode.

If background music is selected, generate a descriptive music.prompt in the storyboard meta that matches the video's mood, pacing, and content. Examples:

  • Educational: "mysterious ambient documentary background music, subtle synth pads, slow tempo"
  • Energetic: "upbeat electronic background beat, driving rhythm, high energy"
  • Motivational: "inspiring orchestral background music, building intensity, hopeful"
  • Storytelling: "cinematic atmospheric soundtrack, tension building, dramatic strings"

Phase 2: Storyboard Generation

Generate a structured storyboard JSON following the schema in references/storyboard-schema.md. Key rules:

CRITICAL: Audio-First Segment Timing

The storyboard duration_seconds values are INITIAL ESTIMATES only. The actual segment durations will be automatically adjusted after TTS generation to match the real audio length of each segment's narration. Write narration at ~150 words/minute as an estimate, but know that the pipeline will correct timing based on measured audio.

Each segment's final visual duration = actual audio duration + padding (0.3s default).

CRITICAL: Veo 3.1 Video Clip Duration Constraint

Veo 3.1 clips are EXACTLY 8 seconds long — this is a hard limit from the API. For any segment with type: "video", the narration MUST be written to fit within 8 seconds of speaking time (including padding). This means:

  • Target ~18-20 words per video segment narration (at ~150 wpm, 20 words ≈ 8 seconds)
  • The voiceover duration + padding must NOT exceed 8 seconds, or the video clip will end before the narration finishes
  • Write concise, punchy narration for video segments — every word must earn its place
  • If a concept needs more narration time, use an image segment instead (images with Ken Burns can be any duration)
  • The goal is seamless sync: the voiceover plays for the full duration of the video clip with no dead air and no clip looping

Hook Segment Optimization:

The FIRST segment is critical for YouTube's algorithm. Mark it with "is_hook": true and follow these rules:

  • Keep it 2-3 seconds long with punchy, provocative narration
  • Use a bold visual that immediately grabs attention
  • Open with a question, surprising fact, or bold statement
  • Consider using "spring-pop" or "scale-pop" animation for text overlays
  • Pattern interrupt: subvert expectations in the first frame

Visual Mode Handling:

The visual_mode field in meta controls what segment types are allowed:

  • "mixed" (default) — Use the heuristics table below to decide each segment's type. Mix of video and image segments.
  • "images-only" — ALL segments MUST be type: "image" with Ken Burns presets. No video segments. Vary Ken Burns presets for visual interest.
  • "videos-only" — ALL segments MUST be type: "video". No image segments. Each Veo clip is max 8 seconds; plan segment durations accordingly.

Segment Type Decision Heuristics (used only when visual_mode is "mixed"):

Content Type Segment Type Reason
Action, motion, transformation video (Veo 3.1) Moving subjects need video
Establishing shots, landscapes video (Veo 3.1) Cinematic environments
Static facts, quotes, statistics image (Gemini) + Ken Burns Text/data works better as images
Diagrams, comparisons, lists image (Gemini) + Ken Burns Infographic-style content
Abstract concepts, metaphors image (Gemini) + Ken Burns Conceptual imagery
Intro/outro title cards image (Gemini) + Ken Burns Text-heavy frames
Reaction moments, humor, pop culture gif (Giphy) Recognizable memes/reactions resonate with audiences

Budget:

  • Mixed mode: 2-4 VIDEO segments and 3-6 IMAGE segments for a 45-second Short.
  • Images-only mode: 6-10 IMAGE segments for a 45-second Short (4-8 seconds each with Ken Burns).
  • Videos-only mode: 4-7 VIDEO segments for a 45-second Short (each Veo clip is max 8 seconds).

Narration: Write at ~150 words/minute speaking rate as an estimate. Each segment MUST have its own narration_text. The narration for each segment will be generated as a separate audio file for perfect timing sync.

Per-Segment Audio Fields:

  • audio_filename — Initially null, populated by the pipeline (e.g., "segment-01_audio.mp3")
  • audio_duration_seconds — Initially null, populated with actual measured duration
  • padding_seconds — Gap after narration before next segment (default: 0.3, use 0.5 for dramatic pauses, 0.1 for fast-paced)

Visual Prompts: Write detailed, cinematic prompts. For Veo: include camera movement, lighting, mood. For Gemini images: include style, composition, color palette. Always specify 9:16 portrait orientation in prompts.

Ken Burns Presets (for image segments, alternate between them for variety):

  • zoom-in-slow - Slowly zoom into center
  • zoom-out-slow - Slowly zoom out from center
  • pan-left - Slow pan from right to left
  • pan-right - Slow pan from left to right
  • zoom-in-top-left - Zoom into upper-left corner
  • zoom-out-center - Zoom out from center
  • pan-up - Slow pan from bottom to top
  • pan-down - Slow pan from top to bottom
  • zoom-in-bottom-right - Zoom into bottom-right corner
  • drift-diagonal - Gentle diagonal drift with subtle zoom

Transitions (between segments — v3.0 adds wipe, clock-wipe, flip):

  • fade - Cross-fade (default, most natural)
  • slide-left - Slide in from right
  • slide-up - Slide up from bottom
  • wipe-left - Wipe from right to left (clean, modern)
  • wipe-right - Wipe from left to right
  • wipe-up - Wipe from bottom to top
  • clock-wipe - Circular clock wipe effect (dramatic)
  • flip - 3D flip transition (dynamic)
  • none - Hard cut

Transition Sound Effects (optional per segment): For segments with transitions, you can add a transition_sfx object:

  • prompt — Descriptive text for generating the SFX (e.g., "deep cinematic whoosh transition sound")
  • volume — Playback volume (default: 0.5)
  • Best for: slide, wipe, and flip transitions. Skip for subtle fades.

Giphy GIF Segments (when Giphy content is enabled):

For type: "gif" segments, set giphy_search with a descriptive query. The pipeline downloads the top matching GIF from Giphy and renders it full-frame with looping. Best for reaction moments, humor, and pop culture references.

  • giphy_search — Descriptive search query (e.g., "mind blown explosion reaction", "surprised pikachu meme")
  • visual_prompt — Still required as documentation, but not used for generation
  • No ken_burns_preset needed (GIFs are animated)
  • GIFs loop automatically for the segment duration

Giphy Sticker Overlays (optional per segment):

Any segment type can have a giphy_overlay object for an animated transparent sticker:

  • search — Giphy sticker search query (e.g., "fire emoji", "thumbs up animated", "sparkle effect")
  • position — Anchor: "top-left", "top-right", "bottom-left", "bottom-right", "center" (default: "bottom-right")
  • scale — Fraction of frame width, 0.0-1.0 (default: 0.25 = 270px on 1080px)
  • offset_x / offset_y — Pixel offset from anchor position (default: 0)
  • delay_frames — Frames to wait before showing (default: 0)
  • duration_frames — How long to show (null = entire segment duration)

When to use stickers:

  • Emoji reactions to emphasize key moments
  • Animated effects (sparkles, fire, explosions) for visual flair
  • Arrows/pointers to draw attention
  • DO NOT overuse: max 1 sticker per segment, keep them small (0.15-0.3 scale)

Text Overlays (optional per segment, especially for educational/instructional content):

Each segment can have a text_overlay object with these fields:

  • text - The text to display (keep short: key facts, numbers, labels, 1-8 words)
  • position - "top", "center", or "bottom" (default: "center")
  • style - Visual preset:
    • "bold-fact" — Large bold white text with heavy shadow (for key statistics, facts)
    • "subtitle-bar" — Text on a semi-transparent dark bar (for explanations)
    • "callout" — Uppercase gold text (for emphasis, warnings, key terms)
    • "highlight-box" — White text on a colored box with rounded corners (for labels, categories)
    • "minimal" — Small italic translucent text (for subtle annotations)
  • animateIn - Entrance animation:
    • "slide-up" — Slides up into position (recommended)
    • "fade" — Simple fade in
    • "scale-pop" — Pops in with a bounce
    • "typewriter" — Characters appear one by one
    • "spring-pop" — Natural bouncy pop using Remotion spring physics (v3.0)
    • "spring-slide" — Natural slide-up with spring overshoot (v3.0)
  • delayFrames - Delay before appearing (default: 0, use 15-30 to sync with narration)

When to use text overlays:

  • Educational content: Show key facts, statistics, definitions (e.g., "6 BILLION TONS", "243 EARTH DAYS")
  • Instructional content: Label steps, highlight concepts, show formulas
  • Listicles: Show the item number/title (e.g., "#3: Venus", "FACT 4")
  • Motivational: Display quotes or key phrases
  • DO NOT clutter: max 1 text overlay per segment, keep text very short

Background Music (when enabled): Set meta.music with a descriptive prompt. The pipeline will:

  1. Generate music via FFmpeg synthetic audio generation
  2. Loop it to match video duration
  3. Normalize to -14 LUFS
  4. Auto-duck volume under narration (default: 0.15 full, 0.05 ducked)
  5. Fade in at start (1s) and fade out at end (2s)

Loop Ending (optional): Set meta.loop_ending: true to cross-dissolve the last 0.5s to the first segment's visual, creating a seamless loop that encourages rewatching.

Progress Bar (optional): Set meta.show_progress_bar: true to show a thin progress indicator at the top. Customize color with meta.progress_bar_color.

Phase 3: User Approval

Present the storyboard in this format:

=== YouTube Short Storyboard ===
Title: "[Title]"
Duration: ~[N] seconds | Segments: [N] ([X] video + [Y] images)
Visual Mode: [mixed / images-only / videos-only]
Voice: [Voice Name] ([Gender], [Style])
Music: [prompt or "None"]
Features: [progress bar, loop ending, etc.]

[0:00-0:05] SEGMENT 1 ([TYPE]) - [Label] [HOOK]
  Narration: "[Narration text]"
  Visual: [Brief visual description]
  [Ken Burns: preset] (if image)
  [Text Overlay: "KEY FACT" (style, position, animation)] (if applicable)
  [Transition SFX: "whoosh sound"] (if applicable)
  Transition: [type]
  Padding: [0.3s]

[0:05-0:12] SEGMENT 2 ([TYPE]) - [Label]
  ...

Thumbnail: [Brief thumbnail description]

Note: Segment durations are estimates. The audio-first pipeline will
adjust each segment to exactly match the actual narration length,
ensuring perfect audio-visual sync.

Wait for user approval before proceeding. The user can modify segments, swap video/image types, change the voice, or adjust narration.

Phase 4: Asset Generation (Audio-First Pipeline)

After approval:

Step 0: Choose output directory

Ask the user where to save the output, or default to the current working directory with a subfolder named after the video title (slugified). Example: ./mind-blowing-space-facts/

Save the storyboard JSON to <output-dir>/storyboard.json.

Step 1: Set up the Remotion project

If the Remotion project does not exist at the output directory:

xcopy /E /I /Y "C:\Users\eckme\.claude\skills\youtube-shorts-creator\assets\remotion-template" "<output-dir>\remotion-project"
cd "<output-dir>\remotion-project"
npm install

Step 2: Run the audio-first asset generation pipeline

python "C:\Users\eckme\.claude\skills\youtube-shorts-creator\scripts\generate_assets.py" "<output-dir>\storyboard.json" "<output-dir>\remotion-project"

This script runs an audio-first pipeline in 8 phases:

  1. Phase 1 — Per-Segment TTS: Generates individual audio files for each segment's narration. Measures actual duration of each audio file via ffprobe. Normalizes each to -14 LUFS.
  2. Phase 1B — Background Music: Generates background music via FFmpeg synthetic audio generation (if meta.music is set). Loops to target duration, normalizes to -14 LUFS.
  3. Phase 2 — Update Durations: Adjusts each segment's visual duration to match its actual audio duration + padding, ensuring no scene changes during narration.
  4. Phase 2B — Giphy Download: Downloads GIFs for "gif" segments and sticker overlays from Giphy API (if any exist in storyboard).
  5. Phase 3 — Visual Generation (Parallel): Generates Veo 3.1 video clips + Gemini images + thumbnail in parallel, now knowing exact durations. GIF segments are skipped (already handled by Phase 2B).
  6. Phase 3B — Transition SFX: Generates short sound effect clips for segments with transition_sfx configured.
  7. Phase 4 — Caption Generation: Creates per-segment word-level captions using WhisperX (speech recognition) with proportional fallback.
  8. Phase 5 — Audio Concatenation: Merges per-segment audio files into a single voiceover reference file.
  9. Phase 6 — Storyboard Update: Writes all asset filenames, actual durations, and metadata back to the storyboard JSON.

Monitor the output and relay progress to the user.

Phase 5: Render

After all assets are generated:

cd "<output-dir>\remotion-project"
npx remotion render YouTubeShort "..\output.mp4" --props="..\storyboard.json" --codec=h264 --pixel-format=yuv420p --audio-codec=aac --crf=18

Copy the thumbnail:

copy "<output-dir>\remotion-project\public\thumbnail.png" "<output-dir>\thumbnail.png"

Phase 6: Output

Report the results:

=== YouTube Short Created! ===

Video: <path>/output.mp4
Thumbnail: <path>/thumbnail.png

Duration: [N] seconds | Resolution: 1080x1920 | FPS: 30
Segments: [N] ([X] video clips + [Y] images with Ken Burns)
Voiceover: [Voice Name] (Gemini TTS) — per-segment sync
Music: [Yes/No] — auto-ducking enabled
Captions: [WhisperX / Proportional / Disabled]
Audio: Normalized to -14 LUFS
Audio Sync: Perfect (audio-driven segment durations)

Gemini TTS Voice Options

Voice Name Gender Style
Sulafat Female Warm, narration
Laomedeia Female Energetic
Achernar Female Soft, warm
Charon Male Informative, narration
Alnilam Male Firm, confident
Iapetus Male Clear
Fenrir Male Dynamic

Error Handling

  • Veo timeout/block: Substitute with Gemini image + Ken Burns effect (rotates through varied presets)
  • Gemini TTS failure: Fall back to Google Cloud TTS Chirp 3 HD
  • Image generation failure: Uses gradient background with narration text overlay
  • Caption generation failure: Render without captions
  • WhisperX failure: Falls back to proportional captions per-segment (mixed mode: some Whisper, some proportional)
  • Per-segment TTS failure: Uses estimated duration from word count as fallback
  • Audio concatenation failure: Calculates total duration from individual files
  • Background music failure: Video renders without music (graceful skip)
  • SFX generation failure: Transition plays without sound effect
  • Giphy search failure: GIF segments use gradient fallback, sticker overlays are skipped
  • Missing GIPHY_API_KEY: Warning printed, Giphy features gracefully skipped
  • LUFS normalization failure: Audio used as-is (graceful skip)
  • All API calls: Retry up to 3 times with exponential backoff
  • Storyboard backup: Original storyboard is saved as *_original.json before updates

Key Architecture: Audio-First Sync

The v3.0 pipeline ensures perfect audio-visual synchronization:

  1. Each segment gets its own audio file (e.g., segment-01_audio.mp3)
  2. All audio normalized to -14 LUFS (YouTube loudness standard)
  3. Each segment's visual duration is set to actual audio duration + padding
  4. In the Remotion composition, each segment's audio plays at its exact start frame
  5. Background music auto-ducks when narration is playing
  6. Captions generated using WhisperX speech recognition for accurate word timing
  7. No global voiceover track — each segment is self-contained for perfect sync

This means:

  • A scene never changes while its narration is still playing
  • There is always a small gap (padding) between narration and the next scene
  • Captions are accurately timed to the actual words spoken
  • Video clips are 8 seconds max — narration for video segments should be written to fit within 8 seconds (avoid looping)
  • Background music volume drops automatically during narration

v3.0 Feature Details

Background Music

  • Generated via FFmpeg synthetic audio generation from a descriptive prompt
  • Automatically looped with FFmpeg if video exceeds clip length
  • Volume ducking: full volume (0.15) during pauses, reduced (0.05) during narration
  • Fade in (1s) at video start, fade out (2s) at video end
  • Skipped gracefully if generation fails

Transition Sound Effects

  • Short SFX clips (1.5s) generated per-segment via FFmpeg
  • Play at the transition point between segments
  • Configurable volume per-segment
  • Best paired with dynamic transitions (wipe, flip, slide)

WhisperX Captions

  • Uses whisper.cpp via @remotion/install-whisper-cpp for speech recognition
  • Per-segment processing: each audio file transcribed individually
  • Mixed mode: segments that fail Whisper use proportional fallback
  • One-time model download (~500MB medium.en model)
  • Falls back to proportional estimation if Node.js/whisper unavailable

Advanced Transitions

  • wipe-left / wipe-right / wipe-up: Clean directional wipe effects
  • clock-wipe: Circular clock-hand wipe (dramatic reveals)
  • flip: 3D card flip transition
  • All use Remotion's @remotion/transitions library

Spring Animations

  • spring-pop: Bouncy scale-up using Remotion spring physics
  • spring-slide: Natural slide-up with overshoot
  • More organic, professional motion than linear easing

Progress Bar

  • Thin bar at top showing video progress (0% to 100%)
  • Customizable color via meta.progress_bar_color
  • Subtle glow effect for visibility

Loop Ending

  • Cross-dissolves to first segment's image in last 0.5s
  • Creates seamless loop that encourages repeat watching
  • YouTube algorithm favors videos that get rewatched

Audio Normalization

  • Two-pass FFmpeg loudnorm filter targeting -14 LUFS
  • Applied to all TTS audio and background music
  • Ensures consistent volume across the video
  • Gracefully skipped if FFmpeg unavailable

Backward Compatibility

All v3.0 features default to OFF. A v2.0 storyboard (without music, transition_sfx, caption_style, giphy_search, giphy_overlay, etc.) will render identically to v2.0. New features only activate when their respective fields are present and non-null in the storyboard JSON. Giphy features require GIPHY_API_KEY — if not set, GIF segments fall back to gradient backgrounds and stickers are skipped.

References

Load these as needed for API details:

  • references/veo-api.md - Veo 3.1 API endpoints and parameters
  • references/storyboard-schema.md - Full storyboard JSON schema (v3.0)
  • references/style-templates.md - Pre-built visual style templates (load when user requests a specific style or says "neon rim", "holographic", etc.)

Style Templates

Pre-built visual styles are available in references/style-templates.md. When the user mentions a style by name or trigger phrase, load the reference file and use its templates for all visual prompts. This ensures consistent aesthetics across all segments.

Available styles:

  • Neon-Rim Holographic Documentary — Dark 3D renders with red-orange/cyan-blue color-coded glowing figures on pure black backgrounds. Best for: history, true crime, military, biker culture, noir, mystery, biographical content. Trigger: "neon rim", "holographic", "glowing figures", "dark documentary"

When a style template is active, skip the "Visual Style" question in Phase 1 (it's already decided) and pre-fill the meta.style, meta.caption_style, and meta.progress_bar_color from the template's config.


Business CMS Integration

MANDATORY: After rendering the final video:

  1. Create CMS folder:

    BUSINESS_ROOT="C:/Users/eckme/OneDrive/Documents/New folder (2)/Business"
    DATE_PATH=$(date +"%Y/%m/%d")
    VIDEO_SLUG=$(echo "VIDEO_TITLE" | tr '[:upper:]' '[:lower:]' | tr ' ' '-')
    mkdir -p "$BUSINESS_ROOT/Videos/YouTube-Shorts/$DATE_PATH/$VIDEO_SLUG"
    
  2. Copy deliverables:

    • output.mp4 → final video
    • thumbnail.png → thumbnail
    • storyboard.json → composition spec
  3. Create script.md with the full narration script

  4. Create metadata.md with type (YouTube Short), title, created time, status (Draft), duration, voice used, visual style, and file list

  5. Create/update _manifest.md in the date folder

  6. Tell the user the exact path where the video is saved