Claude Code Plugins

Community-maintained marketplace

Feedback

Assemble final video from audio, images, and subtitles

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name Video Assembly
tier 3
load_policy task-specific
description Assemble final video from audio, images, and subtitles
version 1.0.0
parent_skill production-operations

Video Assembly Skill

The Visual Wrapper for the Audio Journey

This skill handles assembling the final video from mastered audio, scene images, and VTT subtitles.


Purpose

Create YouTube-ready video files that complement the hypnotic audio experience.


Video Standards

Parameter Standard
Resolution 1920x1080 (Full HD)
Aspect Ratio 16:9
Codec H.264
Frame Rate 24 fps (cinematic)
Audio Codec AAC 320kbps
Container MP4

Input Requirements

Input File Required
Master Audio {session}_MASTER.mp3 Yes
Scene Images images/uploaded/*.png Yes
Subtitles output/subtitles.vtt Yes

Scene Image Generation

Primary Method: Stable Diffusion (Default)

python3 scripts/core/generate_scene_images.py sessions/{session}/

Alternative: Midjourney Prompts

python3 scripts/core/generate_scene_images.py sessions/{session}/ --midjourney-only

Alternative: Stock Images

python3 scripts/core/generate_scene_images.py sessions/{session}/ --method stock

Image Specifications

Property Requirement
Resolution 1920x1080 minimum
Format PNG or JPEG
Aspect Ratio 16:9
Naming scene_01.png, scene_02.png, etc.

VTT Subtitle Generation

python3 scripts/ai/vtt_generator.py sessions/{session}

VTT Format

WEBVTT
Kind: captions
Language: en

1
00:00:00.000 --> 00:00:05.500
Welcome to this healing journey.

2
00:00:06.000 --> 00:00:12.000
Find a comfortable position and
allow your eyes to close.

Subtitle Guidelines

Guideline Value
Max lines per caption 2
Max characters per line ~80
Min duration 1.5 seconds
Max duration 7 seconds

Video Assembly Command

python3 scripts/core/assemble_session_video.py sessions/{session}/

This automatically:

  • Sequences images based on script sections
  • Adds cross-fade transitions
  • Syncs subtitles to audio
  • Outputs to output/video/session_final.mp4

Manual FFmpeg Assembly

For custom control:

# Create video from images with audio
ffmpeg -y \
  -framerate 1/10 \
  -pattern_type glob -i 'images/uploaded/*.png' \
  -i output/{session}_MASTER.mp3 \
  -c:v libx264 -r 24 -pix_fmt yuv420p \
  -c:a aac -b:a 320k \
  -shortest \
  output/video/session_final.mp4

Image-to-Section Mapping

Images should correspond to script sections:

Image Section Timing
scene_01.png Pre-Talk 0:00-3:00
scene_02.png Induction 3:00-8:00
scene_03.png Deepening 8:00-12:00
scene_04.png Journey Start 12:00-17:00
scene_05.png Journey Core 17:00-22:00
scene_06.png Helm/Deepest 22:00-25:00
scene_07.png Integration 25:00-28:00
scene_08.png Emergence 28:00-30:00

Transition Effects

Transition Duration Use For
Cross-dissolve 2-3 seconds Section transitions
Fade from black 3 seconds Opening
Fade to black 3 seconds Closing

Output Files

File Location Purpose
Final video output/video/session_final.mp4 Direct use
YouTube copy output/youtube_package/final_video.mp4 Upload ready

Video Overlay Generation

Generate supporting graphics:

python3 scripts/core/generate_video_images.py sessions/{session}/ --all

Creates:

  • title_card.png - Video intro screen
  • sections/section_*.png - Chapter transitions
  • outro.png - End screen
  • social_preview.png - Social sharing

Quality Verification

After assembly:

# Check video properties
ffprobe -v error -show_format -show_streams output/video/session_final.mp4

# Play with VLC to verify sync
vlc output/video/session_final.mp4

Quality Checklist

  • Resolution is 1920x1080
  • Frame rate is 24 fps
  • Audio syncs with subtitles
  • Transitions are smooth
  • No visible artifacts
  • Duration matches audio
  • File size reasonable (<2GB for 30 min)

Troubleshooting

Issue Cause Solution
Audio/video desync Different durations Use -shortest flag
Pixelated video Wrong pixel format Use -pix_fmt yuv420p
Green frames Image format issue Convert images to PNG
Subtitle timing off VTT not scaled Regenerate VTT with actual audio duration
File too large Bitrate too high Use -crf 23 for smaller file

Integration with Pipeline

Before (dependencies):

  • Audio mastered ({session}_MASTER.mp3)
  • Scene images ready (images/uploaded/)
  • VTT subtitles generated (output/subtitles.vtt)

After (next steps):

  • YouTube packaging

Related Resources

  • Skill: tier3-production/audio-mixing/ (input)
  • Skill: tier3-production/youtube-packaging/ (next step)
  • Doc: docs/STOCK_IMAGE_SOP.md
  • Script: scripts/core/assemble_session_video.py