Claude Code Plugins

Community-maintained marketplace

Feedback

Video production workflow orchestrator for Veo 3. Guides users through creating professional video prompts via two paths - direct text-to-video OR image-to-video pipeline (Imagen 3/4 → Veo 3). Validates prompt completeness, checks conflicts, ensures all mandatory components present. Integrates camera-movements, great-prompt-anatomy, short-prompt-guide, long-prompt-guide, and imagine skills.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name arch-v
description Video production workflow orchestrator for Veo 3. Guides users through creating professional video prompts via two paths - direct text-to-video OR image-to-video pipeline (Imagen 3/4 → Veo 3). Validates prompt completeness, checks conflicts, ensures all mandatory components present. Integrates camera-movements, great-prompt-anatomy, short-prompt-guide, long-prompt-guide, and imagine skills.

ARCH-V: Video Production Orchestrator

Professional video prompt creation for Veo 3 with two production paths.

Overview

ARCH-V guides you through creating production-ready video prompts by:

  1. Determining optimal workflow path
  2. Loading appropriate reference skills
  3. Validating mandatory components
  4. Checking for conflicts
  5. Delivering final validated prompts

Two Production Paths

Path 1: Text-to-Video (Direct Veo 3)

When to use: You have clear vision for entire video including motion, audio, and can describe it in text.

Output: Single Veo 3 prompt ready for text-to-video generation

Skills used:

  • great-prompt-anatomy (8 mandatory components)
  • camera-movements (standardized vocabulary)
  • short-prompt-guide OR long-prompt-guide (based on complexity)

Path 2: Image-to-Video (Imagen → Veo 3)

When to use: You want precise control over starting visual, or have specific still image composition in mind before adding motion.

Output: Two prompts

  1. Imagen 3/4 prompt for still image generation
  2. Veo 3 image-to-video prompt for animation/motion

Skills used:

  • Stage A (Imagen): imagine skill + great-prompt-anatomy (visual components)
  • Stage B (Veo 3): camera-movements + great-prompt-anatomy (motion components)

Workflow Process

Stage 0: Path Determination

ARCH-V asks:

Which production path?

1. Text-to-Video (Direct Veo 3)
   - Describe entire video in single prompt
   - Faster workflow
   - Less control over initial composition

2. Image-to-Video (Imagen → Veo 3 pipeline)
   - Create perfect still image first
   - Then add motion and animation
   - Maximum control over visual composition
   - Two-step process

User chooses path → ARCH-V loads appropriate skills and guides accordingly


Path 1 Workflow: Text-to-Video

Stage 1: Prompt Type Determination

ARCH-V asks:

What type of video prompt?

SHORT PROMPT (for):
- Filler shots, B-roll
- Atmospheric scenes
- Quick establishing shots
- <3 sentences to describe

LONG PROMPT (for):
- Dialogue scenes
- Character continuity
- Multi-beat sequences (>3 beats)
- Complex choreography

User chooses → ARCH-V loads short-prompt-guide OR long-prompt-guide

Stage 2: Mandatory Components Check

ARCH-V validates all 8 components from great-prompt-anatomy:

Checklist:

    1. Subject (who/what in shot)
    1. Setting (where/when)
    1. Action (what's happening)
    1. Style/Genre (aesthetic)
    1. Camera/Composition (shot size, angle, movement)
    1. Lighting/Mood (light sources, emotional tone)
    1. Audio (dialogue, ambience, music)
    1. Constraints (prohibitions, exact requirements)

If camera movement mentioned: Load camera-movements for vocabulary validation

Stage 3: Validation & Conflict Checking

ARCH-V checks for:

Time/Weather Conflicts:

  • ❌ "Golden hour" with "midnight"
  • ❌ "Harsh noon sun" with "soft evening light"
  • ✅ Consistent time of day throughout

Camera Movement Conflicts:

  • ❌ "Dolly in while arc left" (multiple movements per beat)
  • ✅ One movement per beat from standardized vocabulary

Spatial Coherence (if long prompt):

  • ✅ FG/MG/BG structure defined
  • ✅ Color anchors consistent (3-5 colors repeated)
  • ✅ Continuity rules explicit

Stage 4: Output

If all valid:

✅ PROMPT READY

[Final Veo 3 text-to-video prompt displayed]

Ready to use in Veo 3!

If validation fails:

⚠️ PROMPT-LOCKED

Missing/Conflicting:
- [specific issue 1]
- [specific issue 2]

Suggested fixes:
- [actionable fix 1]
- [actionable fix 2]

Would you like me to help resolve these?

Path 2 Workflow: Image-to-Video

Stage A: Imagen Prompt Creation

ARCH-V loads: imagine skill for Imagen 3/4 structure

Focus on VISUAL components from great-prompt-anatomy:

Stage A Checklist:

    1. Subject (detailed visual description)
    1. Setting (environment, placement)
    1. Style/Genre (photographic or artistic style)
    1. Camera/Composition (shot size, angle - STATIC, no movement)
    1. Lighting/Mood (light sources, color palette)
    1. Constraints (visual prohibitions)

NOT included in Stage A:

  • ❌ Action (no motion in still image)
  • ❌ Audio (images have no sound)
  • ❌ Camera movements (static composition)

Imagine-specific guidance:

  • Subject-Context-Style framework
  • Technical photography specs (lens, lighting quality)
  • Art style references (Science SARU by default)
  • Natural language verbose descriptions
  • Token limit: 480 tokens

Stage A Output:

✅ IMAGEN PROMPT READY

[Detailed Imagen 3/4 prompt for still image]

Generate this image in Imagen 3/4 first.
Once you have the image, proceed to Stage B.

Stage B: Veo 3 Motion Prompt Creation

ARCH-V asks:

You now have your still image. Let's add motion!

What type of motion complexity?

SHORT MOTION (for):
- Simple camera movement
- Atmospheric animation
- Single motion element

LONG MOTION (for):
- Complex choreography
- Multiple action beats
- Character animation with timing

User chooses → ARCH-V loads appropriate prompt guide

Stage B Checklist (MOTION components):

    1. Action (what motion/animation occurs)
    1. Camera/Composition (movement from static image)
    1. Audio (sound design for video)

Additional from Stage A (maintained):

  • ✅ Subject (already defined in image)
  • ✅ Setting (already defined in image)
  • ✅ Style (must match image aesthetic)
  • ✅ Lighting (must match image lighting)

Camera movements: Load camera-movements for vocabulary

Validation checks:

  • Motion must be achievable from static image
  • Camera movement must respect image composition
  • Action must fit subject/setting from Stage A
  • Audio must match visual style

Stage B Output:

✅ VEO 3 IMAGE-TO-VIDEO PROMPT READY

[Veo 3 prompt referencing your Imagen-generated image]

Use this prompt in Veo 3 with your generated image!

Integration Pattern

Skills Cross-Reference

great-prompt-anatomy (8 components):

  • Used in ALL workflows
  • Path 1: All 8 components
  • Path 2A (Imagen): Components 1,2,4,5,6,8 (visual only)
  • Path 2B (Veo 3 motion): Components 3,5,7 + reference to image

camera-movements:

  • Path 1: For camera movement specification
  • Path 2A: NOT used (static image)
  • Path 2B: CRITICAL (motion from static)

short/long-prompt-guide:

  • Path 1: Guide entire prompt structure
  • Path 2B: Guide motion/animation structure

imagine:

  • Path 1: NOT used
  • Path 2A: PRIMARY guide for Imagen structure

Workflow Decision Tree

User: "I want to create a video"
    ↓
ARCH-V: "Text-to-Video or Image-to-Video?"
    ↓
Path 1: Text-to-Video
    ↓
    "Short or Long prompt?"
    ↓
    Load: great-prompt-anatomy + camera-movements + [short/long]-guide
    ↓
    Validate 8 components + conflicts
    ↓
    Output: Veo 3 prompt

Path 2: Image-to-Video
    ↓
    STAGE A: "Create still image"
    ↓
    Load: imagine + great-prompt-anatomy (visual components)
    ↓
    Validate visual components
    ↓
    Output: Imagen prompt
    ↓
    User generates image
    ↓
    STAGE B: "Add motion to image"
    ↓
    "Short or Long motion?"
    ↓
    Load: camera-movements + great-prompt-anatomy (motion components)
    ↓
    Validate motion feasibility + conflicts
    ↓
    Output: Veo 3 image-to-video prompt

Validation Rules

Universal Validations (All Paths)

Mandatory Component Presence:

  • All required components for chosen path present
  • No empty placeholders
  • Specific details provided (not vague)

Camera Movement:

  • ONE movement per beat/timestamp
  • Uses standardized vocabulary from camera-movements skill
  • Movement appropriate for shot type

Style Consistency:

  • Style/aesthetic maintained throughout
  • Color palette specified (3-5 color anchors)
  • Lighting approach consistent

Path 1 Specific Validations

Time/Weather Continuity:

  • Single time of day throughout (unless intentional transition)
  • Weather consistent (no sudden rain → sun)
  • Lighting quality matches time/weather

Audio Appropriateness:

  • Dialogue formatted correctly (Character: "Text")
  • Ambient sounds match environment
  • Music style fits mood

Path 2 Specific Validations

Stage A (Imagen) Validations:

  • No motion words (running, flying, moving) - image is static
  • No audio descriptions - images silent
  • Technical photography specs appropriate
  • Token count under 480

Stage B (Veo 3 motion) Validations:

  • Motion achievable from static starting point
  • Camera movement respects image composition
  • Action fits subject capabilities
  • Audio matches visual aesthetic from Stage A

Error Messages & Fixes

Common Issues

Issue: Missing Components

⚠️ PROMPT-LOCKED

Missing mandatory components:
- Audio not specified
- Lighting/Mood undefined

Fix: Add audio description (dialogue/ambience/music)
Fix: Specify light sources and mood

Issue: Camera Movement Conflict

⚠️ PROMPT-LOCKED

Conflict detected:
- "0-4s: Dolly in while panning left"

Fix: Choose ONE movement per beat
- Option A: "0-4s: Dolly in"
- Option B: "0-4s: Pan left"

Issue: Time/Weather Conflict

⚠️ PROMPT-LOCKED

Conflict detected:
- Setting: "Golden hour sunset"
- Lighting: "Harsh midday sun"

Fix: Make lighting consistent with time
- "Golden hour: warm, low-angle sun, soft shadows"

Issue: Path 2A - Motion in Static Image

⚠️ IMAGEN PROMPT LOCKED

Error: Motion detected in static image prompt
- "person running across field"

Fix: Describe static composition
- "person mid-stride in running pose"
Then add motion in Stage B (Veo 3)

Example Workflows

Example 1: Path 1 Short Prompt

User: "I want a coffee shop morning scene"

ARCH-V: Path 1 (Text-to-Video) → Short Prompt

Guides through:

  • Subject: Barista pouring latte art
  • Setting: Morning café, 8:30 AM
  • Action: Pours latte, heart shape forms
  • Style: Cinematic B-roll, warm grade
  • Camera: Overhead, slow tilt down
  • Lighting: Natural window light, warm
  • Audio: Espresso machine, milk steaming
  • Constraints: (none specified)

Validates: ✅ All 8 components present

Output:

Format & style: Cinematic B-roll, warm color grade
Barista pours latte art, heart shape forming in foam
Morning café, 8:30 AM, soft window light
Overhead, slow tilt down
Natural window key, warm practical fills
Espresso machine hiss, milk steaming, gentle café chatter

Example 2: Path 1 Long Prompt

User: "Romantic rain scene with couple under umbrella"

ARCH-V: Path 1 (Text-to-Video) → Long Prompt (has dialogue)

Loads: long-prompt-guide Production Brief framework

Guides through all 11 blocks, validates:

  • FG/MG/BG structure ✅
  • Color anchors (5 colors) ✅
  • Camera beats (3 beats, ONE movement each) ✅
  • Dialogue format ✅
  • Continuity rules ✅

Output: Complete Production Brief (see long-prompt-guide examples)

Example 3: Path 2 Image-to-Video

User: "I want precise control over Jakarta street scene, then animate it"

ARCH-V: Path 2 (Image-to-Video)

STAGE A - Imagen:

Subject: Young street vendor, simplified Science SARU style
Setting: Dense Jakarta street, golden hour
Style: Science SARU animation aesthetic
Camera: Wide establishing, low angle (STATIC)
Lighting: Golden hour warm, dramatic shadows
Constraints: No motion, frozen moment

[Validates: No motion words, visual only]

✅ IMAGEN PROMPT:
"Science SARU animation style aesthetic. Young Indonesian street 
vendor character with simplified design, elastic limbs, large 
expressive eyes, worn earth tone clothing. Dense layered Jakarta 
street scene at golden hour, geometric shophouses, vibrant warm 
palette transitioning to orange-pink sunset gradient, dramatic 
long shadows, wet reflective surfaces. Wide establishing shot, 
low angle, dramatic composition, flat color blocks with gradients, 
painterly watercolor texture, melancholic atmosphere."

User generates image in Imagen 3/4

STAGE B - Veo 3 Motion:

[Image uploaded to Veo 3]

Action: Vendor turns head slowly toward camera, slight smile
Camera: Slow dolly in from wide to medium close-up
Audio: Street ambience, distant traffic, vendor humming melody

[Validates: Motion achievable, camera respects composition]

✅ VEO 3 IMAGE-TO-VIDEO PROMPT:
"From this still image: Vendor turns head slowly toward camera, 
slight smile forming. Slow dolly in from wide establishing to 
medium close-up over 8 seconds. Jakarta street ambience with 
distant traffic, motorbike sounds, vendor humming traditional 
Indonesian melody. Maintain golden hour lighting, warm atmosphere, 
and Science SARU aesthetic from image."

Tips for Using ARCH-V

Choose the right path:

  • Path 1 faster for straightforward videos
  • Path 2 better for complex visual compositions or when you need perfect still first

Be specific early:

  • Detailed subject descriptions help ARCH-V guide you better
  • Vague inputs → more back-and-forth questions

Trust the validation:

  • PROMPT-LOCKED means real conflicts exist
  • Suggested fixes based on proven patterns
  • Resolve conflicts before generating

Use skill references:

  • ARCH-V will load appropriate skills automatically
  • You can reference them directly for inspiration
  • Cross-skill integration is intentional

Iterate progressively:

  • Start with minimum viable (4 blocks for long prompts)
  • Add detail as needed
  • ARCH-V guides what's optional vs mandatory

Technical Notes

Token Efficiency:

  • ARCH-V itself: ~2,000 tokens
  • Loads reference skills on-demand only
  • Total system: ~8,600 tokens vs ~15,000 monolithic
  • 43% token savings with full capability

Skill Integration:

  • All 5 reference skills work together seamlessly
  • Cross-references validated automatically
  • No duplication between skills

Progressive Disclosure:

  • ARCH-V SKILL.md loaded when user asks for video prompts
  • Reference skills loaded based on path/choices
  • Optimal token usage throughout workflow

Quick Start

For beginners: Let ARCH-V guide you with questions

For experienced users: Specify path and prompt type upfront:

  • "Text-to-video short prompt for product shot"
  • "Image-to-video, create Imagen prompt first for portrait"

For complex projects: Use Path 2 with long motion prompts for maximum control

ARCH-V adapts to your workflow preference!