Claude Code Plugins

Community-maintained marketplace

Feedback

summarize-transcripts

@dparedesi/YTScribe
6
0

Generate AI summaries for downloaded YouTube transcripts. Use when you want to add summaries to existing transcript files, batch process transcripts for summarization, summarize a channel's videos, generate frontmatter summaries, or when the user mentions summarize, AI summaries, OpenRouter, or transcript metadata.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name summarize-transcripts
description Generate AI summaries for downloaded YouTube transcripts. Use when you want to add summaries to existing transcript files, batch process transcripts for summarization, summarize a channel's videos, generate frontmatter summaries, or when the user mentions summarize, AI summaries, OpenRouter, or transcript metadata.

Summarize Transcripts

Why? Transcript files without summaries are difficult to scan. This skill adds ~500-word AI-generated summaries to transcript frontmatter, enabling quick content discovery and organization.

Quick Start

# Summarize a specific folder
transcript-summarize OpenAI

# Summarize ALL folders
transcript-summarize --all

# Preview what would be summarized
transcript-summarize --all --dry-run

Workflow

1. Verification & Mode Selection

Check for OpenRouter API key securely (do not print it):

# Check if .env contains the key definition (silently)
if [ -f .env ] && grep -q "^[[:space:]]*OPENROUTER_API_KEY=" .env; then echo "✅ Key configured in .env"; else echo "❌ Key missing"; fi

Decision:

  • If key exists: Proceed to Step 2A (Automated Batch Mode). This is preferred for speed and volume.
  • If key is MISSING: Proceed to Step 2B (Agentic Fallback Mode). You will summarize the files manually.

2A. Automated Batch Mode (With API Key)

Use the CLI tool to process folders efficiently.

1. Load Environment & Run:

export $(grep -v '^#' .env | xargs) && transcript-summarize <FOLDER_NAME>

2. Verify & Interpret Output:

Check the summary statistics at the end of the command output:

Summarization Complete!
  Success: 0
  Skipped: 15  <-- This means files were already summarized (Idempotent)
  Errors: 0
  Total: 15

Decision Logic:

  • Success > 0: Work was done. Task success.
  • Skipped == Total: All files are already up to date. Task success. Do NOT retry or look for "missing" files.
  • Errors > 0: Check the error logs (401/403/429).

2B. Agentic Fallback Mode (No API Key)

If the user has no API key, YOU are the summarizer.

Constraints:

  • Process small batches (1-5 files) to manage your context window.
  • DO NOT use the transcript-summarize command (it will fail).
  • You must read, summarize, and update the files using your tools.

Workflow:

  1. List Files:

    ls data/<FOLDER>/transcripts/*.md
    
  2. Notify User (Polite Fallback):

    • Inform the user: "I see the API key is missing. For future reference, you can set this up following the instructions in README.md to enable faster automated summarization. For now, I will proceed with manual summarization of this batch."
  3. Process Loop (Iterate through files):

    • Read the transcript file: view_file <path>

    • Generate Summary (Internal Monologue):

      • Target ~500 words.
      • Format: Single continuous paragraph (no line breaks, no bullets).
      • Style: Neutral, informative, dense. No "This video is about..." intro.
    • Update File:

      • Insert the summary into the frontmatter summary: field.
      • Use replace_file_content to add the summary block.

      Example Code Edit:

      summary: |
        Here is the generated summary text...
      ---
      

3. Ask to Continue: After processing a batch, ask the user if they want you to continue with the next batch.

2. Identify Target Folders

Determine scope based on user request:

User Request Command
Specific channel transcript-summarize <FOLDER_NAME>
All channels transcript-summarize --all
Preview only transcript-summarize --all --dry-run

[!TIP] Always run --dry-run first when processing many folders. This shows exactly how many transcripts need summaries.

3. Execute Summarization

Run the command with environment variables loaded:

# Load .env and run summarization
export $(grep -v '^#' .env | xargs) && transcript-summarize <FOLDER_NAME>

# Example: Summarize the a16z folder
export $(grep -v '^#' .env | xargs) && transcript-summarize a16z

Expected output:

Processing folder: a16z
Found 42 transcripts, 38 need summaries
Summarizing: how_to_build_a_startup.txt (1/38)
[========================================] 38/38 complete
Summary: Processed 38 files, 0 errors

4. Verify Results

Check that summaries were added:

# View a summarized transcript's frontmatter
head -30 data/a16z/how_to_build_a_startup.txt

Success criteria:

  • summary: field exists in frontmatter
  • Summary is ~500 words (configurable via --max-words)
  • No API errors in output

Command Reference

transcript-summarize [FOLDER] [OPTIONS]
Option Description Default
FOLDER Specific folder in data/ to process (Required unless --all)
--all Process ALL folders in data/ False
--dry-run Show what would happen without changes False
--force Re-summarize files that already have summaries False
--delay Seconds between API requests (min: 4s) 4.0
--model OpenRouter model to use xiaomi/mimo-v2-flash:free
--max-words Target summary length 500

[!TIP] The default model xiaomi/mimo-v2-flash:free is free and high-quality. No paid account needed, just an OpenRouter API key.


Examples

Summarize a single channel:

export $(grep -v '^#' .env | xargs) && transcript-summarize OpenAI
# Processes only data/OpenAI/transcripts/*.md files

Dry run to preview work:

export $(grep -v '^#' .env | xargs) && transcript-summarize --all --dry-run
# Output: "Would process 156 files across 12 folders"

Force re-summarize with custom model:

export $(grep -v '^#' .env | xargs) && transcript-summarize a16z --force --model moonshotai/kimi-k2:free
# Overwrites existing summaries with fresh ones

Slower rate for unstable connections:

export $(grep -v '^#' .env | xargs) && transcript-summarize LexFridman --delay 10
# 10 second delay between API calls

Common Mistakes

Mistake Why It's Wrong Correct Approach
Running without --dry-run first May process hundreds of files unexpectedly Always preview with --dry-run when using --all
Using --force without reason Wastes API calls on already-summarized files Only use --force when changing models or max-words
Setting --delay below 4s Will trigger rate limits Keep delay at 4s minimum
Running from wrong directory Command won't find data/ folder Always run from project root (where pyproject.toml is)

Troubleshooting

Issue Cause Solution
OPENROUTER_API_KEY not set .env file exists but variables not exported to shell Use `export $(grep -v '^#' .env
Rate limited (429) Too many requests too fast Script auto-retries with backoff. If persistent, increase --delay to 8-10s
No folders found in data/ Running from wrong directory or empty data folder cd to project root; verify data/ contains channel folders
Transcript too short (skipped) Transcript under 100 words Expected behavior; very short transcripts skip summarization
Model not available Model ID typo or model deprecated Check OpenRouter docs for current model IDs
Summary in wrong language Model defaulted to source language Most free models default to English; use a multilingual model if needed

Quality Checklist

Before considering summarization complete:

  • Ran --dry-run first to preview scope
  • Verified API key is configured
  • Command completed without errors
  • Spot-checked 2-3 summaries for quality
  • Summary length is appropriate (~500 words)

[!WARNING] If summaries appear truncated or low-quality, try a different model. Quality varies by model and transcript content.