name	summarize-transcripts
description	Generate AI summaries for downloaded YouTube transcripts. Use when you want to add summaries to existing transcript files, batch process transcripts for summarization, summarize a channel's videos, generate frontmatter summaries, or when the user mentions summarize, AI summaries, OpenRouter, or transcript metadata.

Summarize Transcripts

Why? Transcript files without summaries are difficult to scan. This skill adds ~500-word AI-generated summaries to transcript frontmatter, enabling quick content discovery and organization.

Quick Start

# Summarize a specific folder
transcript-summarize OpenAI

# Summarize ALL folders
transcript-summarize --all

# Preview what would be summarized
transcript-summarize --all --dry-run

Workflow

1. Verification & Mode Selection

Check for OpenRouter API key securely (do not print it):

# Check if .env contains the key definition (silently)
if [ -f .env ] && grep -q "^[[:space:]]*OPENROUTER_API_KEY=" .env; then echo "✅ Key configured in .env"; else echo "❌ Key missing"; fi

Decision:

If key exists: Proceed to Step 2A (Automated Batch Mode). This is preferred for speed and volume.
If key is MISSING: Proceed to Step 2B (Agentic Fallback Mode). You will summarize the files manually.

2A. Automated Batch Mode (With API Key)

Use the CLI tool to process folders efficiently.

1. Load Environment & Run:

export $(grep -v '^#' .env | xargs) && transcript-summarize <FOLDER_NAME>

2. Verify & Interpret Output:

Check the summary statistics at the end of the command output:

Summarization Complete!
  Success: 0
  Skipped: 15  <-- This means files were already summarized (Idempotent)
  Errors: 0
  Total: 15

Decision Logic:

Success > 0: Work was done. Task success.
Skipped == Total: All files are already up to date. Task success. Do NOT retry or look for "missing" files.
Errors > 0: Check the error logs (401/403/429).

2B. Agentic Fallback Mode (No API Key)

If the user has no API key, YOU are the summarizer.

Constraints:

Process small batches (1-5 files) to manage your context window.
DO NOT use the transcript-summarize command (it will fail).
You must read, summarize, and update the files using your tools.

Workflow:

List Files:
```
ls data/<FOLDER>/transcripts/*.md
```
Notify User (Polite Fallback):
- Inform the user: "I see the API key is missing. For future reference, you can set this up following the instructions in README.md to enable faster automated summarization. For now, I will proceed with manual summarization of this batch."
Process Loop (Iterate through files):
- Read the transcript file: view_file <path>
- Generate Summary (Internal Monologue):
  - Target ~500 words.
  - Format: Single continuous paragraph (no line breaks, no bullets).
  - Style: Neutral, informative, dense. No "This video is about..." intro.
- Update File:
  - Insert the summary into the frontmatter summary: field.
  - Use replace_file_content to add the summary block.
  Example Code Edit:
```
summary: |
  Here is the generated summary text...
---
```

3. Ask to Continue: After processing a batch, ask the user if they want you to continue with the next batch.

2. Identify Target Folders

Determine scope based on user request:

User Request	Command
Specific channel	`transcript-summarize <FOLDER_NAME>`
All channels	`transcript-summarize --all`
Preview only	`transcript-summarize --all --dry-run`

[!TIP] Always run --dry-run first when processing many folders. This shows exactly how many transcripts need summaries.

3. Execute Summarization

Run the command with environment variables loaded:

# Load .env and run summarization
export $(grep -v '^#' .env | xargs) && transcript-summarize <FOLDER_NAME>

# Example: Summarize the a16z folder
export $(grep -v '^#' .env | xargs) && transcript-summarize a16z

Expected output:

Processing folder: a16z
Found 42 transcripts, 38 need summaries
Summarizing: how_to_build_a_startup.txt (1/38)
[========================================] 38/38 complete
Summary: Processed 38 files, 0 errors

4. Verify Results

Check that summaries were added:

# View a summarized transcript's frontmatter
head -30 data/a16z/how_to_build_a_startup.txt

Success criteria:

summary: field exists in frontmatter
Summary is ~500 words (configurable via --max-words)
No API errors in output

Command Reference

transcript-summarize [FOLDER] [OPTIONS]

Option	Description	Default
`FOLDER`	Specific folder in `data/` to process	(Required unless `--all`)
`--all`	Process ALL folders in `data/`	False
`--dry-run`	Show what would happen without changes	False
`--force`	Re-summarize files that already have summaries	False
`--delay`	Seconds between API requests (min: 4s)	4.0
`--model`	OpenRouter model to use	`xiaomi/mimo-v2-flash:free`
`--max-words`	Target summary length	500

[!TIP] The default model xiaomi/mimo-v2-flash:free is free and high-quality. No paid account needed, just an OpenRouter API key.

Examples

Summarize a single channel:

export $(grep -v '^#' .env | xargs) && transcript-summarize OpenAI
# Processes only data/OpenAI/transcripts/*.md files

Dry run to preview work:

export $(grep -v '^#' .env | xargs) && transcript-summarize --all --dry-run
# Output: "Would process 156 files across 12 folders"

Force re-summarize with custom model:

export $(grep -v '^#' .env | xargs) && transcript-summarize a16z --force --model moonshotai/kimi-k2:free
# Overwrites existing summaries with fresh ones

Slower rate for unstable connections:

export $(grep -v '^#' .env | xargs) && transcript-summarize LexFridman --delay 10
# 10 second delay between API calls

Common Mistakes

Mistake	Why It's Wrong	Correct Approach
Running without `--dry-run` first	May process hundreds of files unexpectedly	Always preview with `--dry-run` when using `--all`
Using `--force` without reason	Wastes API calls on already-summarized files	Only use `--force` when changing models or max-words
Setting `--delay` below 4s	Will trigger rate limits	Keep delay at 4s minimum
Running from wrong directory	Command won't find `data/` folder	Always run from project root (where `pyproject.toml` is)

Troubleshooting

Issue	Cause	Solution
`OPENROUTER_API_KEY not set`	`.env` file exists but variables not exported to shell	Use `export $(grep -v '^#' .env
`Rate limited (429)`	Too many requests too fast	Script auto-retries with backoff. If persistent, increase `--delay` to 8-10s
`No folders found in data/`	Running from wrong directory or empty data folder	`cd` to project root; verify `data/` contains channel folders
`Transcript too short` (skipped)	Transcript under 100 words	Expected behavior; very short transcripts skip summarization
`Model not available`	Model ID typo or model deprecated	Check OpenRouter docs for current model IDs
Summary in wrong language	Model defaulted to source language	Most free models default to English; use a multilingual model if needed

Quality Checklist

Before considering summarization complete:

Ran --dry-run first to preview scope
Verified API key is configured
Command completed without errors
Spot-checked 2-3 summaries for quality
Summary length is appropriate (~500 words)

[!WARNING] If summaries appear truncated or low-quality, try a different model. Quality varies by model and transcript content.

summarize-transcripts

Install Skill

SKILL.md