| name | id-generator |
| description | Generate intelligent session IDs based on detected content source type. Analyzes ContentSummary and creates meaningful IDs (podcast-xyz, transcript-abc, etc.). |
ID Generator Skill
Expert at generating meaningful session IDs based on content source type and characteristics.
What This Skill Does
- Reads ContentSummary JSON with detectedSource field
- Analyzes source type and main topic/theme
- Generates human-readable, meaningful session ID
- Returns ID with confidence score and rationale
- Enables reusable content identification
ID Generation Process
Follow these 5 steps:
Step 1: Validate Input
- Confirm ContentSummary has required fields
- Check if detectedSource field exists
- Note the content's main topic/headline/category
Step 2: Analyze Source Type
- Review detectedSource value (podcast, transcript, article, youtube, twitter, text, other)
- Consider content characteristics:
- Podcast: Audio transcript with timestamps, speaker markers, conversational flow
- Transcript: Conversational content, dialogue, timestamps, multiple speakers
- Article: Written structure, sections, headlines, formal tone
- YouTube: Video description, channel info, timestamps
- Twitter: Social media context, short-form, engagement metrics
- Text: Unstructured thoughts, raw content, raw notes
- Other: Any other content type
Step 3: Extract Topic Keywords
- Identify main topic from headline, category, or keyThemes
- Select 1-2 most significant keywords
- Avoid generic terms, prefer specific subject matter
- Examples: "healthcare", "ai", "python", "productivity"
Step 4: Generate Base ID
- Format:
{source-type}-{date}-{topic} - Source prefix from detectedSource (lowercase)
- Date in YYYY-MM-DD format
- Topic as 1-2 words (hyphen-separated, lowercase)
- Example:
podcast-2024-12-08-ai-healthcare
Step 5: Quality Check & Return
- Ensure ID is lowercase, hyphen-separated
- Validate length (20-50 characters preferred)
- Return JSON with:
contentId: Generated IDdetectedSource: Confirmed source typesourceConfidence: 0.0-1.0 confidence in source detectionrationale: Brief explanation of why this source type
ID Generation Rules
Source-Based Naming
- Podcast:
podcast-{date}-{topic} - Transcript:
transcript-{date}-{topic} - Article:
article-{date}-{topic} - YouTube:
youtube-{date}-{topic} - Twitter:
tweet-{date}-{topic} - Text:
text-{date}-{topic} - Other:
content-{date}-{topic}
Topic Selection
- Extract from headline (first 2-3 words)
- If headline too generic, use category or first keyTheme
- Avoid articles (a, the, and, or)
- Use only letters, numbers, hyphens
- Maximum 3 words
Date Handling
- Use current date in YYYY-MM-DD format
- Or extract from context field if content has original date
- Format consistently
Examples
Input: Podcast about AI in Healthcare (2024-12-08)
Output: podcast-2024-12-08-ai-healthcare
Input: Whisper transcript of React conference talk (2024-12-07)
Output: transcript-2024-12-07-react-conference
Input: Article on productivity hacks (2024-12-06)
Output: article-2024-12-06-productivity-hacks
Input: YouTube tutorial on Python (2024-12-05)
Output: youtube-2024-12-05-python-tutorial
Output Format
Return structured JSON:
{
"contentId": "podcast-2024-12-08-ai-healthcare",
"detectedSource": "podcast",
"sourceConfidence": 0.95,
"rationale": "Audio transcript with speaker markers, timestamps, and conversational flow indicates podcast source."
}
Important Rules
- Always return valid JSON
- Confidence score reflects certainty in source detection (0.5-1.0 typical range)
- Rationale should be brief (1-2 sentences)
- ID must be unique and reusable
- Never include spaces or special characters (except hyphens)
- Generate ID even if source confidence is moderate (>0.5)
- Prefer readable IDs over random strings
- Keep ID length under 60 characters when possible