Claude Code Plugins

Community-maintained marketplace

Feedback

Prompt Manager

@sunholo-data/ailang
8
0

Optimize and manage AILANG teaching prompts for maximum conciseness and accuracy. Use when user asks to create/update prompts, optimize prompt length, or verify prompt accuracy.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name Prompt Manager
description Optimize and manage AILANG teaching prompts for maximum conciseness and accuracy. Use when user asks to create/update prompts, optimize prompt length, or verify prompt accuracy.

Prompt Manager

Mission: Create concise, accurate teaching prompts with maximum information density.

Core Principle: Token Efficiency

Target: ~4000 tokens per prompt (currently ~8000+) Strategy: Reference external docs, use tables, consolidate examples Validation: Maintain eval success rates while reducing prompt size

When to Use This Skill

Invoke when user mentions:

  • "Create new prompt" / "update prompt" / "optimize prompt"
  • "Make prompt more concise" / "reduce prompt length"
  • "Fix prompt documentation" / "prompt-code mismatch"
  • After implementing language features (keep prompt synchronized)
  • Before eval baselines (verify accuracy)

Quick Reference Scripts

Create New Version

.claude/skills/prompt-manager/scripts/create_prompt_version.sh <new_version> <base_version> "<description>"

Creates versioned prompt file, computes hash, updates versions.json

Update Hash

.claude/skills/prompt-manager/scripts/update_hash.sh <version>

Recomputes SHA256 after edits

Verify Accuracy

.claude/skills/eval-analyzer/scripts/verify_prompt_accuracy.sh <version>

Catches prompt-code mismatches, false limitations

Check Examples Coverage

.claude/skills/prompt-manager/scripts/check_examples_coverage.sh <version>

Verifies that features used in working examples are documented in prompt

Analyze Size & Optimization Opportunities

.claude/skills/prompt-manager/scripts/analyze_prompt_size.sh prompts/v0.3.17.md

Shows: word count, section sizes, code blocks, tables, optimization opportunities

Test Prompt Effectiveness

.claude/skills/prompt-manager/scripts/test_prompt.sh v0.3.18

Runs AILANG-only eval (no Python) with dev models to test prompt effectiveness

Optimization Workflow

1. Analyze Current Prompt

.claude/skills/prompt-manager/scripts/analyze_prompt_size.sh prompts/v0.3.16.md

Sample output:

Total words: 4358 (target: <4000)
Total lines: 1214 (target: <200)
⚠️  OVER TARGET by 358 words (8%)

Code blocks: 60 (target: 5-10 comprehensive)
Table rows: 0 (target: 10+ tables)

Top sections by size:
  719 words - Effect System
  435 words - List Operations
  368 words - Algebraic Data Types

High-ROI optimization areas identified by script:

  • 60 code blocks → consolidate to 5-10 comprehensive examples
  • 0 tables → convert builtin/syntax docs to tables
  • Large sections → link details to external docs

2. Create Optimized Version

.claude/skills/prompt-manager/scripts/create_prompt_version.sh v0.3.17 v0.3.16 "Optimize for conciseness (-50% tokens)"

3. Apply Optimization Strategies

Reference resources/prompt_optimization.md for:

  • Tables vs prose (builtin docs)
  • Consolidating examples
  • Linking to external docs
  • Progressive disclosure patterns

Key techniques:

  1. Replace prose with tables - Builtin functions, syntax rules
  2. Consolidate examples - 8 comprehensive > 24 scattered
  3. Link to docs - Type system details → docs/guides/types.md
  4. Quick reference - 1-screen summary at top
  5. Remove redundancy - Historical notes → CHANGELOG.md

4. Validate Optimization

⚠️ CRITICAL: Must validate AFTER each optimization step!

# 1. CHECK ALL CODE EXAMPLES (NEW REQUIREMENT!)
# Extract and test every AILANG code block in the prompt
# This catches syntax errors that cause regressions
.claude/skills/prompt-manager/scripts/validate_all_code.sh prompts/v0.3.17.md

# 2. Check new size
.claude/skills/prompt-manager/scripts/analyze_prompt_size.sh prompts/v0.3.17.md

# 3. Verify accuracy (no false limitations)
.claude/skills/eval-analyzer/scripts/verify_prompt_accuracy.sh v0.3.17

# 4. Check examples coverage (NEW - v0.4.1+)
.claude/skills/prompt-manager/scripts/check_examples_coverage.sh v0.3.17
# Ensures working examples are documented in prompt

# 5. Update hash
.claude/skills/prompt-manager/scripts/update_hash.sh v0.3.17

# 6. TEST PROMPT EFFECTIVENESS (CRITICAL!)
.claude/skills/prompt-manager/scripts/test_prompt.sh v0.3.17
# This runs AILANG-only eval (no Python baseline) with dev models
# Target: >40% AILANG success rate

Success criteria:

  • ✅ Token reduction: 10-20% per iteration (NOT >50% in one step!)
  • ✅ AILANG success rate: >40% (if <40%, revert and try smaller optimization)
  • ✅ All external links resolve
  • ✅ No increase in compilation errors
  • ✅ Examples still work in REPL

⚠️ If success rate drops >10%, REVERT and try smaller optimization

5. Document Optimization

Add header to optimized prompt:

---
Version: v0.3.17
Optimized: 2025-10-22
Token reduction: -54% (8200 → 3800 tokens)
Baseline: v0.3.16→v0.3.17 success rate maintained
---

6. Commit

git add prompts/v0.3.17.md prompts/versions.json
git commit -m "feat: Optimize v0.3.17 prompt for conciseness

- Reduced tokens: 8200 → 3800 (-54%)
- Builtin docs: prose → tables + reference ailang builtins list
- Examples: 24 scattered → 8 consolidated comprehensive
- Type system: moved details to docs/guides/types.md
- Added quick reference section at top
- Validated: eval success rate maintained"

Optimization Strategies (Summary)

Full guide: resources/prompt_optimization.md

Quick Wins

  1. Tables > Prose - Builtin docs, syntax rules (-67% tokens)
  2. Consolidate Examples - 8 comprehensive > 24 scattered (-56% tokens)
  3. Link to Docs - Move detailed explanations to external docs (-76% tokens)
  4. Quick Reference - 1-screen summary at top
  5. Remove Redundancy - Historical notes → CHANGELOG.md, implementation details → code links

Anti-Patterns

  • ❌ Explaining "why" (move to design docs)
  • ❌ Historical context (move to changelog)
  • ❌ Implementation details (link to code)
  • ❌ Verbose examples (show, don't tell)
  • ❌ Apologetic limitations (be direct)

Optimization Checklist

  • Token count <4000 words
  • All external links resolve
  • Examples work in REPL
  • Eval baseline success rate maintained
  • Hash updated in versions.json
  • Optimization metrics documented in prompt header

Common Tasks

Detailed workflows: resources/workflow_guide.md

Fix False Limitation

Create version → Remove "❌ NO X" → Add "✅ X" with examples → Verify → Commit

Add Feature

Create version → Add to capabilities table → Add consolidated example → Verify → Commit

Optimize for Conciseness

Analyze size → Identify high-ROI sections → Apply techniques → Validate success rate → Document metrics → Commit

Progressive Disclosure

  1. Always loaded: skill.md (this file - workflow + optimization principles)
  2. Load for optimization: resources/prompt_optimization.md (detailed strategies)
  3. Load for workflows: resources/workflow_guide.md (detailed examples)
  4. Execute as needed: Scripts (create_prompt_version.sh, update_hash.sh)

Integration

  • eval-analyzer: verify_prompt_accuracy.sh catches mismatches
  • post-release: Run baselines after optimization
  • ailang builtins list: Reference instead of duplicating
  • docs/guides/: Link to instead of explaining

Success Metrics

Target prompt profile:

  • Tokens: <4000 (~30-40% reduction from current, in 3 iterations)
  • Lines: <300 (currently 500+)
  • Examples: 40-50 (not <30!)
  • Tables: 10+ for reference data
  • AILANG success rate: >40%

⚠️ Lessons from v0.3.18 and v0.3.20 Failures

v0.3.18 Failure: Over-Optimization

What happened: Optimized v0.3.17 → v0.3.18 with -59% token reduction (5189 → 2126 words) Result: AILANG success rate collapsed to 4.8% (from expected ~40-60%)

Root causes:

  1. Too aggressive - removed >50% content in one step
  2. Over-consolidated - 64 → 21 examples (lost pattern variety)
  3. Tables replaced prose - lost explanatory context for syntax rules
  4. Removed negatives - "what NOT to do" examples are critical
  5. No incremental validation - didn't test after each change

v0.3.20 Failure: Incorrect Syntax in Examples

What happened: Prompt had 3 syntax errors: (1) match { | pattern => (wrong), (2) import "std/io" (wrong), (3) let (x, y) = tuple (wrong) Result: -4.8% regression (40.0% → 35.2%), 18 benchmarks failed with PAR_001 compile errors

Root cause: No validation that code examples in prompt actually work with AILANG parser

Critical lessons:

  • ❌ DON'T optimize >20% per iteration
  • ❌ DON'T reduce examples below 40 total
  • ❌ DON'T replace all syntax prose with tables
  • ❌ DON'T link critical syntax to external docs (AIs can't follow links)
  • ❌ DON'T skip eval testing between iterations
  • DON'T trust code examples without testing them (NEW!)
  • ✅ DO optimize incrementally (3 iterations of 10-15% each)
  • ✅ DO keep negative examples ("what NOT to do")
  • ✅ DO validate with test_prompt.sh after EACH change
  • ✅ DO maintain pattern repetition (models need to see things 3-5 times)
  • DO extract and test ALL code blocks in prompt (NEW!)

Full analysis: OPTIMIZATION_FAILURE_ANALYSIS.md