Claude Code Plugins

Community-maintained marketplace

Feedback

Production-grade caching strategies skill for Redis patterns, CDN configuration, cache invalidation, and performance optimization

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name caching-strategies
description Production-grade caching strategies skill for Redis patterns, CDN configuration, cache invalidation, and performance optimization
sasmp_version 1.3.0
bonded_agent 05-caching-cdn
bond_type PRIMARY_BOND

Caching Strategies Skill

Purpose: Atomic skill for caching architecture with comprehensive invalidation patterns and performance metrics.

Skill Identity

Attribute Value
Scope Redis, CDN, Cache Invalidation
Responsibility Single: Caching layer design and optimization
Invocation Skill("caching-strategies")

Parameter Schema

Input Validation

parameters:
  caching_context:
    type: object
    required: true
    properties:
      use_case:
        type: string
        enum: [session, api_response, database, static_assets, compute]
        required: true
      data_profile:
        type: object
        required: true
        properties:
          size_per_item: { type: string, pattern: "^\\d+[KMGB]?B?$" }
          total_items: { type: integer, minimum: 1 }
          update_frequency: { type: string, enum: [real_time, seconds, minutes, hours, days] }
          access_pattern: { type: string, enum: [uniform, hot_cold, temporal] }
      requirements:
        type: object
        properties:
          hit_rate_target: { type: number, minimum: 0, maximum: 100 }
          max_latency_ms: { type: integer, minimum: 1 }
          consistency: { type: string, enum: [strict, eventual] }
          budget_monthly: { type: string }

validation_rules:
  - name: "hit_rate_feasibility"
    rule: "hit_rate_target <= 99.9"
    error: "100% hit rate is not achievable in practice"
  - name: "memory_estimate"
    rule: "size_per_item * total_items <= available_memory"
    warning: "May require cache eviction or sharding"

Output Schema

output:
  type: object
  properties:
    architecture:
      type: object
      properties:
        layers: { type: array }
        technology: { type: string }
        topology: { type: string }
    configuration:
      type: object
      properties:
        memory_allocation: { type: string }
        eviction_policy: { type: string }
        ttl_strategy: { type: object }
        connection_pool: { type: object }
    invalidation:
      type: object
      properties:
        strategy: { type: string }
        triggers: { type: array }
        implementation: { type: string }
    metrics:
      type: object
      properties:
        expected_hit_rate: { type: number }
        memory_usage: { type: string }
        latency_p99: { type: string }

Core Patterns

Cache Layers

L1: Application Memory
├── Technology: Caffeine, Guava
├── Latency: ~0.1ms
├── Size: MB range
├── TTL: Seconds
└── Use: Hot data, thread-local

L2: Distributed Cache
├── Technology: Redis, Memcached
├── Latency: 1-5ms
├── Size: GB-TB range
├── TTL: Minutes to hours
└── Use: Shared state, sessions

L3: CDN Edge
├── Technology: CloudFront, Fastly
├── Latency: 5-50ms (network)
├── Size: Unlimited
├── TTL: Hours to days
└── Use: Static assets, API responses

L4: Database Cache
├── Technology: Query cache, buffer pool
├── Latency: ~1ms
├── Size: GB range
├── TTL: Until invalidated
└── Use: Query results

Cache Patterns

Cache-Aside (Lazy Loading):
├── Read: Check cache → Miss → DB → Store → Return
├── Write: Update DB → Invalidate cache
├── Pros: Simple, resilient to cache failure
├── Cons: Cache miss penalty, stale on DB update
└── Use: Read-heavy, tolerance for staleness

Write-Through:
├── Write: Update cache + DB atomically
├── Read: Always from cache
├── Pros: Cache always fresh
├── Cons: Write latency, complexity
└── Use: Read-after-write needed

Write-Behind:
├── Write: Update cache → Async DB write
├── Pros: Low write latency
├── Cons: Data loss risk, complexity
└── Use: Write-heavy, acceptable loss

Read-Through:
├── Read: Cache handles DB fetch on miss
├── Pros: Simplified application
├── Cons: Cache dependency
└── Use: Predictable access patterns

Invalidation Strategies

TTL-Based:
├── Simple time expiry
├── Formula: TTL = max_acceptable_staleness
├── Jitter: TTL * (1 + random(-0.1, 0.1))
└── Prevents: Thundering herd

Event-Based:
├── Invalidate on data change
├── Implementation: CDC, Pub/Sub
├── Latency: Near real-time
└── Complexity: Event system required

Version-Based:
├── Key: user:{id}:v{version}
├── Bump version on change
├── Old versions expire naturally
└── Benefit: No explicit invalidation

Tag-Based:
├── Associate keys with tags
├── Invalidate by tag
├── Example: Tag "product:123" on all related
└── Use: Related data groups

Retry Logic

Cache Operation Retry

retry_config:
  cache_read:
    max_attempts: 2
    timeout_ms: 50
    on_failure: proceed_without_cache

  cache_write:
    max_attempts: 3
    timeout_ms: 100
    on_failure: log_and_continue

  redis_connection:
    max_attempts: 5
    initial_delay_ms: 100
    max_delay_ms: 5000
    multiplier: 2.0

  circuit_breaker:
    failure_threshold: 5
    reset_timeout_seconds: 30
    half_open_requests: 1

Logging & Observability

Log Format

log_schema:
  level: { type: string }
  timestamp: { type: string, format: ISO8601 }
  skill: { type: string, value: "caching-strategies" }
  event:
    type: string
    enum:
      - cache_hit
      - cache_miss
      - cache_set
      - cache_invalidate
      - cache_evict
      - ttl_expired
      - circuit_open
  context:
    type: object
    properties:
      key: { type: string }
      ttl_seconds: { type: integer }
      latency_ms: { type: number }
      size_bytes: { type: integer }

example:
  level: INFO
  event: cache_hit
  context:
    key: "user:123:profile"
    latency_ms: 0.5
    size_bytes: 1024

Metrics

metrics:
  - name: cache_requests_total
    type: counter
    labels: [operation, result]  # hit, miss, error

  - name: cache_latency_seconds
    type: histogram
    labels: [operation]
    buckets: [0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05]

  - name: cache_memory_bytes
    type: gauge
    labels: [cache_name]

  - name: cache_evictions_total
    type: counter
    labels: [policy]

  - name: cache_hit_ratio
    type: gauge
    labels: [cache_name]

Troubleshooting

Common Issues

Issue Cause Resolution
Low hit rate TTL too short Increase TTL, analyze patterns
High latency Cache miss + DB Warm cache, optimize DB
Memory pressure Too much data Increase memory, evict
Stale data TTL mismatch Event-based invalidation
Thundering herd Mass expiry Jittered TTL, singleflight
Hot key Popularity skew Replicate, local cache

Debug Checklist

□ Hit rate measured (>90% target)?
□ Memory usage within limits?
□ Eviction rate acceptable?
□ Latency p99 within SLA?
□ Invalidation working?
□ Cluster health OK?
□ Connection pool sized right?

Unit Test Templates

Cache Configuration Tests

# test_caching_strategies.py

def test_valid_caching_context():
    params = {
        "caching_context": {
            "use_case": "session",
            "data_profile": {
                "size_per_item": "1KB",
                "total_items": 1000000,
                "update_frequency": "minutes",
                "access_pattern": "hot_cold"
            },
            "requirements": {
                "hit_rate_target": 99,
                "max_latency_ms": 5,
                "consistency": "eventual"
            }
        }
    }
    result = validate_parameters(params)
    assert result.valid == True

def test_memory_estimation():
    result = estimate_memory(
        size_per_item="1KB",
        total_items=1000000,
        overhead_factor=1.5  # Redis overhead
    )
    assert result.total == "1.5GB"
    assert result.recommended_allocation == "2GB"  # 25% buffer

def test_ttl_with_jitter():
    base_ttl = 3600
    jittered = apply_jitter(base_ttl, factor=0.1)
    assert 3240 <= jittered <= 3960  # ±10%

def test_hit_rate_infeasibility():
    params = {
        "caching_context": {
            "requirements": {
                "hit_rate_target": 100  # Impossible
            }
        }
    }
    result = validate_parameters(params)
    assert result.valid == False
    assert "not achievable" in result.errors[0]

Invalidation Tests

def test_ttl_invalidation():
    cache = MockCache()
    cache.set("key", "value", ttl=60)

    assert cache.get("key") == "value"
    advance_time(61)
    assert cache.get("key") is None

def test_event_based_invalidation():
    cache = MockCache()
    cache.set("user:123", {"name": "Alice"})

    # Simulate update event
    event = {"type": "user_updated", "id": 123}
    process_invalidation_event(cache, event)

    assert cache.get("user:123") is None

def test_tag_based_invalidation():
    cache = MockCache()
    cache.set("product:123:details", data, tags=["product:123"])
    cache.set("product:123:reviews", reviews, tags=["product:123"])

    cache.invalidate_by_tag("product:123")

    assert cache.get("product:123:details") is None
    assert cache.get("product:123:reviews") is None

def test_thundering_herd_prevention():
    cache = MockCache()

    # Simulate concurrent requests for same key
    requests = [fetch_with_singleflight(cache, "hot_key") for _ in range(100)]
    results = await asyncio.gather(*requests)

    # Only one DB call should have been made
    assert cache.db_calls == 1
    assert all(r == results[0] for r in results)

Version History

Version Date Changes
2.0.0 2025-01 Production-grade rewrite with invalidation patterns
1.0.0 2024-12 Initial release