Caching Strategies Skill
Purpose: Atomic skill for caching architecture with comprehensive invalidation patterns and performance metrics.
Skill Identity
| Attribute |
Value |
| Scope |
Redis, CDN, Cache Invalidation |
| Responsibility |
Single: Caching layer design and optimization |
| Invocation |
Skill("caching-strategies") |
Parameter Schema
Input Validation
parameters:
caching_context:
type: object
required: true
properties:
use_case:
type: string
enum: [session, api_response, database, static_assets, compute]
required: true
data_profile:
type: object
required: true
properties:
size_per_item: { type: string, pattern: "^\\d+[KMGB]?B?$" }
total_items: { type: integer, minimum: 1 }
update_frequency: { type: string, enum: [real_time, seconds, minutes, hours, days] }
access_pattern: { type: string, enum: [uniform, hot_cold, temporal] }
requirements:
type: object
properties:
hit_rate_target: { type: number, minimum: 0, maximum: 100 }
max_latency_ms: { type: integer, minimum: 1 }
consistency: { type: string, enum: [strict, eventual] }
budget_monthly: { type: string }
validation_rules:
- name: "hit_rate_feasibility"
rule: "hit_rate_target <= 99.9"
error: "100% hit rate is not achievable in practice"
- name: "memory_estimate"
rule: "size_per_item * total_items <= available_memory"
warning: "May require cache eviction or sharding"
Output Schema
output:
type: object
properties:
architecture:
type: object
properties:
layers: { type: array }
technology: { type: string }
topology: { type: string }
configuration:
type: object
properties:
memory_allocation: { type: string }
eviction_policy: { type: string }
ttl_strategy: { type: object }
connection_pool: { type: object }
invalidation:
type: object
properties:
strategy: { type: string }
triggers: { type: array }
implementation: { type: string }
metrics:
type: object
properties:
expected_hit_rate: { type: number }
memory_usage: { type: string }
latency_p99: { type: string }
Core Patterns
Cache Layers
L1: Application Memory
├── Technology: Caffeine, Guava
├── Latency: ~0.1ms
├── Size: MB range
├── TTL: Seconds
└── Use: Hot data, thread-local
L2: Distributed Cache
├── Technology: Redis, Memcached
├── Latency: 1-5ms
├── Size: GB-TB range
├── TTL: Minutes to hours
└── Use: Shared state, sessions
L3: CDN Edge
├── Technology: CloudFront, Fastly
├── Latency: 5-50ms (network)
├── Size: Unlimited
├── TTL: Hours to days
└── Use: Static assets, API responses
L4: Database Cache
├── Technology: Query cache, buffer pool
├── Latency: ~1ms
├── Size: GB range
├── TTL: Until invalidated
└── Use: Query results
Cache Patterns
Cache-Aside (Lazy Loading):
├── Read: Check cache → Miss → DB → Store → Return
├── Write: Update DB → Invalidate cache
├── Pros: Simple, resilient to cache failure
├── Cons: Cache miss penalty, stale on DB update
└── Use: Read-heavy, tolerance for staleness
Write-Through:
├── Write: Update cache + DB atomically
├── Read: Always from cache
├── Pros: Cache always fresh
├── Cons: Write latency, complexity
└── Use: Read-after-write needed
Write-Behind:
├── Write: Update cache → Async DB write
├── Pros: Low write latency
├── Cons: Data loss risk, complexity
└── Use: Write-heavy, acceptable loss
Read-Through:
├── Read: Cache handles DB fetch on miss
├── Pros: Simplified application
├── Cons: Cache dependency
└── Use: Predictable access patterns
Invalidation Strategies
TTL-Based:
├── Simple time expiry
├── Formula: TTL = max_acceptable_staleness
├── Jitter: TTL * (1 + random(-0.1, 0.1))
└── Prevents: Thundering herd
Event-Based:
├── Invalidate on data change
├── Implementation: CDC, Pub/Sub
├── Latency: Near real-time
└── Complexity: Event system required
Version-Based:
├── Key: user:{id}:v{version}
├── Bump version on change
├── Old versions expire naturally
└── Benefit: No explicit invalidation
Tag-Based:
├── Associate keys with tags
├── Invalidate by tag
├── Example: Tag "product:123" on all related
└── Use: Related data groups
Retry Logic
Cache Operation Retry
retry_config:
cache_read:
max_attempts: 2
timeout_ms: 50
on_failure: proceed_without_cache
cache_write:
max_attempts: 3
timeout_ms: 100
on_failure: log_and_continue
redis_connection:
max_attempts: 5
initial_delay_ms: 100
max_delay_ms: 5000
multiplier: 2.0
circuit_breaker:
failure_threshold: 5
reset_timeout_seconds: 30
half_open_requests: 1
Logging & Observability
Log Format
log_schema:
level: { type: string }
timestamp: { type: string, format: ISO8601 }
skill: { type: string, value: "caching-strategies" }
event:
type: string
enum:
- cache_hit
- cache_miss
- cache_set
- cache_invalidate
- cache_evict
- ttl_expired
- circuit_open
context:
type: object
properties:
key: { type: string }
ttl_seconds: { type: integer }
latency_ms: { type: number }
size_bytes: { type: integer }
example:
level: INFO
event: cache_hit
context:
key: "user:123:profile"
latency_ms: 0.5
size_bytes: 1024
Metrics
metrics:
- name: cache_requests_total
type: counter
labels: [operation, result] # hit, miss, error
- name: cache_latency_seconds
type: histogram
labels: [operation]
buckets: [0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05]
- name: cache_memory_bytes
type: gauge
labels: [cache_name]
- name: cache_evictions_total
type: counter
labels: [policy]
- name: cache_hit_ratio
type: gauge
labels: [cache_name]
Troubleshooting
Common Issues
| Issue |
Cause |
Resolution |
| Low hit rate |
TTL too short |
Increase TTL, analyze patterns |
| High latency |
Cache miss + DB |
Warm cache, optimize DB |
| Memory pressure |
Too much data |
Increase memory, evict |
| Stale data |
TTL mismatch |
Event-based invalidation |
| Thundering herd |
Mass expiry |
Jittered TTL, singleflight |
| Hot key |
Popularity skew |
Replicate, local cache |
Debug Checklist
□ Hit rate measured (>90% target)?
□ Memory usage within limits?
□ Eviction rate acceptable?
□ Latency p99 within SLA?
□ Invalidation working?
□ Cluster health OK?
□ Connection pool sized right?
Unit Test Templates
Cache Configuration Tests
# test_caching_strategies.py
def test_valid_caching_context():
params = {
"caching_context": {
"use_case": "session",
"data_profile": {
"size_per_item": "1KB",
"total_items": 1000000,
"update_frequency": "minutes",
"access_pattern": "hot_cold"
},
"requirements": {
"hit_rate_target": 99,
"max_latency_ms": 5,
"consistency": "eventual"
}
}
}
result = validate_parameters(params)
assert result.valid == True
def test_memory_estimation():
result = estimate_memory(
size_per_item="1KB",
total_items=1000000,
overhead_factor=1.5 # Redis overhead
)
assert result.total == "1.5GB"
assert result.recommended_allocation == "2GB" # 25% buffer
def test_ttl_with_jitter():
base_ttl = 3600
jittered = apply_jitter(base_ttl, factor=0.1)
assert 3240 <= jittered <= 3960 # ±10%
def test_hit_rate_infeasibility():
params = {
"caching_context": {
"requirements": {
"hit_rate_target": 100 # Impossible
}
}
}
result = validate_parameters(params)
assert result.valid == False
assert "not achievable" in result.errors[0]
Invalidation Tests
def test_ttl_invalidation():
cache = MockCache()
cache.set("key", "value", ttl=60)
assert cache.get("key") == "value"
advance_time(61)
assert cache.get("key") is None
def test_event_based_invalidation():
cache = MockCache()
cache.set("user:123", {"name": "Alice"})
# Simulate update event
event = {"type": "user_updated", "id": 123}
process_invalidation_event(cache, event)
assert cache.get("user:123") is None
def test_tag_based_invalidation():
cache = MockCache()
cache.set("product:123:details", data, tags=["product:123"])
cache.set("product:123:reviews", reviews, tags=["product:123"])
cache.invalidate_by_tag("product:123")
assert cache.get("product:123:details") is None
assert cache.get("product:123:reviews") is None
def test_thundering_herd_prevention():
cache = MockCache()
# Simulate concurrent requests for same key
requests = [fetch_with_singleflight(cache, "hot_key") for _ in range(100)]
results = await asyncio.gather(*requests)
# Only one DB call should have been made
assert cache.db_calls == 1
assert all(r == results[0] for r in results)
Version History
| Version |
Date |
Changes |
| 2.0.0 |
2025-01 |
Production-grade rewrite with invalidation patterns |
| 1.0.0 |
2024-12 |
Initial release |