name

context-packing-memory-management

description

Systematic context window optimization and cross-session memory management for long-running multi-agent tasks. Use when working on projects spanning multiple sessions, managing large codebases with 50+ files, conducting extended research, or coordinating context between multiple agents. Includes token allocation strategies, intelligent compaction procedures, persistent memory schemas, and agent handoff protocols.

Context Packing & Memory Management

Overview

Context window management is the foundational constraint governing multi-agent system performance. With Claude Sonnet 4.5's 200,000-token context window, efficient utilization determines whether agents can maintain project continuity across sessions, coordinate effectively, and execute complex tasks without information loss.

This skill provides quantitative, automation-ready strategies for:

Optimal token allocation across context types
Systematic context compaction when approaching capacity
Cross-session memory persistence with structured schemas
Hierarchical information organization for rapid navigation
Multi-agent context coordination and handoff protocols

Target outcome: Maintain 95%+ critical information retention while operating within token constraints across extended sessions.

Core Principles

Token Budget Discipline: Treat context window as a scarce, shared resource with explicit allocation rules
Progressive Disclosure: Load information just-in-time based on relevance scoring, not preemptively
Compression Without Loss: Preserve architectural decisions and critical state while eliminating redundancy
Hierarchical Navigation: Organize information in layers enabling quick jumps without full context traversal
Persistent Memory: Extract and store session-independent knowledge for rehydration in future sessions
Automation-First Design: Use threshold-based rules and scoring algorithms, not subjective judgment
Multi-Agent Coordination: Explicit ownership and handoff protocols prevent context fragmentation

Context Window Optimization

Token Allocation Strategy

Claude Sonnet 4.5 Context Budget: 200,000 tokens

Allocate tokens according to this distribution:

Context Type	Token Allocation	Percentage	Purpose
Knowledge Base & System Instructions	60,000-80,000	30-40%	Skills, system prompts, core procedures
Active Task Context	80,000-100,000	40-50%	Current files, recent outputs, working state
Session Memory	20,000-30,000	10-15%	Architectural decisions, persistent state
Buffer/Overhead	10,000-20,000	5-10%	Tool outputs, safety margin

Rationale: Knowledge base is static and necessary. Active context is dynamic and scales with task complexity. Session memory grows slowly. Buffer prevents hard limits.

Implementation:

MAX_CONTEXT = 200000
KNOWLEDGE_BASE_MAX = 80000  # 40%
ACTIVE_CONTEXT_MAX = 100000  # 50%
SESSION_MEMORY_MAX = 30000   # 15%
BUFFER_MIN = 10000           # 5% minimum safety

Token Estimation Techniques

Using Claude API Tokenization (Recommended for accuracy):

# Anthropic API call for exact token count
import anthropic

def count_tokens(text: str) -> int:
    client = anthropic.Anthropic()
    response = client.messages.count_tokens(
        model="claude-sonnet-4-5-20250929",
        messages=[{"role": "user", "content": text}]
    )
    return response.input_tokens

Cost-Benefit: API tokenization costs ~$0.0001 per call. For context management in long sessions (>4 hours), the marginal cost ($0.01-0.05 total) is justified by preventing context overflow errors that waste entire sessions.

Approximation Formulas (Use when API calls are impractical):

Code files: 0.75 tokens/character (includes syntax, whitespace)
Documentation: 0.65 tokens/character (prose is more compact)
JSON/structured data: 0.85 tokens/character (brackets, quotes add overhead)
Log files: 0.70 tokens/character (mixed content)

Validation: Test approximations against API counts for your specific content mix. Adjust formulas if error exceeds ±10%.

File Loading Prioritization

Relevance Scoring Algorithm:

Each file receives a score from 0-100 based on:

FILE_SCORE = (RELEVANCE_SCORE × 0.50) + 
             (RECENCY_SCORE × 0.30) + 
             (DEPENDENCY_SCORE × 0.20)

Relevance Score (0-50 points):

Mentioned in current task description: +25
Modified in last 5 operations: +15
Contains unresolved issues/TODOs: +10
Core architectural file (config, schema): +20
Utility/helper file: +5

Recency Score (0-30 points):

Modified in last hour: +30
Modified in last 4 hours: +20
Modified in last 24 hours: +10
Modified in last week: +5
Older: 0

Dependency Score (0-20 points):

Direct dependency of active file: +20
Second-degree dependency: +10
Imports/references active file: +15
No relationship: 0

Loading Strategy:

Sort files by score (descending)
Load files until reaching 70% of ACTIVE_CONTEXT_MAX
Reserve remaining 30% for:
- Tool outputs (15%)
- Dynamic context expansion (10%)
- Safety buffer (5%)

Just-In-Time vs. Preloading Decision:

Condition	Strategy	Rationale
Score ≥ 70	Preload	High probability of need
Score 40-69	Just-in-time	Moderate probability
Score < 40	On-demand only	Low probability
File size > 10,000 tokens	Just-in-time	Large footprint
Total context > 80%	Just-in-time all	Capacity constraint

Capacity Monitoring

Warning Thresholds:

Level	Context Used	Action Required
Green	0-79% (0-158k tokens)	Normal operation
Yellow	80-89% (160k-178k)	Begin planning compaction
Orange	90-94% (180k-188k)	Initiate compaction immediately
Red	95-100% (190k-200k)	Emergency compaction + shed low-priority

Monitoring Implementation:

def check_capacity_status(current_tokens: int) -> str:
    usage_pct = (current_tokens / MAX_CONTEXT) * 100
    
    if usage_pct < 80:
        return "GREEN"
    elif usage_pct < 90:
        return "YELLOW: Plan compaction within 10 operations"
    elif usage_pct < 95:
        return "ORANGE: Compact now before next major operation"
    else:
        return "RED: Emergency compaction required"

Capacity Alert Responses:

Yellow: Generate compaction plan, identify candidates for removal
Orange: Execute compaction procedure (see below), defer non-critical file loads
Red: Aggressive compaction, shed all files with score < 30, summarize verbose outputs

Intelligent Context Compaction

When to Compact

Automatic Triggers:

Context usage reaches 80% (Yellow threshold)
Planning session completion (natural break point)
Before loading large file set (>20,000 tokens)
Agent handoff initiation (clean context for recipient)
Every 50 operations (proactive maintenance)

Manual Triggers:

User requests context summary
Performance degradation observed (slow responses)
Before critical decision-making operations

Preservation Rules

MUST PRESERVE (Critical retention score: 100):

Architectural Decisions:

Decision description with timestamp
Rationale and alternatives considered
Expected impact and validation criteria
Author agent identifier (if multi-agent)

Example:

[2025-11-04T14:23:00Z] DECISION: Adopt microservices architecture
Rationale: Enables independent team scaling, better fault isolation
Alternatives: Monolith (rejected: scaling limits), Serverless (rejected: vendor lock-in)
Impact: 3-month migration timeline, reduced coupling by 60%
Validated: Service isolation tests pass, deployment time reduced 45%
Agent: Architecture-Planner-001

Active Bugs and Unresolved Issues:

Issue ID, description, reproduction steps
Impact severity (Critical/High/Medium/Low)
Current investigation status
Attempted fixes and results

Example:

BUG-2847 [CRITICAL]: Auth service timeout under load
Repro: 100+ concurrent requests → 30% timeout rate
Status: Root cause identified (connection pool exhaustion)
Attempted: Increased pool size (no effect), Added retry logic (partial improvement)
Next: Implement connection queueing with backpressure

Critical Implementation Details:

Non-obvious algorithm choices with rationale
Performance-critical optimizations
Security considerations and threat model assumptions
Data integrity constraints

Example:

CRITICAL: User.email uses case-insensitive unique index
Rationale: Prevent bob@example.com vs Bob@example.com duplicates
Implementation: PostgreSQL LOWER(email) functional index
Query pattern: WHERE LOWER(email) = LOWER($1)

Recent File Modifications (Last 5 operations):

File path, modification timestamp
Change summary (1-2 sentences)
Reason for change
Related files impacted

Example:

[2025-11-04T15:47:00Z] Modified: src/auth/jwt_handler.py
Changes: Added refresh token rotation, increased expiry to 7 days
Reason: Support mobile offline mode per FEATURE-892
Impact: Affects src/api/auth_routes.py (refresh endpoint updated)

Current Task State:

Active task description and acceptance criteria
Completion percentage (with substep breakdown)
Next 3 planned actions with dependencies
Blockers and resolution strategies

Example:

TASK: Implement user profile API endpoints
Progress: 65% complete
  ✓ GET /profile (done)
  ✓ PUT /profile (done)
  ⧖ DELETE /profile (in progress - cascade logic remaining)
  ☐ PATCH /profile (not started)
Next actions:
  1. Complete delete cascade to related tables (blocks: finalize schema)
  2. Implement PATCH with partial update support
  3. Add rate limiting to all endpoints
Blockers: DB migration approval needed from DBA team

Discard Rules

CAN SAFELY DISCARD (Retention score: 0-30):

Redundant Tool Outputs:

Duplicate search results with same information
Repeated file listings showing unchanged directories
Multiple passes of same linting output
Successful operation confirmations without actionable data

Deduplication algorithm:

def is_duplicate_output(new_output, existing_outputs):
    # Hash-based deduplication
    new_hash = hash(normalize(new_output))
    for existing in existing_outputs:
        if hash(normalize(existing)) == new_hash:
            similarity = compute_similarity(new_output, existing)
            if similarity > 0.85:  # 85% threshold
                return True
    return False

def normalize(text):
    # Remove timestamps, IDs, non-semantic variations
    return re.sub(r'\d{4}-\d{2}-\d{2}T[\d:]+Z', '', text)

Resolved Issues with Confirmed Fixes:
- Bugs marked "RESOLVED" with passing tests
- Completed tasks with acceptance criteria validated
- Questions answered with no follow-up needed
Retention criteria: Keep resolved issues for 10 operations, then discard if no re-mention
Exploratory Attempts That Didn't Lead Anywhere:
- Dead-end implementation approaches explicitly abandoned
- Failed experiments with documented negative results
- Prototype code replaced by production implementation
Preserve as lessons learned: Extract 1-sentence summary before discarding details
Verbose Debug Logs:
- Stack traces after issue is identified and fixed
- Verbose logging output when summary captures key points
- Intermediate computation steps when only result matters
Preserve: Error message and root cause (discard trace). Preserve: Summary statistics (discard raw logs).
Successful Operation Confirmations:
- "File saved successfully" (retain only file path + timestamp)
- "Tests passed" (retain only pass count, discard individual test output)
- "Build completed" (retain only artifact location, discard build logs)

Deduplication Strategies

Text-Based Deduplication:

Exact match elimination: Hash-based identification of identical content
Semantic clustering: Group similar outputs, keep most recent representative
Incremental diff preservation: For similar file versions, store only deltas

Implementation:

from difflib import SequenceMatcher

def semantic_similarity(text1: str, text2: str) -> float:
    """Returns similarity score 0.0-1.0"""
    return SequenceMatcher(None, text1, text2).ratio()

def deduplicate_outputs(outputs: list[str]) -> list[str]:
    """Returns deduplicated list, preserving most recent unique items"""
    unique = []
    seen_hashes = set()
    
    for output in reversed(outputs):  # Process newest first
        output_hash = hash(output)
        
        if output_hash in seen_hashes:
            continue
            
        # Check semantic similarity against existing unique items
        is_duplicate = False
        for unique_item in unique:
            if semantic_similarity(output, unique_item) > 0.85:
                is_duplicate = True
                break
        
        if not is_duplicate:
            unique.append(output)
            seen_hashes.add(output_hash)
    
    return list(reversed(unique))  # Return in chronological order

Tool Output Consolidation:

Instead of:

[Search 1] Found 12 files matching "auth"
  - auth_handler.py
  - auth_routes.py
  - ...
[Search 2] Found 12 files matching "auth"
  - auth_handler.py
  - auth_routes.py
  - ...

Consolidate to:

[Searches 1-2] Found 12 files matching "auth" (checked 2x, unchanged)
  - auth_handler.py
  - auth_routes.py
  - ...

Compaction Process

Step-by-step procedure:

Snapshot Current State (Safety first):

def create_pre_compaction_snapshot():
    snapshot = {
        'timestamp': current_time(),
        'total_tokens': estimate_current_context_size(),
        'file_list': list_loaded_files(),
        'decision_count': count_architectural_decisions(),
        'issue_count': count_unresolved_issues()
    }
    save_snapshot(snapshot)
    return snapshot

Score All Context Elements:
- Apply preservation rules (score: 100)
- Apply discard rules (score: 0-30)
- Score remaining elements (score: 31-99) by:
  - Recency (30 points)
  - Reference count (20 points)
  - Task relevance (30 points)
  - Information density (20 points)

Calculate Token Savings Target:

current_usage = estimate_current_context_size()
target_usage = MAX_CONTEXT * 0.65  # Target 65% after compaction
tokens_to_remove = current_usage - target_usage

Remove Low-Score Elements (Ascending score order):

removed_tokens = 0
sorted_elements = sort_by_score(context_elements)

for element in sorted_elements:
    if element.score >= 31:  # Never remove preserved content
        break
    if removed_tokens >= tokens_to_remove:
        break
        
    remove_from_context(element)
    removed_tokens += element.token_count

Deduplicate Remaining Content:
- Apply deduplication algorithms to tool outputs
- Consolidate similar findings
- Merge redundant sections

Generate Compaction Summary:

summary = {
    'tokens_before': pre_snapshot['total_tokens'],
    'tokens_after': estimate_current_context_size(),
    'tokens_saved': tokens_before - tokens_after,
    'elements_removed': count_removed_elements(),
    'preservation_validation': validate_critical_content_present()
}

Validate No Critical Loss:
- Check all architectural decisions still present
- Verify unresolved issues retained
- Confirm current task state intact
- Validate recent modifications preserved

Validation Checklist:

All decisions with timestamp in last 7 days preserved
All CRITICAL and HIGH severity issues preserved
Last 5 file modifications with full context preserved
Current task state with next actions preserved
Agent ownership information preserved (if multi-agent)
Context reduction achieved (target: 20-35% token savings)
No preservation-scored (100) items removed

Compaction Examples

Example 1: Research Session Compaction

Before Compaction (185,000 tokens - 92.5% capacity):

[Web Search 1] "machine learning optimization" - 15 results (4,500 tokens)
[Web Fetch 1] Article: "Gradient Descent Variants" (12,000 tokens full text)
[Web Search 2] "machine learning optimization" - 15 results (4,500 tokens - DUPLICATE)
[Web Fetch 2] Article: "Adam Optimizer Explained" (8,000 tokens full text)
[Web Search 3] "neural network architectures" - 20 results (5,500 tokens)
[Web Fetch 3] Article: "CNN Architectures Review" (15,000 tokens full text)
[Analysis 1] "Compare optimization algorithms" (3,000 tokens)
[Analysis 2] "Evaluate CNN architectures" (2,500 tokens)
[Chat History] 45 exchanges of clarifying questions (22,000 tokens)
[System Prompts & Skills] (80,000 tokens)
[Working Notes] Architectural decision log (8,000 tokens)

After Compaction (128,000 tokens - 64% capacity, 31% savings):

[Web Search - Deduplicated] Combined results on ML optimization (4,500 tokens)
[Key Findings Summary] 
  - Gradient Descent: Vanilla approach, slow convergence
  - Adam: Adaptive learning rate, fastest convergence (85% of cases)
  - RMSprop: Good for RNNs, second choice
  (Extracted from 20,000 tokens of full articles → 800 tokens summary)
[Architecture Evaluation]
  - CNN: Best for image tasks (preserved from 15k token article)
  - RNN: Sequential data (not needed for current task - discarded)
  - Transformer: NLP focus (not relevant - discarded)
  (2,000 tokens preserved from 15k)
[Analyses Consolidated] Merged overlapping sections (2,500 tokens)
[Chat History] Retained last 10 exchanges + key decisions (8,000 tokens)
[System Prompts & Skills] (80,000 tokens - unchanged)
[Working Notes] (8,000 tokens - unchanged, contains decisions)

Token savings: 57,000 tokens (31%)
Critical information retained: 100% (decisions, current task, key findings)

Example 2: Code Development Compaction

Before Compaction (178,000 tokens - 89% capacity):

[Files Loaded]
  - src/main.py (5,000 tokens)
  - src/auth.py (8,000 tokens)
  - src/database.py (12,000 tokens)
  - src/utils.py (3,000 tokens)
  - tests/test_auth.py (6,000 tokens)
  - tests/test_database.py (8,000 tokens)
  - Old version: src/auth_old_backup.py (8,000 tokens - DISCARD)
  - Experimental: src/auth_prototype_v2.py (7,000 tokens - DISCARD)
[Test Outputs]
  - Test run 1: All passed (2,500 tokens of verbose output)
  - Test run 2: All passed (2,500 tokens - DUPLICATE)
  - Test run 3: 1 failure in auth (2,500 tokens)
  - Test run 4: All passed after fix (2,500 tokens)
[Linting Results]
  - Run 1: 23 issues found (3,000 tokens detailed output)
  - Run 2: 15 issues remain (2,500 tokens - PARTIAL DUPLICATE)
  - Run 3: All clear (500 tokens)
[Git Logs] Last 50 commits (15,000 tokens)
[Documentation] API reference loaded but unused (12,000 tokens)
[System Prompts & Skills] (70,000 tokens)

After Compaction (115,000 tokens - 57.5% capacity, 35% savings):

[Files Loaded - Current Versions Only]
  - src/main.py (5,000 tokens)
  - src/auth.py (8,000 tokens)
  - src/database.py (12,000 tokens)
  - src/utils.py (3,000 tokens)
  - tests/test_auth.py (6,000 tokens) [Referenced in recent work]
  
  Removed: Old backups, prototypes (15,000 tokens saved)
  Deferred: test_database.py (not modified recently, load if needed)
  
[Test Summary]
  Latest status: All tests passing (4 runs consolidated)
  Critical: Test run 3 showed auth timeout bug → Fixed in run 4
  (Preserved: Bug description + fix. Discarded: Verbose passing test logs)
  (2,500 tokens preserved from 10,000 tokens)
  
[Code Quality]
  Linting: Clean (23 initial issues resolved)
  Critical fixes: SQL injection prevention added to database.py
  (500 tokens preserved from 6,000 tokens)
  
[Recent Changes Summary]
  Last 5 commits relevant to current task:
    - Fixed auth timeout bug (critical, preserved)
    - Added rate limiting (relevant, preserved)
    - Updated dependencies (not critical, summarized)
  (2,000 tokens preserved from 15,000 tokens)
  
[System Prompts & Skills] (70,000 tokens - unchanged)

Token savings: 63,000 tokens (35%)
Critical information retained: 100% (bug fix, architecture, current files)

Example 3: Multi-Agent Coordination Compaction

Before Agent Handoff (172,000 tokens - 86% capacity):

[Agent A - Research Phase Context]
  - Market analysis (15,000 tokens)
  - Competitor research (12,000 tokens)
  - Technology evaluation (18,000 tokens)
  - 30 web searches with full results (35,000 tokens)
  - Working notes and intermediate analyses (20,000 tokens)
[Agent A - Decisions Made]
  - Technology stack selected: React + Node.js
  - Database: PostgreSQL (rationale: ACID + JSON support)
  - Architecture: Microservices (rationale: team scaling)
  (2,000 tokens)
[Shared Project Context]
  - Requirements document (8,000 tokens)
  - Project timeline (1,000 tokens)
  - Team assignments (500 tokens)
[System Prompts & Skills] (60,000 tokens)

After Compaction for Handoff to Agent B - Implementation (98,000 tokens - 49% capacity):

[Handoff Package for Agent B]
  - Key Decisions Summary:
    • Tech Stack: React + Node.js (rationale: team expertise, ecosystem maturity)
    • Database: PostgreSQL (rationale: ACID compliance, JSON support for flexibility)
    • Architecture: Microservices (rationale: enables independent team scaling)
    (800 tokens - extracted from 2,000 tokens)
    
  - Critical Constraints:
    • Must support 10k concurrent users (performance requirement)
    • 99.9% uptime SLA (reliability requirement)
    • GDPR compliance mandatory (legal requirement)
    (300 tokens - extracted from research)
    
  - Implementation Priorities:
    1. Auth service (foundation for other services)
    2. User profile service
    3. Core business logic service
    (200 tokens)
    
  - Resources for Agent B:
    • Requirements doc (8,000 tokens - full preservation)
    • Technology decision rationale (800 tokens)
    • Project timeline (1,000 tokens - full preservation)
    (9,800 tokens total)

[Agent A Research - Archived]
  Compressed summary stored in persistent memory:
    "Evaluated 5 tech stacks, 3 databases, 2 architectures.
     Final selections justified by: team capabilities (60% weight),
     ecosystem maturity (25% weight), scalability (15% weight).
     Detailed analysis available in session archive."
  (Stored externally, not loaded unless explicitly needed)

[Shared Project Context] (9,500 tokens - preserved)
[System Prompts & Skills] (60,000 tokens - unchanged)

Token savings: 74,000 tokens (43%)
Agent B receives: Actionable decisions, critical constraints, clear priorities
Agent A work preserved: Archived in persistent memory for future reference

Cross-Session Memory Persistence

Memory Schema

Abstract storage format (implementation-agnostic):

class PersistentMemory:
    """
    Base schema for persistent memory objects.
    Implementation can be: JSON files, database, MCP memory server, etc.
    """
    
    # Required fields
    memory_id: str           # Unique identifier (UUID)
    memory_type: str         # "decision" | "issue" | "state" | "learning"
    timestamp: str           # ISO 8601 format
    project_id: str          # Links memories to specific projects
    agent_id: str            # Agent that created this memory
    
    # Content fields
    title: str               # Brief description (max 100 chars)
    content: str             # Full content (structured based on memory_type)
    tags: list[str]          # Searchable tags
    
    # Metadata
    importance: int          # 1-10 scale (10 = critical)
    expires_at: str | None   # Optional expiration timestamp
    parent_memory_id: str | None  # Links related memories
    
    # State management
    is_archived: bool        # False = active, True = archived
    access_count: int        # Number of times retrieved
    last_accessed: str       # Last retrieval timestamp

class DecisionMemory(PersistentMemory):
    """Architectural and significant decisions"""
    content: {
        'decision': str,              # What was decided
        'rationale': str,             # Why this decision
        'alternatives_considered': list[str],  # What else was evaluated
        'expected_impact': str,       # Predicted outcomes
        'validation_criteria': str,   # How to verify success
        'context': str                # Situational factors
    }

class IssueMemory(PersistentMemory):
    """Bugs, blockers, and unresolved problems"""
    content: {
        'description': str,           # Issue description
        'severity': str,              # "critical" | "high" | "medium" | "low"
        'reproduction_steps': str,    # How to reproduce
        'investigation_status': str,  # Current understanding
        'attempted_fixes': list[str], # What's been tried
        'next_actions': str           # Planned resolution steps
    }

class StateMemory(PersistentMemory):
    """Project state and progress tracking"""
    content: {
        'task_description': str,      # Current task
        'completion_percentage': int, # 0-100
        'substeps_completed': list[str],
        'substeps_remaining': list[str],
        'blockers': list[str],
        'next_actions': list[str]
    }

class LearningMemory(PersistentMemory):
    """Lessons learned and insights"""
    content: {
        'lesson': str,                # What was learned
        'context': str,               # Situation where learned
        'applicability': str,         # When to apply this lesson
        'evidence': str               # Supporting data/results
    }

Session Summary Generation

Trigger: End of session (explicit user end, or after 2 hours of inactivity)

Algorithm:

def generate_session_summary(session_context) -> dict:
    """
    Extract critical information for cross-session persistence.
    Target: 5,000-10,000 tokens for typical session.
    """
    
    summary = {
        'session_metadata': {
            'session_id': generate_uuid(),
            'start_time': session_context.start_time,
            'end_time': current_time(),
            'duration_minutes': calculate_duration(),
            'agent_ids': session_context.active_agents,
            'project_id': session_context.project_id
        },
        
        'decisions_made': extract_decisions(session_context),
        # Returns list of DecisionMemory objects
        # Criteria: Any explicit "DECISION:" markers, architecture changes,
        #           technology selections, design pattern adoptions
        
        'issues_encountered': extract_issues(session_context),
        # Returns list of IssueMemory objects  
        # Criteria: Unresolved bugs, blockers, questions pending answers
        # Excludes: Resolved issues unless resolution method is noteworthy
        
        'current_state': extract_state(session_context),
        # Returns StateMemory object
        # Captures: Task progress, file modifications, next planned actions
        
        'learnings': extract_learnings(session_context),
        # Returns list of LearningMemory objects
        # Criteria: Failed approaches (save future time), unexpected behaviors,
        #           performance insights, useful patterns discovered
        
        'file_modifications': {
            'created': list_files_created(),
            'modified': list_files_modified_with_summary(),
            'deleted': list_files_deleted()
        },
        
        'metrics': {
            'tokens_used': session_context.total_tokens,
            'files_accessed': len(session_context.accessed_files),
            'operations_performed': session_context.operation_count,
            'compactions_executed': session_context.compaction_count
        }
    }
    
    return summary

Extraction Criteria:

Decision Extraction:
- Scan for explicit decision markers: "DECISION:", "We will", "Chose X because"
- Identify architecture changes in code structure
- Detect technology/library selections
- Score importance: 8-10 (preserve forever), 5-7 (preserve 30 days), 1-4 (preserve 7 days)
Issue Extraction:
- Identify unresolved "BUG-" or "ISSUE-" markers
- Detect error messages not followed by resolution
- Find "TODO:" comments in code with context
- Capture blockers mentioned in task status
State Extraction:
- Parse task description and completion percentage
- List last 5 file modifications with change summaries
- Extract "Next actions:" or "Next steps:" sections
- Identify blockers and dependencies
Learning Extraction:
- Find "Lesson:" or "Note:" markers
- Detect failed approaches with analysis
- Identify performance improvements with measurements
- Capture useful patterns or anti-patterns discovered

Decision Logging Template

Usage: Apply this template whenever an architectural or significant decision is made.

## DECISION: [Brief decision title]

**Decision ID**: DEC-[YYYY-MM-DD]-[Sequential Number]
**Timestamp**: [ISO 8601 timestamp]
**Agent**: [Agent identifier]
**Project**: [Project identifier]
**Importance**: [1-10 score]

### Decision
[Clear statement of what was decided, 2-3 sentences max]

### Rationale
[Why this decision was made, addressing:]
- Primary factors driving the decision
- Key constraints or requirements satisfied
- Expected benefits

### Alternatives Considered
1. **[Alternative 1]**: [Why rejected/not selected]
2. **[Alternative 2]**: [Why rejected/not selected]
3. **[Alternative 3]**: [Why rejected/not selected]

### Expected Impact
[Predicted outcomes, including:]
- Performance implications (quantitative if possible)
- Development timeline effects
- Team/process changes required
- Technical debt or trade-offs accepted

### Validation Criteria
[How to verify this was the right decision:]
- Metric 1: [Measurable criterion]
- Metric 2: [Measurable criterion]
- Timeframe: [When to evaluate]

### Context
[Situational factors relevant to this decision:]
- Project phase, team composition, constraints, deadlines
- Assumptions made
- Related decisions (reference Decision IDs)

### Tags
[Searchable tags]: architecture, performance, security, etc.

Example:

## DECISION: Adopt PostgreSQL over MongoDB

**Decision ID**: DEC-2025-11-04-003
**Timestamp**: 2025-11-04T14:32:00Z
**Agent**: Database-Architect-002
**Project**: ecommerce-platform-v2
**Importance**: 9

### Decision
Selected PostgreSQL as the primary database for the e-commerce platform, replacing the initially proposed MongoDB solution.

### Rationale
- ACID compliance critical for order processing and payment transactions
- Complex relational queries needed for inventory management across warehouses
- JSONB support provides document-store flexibility where needed
- Team has 5 years PostgreSQL experience vs. 1 year MongoDB experience

### Alternatives Considered
1. **MongoDB**: Rejected due to lack of multi-document ACID transactions (required for order processing) and weaker consistency guarantees
2. **MySQL**: Rejected due to inferior JSON handling and less robust full-text search capabilities
3. **Hybrid (PostgreSQL + MongoDB)**: Rejected due to operational complexity and data synchronization challenges

### Expected Impact
- Development: Estimated 2 weeks faster due to team expertise
- Performance: 95th percentile query latency < 100ms (validated in prototype)
- Scalability: Vertical scaling to 10k transactions/sec before sharding needed
- Cost: $800/month for managed service (vs. $1,200/month for MongoDB Atlas equivalent tier)

### Validation Criteria
- Order processing maintains ACID guarantees under load testing (10k orders/hour)
- Complex inventory queries execute in < 200ms at 95th percentile
- Database operational overhead < 5 hours/week after 3 months
- Timeframe: Validate after 3 months in production

### Context
- Project in initial architecture phase, no existing database commitment
- Team composition: 3 developers with strong PostgreSQL background
- Requirement: Support 10k concurrent users at launch, scale to 50k within 12 months
- Compliance: GDPR and PCI-DSS requirements mandate ACID transactions
- Timeline: 6 months to production launch

### Tags
database, architecture, postgresql, acid-transactions, ecommerce

Memory Rehydration Process

Trigger: Starting a new session in an existing project

Procedure:

def rehydrate_memory_for_session(project_id: str, agent_id: str) -> dict:
    """
    Load relevant persistent memories into new session context.
    Target: 20,000-30,000 tokens for memory context.
    """
    
    # Step 1: Load active state
    active_state = load_memories(
        project_id=project_id,
        memory_type="state",
        is_archived=False,
        sort_by="timestamp",
        limit=1  # Most recent state
    )
    
    # Step 2: Load recent decisions (importance-weighted)
    decisions = load_memories(
        project_id=project_id,
        memory_type="decision",
        is_archived=False,
        importance_gte=7,  # High importance only
        timestamp_after=days_ago(30),  # Last 30 days
        sort_by="importance DESC, timestamp DESC",
        limit=10
    )
    
    # Step 3: Load unresolved issues
    issues = load_memories(
        project_id=project_id,
        memory_type="issue",
        is_archived=False,
        content_severity_in=["critical", "high"],  # Critical/high only
        sort_by="severity DESC, timestamp ASC",  # Oldest critical issues first
        limit=15
    )
    
    # Step 4: Load relevant learnings
    learnings = load_memories(
        project_id=project_id,
        memory_type="learning",
        is_archived=False,
        timestamp_after=days_ago(60),  # Last 60 days
        sort_by="access_count DESC",  # Most referenced first
        limit=5
    )
    
    # Step 5: Update access metadata
    for memory in (active_state + decisions + issues + learnings):
        increment_access_count(memory.memory_id)
        update_last_accessed(memory.memory_id, current_time())
    
    # Step 6: Format for context loading
    rehydrated_context = {
        'project_summary': generate_project_summary(project_id),
        'current_state': format_state_for_context(active_state),
        'key_decisions': format_decisions_for_context(decisions),
        'open_issues': format_issues_for_context(issues),
        'applicable_learnings': format_learnings_for_context(learnings),
        'session_continuity': {
            'last_session_end': get_last_session_end_time(project_id),
            'time_since_last_session': calculate_time_gap(),
            'sessions_in_project': count_total_sessions(project_id)
        }
    }
    
    return rehydrated_context

Formatting for Context:

def format_for_context(memories: list[PersistentMemory], max_tokens: int) -> str:
    """Convert memory objects to human-readable context text"""
    
    output = []
    token_count = 0
    
    for memory in memories:
        formatted = f"""
### {memory.title}
**ID**: {memory.memory_id} | **Importance**: {memory.importance}/10 | **Date**: {memory.timestamp}

{format_memory_content(memory)}

---
"""
        
        tokens = estimate_tokens(formatted)
        if token_count + tokens > max_tokens:
            break
            
        output.append(formatted)
        token_count += tokens
    
    return "\n".join(output)

Memory Search and Retrieval

Query Interface:

def search_memories(
    project_id: str,
    query: str = None,           # Full-text search
    memory_type: str = None,     # Filter by type
    tags: list[str] = None,      # Tag-based filtering
    importance_gte: int = None,  # Minimum importance
    timestamp_after: str = None, # Time-based filtering
    timestamp_before: str = None,
    is_archived: bool = False,
    sort_by: str = "relevance DESC",
    limit: int = 10
) -> list[PersistentMemory]:
    """
    Flexible memory search supporting multiple query patterns.
    """
    pass

# Usage examples:

# Find all high-importance decisions about authentication
auth_decisions = search_memories(
    project_id="proj-123",
    query="authentication security",
    memory_type="decision",
    importance_gte=7,
    tags=["security", "authentication"]
)

# Find recent critical bugs
critical_bugs = search_memories(
    project_id="proj-123",
    memory_type="issue",
    timestamp_after=days_ago(7),
    sort_by="severity DESC, timestamp DESC"
)

# Find learnings related to performance
perf_learnings = search_memories(
    project_id="proj-123",
    memory_type="learning",
    tags=["performance", "optimization"],
    sort_by="access_count DESC"
)

Search Ranking Algorithm:

def calculate_relevance_score(memory: PersistentMemory, query: str) -> float:
    """
    Score memories for relevance to search query.
    Returns 0.0-1.0 score.
    """
    
    score = 0.0
    
    # Text similarity (40% weight)
    title_similarity = text_similarity(query, memory.title)
    content_similarity = text_similarity(query, str(memory.content))
    score += (title_similarity * 0.15) + (content_similarity * 0.25)
    
    # Importance (25% weight)
    score += (memory.importance / 10) * 0.25
    
    # Recency (20% weight)
    days_old = (current_time() - memory.timestamp).days
    recency_factor = 1 / (1 + days_old / 30)  # Decay over 30 days
    score += recency_factor * 0.20
    
    # Access frequency (15% weight)
    access_factor = min(memory.access_count / 10, 1.0)  # Cap at 10 accesses
    score += access_factor * 0.15
    
    return score

Memory Lifecycle Management

Archival Rules:

Memory Type	Archive Condition	Rationale
Decision	Importance ≤ 5 AND age > 90 days	Low-importance decisions lose relevance
Decision	Superseded by newer conflicting decision	Keep history but archive old decision
Issue	Status = "resolved" AND age > 30 days	Resolved issues rarely need review
State	Superseded by newer state	Only current state needs active loading
Learning	Access count = 0 AND age > 180 days	Unused learnings are not applicable

Deletion Rules (Permanent removal):

Memory Type	Delete Condition	Rationale
Any	Marked for deletion by user	User override
Decision	Importance ≤ 3 AND archived > 365 days	Very low importance + very old
Issue	Resolved AND archived > 180 days	Unlikely to recur after 6 months
State	Superseded AND age > 90 days	Historical state not needed
Learning	Access count = 0 AND age > 365 days	Not useful after 1 year of no use

Lifecycle Automation:

def execute_memory_lifecycle_maintenance(project_id: str):
    """
    Run periodically (daily) to manage memory lifecycle.
    """
    
    # Archive candidates
    archive_candidates = find_memories_matching(
        project_id=project_id,
        is_archived=False,
        conditions=[
            ("memory_type='decision' AND importance<=5 AND age_days>90"),
            ("memory_type='issue' AND content->>'status'='resolved' AND age_days>30"),
            ("memory_type='learning' AND access_count=0 AND age_days>180")
        ]
    )
    
    for memory in archive_candidates:
        archive_memory(memory.memory_id)
        log_lifecycle_action("ARCHIVED", memory.memory_id, reason="Age and importance criteria")
    
    # Delete candidates
    delete_candidates = find_memories_matching(
        project_id=project_id,
        is_archived=True,
        conditions=[
            ("memory_type='decision' AND importance<=3 AND archived_days>365"),
            ("memory_type='issue' AND archived_days>180"),
            ("memory_type='state' AND archived_days>90"),
            ("memory_type='learning' AND access_count=0 AND age_days>365")
        ]
    )
    
    for memory in delete_candidates:
        delete_memory(memory.memory_id)
        log_lifecycle_action("DELETED", memory.memory_id, reason="Retention period expired")

Hierarchical Information Organization

Organization Template

Four-Level Hierarchy:

Level 1: Executive Summary (Target: 500-1,000 tokens)
├─ Level 2: Section Summaries (Target: 2,000-4,000 tokens)
│  ├─ Level 3: Detailed Information (Target: 10,000-20,000 tokens)
│  │  └─ Level 4: Raw Code/Data (Variable: 20,000-80,000 tokens)

Level 1 - Executive Summary:

Purpose: 30-second read for project orientation
Contents:
- Project goal (1 sentence)
- Current status and completion percentage
- Top 3 priorities
- Critical blockers (if any)
- Key architectural decisions (links to Level 2)

# PROJECT EXECUTIVE SUMMARY

**Goal**: Build scalable e-commerce API supporting 50k concurrent users

**Status**: 45% complete - Authentication and user management done, payment processing in progress

**Priorities**:
1. Complete payment service integration [BLOCKED: pending PCI compliance review]
2. Implement order management service
3. Set up production infrastructure

**Architecture**: Microservices on Kubernetes, PostgreSQL database, Redis caching
[Details: §2.1-Architecture](#21-architecture)

**Last Updated**: 2025-11-04T15:30:00Z | **Agent**: Implementation-Lead-003

Level 2 - Section Summaries:

Purpose: 2-minute read for context establishment
Contents:
- Subsystem summaries (3-5 sentences each)
- Key decisions per subsystem with rationale
- Current progress and next actions
- Links to Level 3 detailed information

## 2.1 Architecture

**Overview**: Microservices architecture with 5 core services (Auth, User, Payment, Order, Inventory) deployed on Kubernetes. API Gateway handles routing and rate limiting. PostgreSQL for transactional data, Redis for caching and sessions.

**Key Decisions**:
- Microservices over monolith for independent scaling [DEC-2025-10-15-001]
- PostgreSQL over MongoDB for ACID compliance [DEC-2025-10-20-003]
- Kubernetes for orchestration supporting multi-region deployment [DEC-2025-10-22-005]

**Status**: Architecture validated through proof-of-concept. Auth and User services deployed to staging. Payment service 60% complete.

**Next Actions**: Complete payment service, deploy order service, implement circuit breakers between services.

[Detailed Architecture Docs: §3.1](#31-architecture-details)

Level 3 - Detailed Information:

Purpose: Deep technical context for implementation
Contents:
- Complete technical specifications
- Implementation details with code snippets
- Decision rationale with alternatives
- Known issues and workarounds
- References to Level 4 raw files

## 3.1 Architecture Details

### 3.1.1 Service Communication Pattern

Services communicate via synchronous REST APIs for critical paths and asynchronous message queues (RabbitMQ) for non-blocking operations.

**Synchronous Patterns** (Request-Response):
- Auth validation: Gateway → Auth Service
- User data retrieval: Any Service → User Service
- Payment processing: Order Service → Payment Service

**Asynchronous Patterns** (Event-Driven):
- Order confirmation: Order Service → Email Service (via queue)
- Inventory updates: Order Service → Inventory Service (via queue)
- Analytics events: All Services → Analytics Pipeline (via queue)

**Rationale**: Synchronous for operations requiring immediate response (user-facing). Asynchronous for fire-and-forget operations improving perceived performance.

**Implementation**: See [gateway/routing_config.yaml](§4.gateway.routing) for routing rules, [services/order/queue_handlers.py](§4.services.order.queue) for message handling.

### 3.1.2 Database Schema Design

[Detailed schema documentation with ER diagrams, constraints, indexing strategies]

[See full schema: §4.database.schema](#4-database-schema)

Level 4 - Raw Code/Data:

Purpose: Source of truth for implementation
Contents:
- Complete source files
- Configuration files
- Database schemas
- API specifications
- Test suites

# §4.services.auth.jwt_handler
# Complete implementation file loaded from: src/services/auth/jwt_handler.py

import jwt
from datetime import datetime, timedelta
from typing import Dict, Optional

class JWTHandler:
    """
    JWT token generation and validation for authentication service.
    
    Design decisions:
    - 15-minute access token expiry (security vs. UX balance)
    - 7-day refresh token expiry (mobile offline support)
    - RS256 algorithm (asymmetric for multi-service verification)
    """
    
    def __init__(self, private_key: str, public_key: str):
        self.private_key = private_key
        self.public_key = public_key
        self.access_token_expiry = timedelta(minutes=15)
        self.refresh_token_expiry = timedelta(days=7)
    
    def generate_access_token(self, user_id: str, permissions: list[str]) -> str:
        """Generate short-lived access token with user permissions"""
        payload = {
            'user_id': user_id,
            'permissions': permissions,
            'token_type': 'access',
            'exp': datetime.utcnow() + self.access_token_expiry,
            'iat': datetime.utcnow()
        }
        return jwt.encode(payload, self.private_key, algorithm='RS256')
    
    # ... [Additional 200 lines of implementation]

Navigation System

Section Markers:

Use consistent notation for cross-references:

§1.0    - Level 1 (Executive Summary)
§2.1    - Level 2 (Section Summary)
§3.1.2  - Level 3 (Detailed Information)
§4.file.path - Level 4 (Raw Files)

Quick Reference Index (Insert at top of context):

# NAVIGATION INDEX

## Executive Summary
[§1.0 - Project Overview](#10-project-overview) - Goal, status, priorities, blockers

## Section Summaries
[§2.1 - Architecture](#21-architecture) - System design, key decisions
[§2.2 - Authentication](#22-authentication) - Auth service implementation
[§2.3 - Payment](#23-payment) - Payment integration status [IN PROGRESS]
[§2.4 - Infrastructure](#24-infrastructure) - Deployment and operations

## Detailed Documentation
[§3.1 - Architecture Details](#31-architecture-details) - Deep technical specs
[§3.2 - Database Schema](#32-database-schema) - Tables, relationships, indexes
[§3.3 - API Specifications](#33-api-specifications) - Endpoint definitions

## Source Files (Level 4)
[§4.services.auth.*](#4-services-auth) - Authentication service code
[§4.services.payment.*](#4-services-payment) - Payment service code
[§4.database.migrations.*](#4-database-migrations) - DB migration scripts
[§4.tests.*](#4-tests) - Test suites

---

Navigation Shortcuts:

<!-- Quick jump to specific information -->

Need: JWT implementation → [§4.services.auth.jwt_handler](#jwt-handler)
Need: Decision rationale for PostgreSQL → [§2.1 Architecture - Key Decisions](#21-architecture)
Need: Current blockers → [§1.0 Executive Summary](#10-project-overview)
Need: Payment service status → [§2.3 Payment](#23-payment)

Summary Guidelines

Level 1 (Executive) Summary Rules:

Maximum 1,000 tokens
No technical jargon - understandable by non-technical stakeholders
Focus on outcomes, not implementation details
Always include: Goal, Status %, Top priorities, Blockers
Update every major milestone (≥10% progress change)

Level 2 (Section) Summary Rules:

200-500 tokens per section
Moderate technical detail - understandable by technical generalists
Include: Overview, Key decisions with IDs, Current status, Next actions
Link to Level 3 for details
Update when section changes significantly

Level 3 (Detailed) Summary Rules:

1,000-3,000 tokens per detailed section
Full technical depth - for implementation
Include: Specifications, Rationale, Implementation notes, References to code
Link to Level 4 raw files
Update when implementation details change

Level 4 (Raw) Summary Rules:

No summary - raw content only
Include file metadata: path, last modified, size, purpose
For large files (>5,000 tokens), provide section markers within file
Keep synchronized with actual files

Information Scoping Rules

What belongs at each level:

Information Type	Level 1	Level 2	Level 3	Level 4
Project goals	✓	✓	-	-
Current status	✓	✓	✓	-
Critical blockers	✓	✓	✓	-
Architectural decisions	Summary only	Title + rationale	Full analysis	-
Implementation details	-	Brief mention	✓	✓
Code snippets	-	-	Key excerpts	Full files
Configuration	-	-	Critical settings	Full configs
API endpoints	Count only	List	Specifications	Implementation
Database schema	Mention only	Table names	Full schema	Migration files
Test results	Pass/fail status	Summary	Detailed results	Raw logs

Maintenance During Updates

Propagation Rules:

When updating information, propagate changes according to this matrix:

Update Type	Update L4	Update L3	Update L2	Update L1
Code change	Always	If significant	If impacts status	If changes status %
Bug fix	Always	If notable	If critical bug	If was blocker
Config change	Always	If architecture	If impacts design	Rarely
New feature	Always	Always	Always	If major feature
Progress update	-	Status only	Status + %	Status + %
Decision made	Reference	Full details	Summary	Link only

Update Procedure:

def update_hierarchical_documentation(change_type: str, scope: str, details: dict):
    """
    Propagate documentation updates through hierarchy.
    """
    
    # Always update Level 4 (raw files) first
    update_level_4(details['file_path'], details['changes'])
    
    # Determine propagation based on change type and scope
    propagate_to_level_3 = should_propagate_to_level_3(change_type, scope)
    propagate_to_level_2 = should_propagate_to_level_2(change_type, scope)
    propagate_to_level_1 = should_propagate_to_level_1(change_type, scope)
    
    if propagate_to_level_3:
        update_level_3_section(
            section=details['section'],
            change_summary=details['technical_summary']
        )
    
    if propagate_to_level_2:
        update_level_2_summary(
            section=details['section'],
            status_change=details['status_impact'],
            next_actions_change=details['next_actions']
        )
    
    if propagate_to_level_1:
        update_executive_summary(
            status_pct_change=details['completion_delta'],
            priority_change=details['priority_impact'],
            blocker_change=details['blocker_status']
        )
    
    # Update navigation index if structure changed
    if details.get('structure_change', False):
        regenerate_navigation_index()

Consistency Checks:

Run these validations after updates:

def validate_hierarchy_consistency() -> list[str]:
    """Returns list of inconsistencies found"""
    
    issues = []
    
    # Check 1: All Level 3 references to Level 4 are valid
    for ref in extract_level_4_references_from_level_3():
        if not file_exists(ref.file_path):
            issues.append(f"Broken L3→L4 reference: {ref}")
    
    # Check 2: All Level 2 summaries have corresponding Level 3 details
    for summary in get_level_2_summaries():
        if not has_level_3_details(summary.section_id):
            issues.append(f"L2 summary without L3 details: {summary.section_id}")
    
    # Check 3: Status percentages consistent across levels
    l1_status = get_level_1_status_percentage()
    l2_aggregated = aggregate_level_2_status_percentages()
    if abs(l1_status - l2_aggregated) > 5:  # Allow 5% tolerance
        issues.append(f"Status mismatch: L1={l1_status}%, L2 aggregate={l2_aggregated}%")
    
    # Check 4: Decision IDs referenced exist in memory
    for decision_ref in extract_all_decision_references():
        if not memory_exists(decision_ref):
            issues.append(f"Referenced decision not found: {decision_ref}")
    
    return issues

Multi-Agent Context Coordination

Agent Ownership Model

Context Ownership Principles:

Primary Owner: One agent has write access to a context section at a time
Read-Only Access: Other agents can read but not modify owned sections
Ownership Transfer: Explicit handoff protocol required to transfer ownership
Shared Sections: Common reference materials (Level 1, Level 2 summaries) are read-only to all

Ownership Tracking:

class ContextOwnership:
    """Track which agent owns which context sections"""
    
    section_id: str              # e.g., "§2.3-payment", "§4.services.auth"
    owner_agent_id: str          # Current owner
    ownership_type: str          # "exclusive" | "shared-write" | "read-only"
    acquired_at: str             # When ownership acquired
    expires_at: str | None       # Optional expiration for auto-release
    previous_owner: str | None   # For audit trail
    lock_reason: str             # Why this section is owned

class ContextSection:
    """Represents a section of context with ownership metadata"""
    
    section_id: str
    level: int                   # 1-4 hierarchy level
    content: str
    ownership: ContextOwnership
    last_modified_by: str
    last_modified_at: str
    modification_count: int
    agents_with_read_access: list[str]

Agent Handoff Protocols

Handoff Trigger Conditions:

Condition	Action	Example
Agent A completes assigned task	Automatic handoff to coordinator	Research agent → Implementation agent
Agent A encounters blocker outside expertise	Request handoff to specialist	Backend agent → Database agent for schema design
Agent A reaches token capacity	Compress and handoff to fresh agent	Long-running agent → Continuation agent
Scheduled rotation	Planned handoff at milestone	Phase 1 agent → Phase 2 agent
Agent A timeout/failure	Emergency handoff to recovery agent	Failed agent → Supervisor agent

Handoff Protocol Procedure:

def execute_agent_handoff(
    from_agent: str,
    to_agent: str,
    handoff_type: str,  # "complete" | "specialist" | "continuation" | "emergency"
    context_sections: list[str]
) -> dict:
    """
    Execute structured handoff between agents.
    Returns handoff package for recipient agent.
    """
    
    # Step 1: Compact context for handoff
    if handoff_type == "continuation":
        # Heavy compaction for token refresh
        target_reduction = 0.40  # 40% reduction
    else:
        # Light compaction to preserve relevant details
        target_reduction = 0.20  # 20% reduction
    
    compacted_context = compact_context_for_handoff(
        sections=context_sections,
        reduction_target=target_reduction
    )
    
    # Step 2: Generate handoff package
    handoff_package = {
        'handoff_metadata': {
            'from_agent_id': from_agent,
            'to_agent_id': to_agent,
            'handoff_type': handoff_type,
            'timestamp': current_time(),
            'reason': get_handoff_reason(),
            'context_token_count': estimate_tokens(compacted_context)
        },
        
        'agent_expertise_match': {
            'required_skills': identify_required_skills(context_sections),
            'to_agent_capabilities': get_agent_capabilities(to_agent),
            'skill_match_score': calculate_skill_match(to_agent, context_sections)
        },
        
        'context_summary': {
            'executive_summary': extract_level_1_summary(),
            'critical_decisions': extract_decisions_from_context(importance_gte=7),
            'open_issues': extract_unresolved_issues(),
            'current_task_state': extract_current_state(),
            'blocking_dependencies': identify_blockers()
        },
        
        'work_products': {
            'completed': list_completed_items_by_agent(from_agent),
            'in_progress': list_in_progress_items(from_agent),
            'not_started': list_pending_items()
        },
        
        'ownership_transfers': [
            ContextOwnership(
                section_id=section,
                owner_agent_id=to_agent,
                ownership_type="exclusive",
                acquired_at=current_time(),
                previous_owner=from_agent,
                lock_reason=f"Handoff from {from_agent}"
            )
            for section in context_sections
        ],
        
        'next_actions': {
            'immediate': extract_immediate_next_actions(),
            'short_term': extract_short_term_goals(),
            'success_criteria': extract_acceptance_criteria()
        },
        
        'agent_specific_notes': {
            'from_agent_observations': collect_agent_observations(from_agent),
            'suggested_approach': get_suggested_approach(from_agent),
            'known_pitfalls': list_known_pitfalls_for_task(),
            'useful_resources': list_helpful_resources()
        },
        
        'compacted_context': compacted_context
    }
    
    # Step 3: Record handoff in persistent memory
    handoff_memory = create_handoff_memory(handoff_package)
    store_memory(handoff_memory)
    
    # Step 4: Update ownership records
    for transfer in handoff_package['ownership_transfers']:
        update_ownership_record(transfer)
    
    # Step 5: Notify coordinator (if multi-agent orchestration)
    notify_coordinator({
        'event': 'agent_handoff',
        'from': from_agent,
        'to': to_agent,
        'sections': context_sections,
        'timestamp': current_time()
    })
    
    return handoff_package

Handoff Package Token Budget:

Handoff Type	Target Token Budget	Rationale
Complete task	15,000-25,000	Full context transfer including learnings
Specialist consultation	5,000-10,000	Focused problem scope only
Continuation (token refresh)	30,000-40,000	Preserve max context for continuity
Emergency recovery	10,000-15,000	Critical state only, fast recovery

Context Coordination Rules

Read Access Rules:

Level 1 (Executive Summary): Always readable by all agents
Level 2 (Section Summaries): Readable by all agents in project
Level 3 (Detailed Info): Readable by agents with task relevance
Level 4 (Raw Files): Readable only by owner + agents with explicit access

Write Access Rules:

Exclusive Ownership: Only owner can modify owned sections
Shared Write Sections: Multiple agents can write if designated "shared-write"
Conflict Resolution: Last-write-wins with conflict detection
Audit Trail: All modifications logged with agent ID and timestamp

Conflict Prevention:

def acquire_section_ownership(
    agent_id: str,
    section_id: str,
    operation: str  # "read" | "write"
) -> bool:
    """
    Attempt to acquire ownership or access to a section.
    Returns True if successful, False if denied.
    """
    
    current_ownership = get_section_ownership(section_id)
    
    # Read operations: Always allowed for Levels 1-2, check for 3-4
    if operation == "read":
        if section_id.startswith("§1") or section_id.startswith("§2"):
            return True
        return agent_id in get_section_read_access_list(section_id)
    
    # Write operations: Check ownership
    if operation == "write":
        # No current owner - acquire ownership
        if current_ownership is None:
            set_section_ownership(section_id, agent_id, "exclusive")
            return True
        
        # Shared write section
        if current_ownership.ownership_type == "shared-write":
            return True
        
        # Exclusive ownership by current agent
        if current_ownership.owner_agent_id == agent_id:
            return True
        
        # Owned by another agent - denied
        return False

Coordination State Synchronization:

For distributed multi-agent systems, maintain synchronization:

class CoordinationState:
    """Shared state for multi-agent coordination"""
    
    project_id: str
    active_agents: list[str]
    section_ownership_map: dict[str, ContextOwnership]
    agent_task_assignments: dict[str, list[str]]
    global_blockers: list[str]
    shared_resources: dict[str, any]
    
    last_sync_timestamp: str
    sync_version: int  # Optimistic locking

def synchronize_coordination_state(agent_id: str) -> CoordinationState:
    """
    Fetch latest coordination state and resolve any conflicts.
    """
    
    local_state = get_local_coordination_state(agent_id)
    remote_state = fetch_remote_coordination_state()
    
    # Detect conflicts
    if local_state.sync_version != remote_state.sync_version:
        # Conflict: Resolve using strategy
        resolved_state = resolve_coordination_conflict(
            local_state,
            remote_state,
            resolution_strategy="remote-wins-on-ownership"
        )
        
        # Apply resolved state locally
        apply_coordination_state(agent_id, resolved_state)
        return resolved_state
    
    return remote_state

Integration with Prompt Engineering (SKILL-003)

Context Organization in System Prompts

System Prompt Structure (Leveraging SKILL-003 principles):

# SYSTEM PROMPT - Agent ID: Implementation-Lead-003

## Role and Capabilities
[Agent role definition - 500 tokens]

## Project Context (Hierarchical)
[Level 1 Executive Summary - 1,000 tokens]
- Automatically included in every interaction
- Provides constant orientation

## Active Task Context
[Current task from Level 2 - 2,000 tokens]
- Dynamically updated based on current work
- Links to Level 3 details as needed

## Critical Knowledge
[Key decisions and constraints - 3,000 tokens]
- Architecture decisions with IDs
- Critical issues and blockers
- Must-follow constraints

## Available Resources
[Links to Level 3, Level 4 content]
- Load on-demand using section markers
- "For database schema details, see §3.2"
- "For JWT implementation, see §4.services.auth.jwt_handler"

## Success Criteria
[Acceptance criteria for current task - 1,000 tokens]

Total system prompt: ~7,500 tokens (3.75% of context window)

Dynamic Context Loading:

def construct_system_prompt_with_context(
    agent_id: str,
    project_id: str,
    current_task: str
) -> str:
    """
    Build system prompt with appropriate context for agent and task.
    Target: 5,000-10,000 tokens for system prompt portion.
    """
    
    # Core agent definition (static)
    agent_definition = load_agent_definition(agent_id)  # 500 tokens
    
    # Level 1 summary (always included)
    executive_summary = get_level_1_summary(project_id)  # 1,000 tokens
    
    # Task-relevant Level 2 sections
    relevant_sections = identify_relevant_sections(current_task)
    section_summaries = load_level_2_summaries(relevant_sections)  # 2,000 tokens
    
    # Critical decisions and constraints
    critical_knowledge = load_critical_knowledge(
        project_id=project_id,
        importance_gte=8,
        relevance_to_task=current_task
    )  # 3,000 tokens
    
    # Acceptance criteria
    success_criteria = extract_acceptance_criteria(current_task)  # 1,000 tokens
    
    # Resource links (Level 3, Level 4)
    resource_index = generate_resource_index(relevant_sections)  # 500 tokens
    
    prompt = f"""
{agent_definition}

# PROJECT CONTEXT
{executive_summary}

# CURRENT FOCUS
{section_summaries}

# CRITICAL KNOWLEDGE
{critical_knowledge}

# SUCCESS CRITERIA
{success_criteria}

# AVAILABLE RESOURCES
{resource_index}

---
"""
    
    return prompt

Dynamic State in User Messages

User Message Context Loading Strategy:

Instead of loading everything in system prompt, dynamically include in user messages:

def construct_user_message_with_context(
    user_query: str,
    required_context: list[str]
) -> str:
    """
    Augment user query with just-in-time context.
    """
    
    # Score and prioritize context items
    scored_context = [
        (ctx, calculate_relevance_score(ctx, user_query))
        for ctx in required_context
    ]
    
    # Sort by relevance and load until token budget
    scored_context.sort(key=lambda x: x[1], reverse=True)
    
    context_sections = []
    token_count = estimate_tokens(user_query)
    max_tokens = 50000  # Reserve 50k tokens for user message context
    
    for context_item, score in scored_context:
        if score < 0.3:  # Relevance threshold
            break
            
        content = load_context_content(context_item)
        content_tokens = estimate_tokens(content)
        
        if token_count + content_tokens > max_tokens:
            break
        
        context_sections.append(content)
        token_count += content_tokens
    
    # Construct message
    message = f"""
{user_query}

<relevant_context>
{''.join(context_sections)}
</relevant_context>
"""
    
    return message

Token-Aware Prompt Design

Prompt Engineering Patterns for Context Management:

Progressive Disclosure Pattern:

System Prompt: "You have access to detailed documentation via section markers (§).
When you need specific information, indicate which section you need, and it will be
loaded into context. Do not request all sections at once."

User Message: "Implement the payment service endpoint."

Agent Response: "I'll need the payment service specifications. Please load §3.3.2
Payment API Specifications."

[System loads §3.3.2 into next user message]

Context Pruning Pattern:

System Prompt: "Periodically review your context and identify information that is
no longer needed. When you identify such information, explicitly state:
'PRUNE: [section_id] - [reason]' and it will be removed to free tokens."

Agent: "PRUNE: §4.services.user.old_implementation - Replaced by new version,
no longer needed for reference."

[System removes pruned section]

Summary Elevation Pattern:

System Prompt: "When working with large files (>10,000 tokens), first generate
a 500-token summary and propose working with the summary. Only load full file
if summary is insufficient."

Agent: "I've analyzed §4.database.migration_001 (15,000 tokens). Here's a summary:
[500-token summary]. This summary should be sufficient for current task. Load full
file only if we need to modify the migration."

Best Practices Checklist

Context Window Optimization:

Token allocation follows 40% knowledge / 50% active / 10% session / 5% buffer distribution
File loading uses relevance scoring algorithm (50% relevance, 30% recency, 20% dependency)
Capacity monitoring implemented with thresholds (80% yellow, 90% orange, 95% red)
Just-in-time loading strategy for files scoring < 70
Token estimation uses Claude API or validated approximation formulas

Context Compaction:

Compaction triggered automatically at 80% capacity
All preservation rules (score: 100) enforced - no critical data discarded
Architectural decisions preserved with full context (decision, rationale, alternatives, impact)
Unresolved issues (critical/high severity) retained with investigation status
Last 5 file modifications preserved with change summaries
Current task state includes completion %, substeps, next actions, blockers
Deduplication applied to tool outputs (85% similarity threshold)
Compaction achieves 20-35% token savings target
Validation confirms no score-100 items removed

Cross-Session Memory:

Session summary generated at session end capturing decisions, issues, state, learnings
Decision logging uses complete template with all required fields
Persistent memory schema implemented with all required fields (memory_id, type, timestamp, project_id, agent_id, content, importance)
Memory rehydration loads: active state (most recent), high-importance decisions (last 30 days), critical/high issues (unresolved), relevant learnings (last 60 days)
Memory search supports query, type filtering, tag filtering, importance filtering, time filtering
Memory lifecycle automation runs daily (archive aged/low-importance, delete expired)
Access count tracking implemented for usage-based retention

Hierarchical Organization:

Four-level hierarchy implemented (Executive → Section → Detailed → Raw)
Level 1 (Executive) ≤ 1,000 tokens with goal, status, priorities, blockers
Level 2 (Section) 200-500 tokens per section with overview, decisions, status, next actions
Level 3 (Detailed) 1,000-3,000 tokens per section with specs, rationale, implementation notes
Level 4 (Raw) contains complete source files with metadata
Navigation index provided with section markers (§) for all levels
Section markers used consistently (§1.0, §2.1, §3.1.2, §4.file.path)
Update propagation rules followed (L4 → L3 → L2 → L1 based on change significance)
Consistency validation run after updates (references valid, status percentages aligned)

Multi-Agent Coordination:

Context ownership tracking implemented per section
Ownership types defined (exclusive | shared-write | read-only)
Agent handoff protocol implemented with structured handoff package
Handoff package includes: metadata, context summary, work products, ownership transfers, next actions, agent notes, compacted context
Handoff compaction: 40% reduction for continuation, 20% for other types
Read access rules enforced (L1/L2 readable by all, L3/L4 restricted)
Write access rules enforced (exclusive ownership required for writes)
Coordination state synchronized across agents with conflict resolution

Prompt Engineering Integration:

System prompt structure follows SKILL-003 principles
System prompt includes: role, L1 summary, active task, critical knowledge, resource links
System prompt token budget: 5,000-10,000 tokens (2.5-5% of context window)
Dynamic context loading in user messages based on relevance scoring
Progressive disclosure pattern implemented (load details on-demand)
Context pruning pattern enabled (explicit PRUNE statements)
Summary elevation pattern used for large files (>10,000 tokens)

General:

All quantitative thresholds explicitly defined (no vague guidance)
All algorithms include implementation details
All schemas include complete field specifications
All procedures are step-by-step executable
Automation-friendly rules (threshold-based, not subjective)
Examples provided with actual token counts
Integration with SKILL-003 clearly documented

Common Pitfalls to Avoid

Premature Loading: Loading all files at session start without relevance assessment
- Problem: Wastes 30-40% of context window on unused files
- Solution: Use file loading prioritization algorithm, load just-in-time for score < 70
No Capacity Monitoring: Ignoring context usage until hitting hard limit
- Problem: Emergency compaction loses information, disrupts workflow
- Solution: Implement monitoring with thresholds, compact proactively at 80%
Discarding Architectural Decisions: Removing decisions during compaction to save tokens
- Problem: Loss of rationale leads to contradictory future decisions
- Solution: Always preserve preservation-score 100 items, use validation checklist
Verbose Tool Output Retention: Keeping complete logs of successful operations
- Problem: Redundant confirmations consume 10-15% of context
- Solution: Summarize successful operations, keep only actionable data
No Cross-Session Memory: Starting each session from scratch
- Problem: Repeatedly re-analyzing same codebase, forgetting past decisions
- Solution: Implement session summary generation and memory rehydration
Flat Information Structure: Organizing all information at same detail level
- Problem: Cannot navigate quickly, must read entire context for any query
- Solution: Use 4-level hierarchy with navigation markers
Missing Decision Rationale: Recording "what" without "why"
- Problem: Future agents/sessions don't understand constraints behind decisions
- Solution: Use complete decision logging template with alternatives considered
Over-Aggressive Compaction: Targeting >50% token reduction
- Problem: Loses important context details, breaks continuity
- Solution: Target 20-35% reduction, focus on deduplication and discard rules
No Agent Handoff Protocol: Informal context transfer between agents
- Problem: Knowledge loss, duplicated work, contradictory approaches
- Solution: Use structured handoff package with ownership transfers
Static System Prompts: Loading all context in system prompt regardless of task
- Problem: Wastes tokens on irrelevant information
- Solution: Dynamic context loading based on current task relevance
Ignoring Token Costs: Using approximations when accuracy critical
- Problem: 10-15% estimation errors lead to context overflow
- Solution: Use Claude API tokenization for long sessions (justified marginal cost)
No Memory Lifecycle: Accumulating memories indefinitely
- Problem: Memory search becomes slow, outdated information pollutes results
- Solution: Implement archival and deletion rules with automated maintenance
Duplicate Information Across Levels: Repeating same details in L1, L2, L3
- Problem: Wastes tokens, creates update inconsistency
- Solution: Follow information scoping rules, link between levels instead of duplicating
Poor Section Marker Discipline: Inconsistent or missing navigation markers
- Problem: Cannot implement progressive disclosure, forced to load everything
- Solution: Use consistent §X.Y.Z notation, maintain navigation index
No Validation After Compaction: Trusting compaction didn't lose critical data
- Problem: Silently loses architectural decisions, unresolved issues
- Solution: Run validation checklist, verify preservation-scored items present

Token Budget Examples

Example 1: Small Feature Implementation (Single session, 2-4 hours)

Total Budget: 200,000 tokens

Allocation:
- System prompts & skills: 60,000 (30%)
  • Agent definition: 5,000
  • Prompt engineering skill: 8,000
  • Context management skill: 12,000
  • Language-specific skills: 15,000
  • Other skills: 20,000

- Active context: 90,000 (45%)
  • Level 1 executive summary: 1,000
  • Level 2 section summaries: 5,000
  • Level 3 relevant details: 10,000
  • Level 4 active files (3-5 files): 35,000
  • Tool outputs: 15,000
  • Working notes: 10,000
  • Task state: 2,000
  • Recent modifications: 5,000
  • Buffer: 7,000

- Session memory: 30,000 (15%)
  • Architectural decisions: 8,000
  • Unresolved issues: 5,000
  • Critical implementation notes: 7,000
  • Recent change history: 10,000

- Buffer/overhead: 20,000 (10%)
  • Safety margin: 20,000

Compaction Strategy: Likely not needed for single session feature
Memory Persistence: Generate session summary at end (~5,000 tokens)

Example 2: Medium Complexity Project (Multi-session, 2-3 days, 8-12 hours total)

Session 1 Budget: 200,000 tokens

Initial Allocation:
- System prompts & skills: 70,000 (35%)
  • Increased due to multi-session requirements
  
- Active context: 85,000 (42.5%)
  • Level 1-2: 6,000
  • Level 3: 15,000
  • Level 4 files (10-15 files): 50,000
  • Tool outputs: 14,000

- Session memory: 25,000 (12.5%)
  • Decisions: 10,000
  • Issues: 8,000
  • State: 7,000

- Buffer: 20,000 (10%)

Compaction Timeline:
- Session 1, Hour 3: 80% capacity → Compact to 65% (-30,000 tokens)
- Session 1 end: Generate summary (8,000 tokens)

Session 2 Budget: 200,000 tokens

Rehydrated Allocation:
- System prompts & skills: 70,000 (35%)
- Rehydrated memory: 20,000 (10%)
  • Session 1 summary: 8,000
  • Persisted decisions: 7,000
  • Unresolved issues: 5,000
- Active context: 90,000 (45%)
- Session memory: 20,000 (10%)

Example 3: Large Codebase Analysis (Research phase, multi-session, 1 week)

Session 1 (Initial exploration) Budget: 200,000 tokens

Allocation:
- System prompts & skills: 65,000 (32.5%)
- Active context: 95,000 (47.5%)
  • Hierarchical navigation: 15,000
  • File samples (20+ files): 60,000
  • Web search results: 15,000
  • Analysis notes: 5,000
- Session memory: 25,000 (12.5%)
- Buffer: 15,000 (7.5%)

Compaction Events:
- Hour 2: 85% → Compact web search deduplication (-12,000)
- Hour 4: 82% → Compact file samples, keep summaries (-25,000)
- Session end: Generate comprehensive summary (15,000 tokens)

Sessions 2-5 (Deep dive):

Each session:
- Rehydrate 25,000 tokens from previous sessions
- Compact every 3-4 hours
- Generate summary with learnings (10,000 tokens each)

Final Session (Synthesis):

Budget: 200,000 tokens
Rehydrated: 50,000 tokens (compressed from 5 sessions)
- Key decisions from all sessions: 15,000
- Critical findings: 20,000
- Architecture summary: 15,000

Active work: 120,000 tokens
- Synthesizing final report
- Creating architectural diagrams
- Documenting decisions

Example 4: Multi-Agent Development (Implementation phase, coordinated team)

Project Total: 5 agents × 200,000 = 1,000,000 tokens available

Shared Context (Replicated across all agents): 80,000 tokens
- Level 1 executive summary: 2,000
- Level 2 complete: 10,000
- Critical architectural decisions: 20,000
- System-wide constraints: 8,000
- Agent coordination state: 5,000
- Shared resources: 15,000
- Navigation index: 5,000
- Multi-agent protocols: 15,000

Per-Agent Allocation: 120,000 tokens individual context
- Agent-specific system prompts: 20,000
- Agent task context: 60,000
- Agent working memory: 25,000
- Agent buffer: 15,000

Agent Handoff Budget: 25,000 tokens per handoff
- Handoff metadata: 1,000
- Context summary: 8,000
- Work products: 10,000
- Next actions: 3,000
- Agent notes: 3,000

Coordination Overhead: 40,000 tokens
- Ownership tracking: 10,000
- Conflict resolution state: 10,000
- Global blockers: 5,000
- Agent task queue: 15,000

Total Effective Usage: 80,000 (shared) + (5 × 120,000) (agents) + 40,000 (coordination) = 720,000 tokens
Efficiency: 72% (remaining 28% is protocol overhead, acceptable for coordination)

Quick Reference

Context Allocation (200k tokens)

30-40%: Knowledge base & system instructions (60k-80k)
40-50%: Active task context (80k-100k)
10-15%: Session memory (20k-30k)
5-10%: Buffer/overhead (10k-20k)

Capacity Thresholds

Green (0-79%): Normal operation
Yellow (80-89%): Plan compaction within 10 operations
Orange (90-94%): Compact immediately before next major operation
Red (95-100%): Emergency compaction, shed low-priority content

File Loading Prioritization

SCORE = (RELEVANCE × 0.50) + (RECENCY × 0.30) + (DEPENDENCY × 0.20)
- Score ≥ 70: Preload
- Score 40-69: Just-in-time
- Score < 40: On-demand only

Compaction Preservation (Score: 100)

Must preserve:

Architectural decisions (with timestamp, rationale, alternatives, impact)
Active bugs and unresolved issues (Critical/High severity)
Critical implementation details (security, performance, data integrity)
Recent file modifications (last 5 operations)
Current task state (completion %, substeps, next actions, blockers)

Compaction Discard (Score: 0-30)

Can safely discard:

Redundant tool outputs (85%+ similarity)
Resolved issues with confirmed fixes (30+ days old)
Exploratory attempts explicitly abandoned
Verbose debug logs when summary captures key points
Successful operation confirmations (keep only summary)

Memory Schema Fields (Required)

memory_id: str           # UUID
memory_type: str         # "decision" | "issue" | "state" | "learning"
timestamp: str           # ISO 8601
project_id: str
agent_id: str
title: str               # Max 100 chars
content: dict            # Type-specific structured content
tags: list[str]
importance: int          # 1-10

Memory Lifecycle

Archive:

Decisions: Importance ≤5 AND age >90 days
Issues: Resolved AND age >30 days
Learnings: Access count=0 AND age >180 days

Delete:

Decisions: Importance ≤3 AND archived >365 days
Issues: Resolved AND archived >180 days
Learnings: Access count=0 AND age >365 days

Hierarchy Levels

L1 Executive: ≤1,000 tokens - Goal, status, priorities, blockers
L2 Section: 200-500 tokens/section - Overview, decisions, status, next actions
L3 Detailed: 1,000-3,000 tokens/section - Specs, rationale, implementation
L4 Raw: Variable - Complete source files

Navigation Markers

§1.0      - Level 1 (Executive)
§2.1      - Level 2 (Section)
§3.1.2    - Level 3 (Detailed)
§4.file.path - Level 4 (Raw)

Agent Handoff Token Budget

Complete task: 15k-25k tokens
Specialist: 5k-10k tokens
Continuation: 30k-40k tokens
Emergency: 10k-15k tokens

Handoff Compaction

Continuation: 40% reduction
Other types: 20% reduction

Context Ownership Types

exclusive: Single agent write access
shared-write: Multiple agents can write
read-only: All agents can read, none can write

Token Estimation (Approximation)

Code: 0.75 tokens/char
Documentation: 0.65 tokens/char
JSON/data: 0.85 tokens/char
Logs: 0.70 tokens/char

For critical long sessions: Use Claude API tokenization (justified cost)

System Prompt Budget

Target: 5,000-10,000 tokens (2.5-5% of context window)

Agent definition: 500
L1 summary: 1,000
Task context: 2,000
Critical knowledge: 3,000
Success criteria: 1,000
Resource index: 500

Progressive Disclosure Patterns

Load on-demand: Reference §markers, load when needed
Prune explicitly: State "PRUNE: §X.Y - reason"
Summarize first: 500-token summary before loading large files (>10k tokens)

Document Version: 1.0.0
Last Updated: 2025-11-04
Total Token Count: ~58,000 tokens
Integration: SKILL-003 (Prompt Engineering)

Install Skill

SKILL.md

Context Packing & Memory Management

Overview

Core Principles

Context Window Optimization

Token Allocation Strategy

Token Estimation Techniques

File Loading Prioritization

Capacity Monitoring

Intelligent Context Compaction

When to Compact

Preservation Rules

Discard Rules

Deduplication Strategies

Compaction Process

Compaction Examples

Cross-Session Memory Persistence

Memory Schema

Session Summary Generation

Decision Logging Template

Memory Rehydration Process

Memory Search and Retrieval

Memory Lifecycle Management

Hierarchical Information Organization

Organization Template

Navigation System

Summary Guidelines

Information Scoping Rules

Maintenance During Updates

Multi-Agent Context Coordination

Agent Ownership Model

Agent Handoff Protocols

Context Coordination Rules

Integration with Prompt Engineering (SKILL-003)

Context Organization in System Prompts

Dynamic State in User Messages

Token-Aware Prompt Design

Best Practices Checklist

Common Pitfalls to Avoid

Token Budget Examples

Quick Reference

Context Allocation (200k tokens)

Capacity Thresholds

File Loading Prioritization

Compaction Preservation (Score: 100)

Compaction Discard (Score: 0-30)

Memory Schema Fields (Required)

Memory Lifecycle

Hierarchy Levels

Navigation Markers

Agent Handoff Token Budget

Handoff Compaction

Context Ownership Types

Token Estimation (Approximation)

System Prompt Budget

Progressive Disclosure Patterns