| name | extracting-session-data |
| description | Locates, lists, filters, and extracts structured data from Claude Code native session logs. Supports both single and multiple session analysis. |
Extracting Session Data Skill
Core Responsibility
Provide raw access to Claude Code session logs stored in ~/.claude/projects/{project-dir}/{session-id}.jsonl.
Key Principle: This skill extracts data only - return raw data to calling skills for analysis. Do not analyze or interpret within this skill.
Available Scripts
All scripts located in scripts/ subdirectory relative to this skill.
1. locate-logs.sh
Find log directory or specific session file path.
# Get logs directory for current working directory
scripts/locate-logs.sh
# Get logs directory for specific project
scripts/locate-logs.sh /path/to/project
# Get specific session log file path
scripts/locate-logs.sh /path/to/project abc123-session-id
Use when: Building dynamic paths, verifying logs exist before processing.
2. list-sessions.sh
Enumerate all sessions with metadata (ID, size, lines, date, branch).
# List all sessions (table format)
scripts/list-sessions.sh
# JSON output
scripts/list-sessions.sh --format json
# Sort by size or lines
scripts/list-sessions.sh --sort size
scripts/list-sessions.sh --sort lines
# Specific project
scripts/list-sessions.sh /path/to/project
Output formats: table, json, csv
Sort options: date, size, lines
Use when: Starting retrospective, showing available sessions to user, checking for recent sessions.
3. extract-data.sh
Parse JSONL logs and extract specific data types.
Available extraction types:
metadata- Session info (ID, timestamps, branch, working dir)user-prompts- All user messagestool-usage- Tool call statisticserrors- Failed tool calls with timestampsthinking- Thinking blocks (if extended thinking enabled)text-responses- Assistant text responses onlystatistics- Session metrics (message counts, tool calls, errors)all- Combined extraction
# Extract from specific session
scripts/extract-data.sh --type statistics --session SESSION_ID
scripts/extract-data.sh --type errors --session SESSION_ID
scripts/extract-data.sh --type tool-usage --session SESSION_ID
# Extract from all sessions (omit --session)
scripts/extract-data.sh --type statistics
# Limit output
scripts/extract-data.sh --type user-prompts --limit 10
# Different project
scripts/extract-data.sh --type metadata --project /path/to/project
Use when: Need specific data without loading entire log, generating metrics, identifying errors.
4. filter-sessions.sh
Find sessions matching criteria.
Filter options:
--since DATE- Sessions modified since date ("2 days ago", "2025-10-20")--until DATE- Sessions modified until date--branch NAME- Sessions on specific git branch--min-size SIZE- Minimum file size ("1M", "500K")--max-size SIZE- Maximum file size--min-lines N- Minimum line count--max-lines N- Maximum line count--has-errors- Only sessions with failed tool calls--keyword WORD- Sessions containing keyword
Output formats: list, paths, json
# Recent sessions
scripts/filter-sessions.sh --since "2 days ago"
# Large sessions with errors
scripts/filter-sessions.sh --min-lines 500 --has-errors
# Sessions on main branch in last week
scripts/filter-sessions.sh --branch main --since "7 days ago"
# Sessions containing keyword
scripts/filter-sessions.sh --keyword "authentication"
# Get paths only (for piping)
scripts/filter-sessions.sh --since "1 day ago" --format paths
Use when: User requests analysis of recent sessions, finding sessions for specific feature/branch, identifying problematic sessions.
Working Process
Single Session Analysis
# 1. Verify session exists and get metadata
scripts/extract-data.sh --type metadata --session SESSION_ID
# 2. Get session statistics (to determine size)
scripts/extract-data.sh --type statistics --session SESSION_ID
# 3. Extract specific data as needed
scripts/extract-data.sh --type errors --session SESSION_ID
scripts/extract-data.sh --type tool-usage --session SESSION_ID
Multiple Session Analysis
# 1. Filter to find relevant sessions
scripts/filter-sessions.sh --since "7 days ago" --branch main
# 2. Extract data from all filtered sessions
scripts/extract-data.sh --type statistics
# 3. Or iterate through filtered subset
SESSIONS=$(scripts/filter-sessions.sh --has-errors --format paths)
for session in $SESSIONS; do
SESSION_ID=$(basename "$session" .jsonl)
scripts/extract-data.sh --type errors --session $SESSION_ID
done
Integration Pattern for Calling Skills
When another skill (like retrospecting) needs session data:
- Discovery: Use
list-sessions.shorfilter-sessions.shto find relevant sessions - Size Check: Use
extract-data.sh --type statisticsto determine session complexity - Targeted Extraction: Use
extract-data.shwith specific types for needed data - Return Raw Data: Return extracted data to caller for analysis
Example:
# Get latest session ID
LATEST=$(scripts/list-sessions.sh --format json --sort date | jq -r '.[0].sessionId')
# Check size before processing
STATS=$(scripts/extract-data.sh --type statistics --session $LATEST)
LINE_COUNT=$(echo "$STATS" | grep "Total Lines:" | awk '{print $3}')
# Extract based on size
if [ "$LINE_COUNT" -lt 500 ]; then
# Small session: extract detail
scripts/extract-data.sh --type errors --session $LATEST
scripts/extract-data.sh --type tool-usage --session $LATEST
else
# Large session: summary only
scripts/extract-data.sh --type statistics --session $LATEST
fi
Context Budget Management
CRITICAL: This skill is designed for context efficiency
Use Bash Processing, Not Read Tool
# GOOD: Extract via bash, stays in bash context
STATS=$(scripts/extract-data.sh --type statistics)
# Process $STATS in bash
# BAD: Reading full log files
Read ~/.claude/projects/-path/session.jsonl
# Loads entire file into context unnecessarily
Check Session Size Before Loading
Never load full session logs into context without checking size first.
# Always check statistics first
scripts/extract-data.sh --type statistics --session SESSION_ID
# Shows total lines, message counts, etc.
# Decision rules:
# - Small (<500 lines): Can extract detail safely
# - Medium (500-2000 lines): Use selective extraction
# - Large (>2000 lines): Statistics only, offer targeted deep-dives
Return Raw Data to Caller
This skill should:
- Execute bash scripts to extract data
- Return raw text output to calling skill
- Let calling skill manage context for analysis
- Avoid interpretation or analysis within this skill
Output Format
Return raw extracted data with minimal formatting:
# Statistics output
Session: abc123-def456-ghi789
Total Lines: 450
User Messages: 12
Assistant Messages: 23
Tool Calls: 45
Errors: 2
# Tool usage output
=== Tool Usage: abc123-def456-ghi789 ===
Read 15
Bash 12
Edit 8
Grep 5
Write 3
No analysis, no interpretation - just data extraction.
Error Handling
All scripts exit with non-zero status on errors and output to stderr.
Check exit status before processing:
if ! scripts/locate-logs.sh /path/to/project &>/dev/null; then
# Handle: logs directory doesn't exist
echo "Project has no session logs yet"
fi
if ! scripts/extract-data.sh --type metadata --session abc123 &>/dev/null; then
# Handle: session doesn't exist
echo "Session not found"
fi
Common error messages:
Error: Logs directory not found: ~/.claude/projects/-pathError: Session file not found: ~/.claude/projects/-path/session-id.jsonlError: --type is requiredError: jq is required but not installed. Install with: brew install jq
Path Calculation
Claude Code stores sessions using this pattern:
~/.claude/projects/{project-identifier}/{session-id}.jsonl
Where {project-identifier} is calculated by replacing all / with - in the absolute working directory path:
# Example: /Users/user/project → -Users-user-project
PROJECT_ID=$(echo "${PWD}" | sed 's/\//\-/g')
LOGS_DIR="${HOME}/.claude/projects/${PROJECT_ID}"
All scripts use locate-logs.sh internally for consistent path calculation.
Anti-Patterns to Avoid
Don't:
- Load full session logs into context without checking size
- Parse JSONL manually - use
extract-data.sh - Hardcode log paths - use
locate-logs.sh - Analyze or interpret data - return raw data to caller
- Process large logs synchronously without user awareness
Do:
- Check session size with
--type statisticsbefore processing - Use appropriate extraction type for specific needs
- Filter sessions before extraction for efficiency
- Stream/pipe data when processing multiple sessions
- Return raw data for caller to analyze
Success Criteria
Effective use of this skill means:
- Efficient Discovery: Quickly find relevant sessions without manual searching
- Targeted Extraction: Get exactly the data needed, nothing more
- Context Preservation: Avoid loading unnecessary data into context
- Raw Data Focus: Return unprocessed data for caller to analyze
- Multi-Session Support: Handle analysis across timeframes or branches efficiently
Dependencies
Required:
bash(v4.0+)jq(JSON parser)
Scripts check for jq and provide installation instructions if missing:
Error: jq is required but not installed. Install with: brew install jq