| name | sql-reader |
| description | Query production PostgreSQL database with read-only credentials. Use for investigating data, debugging issues, or analyzing application state. |
SQL Reader Skill
Query production PostgreSQL database using read-only credentials stored in Arsenal environment configuration.
π¨ MANDATORY: Read Data Quirks First
BEFORE querying the database, you MUST read the data quirks documentation:
cat docs/sql/DATA_QUIRKS.md
Why: The database has critical semantic quirks that will cause incorrect queries if not understood:
- NULL values have clinical meaning (not "missing data")
- Data inconsistencies require special query patterns
- Volume normalization is required for accuracy
DO NOT skip this step. Read the file every time before querying.
When to Use
Use this skill when you need to:
- Investigate data issues in production
- Debug application state
- Analyze user data or conversation history
- Verify database schema or table contents
- Count records or check data integrity
- Understand database structure and relationships
Database Selection: Production vs Development
CRITICAL: Always prefer PRODUCTION database for usage/analytics questions.
The arsenal/.env file should be configured with production read-only credentials by default.
When to Use Production Database (DEFAULT)
- β User analytics: "How many messages in the last day?"
- β Usage patterns: "Any requests for codel/wren?"
- β Data investigation: "What conversations exist?"
- β Debugging user issues: "Why didn't user X get a message?"
- β Schema exploration: Understanding production data model
- β ANY question about real users or real usage
When to Use Development Database (RARE)
- β οΈ Code debugging only: Testing migration scripts locally
- β οΈ Local development: Verifying local Docker database setup
- β οΈ Test data verification: Checking test fixtures
How to Switch Databases
Production (default):
# arsenal/.env should have:
PGHOST=codel-prod.cluster-crsis4w6ckj7.us-east-2.rds.amazonaws.com
PGDATABASE=codelprod
PGUSER=metabase_readonly_prod
Development (only when explicitly needed):
# Temporarily change arsenal/.env to:
PGHOST=localhost
PGDATABASE=codel
PGUSER=codel
β οΈ WARNING: Always switch back to production credentials after development queries.
Data Model Quickstart (Run These First!)
IMPORTANT: When first exploring the database or debugging an issue, ALWAYS run these 6 quickstart commands to understand the data model:
β οΈ NOTE: Command 3 shows 5 core tables. The deprecated codel_conversations and codel_conversation_recipients tables (stopped receiving data Aug 27, 2025) are NOT included in the bootstrap commands. For recent interventions, use intervention + intervention_message tables instead.
1. See All Tables with Sizes
arsenal/dot-claude/skills/sql-reader/connect.sh "
SELECT table_name,
pg_size_pretty(pg_total_relation_size('public.' || table_name)) as size,
(SELECT COUNT(*) FROM information_schema.columns c
WHERE c.table_name = t.table_name) as columns
FROM information_schema.tables t
WHERE table_schema = 'public'
AND table_type = 'BASE TABLE'
ORDER BY pg_total_relation_size('public.' || table_name) DESC
LIMIT 20;
"
2. See Foreign Key Columns (Relationships)
arsenal/dot-claude/skills/sql-reader/connect.sh "
SELECT DISTINCT
table_name,
column_name,
data_type
FROM information_schema.columns
WHERE column_name LIKE '%_id'
AND table_schema = 'public'
ORDER BY table_name, column_name;
"
3. Describe Key Table Structures
# Show structure of critical tables to answer common questions
arsenal/dot-claude/skills/sql-reader/connect.sh "
-- persons table (user data)
SELECT 'persons' as table_name, column_name, data_type, is_nullable
FROM information_schema.columns
WHERE table_name = 'persons'
ORDER BY ordinal_position;
"
arsenal/dot-claude/skills/sql-reader/connect.sh "
-- conversation_participant table (who is in which conversation, with what role)
SELECT 'conversation_participant' as table_name, column_name, data_type, is_nullable
FROM information_schema.columns
WHERE table_name = 'conversation_participant'
ORDER BY ordinal_position;
"
arsenal/dot-claude/skills/sql-reader/connect.sh "
-- message table (all messages sent)
SELECT 'message' as table_name, column_name, data_type, is_nullable
FROM information_schema.columns
WHERE table_name = 'message'
ORDER BY ordinal_position;
"
arsenal/dot-claude/skills/sql-reader/connect.sh "
-- intervention table (intervention decisions/triggers - logs when AI decides intervention is needed)
SELECT 'intervention' as table_name, column_name, data_type, is_nullable
FROM information_schema.columns
WHERE table_name = 'intervention'
ORDER BY ordinal_position;
"
arsenal/dot-claude/skills/sql-reader/connect.sh "
-- intervention_message table (actual intervention messages sent to users)
SELECT 'intervention_message' as table_name, column_name, data_type, is_nullable
FROM information_schema.columns
WHERE table_name = 'intervention_message'
ORDER BY ordinal_position;
"
4. Read the Enums (Types, States, Roles)
cat api/src/data/models/enums.py
5. See All Data Models
grep "^class.*Base" api/src/data -r --include="*.py" | grep -v test
6. View Intervention Logic (CRITICAL for Understanding Data Model)
# First refresh the prompt cache to get latest intervention conditions
cd .claude/skills/langfuse-prompt-and-trace-debugger
uv run python refresh_prompt_cache.py group_message_intervention_conditions_yaml
# Then view the intervention logic
cat ../../docs/cached_prompts/group_message_intervention_conditions_yaml_production.txt
Why this matters:
- Shows when interventions trigger
- Explains timing rules (last minute, last 6 hours, etc.)
- Maps intervention types to Langfuse prompts used
- Reveals the SQL logic behind intervention decisions
- Critical for understanding
interventionandintervention_messagetables
π¨ CRITICAL SCHEMA INSIGHT - Schema Migrations (Late August 2025):
Multiple schema migrations occurred in late August 2025. The old tables were NOT migrated - they contain historical data only.
INTERVENTION TRACKING MIGRATION (Aug 27-28, 2025):
DEPRECATED (stopped being written Aug 27, 2025):
- β
codel_conversationstable - frozen, contains 6,274 interventions from Feb 24 - Aug 27, 2025 - β
codel_conversation_recipientstable - related to old schema
CURRENT (active since Aug 28, 2025):
- β
interventiontable - logs intervention decisions/triggers (has columns: id, source_message_id, type, status, created_at) - β
intervention_messagetable - tracks actual messages sent to users (has columns: id, intervention_id, message_id, conversation_id, status, role, created_at, prompt_key)
What this means for queries:
- For interventions after Aug 27, 2025: Use
intervention+intervention_messagetables - For interventions before Aug 28, 2025: Use
codel_conversationstable (historical data only) - For queries spanning the migration: Need UNION of both schemas
- When investigating "recent interventions", ALWAYS query
intervention_message, NOTcodel_conversations - The
intervention_messagetable links to themessagetable viamessage_idto get actual message content
PERSON FACTS MIGRATION (Aug 28, 2025):
DEPRECATED (stopped being written Aug 28, 2025):
- β
person_factstable (plural) - frozen, contains 3,007 facts from Jul 2 - Aug 28, 2025
CURRENT (active since Aug 28, 2025):
- β
person_facttable (singular) - currently active, contains 7,440+ facts from Jul 2 onwards
What this means for queries:
- For facts after Aug 28, 2025: Use
person_fact(singular) - For facts before Aug 28, 2025: Use
person_facts(plural) for historical data - Both tables have identical schemas, but
person_factis more space-efficient - The singular form is the current active table
After running these 6 commands, you should be able to answer:
- What tables exist and their sizes? (Command 1)
- Which tables relate to which? (Command 2 - look for person_id, message_id, etc.)
- What columns does each core table have? (Command 3 - persons, conversation_participant, message, intervention, intervention_message)
- What enums/types are used? (Command 4)
- What SQLAlchemy models exist? (Command 5)
- How do interventions work? (Command 6 - intervention conditions and timing)
Example questions you can now answer:
- "What is sam odio's user id?" β Query persons table, join conversation_participant to find relationships (Command 3)
- "Who got the last intervention?" β Query intervention_message + join to message table with created_at timestamp (Command 3)
- "What message triggered it?" β Join intervention.source_message_id to message.id (Commands 2 & 3)
- "How many interventions yesterday?" β Count from intervention table with created_at filter (Command 3)
- "What intervention types have been sent?" β Query intervention.type for distinct values (Commands 3 & 6)
- "Show me recent timeout interventions" β Query intervention WHERE type = 'timeout_intervention_escalation' and join intervention_message (Command 3)
Prerequisites
This skill requires:
- psql (PostgreSQL client) installed
- Ubuntu/Debian:
sudo apt-get install postgresql-client - macOS:
brew install postgresql
- Ubuntu/Debian:
- Database credentials configured in
arsenal/.env
The credentials should be set in arsenal/.env:
PGHOST=your-database-host.rds.amazonaws.com
PGPORT=5432
PGDATABASE=your_database_name
PGUSER=readonly_user
PGPASSWORD=your_readonly_password
PGSSLMODE=require
Primary Workflow
Use the Helper Script (Recommended)
The easiest way to query the database:
# Interactive psql session
arsenal/dot-claude/skills/sql-reader/connect.sh
# Run a single query
arsenal/dot-claude/skills/sql-reader/connect.sh "SELECT COUNT(*) FROM users;"
# Complex query with formatting
arsenal/dot-claude/skills/sql-reader/connect.sh "
SELECT
table_name,
pg_size_pretty(pg_total_relation_size('public.' || table_name)) as size
FROM information_schema.tables
WHERE table_schema = 'public'
ORDER BY pg_total_relation_size('public.' || table_name) DESC
LIMIT 10;
"
Load Environment Variables Manually
If you prefer to work directly with psql:
# Load credentials from arsenal/.env
set -a
source arsenal/.env
set +a
# Now use psql directly
psql
Common Queries
List all tables
SELECT table_name
FROM information_schema.tables
WHERE table_schema = 'public'
ORDER BY table_name;
Get table sizes
SELECT
table_name,
pg_size_pretty(pg_total_relation_size('public.' || table_name)) as size,
pg_size_pretty(pg_relation_size('public.' || table_name)) as table_size,
pg_size_pretty(pg_total_relation_size('public.' || table_name) - pg_relation_size('public.' || table_name)) as index_size
FROM information_schema.tables
WHERE table_schema = 'public'
ORDER BY pg_total_relation_size('public.' || table_name) DESC
LIMIT 20;
Describe a table
-- Using psql command
\d table_name
-- Or using information_schema
SELECT
column_name,
data_type,
character_maximum_length,
is_nullable,
column_default
FROM information_schema.columns
WHERE table_schema = 'public'
AND table_name = 'your_table'
ORDER BY ordinal_position;
Count records
SELECT COUNT(*) FROM your_table;
-- Count by group
SELECT status, COUNT(*) as count
FROM your_table
GROUP BY status
ORDER BY count DESC;
View recent records
SELECT *
FROM your_table
ORDER BY created_at DESC
LIMIT 10;
Find table relationships
-- Find columns that look like foreign keys
SELECT DISTINCT
table_name,
column_name
FROM information_schema.columns
WHERE column_name LIKE '%_id'
AND table_schema = 'public'
ORDER BY table_name, column_name;
Relationship Coaching Schema Patterns
Common patterns when working with relationship coaching data:
Find a Couple by Names
Problem: Multiple people can have the same name (e.g., multiple "Daniel" or "Camily" users).
Solution: Join through conversation_participant to find couples, rank by message count:
-- Find a couple by names (handles duplicate names)
SELECT
c.id AS conversation_id,
p1.id AS person1_id, p1.name AS person1_name,
p2.id AS person2_id, p2.name AS person2_name,
COUNT(DISTINCT m.id) AS message_count
FROM conversation c
JOIN conversation_participant cp1 ON cp1.conversation_id = c.id
JOIN persons p1 ON cp1.person_id = p1.id
JOIN conversation_participant cp2 ON cp2.conversation_id = c.id
JOIN persons p2 ON cp2.person_id = p2.id
LEFT JOIN message m ON m.conversation_id = c.id
WHERE c.type = 'GROUP'
AND cp1.role = 'MEMBER' AND cp2.role = 'MEMBER'
AND cp1.person_id < cp2.person_id
AND (LOWER(p1.name) LIKE '%daniel%' AND LOWER(p2.name) LIKE '%camily%')
GROUP BY c.id, p1.id, p1.name, p2.id, p2.name
ORDER BY message_count DESC;
Join Messages to People
Schema gotcha: Messages link to person_contacts, not directly to persons.
-- Get messages with sender names
SELECT
m.id,
m.content, -- NOTE: Use 'content' not 'body'
m.provider_timestamp,
p.name AS sender_name,
p.id AS sender_person_id
FROM message m
JOIN person_contacts pc ON m.sender_person_contact_id = pc.id
JOIN persons p ON pc.person_id = p.id
WHERE m.conversation_id = {{conversation_id}}
ORDER BY m.provider_timestamp DESC
LIMIT 10;
Get User Reactions
CRITICAL Schema Gotcha: Reactions are stored as SEPARATE message records, NOT as arrays on the original message!
Each reaction creates a new message with:
provider_data->>'reaction_id'set to a unique IDcontentstarting with the reaction type: "Liked", "Loved", "Disliked", "Laughed at", "Emphasized", "Questioned"
-- Find reactions with message type (intervention, cron, direct response)
-- This query identifies what type of coach message was reacted to
WITH reaction_messages AS (
SELECT
m.id as reaction_id,
m.conversation_id,
CASE
WHEN m.content LIKE 'Liked %' THEN 'Liked'
WHEN m.content LIKE 'Loved %' THEN 'Loved'
WHEN m.content LIKE 'Disliked %' THEN 'Disliked'
WHEN m.content LIKE 'Laughed at %' THEN 'Laughed at'
WHEN m.content LIKE 'Emphasized %' THEN 'Emphasized'
WHEN m.content LIKE 'Questioned %' THEN 'Questioned'
END as reaction_type,
TRIM(BOTH '"' FROM SUBSTRING(m.content FROM POSITION('"' IN m.content))) as quoted_content,
p.name as sender_name,
m.created_at
FROM message m
JOIN conversation c ON m.conversation_id = c.id
JOIN person_contacts pc ON m.sender_person_contact_id = pc.id
JOIN persons p ON pc.person_id = p.id
WHERE m.provider_data->>'reaction_id' IS NOT NULL
AND c.type = 'ONE_ON_ONE'
AND m.created_at >= NOW() - INTERVAL '96 hours' -- Adjust timeframe as needed
),
original_messages AS (
SELECT DISTINCT ON (rm.reaction_id)
rm.reaction_id,
orig.id as orig_message_id,
orig.content as orig_content
FROM reaction_messages rm
JOIN message orig ON orig.conversation_id = rm.conversation_id
AND rm.quoted_content LIKE '%' || LEFT(orig.content, 50) || '%'
AND orig.id <> rm.reaction_id
ORDER BY rm.reaction_id, orig.created_at DESC
)
SELECT
rm.reaction_id,
rm.reaction_type,
rm.sender_name,
CASE
WHEN im.prompt_key IS NOT NULL THEN im.prompt_key
WHEN om.orig_content LIKE 'Here''s a thought%' THEN 'scheduled_cron'
WHEN om.orig_content LIKE 'Here''s a curious%' THEN 'scheduled_cron'
WHEN om.orig_content LIKE 'Hey %' AND om.orig_content LIKE '%react%' THEN 'feedback_request'
ELSE 'direct_response'
END as message_type,
LEFT(om.orig_content, 60) as content_preview,
rm.created_at
FROM reaction_messages rm
LEFT JOIN original_messages om ON om.reaction_id = rm.reaction_id
LEFT JOIN intervention_message im ON im.message_id = om.orig_message_id
ORDER BY rm.created_at DESC;
Message type values:
group_msg_*- Intervention triggered by partner's message (e.g.,group_msg_needs_affirmation,group_msg_intervention_needed_recipient)scheduled_cron- Daily scheduled check-in ("Here's a thought/curious question...")feedback_request- Message asking user to react for feedbackdirect_response- Direct coach response to user's messageoh-by-the-way- Scheduled reminder about upcoming events
-- Simple reaction count by type
SELECT
CASE
WHEN m.content LIKE 'Liked %' THEN 'π Liked'
WHEN m.content LIKE 'Loved %' THEN 'β€οΈ Loved'
WHEN m.content LIKE 'Disliked %' THEN 'π Disliked'
WHEN m.content LIKE 'Laughed at %' THEN 'π Laughed at'
END as reaction_type,
COUNT(*) as count
FROM message m
JOIN conversation c ON m.conversation_id = c.id
WHERE m.provider_data->>'reaction_id' IS NOT NULL
AND c.type = 'ONE_ON_ONE'
GROUP BY reaction_type
ORDER BY count DESC;
Get Message Enrichment Data
The message_enrichment table contains AI classifications:
-- View messages with affect and conflict classifications
SELECT
m.id,
m.provider_timestamp,
p.name AS sender,
m.content,
me.affect, -- Partner-Affection, Partner-Contempt, etc.
me.conflict_state, -- 'New Conflict', 'Escalation', 'No conflict', etc.
me.subject, -- 'Partner', 'Self', 'Other'
me.topic -- 'Housework criticism', 'Planning', etc.
FROM message m
JOIN person_contacts pc ON m.sender_person_contact_id = pc.id
JOIN persons p ON pc.person_id = p.id
LEFT JOIN message_enrichment me ON me.message_id = m.id
WHERE m.conversation_id = {{conversation_id}}
AND m.provider_timestamp >= CURRENT_DATE - INTERVAL '56 days'
ORDER BY m.provider_timestamp;
Schema Gotchas to Remember
- Message content column: Use
message.content, notmessage.body(doesn't exist) - Person lookup: Multiple people can have identical names - always join through
conversation_participantto identify couples - MessageβPerson join: Goes through
person_contactstable:message.sender_person_contact_id β person_contacts.id β person_contacts.person_id β persons.id - Couples:
conversation.type = 'GROUP'andconversation_participant.role = 'MEMBER'(should have exactly 2 members) - Message enrichment: Not all messages have enrichment data - use LEFT JOIN
- Reactions: Stored as SEPARATE message records (not arrays). Look for messages where
provider_data->>'reaction_id' IS NOT NULL. Content starts with "Liked", "Loved", "Disliked", "Laughed at", etc.
Get database overview
-- Table count and total size
SELECT
COUNT(*) as table_count,
pg_size_pretty(SUM(pg_total_relation_size('public.' || table_name))) as total_size
FROM information_schema.tables
WHERE table_schema = 'public'
AND table_type = 'BASE TABLE';
Safety Notes
β οΈ READ-ONLY ACCESS: These credentials have SELECT-only permissions. You cannot:
- INSERT, UPDATE, or DELETE data
- CREATE or DROP tables
- Modify schemas or permissions
β Safe to use: This skill is safe for production investigation because:
- Uses dedicated read-only credentials
- Cannot modify data
- Queries are logged in database audit logs
- Connection requires SSL/TLS
psql Quick Reference
When in interactive mode:
\dt -- List all tables
\d table_name -- Describe a table
\di -- List indexes
\df -- List functions
\dv -- List views
\du -- List users/roles
\l -- List databases
\c database -- Connect to different database
\q -- Quit
\? -- Help
\h SQL_COMMAND -- Help on SQL command
Troubleshooting
"command not found: psql"
Install PostgreSQL client:
# macOS
brew install postgresql
# Ubuntu/Debian
sudo apt-get install postgresql-client
# Amazon Linux
sudo yum install postgresql
Connection timeout
- Verify you're on a network that can reach the database (VPN may be required)
- Check security groups allow your IP address
- Verify database instance is running
"FATAL: password authentication failed"
- Verify credentials in
arsenal/.envare correct - Password may have been rotated in secrets manager
- Check that PGHOST, PGPORT, PGDATABASE, PGUSER, PGPASSWORD are all set
"arsenal/.env not found"
- Ensure you're running the script from the project root
- Verify the file exists:
ls -la arsenal/.env - If missing, copy credentials from secrets manager
Integration with Claude Code
When using this skill in Claude Code:
- Announce usage: "I'm using the sql-reader skill to investigate..."
- Run Data Model Quickstart FIRST: Always start by running the 6 quickstart commands to understand the data structure
- Run targeted queries: Use the helper script for specific investigations
- Report findings: Share relevant results with the user
Mandatory Workflow (ALWAYS start with this):
# Step 1: See all tables with sizes
# Step 2: See foreign key columns (relationships)
# Step 3: Describe key table structures
# Step 4: Read enums: cat api/src/data/models/enums.py
# Step 5: See all models: grep "^class.*Base" api/src/data -r --include="*.py" | grep -v test
# Step 6: View intervention logic (group_message_intervention_conditions_yaml)
# Then investigate specific issues
arsenal/dot-claude/skills/sql-reader/connect.sh "SELECT COUNT(*) FROM message WHERE created_at > NOW() - INTERVAL '24 hours';"
Best practices:
- Always announce you're using the sql-reader skill
- Be specific about what you're investigating
- Use read-only queries (SELECT only)
- Format results clearly for the user
- Include relevant context (table names, counts, time ranges)
Examples
Investigate recent activity
arsenal/dot-claude/skills/sql-reader/connect.sh "
SELECT
DATE(created_at) as date,
COUNT(*) as message_count
FROM messages
WHERE created_at > NOW() - INTERVAL '7 days'
GROUP BY DATE(created_at)
ORDER BY date DESC;
"
Find table dependencies
arsenal/dot-claude/skills/sql-reader/connect.sh "
SELECT
tc.table_name as from_table,
kcu.column_name as from_column,
ccu.table_name AS to_table,
ccu.column_name AS to_column
FROM information_schema.table_constraints AS tc
JOIN information_schema.key_column_usage AS kcu
ON tc.constraint_name = kcu.constraint_name
JOIN information_schema.constraint_column_usage AS ccu
ON ccu.constraint_name = tc.constraint_name
WHERE tc.constraint_type = 'FOREIGN KEY'
AND tc.table_schema = 'public'
ORDER BY tc.table_name;
"
Analyze data distribution
arsenal/dot-claude/skills/sql-reader/connect.sh "
SELECT
table_name,
(SELECT COUNT(*) FROM information_schema.columns c
WHERE c.table_name = t.table_name) as column_count,
pg_size_pretty(pg_total_relation_size('public.' || table_name)) as size
FROM information_schema.tables t
WHERE table_schema = 'public'
ORDER BY pg_total_relation_size('public.' || table_name) DESC
LIMIT 15;
"
π Before Responding With Query Results
You queried production data. Now before you respond to the user:
- β Did you show actual query output (not "approximately" or "based on schema")?
- β Did you read DATA_QUIRKS.md first?
- β Are your numbers exact from the query, not estimates?
- β Did you include citations for ALL entity IDs? (persons, conversations, messages)
CRITICAL: Get manager approval before responding.
Your response with data MUST include: β
approve_7f3d8a2e********
Get the full approval token from .claude/skills/manager-review/SKILL.md
Common Mistake #4 in the manager's table is "Guessed at production data" - using words like "approximately" or "based on schema" instead of actual query results. The manager WILL catch this.
Common Mistake #10 is "Missing citations for entity IDs" - if your query returns person_id, conversation_id, or message_id, you MUST include clickable links:
Person Samuel ([view](https://admin.prod.cncorp.io/persons/1)) has 3 conversations.
Conversation 456 ([view](https://admin.prod.cncorp.io/conversations/456)) started yesterday.
Citation URL patterns:
- Person:
https://admin.prod.cncorp.io/persons/{id} - Conversation:
https://admin.prod.cncorp.io/conversations/{id} - Messages:
https://admin.prod.cncorp.io/conversations/{id}/messages - Specific message:
...messages?messageId={id} - Time range:
...messages?start={iso}&end={iso}