name	pg-data
description	Generate safe, read-only PostgreSQL queries from natural language. Use when users need to query blog_small, ecommerce_medium, or saas_crm_large databases. Supports query generation, execution, and result analysis with confidence scoring.

PostgreSQL Data Query Generator

Generate safe, production-ready PostgreSQL SELECT queries from natural language descriptions. This skill analyzes user requests, matches them to the appropriate database schema, generates secure SQL, tests execution, and validates results.

When to Use This Skill

Use this skill when the user wants to:

Query data from blog_small, ecommerce_medium, or saas_crm_large databases
Analyze data with aggregations, joins, or complex filters
Generate reports from the available databases
Explore database content through natural language queries

Available Databases

Database	Description	Use Cases
blog_small	Blog platform with users, posts, comments, categories, tags	Content analytics, author stats, engagement metrics
ecommerce_medium	E-commerce with products, orders, inventory, reviews	Sales analysis, inventory management, customer insights
saas_crm_large	Multi-tenant CRM with leads, deals, accounts, tickets	Pipeline analysis, sales performance, support metrics

Security Requirements

CRITICAL: All generated SQL must comply with these security rules:

Allowed Operations

SELECT statements only
Read-only aggregate functions (COUNT, SUM, AVG, MIN, MAX, etc.)
JOINs, subqueries, CTEs, window functions
WHERE, GROUP BY, HAVING, ORDER BY, LIMIT clauses

Prohibited Operations

NO INSERT, UPDATE, DELETE, TRUNCATE, DROP, ALTER, CREATE
NO EXECUTE, CALL, DO blocks
NO pg_sleep, pg_terminate_backend, or any system functions
NO COPY, \copy, or file operations
NO SET, RESET, or configuration changes
NO Comments containing sensitive data
NO Dynamic SQL or string concatenation for queries
NO Access to pg_catalog system tables (except for metadata queries)
NO Sensitive data exposure (passwords, API keys, tokens, hashes)

SQL Injection Prevention

Use parameterized queries when possible
Validate and sanitize any user-provided values
Never concatenate user input directly into SQL
Escape special characters properly

Workflow

Step 1: Analyze User Request

Parse the user's natural language query to determine:

Target database: Which database(s) does this query relate to?
- Blog content, posts, comments, authors → blog_small
- Products, orders, sales, inventory, reviews → ecommerce_medium
- Leads, deals, accounts, contacts, tickets, subscriptions → saas_crm_large
Query intent: What data does the user want?
- Listing/searching records
- Aggregation/statistics
- Relationships/joins
- Time-based analysis
Output format: Does the user want SQL only or results?
- If user says "give me the SQL" or "show me the query" → Return SQL only
- Otherwise → Execute and return results (default)

Step 2: Read Database Reference

Read the appropriate reference file(s) to understand the schema:

# Read the reference file for the target database
cat .claude/skills/pg-data/references/<database_name>.md

Reference files contain:

Table structures with columns and types
Foreign key relationships
Available views
Common query patterns

Step 3: Generate SQL

Generate a SQL query following these guidelines:

Start with the primary table that best matches the user's intent
Add JOINs only when necessary for the requested data
Use appropriate indexes - check the reference for available indexes
Include sensible defaults:
- Add LIMIT 100 for unbounded queries
- Use meaningful column aliases
- Order results logically

SQL Template

-- Purpose: <brief description of what this query does>
SELECT
    <columns with meaningful aliases>
FROM <primary_table> <alias>
[JOIN <related_table> ON <condition>]
[WHERE <filters>]
[GROUP BY <grouping>]
[HAVING <aggregate_filters>]
[ORDER BY <ordering>]
[LIMIT <reasonable_limit>];

Step 4: Validate SQL Safety

Before execution, verify the query:

Check for prohibited keywords:

INSERT, UPDATE, DELETE, DROP, ALTER, CREATE, TRUNCATE,
EXECUTE, CALL, DO, COPY, SET, RESET, pg_sleep, pg_terminate

Verify read-only nature: Query must only retrieve data
Check for sensitive data exposure: Exclude password_hash, api_key, token fields

Step 5: Execute and Test

Execute the SQL using psql:

PGPASSWORD=postgres psql -h localhost -p 5432 -U postgres -d <database_name> -c "<SQL_QUERY>"

If execution fails:

Analyze the error message
Check for:
- Invalid column/table names
- Type mismatches
- Syntax errors
- Missing JOINs
Regenerate the SQL with corrections
Return to Step 5 (max 3 retries)

Step 6: Analyze Results and Score Confidence

Evaluate the query results and assign a confidence score (0-10):

Score	Meaning	Action
10	Perfect match, expected results	Return results
8-9	Good match, minor uncertainties	Return results with notes
7	Acceptable but could be improved	Return results with caveats
5-6	Uncertain if this answers the question	Regenerate or ask for clarification
1-4	Likely wrong interpretation	Regenerate SQL
0	Completely wrong or empty results	Regenerate or explain issue

Scoring Criteria:

Does the result count seem reasonable?
Do the columns match what was asked?
Are the values in expected ranges?
Does the data make logical sense?

If score < 7:

Analyze what might be wrong
Consider alternative interpretations
Regenerate SQL with different approach
Return to Step 5

Step 7: Return Results

Based on user's preference:

If returning results (default):

## Query Results

**Database**: <database_name>
**Confidence**: <score>/10

### SQL Executed
\`\`\`sql
<the_sql_query>
\`\`\`

### Results
<formatted_results_table>

### Analysis
<brief explanation of what the results show>

If returning SQL only:

## Generated SQL

**Database**: <database_name>
**Purpose**: <what_this_query_does>

\`\`\`sql
<the_sql_query>
\`\`\`

### Usage
\`\`\`bash
PGPASSWORD=postgres psql -h localhost -p 5432 -U postgres -d <database_name> -c "<sql>"
\`\`\`

Example Queries

Blog Database Examples

User: "Show me the top 10 authors by post count"

SELECT
    u.username,
    u.full_name,
    COUNT(p.id) AS post_count,
    SUM(p.view_count) AS total_views
FROM users u
JOIN posts p ON u.id = p.author_id
WHERE p.status = 'published'
GROUP BY u.id, u.username, u.full_name
ORDER BY post_count DESC
LIMIT 10;

User: "Find posts with the most comments in the last 30 days"

SELECT
    p.title,
    p.slug,
    u.username AS author,
    COUNT(c.id) AS comment_count,
    p.view_count
FROM posts p
JOIN users u ON p.author_id = u.id
LEFT JOIN comments c ON p.id = c.post_id
WHERE p.published_at >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY p.id, u.username
ORDER BY comment_count DESC
LIMIT 20;

E-commerce Database Examples

User: "What are the best selling products this month?"

SELECT
    p.name AS product_name,
    p.sku,
    c.name AS category,
    SUM(oi.quantity) AS units_sold,
    SUM(oi.total_price) AS revenue
FROM products p
JOIN order_items oi ON p.id = oi.product_id
JOIN orders o ON oi.order_id = o.id
LEFT JOIN categories c ON p.category_id = c.id
WHERE o.created_at >= DATE_TRUNC('month', CURRENT_DATE)
    AND o.status NOT IN ('cancelled', 'refunded')
GROUP BY p.id, p.name, p.sku, c.name
ORDER BY revenue DESC
LIMIT 20;

User: "Show me customers who haven't ordered in 90 days"

SELECT
    u.email,
    u.first_name,
    u.last_name,
    MAX(o.created_at) AS last_order_date,
    COUNT(o.id) AS total_orders,
    SUM(o.total_amount) AS lifetime_value
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id, u.email, u.first_name, u.last_name
HAVING MAX(o.created_at) < CURRENT_DATE - INTERVAL '90 days'
    OR MAX(o.created_at) IS NULL
ORDER BY lifetime_value DESC NULLS LAST
LIMIT 50;

CRM Database Examples

User: "Show the sales pipeline for organization 1"

SELECT
    ps.name AS stage_name,
    ps.stage_type,
    COUNT(d.id) AS deal_count,
    SUM(d.amount) AS total_value,
    AVG(d.probability) AS avg_probability,
    SUM(d.amount * d.probability / 100) AS weighted_value
FROM pipeline_stages ps
LEFT JOIN deals d ON ps.id = d.stage_id
JOIN pipelines p ON ps.pipeline_id = p.id
WHERE p.organization_id = 1
GROUP BY ps.id, ps.name, ps.stage_type, ps.display_order
ORDER BY ps.display_order;

User: "Find all overdue tickets with high priority"

SELECT
    t.ticket_number,
    t.subject,
    t.priority,
    t.status,
    a.name AS account_name,
    u.first_name || ' ' || u.last_name AS assigned_to,
    t.created_at,
    CURRENT_DATE - DATE(t.created_at) AS days_open
FROM tickets t
LEFT JOIN accounts a ON t.account_id = a.id
LEFT JOIN users u ON t.assigned_to = u.id
WHERE t.status NOT IN ('resolved', 'closed')
    AND t.priority IN ('high', 'urgent')
ORDER BY
    CASE t.priority WHEN 'urgent' THEN 1 WHEN 'high' THEN 2 END,
    t.created_at
LIMIT 50;

Handling Ambiguous Requests

When the user's request is unclear:

Ask clarifying questions if critical information is missing:
- Which database to query?
- What time period?
- What filters to apply?
Make reasonable assumptions and document them:
- Default to recent data (last 30 days)
- Use sensible limits (LIMIT 100)
- Include common filters (active records, published posts)
Suggest alternatives if multiple interpretations exist:
- "I interpreted this as X. If you meant Y, let me know."

Reference Files

blog_small.md - Blog platform schema
ecommerce_medium.md - E-commerce schema
saas_crm_large.md - SaaS CRM schema

Error Recovery

Common Errors and Fixes

Error	Cause	Fix
`column does not exist`	Wrong column name	Check reference file for correct name
`relation does not exist`	Wrong table name	Verify table exists in schema
`operator does not exist`	Type mismatch	Cast to correct type
`syntax error`	Invalid SQL	Review query structure
`permission denied`	Attempting write op	Ensure SELECT only

Retry Logic

Parse error message
Identify the issue
Consult reference file
Regenerate corrected SQL
Maximum 3 retry attempts
If still failing, explain the issue to user

Notes

Always respect multi-tenant boundaries in saas_crm_large (filter by organization_id)
Use views when available for common query patterns
Consider query performance for large result sets
Results are truncated to first 100 rows by default

pg-data

Install Skill

SKILL.md