| name | optimizing-queries |
| description | Analyzes and optimizes SQL/NoSQL queries for performance. Use when reviewing query performance, optimizing slow queries, analyzing EXPLAIN output, suggesting indexes, identifying N+1 problems, recommending query rewrites, or improving database access patterns. Supports PostgreSQL, MySQL, SQLite, MongoDB, Redis, DynamoDB, and Elasticsearch. |
Query Optimization
Analyze database queries and provide actionable optimization recommendations.
Workflow
Copy and track progress:
Query Optimization:
- [ ] Identify database type
- [ ] Analyze query structure
- [ ] Check for red flags
- [ ] Provide recommendations
- [ ] Suggest verification
- Identify database type from syntax or user context
- Analyze query structure - parse SELECT, FROM, WHERE, JOIN, ORDER BY, GROUP BY clauses
- Check for red flags (see Quick Reference)
- Provide concrete recommendations with copy-pasteable code
- Suggest verification - EXPLAIN command or metric to confirm
For complex queries (3+ JOINs or nested subqueries):
- Decompose into logical units
- Analyze each unit's access pattern
- Synthesize recommendations in priority order
Analysis Checklist
For each query, evaluate:
- Index usage: Missing or inefficient indexes
- Scan type: Full scans on large tables
- Join efficiency: Join order, type, missing conditions
- Filtering: WHERE placement, function usage on columns
- Projection: SELECT * vs specific columns
- Pagination: OFFSET vs keyset/cursor
- N+1 detection: Loop patterns in code context
Quick Reference
SQL Red Flags
SELECT *on large tablesWHERE function(column)prevents index useLIKE '%pattern'(leading wildcard)ORconditions spanning different columns- Large
OFFSETvalues NOT INwith nullable subquery- Missing
LIMITon unbounded queries
NoSQL Red Flags
- MongoDB: COLLSCAN,
$where, unbounded arrays - Redis:
KEYS *, missing TTL, wrong data structure - DynamoDB: Scan operations, hot partitions
- Elasticsearch: Deep pagination, wildcard prefix
Examples
Example 1: Missing Index
Input:
SELECT * FROM orders WHERE customer_email = 'user@example.com';
Output:
## Issue [HIGH]
Full table scan on `orders` - no index on `customer_email`
## Impact
Query time grows linearly with table size; causes lock contention
## Solution
CREATE INDEX idx_orders_customer_email ON orders(customer_email);
## Verification
EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_email = 'user@example.com';
-- Should show "Index Scan" instead of "Seq Scan"
Example 2: N+1 Query Pattern
Input:
users = db.query("SELECT * FROM users")
for user in users:
orders = db.query(f"SELECT * FROM orders WHERE user_id = {user.id}")
Output:
## Issue [HIGH]
N+1 query pattern - executes 1 + N queries instead of 1
## Impact
100 users = 101 queries; network latency multiplied
## Solution
SELECT u.*, o.* FROM users u
LEFT JOIN orders o ON u.id = o.user_id;
## Verification
Check query count before/after; should reduce from N+1 to 1
Example 3: When NOT to Optimize
Input:
SELECT * FROM config WHERE key = 'app_version';
Output:
## No Issues Detected
This query on a small lookup table (typically <100 rows) is acceptable.
Adding an index would add write overhead without meaningful read benefit.
## When to Reconsider
- If table grows beyond 10,000 rows
- If query is called >100 times/second
Example 4: Incomplete Information
Input:
SELECT * FROM large_table WHERE status = ?;
Output:
## Issue [MEDIUM]
Preliminary analysis without schema information
## Potential Issues
1. `SELECT *` - may fetch unnecessary columns
2. `status` column - if low cardinality (e.g., 5 values), index may not help
## Information Needed
- Table row count
- Cardinality of `status` column
- Existing indexes: `SHOW INDEX FROM large_table;`
Example 5: MongoDB COLLSCAN
Input:
db.orders.find({ customer_email: "user@example.com" })
Output:
## Issue [HIGH]
COLLSCAN on `orders` - no index on `customer_email`
## Impact
Query scans entire collection; O(n) performance
## Solution
db.orders.createIndex({ customer_email: 1 })
## Verification
db.orders.find({ customer_email: "..." }).explain("executionStats")
// Should show "stage": "IXSCAN" instead of "COLLSCAN"
Response Format
Structure recommendations as:
## Issue [HIGH/MEDIUM/LOW]
[Specific problem identified]
## Impact
[Performance/resource cost]
## Solution
[Concrete fix with code]
## Verification
[EXPLAIN command or metric to confirm]
Confidence Levels
When uncertainty exists, indicate confidence:
- HIGH: Pattern clearly matches known anti-pattern
- MEDIUM: Likely issue, depends on data distribution/schema
- LOW: Potential issue, needs EXPLAIN to confirm
Response Constraints
- Max 5 recommendations per query - prioritize by impact
- Order by severity: HIGH → MEDIUM → LOW
- One primary fix per issue - mention alternatives briefly
- Code must be copy-pasteable - no placeholders like
<table_name> - Verification commands must be complete - include all flags
Prioritization Matrix
| Issue Type | Impact | Priority |
|---|---|---|
| Missing JOIN index | Query blocks | P0 |
| Full table scan (>100k rows) | Slow response | P0 |
| N+1 pattern | Latency × N | P1 |
| SELECT * | Memory/bandwidth | P1 |
| Missing LIMIT | Resource exhaustion | P1 |
| Large OFFSET | Slow pagination | P2 |
| Suboptimal ORDER BY | Sort overhead | P2 |
Reference Files
Consult these for detailed patterns (use grep to find specific sections):
references/sql-patterns.md- SQL optimization- Grep:
## Index,## Query Rewrite,## Join,## Aggregation
- Grep:
references/nosql-patterns.md- MongoDB, Redis, DynamoDB, Elasticsearch- Grep:
## MongoDB,## Redis,## DynamoDB,## Elasticsearch
- Grep:
references/explain-analysis.md- Execution plans- Grep:
## PostgreSQL,## MySQL,## MongoDB
- Grep:
references/common-scenarios.md- Common patterns- Grep:
## Slow pagination,## N+1,## Dashboard,## Bulk
- Grep:
references/orm-patterns.md- ORM fixes- Grep:
### Django,### SQLAlchemy,### ActiveRecord,### Prisma
- Grep:
Utility Scripts
All scripts use Python standard library only (no pip install required).
Run these for automated analysis:
Query Analysis
analyze_query.py - Detect anti-patterns and score complexity:
python scripts/analyze_query.py "SELECT * FROM orders WHERE ..."
python scripts/analyze_query.py --file query.sql --json
suggest_index.py - Generate index recommendations:
python scripts/suggest_index.py "SELECT ... FROM orders WHERE status = 'pending'"
EXPLAIN & Logs
parse_explain.py - Analyze EXPLAIN output:
python scripts/parse_explain.py --pg explain_output.txt
python scripts/parse_explain.py --mysql explain_output.txt
diff_explain.py - Compare before/after EXPLAIN outputs:
python scripts/diff_explain.py --pg before.txt after.txt
python scripts/diff_explain.py --mysql before.txt after.txt --json
parse_slow_log.py - Analyze slow query logs:
python scripts/parse_slow_log.py --pg /var/log/postgresql.log --top 20
python scripts/parse_slow_log.py --mysql slow-query.log --json
Schema & Code
analyze_schema.py - Find schema optimization opportunities:
python scripts/analyze_schema.py schema.sql
detect_orm_issues.py - Find ORM anti-patterns in code:
python scripts/detect_orm_issues.py app.py
python scripts/detect_orm_issues.py --dir ./models
check_migration.py - Check migration safety:
python scripts/check_migration.py --pg migration.sql
python scripts/check_migration.py --mysql migration.sql
Workflow with Scripts
For query optimization:
analyze_query.py→ detect issuessuggest_index.py→ generate indexes- User runs EXPLAIN (before)
- User applies optimization
- User runs EXPLAIN (after)
diff_explain.py→ verify improvement
For verifying optimization impact:
# Save before state
psql -c "EXPLAIN (ANALYZE, BUFFERS) SELECT ..." > before.txt
# Apply optimization (add index, rewrite query, etc.)
# Save after state
psql -c "EXPLAIN (ANALYZE, BUFFERS) SELECT ..." > after.txt
# Compare
python scripts/diff_explain.py --pg before.txt after.txt
For codebase audit:
detect_orm_issues.py --dir ./→ find N+1 patternsanalyze_schema.py→ check schema- Prioritize fixes by severity
For migration review:
check_migration.py→ safety check- Fix critical issues before deployment
- Use safe alternatives (CONCURRENTLY, etc.)
EXPLAIN Commands
Provide appropriate commands:
-- PostgreSQL
EXPLAIN (ANALYZE, BUFFERS) SELECT ...;
-- MySQL
EXPLAIN ANALYZE SELECT ...;
-- SQLite
EXPLAIN QUERY PLAN SELECT ...;
// MongoDB
db.collection.find({...}).explain("executionStats")
Index Recommendation Format
When suggesting indexes:
-- PostgreSQL/MySQL
CREATE INDEX [CONCURRENTLY] idx_table_columns
ON table(col1, col2, col3);
-- Rationale: [why this column order, covering considerations]
// MongoDB (ESR: Equality, Sort, Range)
db.collection.createIndex({ equality_field: 1, sort_field: -1, range_field: 1 })
Handling Missing Information
When user provides incomplete context:
No schema provided:
"To give precise recommendations, I need the table schema. Could you share:
- Column data types
- Existing indexes
- Approximate row counts
Meanwhile, here's what I can infer from the query..."
No EXPLAIN output:
"Run this command and share the output:
EXPLAIN (ANALYZE, BUFFERS) <their query>;This will show actual vs estimated rows and scan types."
Unknown database type:
"Which database are you using? (PostgreSQL, MySQL, SQLite, MongoDB, etc.) The optimization strategies differ significantly between them."
Conditional Workflows
Analyzing a query:
- Run
analyze_query.pyfirst - Check for anti-patterns
- Suggest fixes with examples
Interpreting EXPLAIN output:
- Run
parse_explain.pywith appropriate flag - Identify warnings
- Correlate with query structure
Full optimization request:
- Analyze query → identify issues
- Suggest indexes → generate CREATE statements
- Request EXPLAIN → interpret results
- Provide before/after comparison
Complex optimization (3+ issues or architectural decisions): Use extended reasoning to:
- Map full query execution path
- Identify all optimization opportunities
- Evaluate trade-offs between approaches
- Prioritize by impact and implementation cost
Script Error Handling
If scripts fail:
- Check Python version (requires 3.7+)
- Verify input format (SQL string, valid file path)
- Fall back to manual analysis using reference files
Common errors:
FileNotFoundError→ Check file path, use absolute pathsjson.JSONDecodeError→ Input not valid JSON, check --pg/--mysql flags- Empty output → Query may be valid, no issues detected
When scripts are unavailable: Perform manual analysis using Quick Reference red flags and reference files.
Unsupported Databases
For databases not explicitly covered (Oracle, SQL Server, CockroachDB, etc.):
- Apply general SQL patterns from
references/sql-patterns.md - Note: "Optimization patterns based on PostgreSQL/MySQL; verify syntax for [database]"
- Recommend user consult database-specific documentation for EXPLAIN syntax and index creation