Claude Code Plugins

Community-maintained marketplace

Feedback

query-optimizer

@CuriousLearner/devkit
10
0

Analyze and optimize SQL queries for better performance and efficiency.

Install Skill

1Download skill
2Enable skills in Claude

Open claude.ai/settings/capabilities and find the "Skills" section

3Upload to Claude

Click "Upload skill" and select the downloaded ZIP file

Note: Please verify skill by going through its instructions before using it.

SKILL.md

name query-optimizer
description Analyze and optimize SQL queries for better performance and efficiency.

Query Optimizer Skill

Analyze and optimize SQL queries for better performance and efficiency.

Instructions

You are a database performance optimization expert. When invoked:

  1. Analyze Query Performance:

    • Use EXPLAIN/EXPLAIN ANALYZE to understand execution plan
    • Identify slow queries from logs
    • Measure query execution time
    • Detect full table scans and missing indexes
  2. Identify Bottlenecks:

    • Find N+1 query problems
    • Detect inefficient JOINs
    • Identify missing or unused indexes
    • Spot suboptimal WHERE clauses
  3. Optimize Queries:

    • Add appropriate indexes
    • Rewrite queries for better performance
    • Suggest caching strategies
    • Recommend query restructuring
  4. Provide Recommendations:

    • Index creation suggestions
    • Query rewriting alternatives
    • Database configuration tuning
    • Monitoring and alerting setup

Supported Databases

  • SQL: PostgreSQL, MySQL, MariaDB, SQL Server, SQLite
  • Analysis Tools: EXPLAIN, EXPLAIN ANALYZE, Query Profiler
  • Monitoring: pg_stat_statements, slow query log, performance schema

Usage Examples

@query-optimizer
@query-optimizer --analyze-slow-queries
@query-optimizer --suggest-indexes
@query-optimizer --explain SELECT * FROM users WHERE email = 'test@example.com'
@query-optimizer --fix-n-plus-one

Query Analysis Tools

PostgreSQL - EXPLAIN ANALYZE

-- Basic EXPLAIN
EXPLAIN
SELECT u.id, u.username, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.active = true
GROUP BY u.id, u.username;

-- EXPLAIN ANALYZE - actually runs the query
EXPLAIN ANALYZE
SELECT u.id, u.username, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.active = true
GROUP BY u.id, u.username;

-- EXPLAIN with all options (PostgreSQL)
EXPLAIN (ANALYZE, BUFFERS, VERBOSE, FORMAT JSON)
SELECT * FROM orders
WHERE user_id = 123
  AND created_at >= '2024-01-01';

Reading EXPLAIN Output:

Seq Scan on users  (cost=0.00..1234.56 rows=10000 width=32)
  Filter: (active = true)

-- Seq Scan = Sequential Scan (full table scan) - BAD for large tables
-- cost=0.00..1234.56 = startup cost..total cost
-- rows=10000 = estimated rows
-- width=32 = average row size in bytes
Index Scan using idx_users_email on users  (cost=0.29..8.30 rows=1 width=32)
  Index Cond: (email = 'test@example.com'::text)

-- Index Scan = Using index - GOOD
-- Much lower cost than Seq Scan
-- rows=1 = accurate estimate

MySQL - EXPLAIN

-- MySQL EXPLAIN
EXPLAIN
SELECT u.id, u.username, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.active = true
GROUP BY u.id, u.username;

-- EXPLAIN with execution stats (MySQL 8.0+)
EXPLAIN ANALYZE
SELECT * FROM orders
WHERE user_id = 123;

-- Show warnings for optimization info
EXPLAIN
SELECT * FROM users WHERE email = 'test@example.com';
SHOW WARNINGS;

MySQL EXPLAIN Output:

+----+-------------+-------+------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key     | key_len | ref   | rows | Extra       |
+----+-------------+-------+------+---------------+---------+---------+-------+------+-------------+
|  1 | SIMPLE      | users | ALL  | NULL          | NULL    | NULL    | NULL  | 1000 | Using where |
+----+-------------+-------+------+---------------+---------+---------+-------+------+-------------+

-- type=ALL means full table scan - BAD
-- key=NULL means no index used

+----+-------------+-------+------+---------------+----------------+---------+-------+------+-------+
| id | select_type | table | type | possible_keys | key            | key_len | ref   | rows | Extra |
+----+-------------+-------+------+---------------+----------------+---------+-------+------+-------+
|  1 | SIMPLE      | users | ref  | idx_users_email| idx_users_email| 767     | const |    1 | NULL  |
+----+-------------+-------+------+---------------+----------------+---------+-------+------+-------+

-- type=ref means index lookup - GOOD
-- key shows index being used

Common Performance Issues

1. Missing Indexes

Problem:

-- Slow query - full table scan
SELECT * FROM users WHERE email = 'john@example.com';

-- EXPLAIN shows:
-- Seq Scan on users (cost=0.00..1500.00 rows=1 width=100)
--   Filter: (email = 'john@example.com')

Solution:

-- Add index on email column
CREATE INDEX idx_users_email ON users(email);

-- Now EXPLAIN shows:
-- Index Scan using idx_users_email on users (cost=0.29..8.30 rows=1 width=100)
--   Index Cond: (email = 'john@example.com')

-- Query becomes 100x faster

2. N+1 Query Problem

Problem:

// ORM code causing N+1 queries
const users = await User.findAll(); // 1 query

for (const user of users) {
  const orders = await Order.findAll({
    where: { userId: user.id }  // N queries (one per user)
  });
  console.log(`${user.name}: ${orders.length} orders`);
}

// Total: 1 + N queries for N users
// For 100 users = 101 queries!

Solution:

// Use eager loading - single query with JOIN
const users = await User.findAll({
  include: [{
    model: Order,
    attributes: ['id', 'total_amount']
  }]
});

for (const user of users) {
  console.log(`${user.name}: ${user.orders.length} orders`);
}

// Total: 1 query regardless of user count

SQL Equivalent:

-- Instead of multiple queries:
SELECT * FROM users;
SELECT * FROM orders WHERE user_id = 1;
SELECT * FROM orders WHERE user_id = 2;
-- ... (N more queries)

-- Use single JOIN query:
SELECT
  u.id,
  u.name,
  COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id, u.name;

3. SELECT * Inefficiency

Problem:

-- Fetching all columns when only need few
SELECT * FROM products
WHERE category_id = 5;

-- Fetches: id, name, description (large text), image_url, specs (json),
--         price, stock, created_at, updated_at, etc.

Solution:

-- Only select needed columns
SELECT id, name, price, stock
FROM products
WHERE category_id = 5;

-- Benefits:
-- - Less data transferred
-- - Faster query execution
-- - Lower memory usage
-- - Can use covering indexes

4. Inefficient Pagination

Problem:

-- OFFSET becomes slow with large offsets
SELECT * FROM users
ORDER BY created_at DESC
LIMIT 20 OFFSET 10000;

-- Database must:
-- 1. Sort all rows
-- 2. Skip 10,000 rows
-- 3. Return next 20
-- Gets slower as offset increases

Solution:

-- Use cursor-based (keyset) pagination
SELECT * FROM users
WHERE created_at < '2024-01-01 12:00:00'
  AND (created_at < '2024-01-01 12:00:00' OR id < 12345)
ORDER BY created_at DESC, id DESC
LIMIT 20;

-- Or with indexed column:
SELECT * FROM users
WHERE id < 10000
ORDER BY id DESC
LIMIT 20;

-- Benefits:
-- - Consistent performance regardless of page
-- - Uses index efficiently
-- - No need to skip rows

5. Function on Indexed Column

Problem:

-- Function prevents index usage
SELECT * FROM users
WHERE LOWER(email) = 'john@example.com';

-- EXPLAIN shows Seq Scan (index not used)

Solution 1 - Store lowercase:

-- Add computed column
ALTER TABLE users ADD COLUMN email_lower VARCHAR(255)
  GENERATED ALWAYS AS (LOWER(email)) STORED;

CREATE INDEX idx_users_email_lower ON users(email_lower);

-- Query:
SELECT * FROM users
WHERE email_lower = 'john@example.com';

Solution 2 - Functional index (PostgreSQL):

-- Create index on function result
CREATE INDEX idx_users_email_lower ON users(LOWER(email));

-- Now original query uses index
SELECT * FROM users
WHERE LOWER(email) = 'john@example.com';

Solution 3 - Case-insensitive collation:

-- PostgreSQL - use citext type
ALTER TABLE users ALTER COLUMN email TYPE citext;

-- Query without LOWER:
SELECT * FROM users WHERE email = 'john@example.com';
-- Automatically case-insensitive

6. Inefficient JOINs

Problem:

-- Multiple JOINs without proper indexes
SELECT
  u.username,
  o.id as order_id,
  p.name as product_name
FROM users u
JOIN orders o ON u.id = o.user_id
JOIN order_items oi ON o.id = oi.order_id
JOIN products p ON oi.product_id = p.id
WHERE u.email = 'john@example.com';

-- Slow if missing indexes on:
-- - users.email
-- - orders.user_id
-- - order_items.order_id
-- - order_items.product_id

Solution:

-- Add necessary indexes
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_order_items_order_id ON order_items(order_id);
CREATE INDEX idx_order_items_product_id ON order_items(product_id);

-- Now query uses indexes for all JOINs
-- EXPLAIN will show "Index Scan" for each table

7. OR Conditions

Problem:

-- OR prevents efficient index usage
SELECT * FROM users
WHERE username = 'john' OR email = 'john@example.com';

-- May not use indexes optimally

Solution:

-- Use UNION for better index usage
SELECT * FROM users WHERE username = 'john'
UNION
SELECT * FROM users WHERE email = 'john@example.com';

-- Each subquery uses its own index
-- Deduplicates results automatically

8. NOT IN with Subquery

Problem:

-- Slow subquery execution
SELECT * FROM users
WHERE id NOT IN (
  SELECT user_id FROM banned_users
);

-- Can be very slow with large subquery results

Solution:

-- Use LEFT JOIN with NULL check
SELECT u.*
FROM users u
LEFT JOIN banned_users bu ON u.id = bu.user_id
WHERE bu.user_id IS NULL;

-- Or use NOT EXISTS (often faster):
SELECT u.*
FROM users u
WHERE NOT EXISTS (
  SELECT 1 FROM banned_users bu
  WHERE bu.user_id = u.id
);

Index Optimization

When to Add Indexes

Add indexes for:

  • Primary keys (automatic in most databases)
  • Foreign keys (critical for JOINs)
  • Columns in WHERE clauses
  • Columns in ORDER BY clauses
  • Columns in GROUP BY clauses
  • Columns in JOIN conditions
  • Columns with high cardinality (many unique values)

Index Types

B-Tree Index (Default):

-- Best for: equality (=) and range (<, >, BETWEEN) queries
CREATE INDEX idx_users_created_at ON users(created_at);

-- Good for:
SELECT * FROM users WHERE created_at > '2024-01-01';
SELECT * FROM users WHERE created_at BETWEEN '2024-01-01' AND '2024-12-31';

Composite Index:

-- Index on multiple columns
CREATE INDEX idx_orders_user_status ON orders(user_id, status);

-- Used for queries filtering both columns
SELECT * FROM orders WHERE user_id = 123 AND status = 'pending';

-- Also used for queries on first column only
SELECT * FROM orders WHERE user_id = 123;

-- NOT used for queries on second column only
SELECT * FROM orders WHERE status = 'pending'; -- Won't use this index

-- Column order matters! Most selective first

Partial Index (PostgreSQL):

-- Index only subset of rows
CREATE INDEX idx_active_users ON users(email)
WHERE active = true;

-- Smaller index, faster queries for active users
SELECT * FROM users WHERE email = 'john@example.com' AND active = true;

GIN Index (PostgreSQL - for arrays, JSONB, full-text):

-- For JSONB columns
CREATE INDEX idx_products_metadata ON products USING GIN(metadata);

-- Query JSONB data
SELECT * FROM products
WHERE metadata @> '{"brand": "Apple"}';

-- For array columns
CREATE INDEX idx_tags ON posts USING GIN(tags);

-- Query arrays
SELECT * FROM posts WHERE tags @> ARRAY['postgresql'];

Full-Text Search Index:

-- PostgreSQL
CREATE INDEX idx_products_search ON products
USING GIN(to_tsvector('english', name || ' ' || description));

-- Full-text search query
SELECT * FROM products
WHERE to_tsvector('english', name || ' ' || description)
  @@ to_tsquery('english', 'laptop & gaming');

Covering Index

Concept:

-- Covering index includes all columns needed by query
CREATE INDEX idx_users_email_username ON users(email, username);

-- This query can be answered entirely from index (no table access)
SELECT username FROM users WHERE email = 'john@example.com';

-- PostgreSQL: Index-Only Scan
-- MySQL: Using index

With INCLUDE (PostgreSQL 11+):

-- Include non-indexed columns in index leaf nodes
CREATE INDEX idx_users_email ON users(email)
INCLUDE (username, created_at);

-- Query can use index without table access
SELECT username, created_at
FROM users
WHERE email = 'john@example.com';

Index Maintenance

Find Unused Indexes (PostgreSQL):

SELECT
  schemaname,
  tablename,
  indexname,
  idx_scan,
  pg_size_pretty(pg_relation_size(indexrelid)) as index_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND indexrelname NOT LIKE '%_pkey'
ORDER BY pg_relation_size(indexrelid) DESC;

-- Drop unused indexes to save space and improve write performance

Find Duplicate Indexes:

-- PostgreSQL query to find duplicate indexes
SELECT
  indrelid::regclass AS table_name,
  array_agg(indexrelid::regclass) AS indexes
FROM pg_index
GROUP BY indrelid, indkey
HAVING COUNT(*) > 1;

Rebuild Fragmented Indexes:

-- PostgreSQL
REINDEX INDEX idx_users_email;
REINDEX TABLE users;

-- MySQL
OPTIMIZE TABLE users;

Query Rewriting Examples

Example 1: Aggregation Optimization

Before:

SELECT
  u.id,
  u.username,
  (SELECT COUNT(*) FROM orders WHERE user_id = u.id) as order_count,
  (SELECT SUM(total_amount) FROM orders WHERE user_id = u.id) as total_spent
FROM users u
WHERE u.active = true;

-- N+1 problem: 1 query + 2 subqueries per user

After:

SELECT
  u.id,
  u.username,
  COUNT(o.id) as order_count,
  COALESCE(SUM(o.total_amount), 0) as total_spent
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.active = true
GROUP BY u.id, u.username;

-- Single query with JOIN
-- Much faster!

Example 2: EXISTS vs IN

Before:

SELECT * FROM products
WHERE id IN (
  SELECT product_id FROM order_items
  WHERE created_at >= '2024-01-01'
);

-- Subquery returns all product_ids (potentially large result set)

After:

SELECT p.* FROM products p
WHERE EXISTS (
  SELECT 1 FROM order_items oi
  WHERE oi.product_id = p.id
    AND oi.created_at >= '2024-01-01'
);

-- EXISTS stops at first match (more efficient)

Example 3: Avoid Cartesian Products

Before:

-- Accidental cartesian product
SELECT *
FROM users u, orders o
WHERE u.active = true
  AND o.status = 'completed';

-- Returns every user combined with every completed order!
-- Missing JOIN condition

After:

SELECT u.*, o.*
FROM users u
INNER JOIN orders o ON u.id = o.user_id
WHERE u.active = true
  AND o.status = 'completed';

-- Proper JOIN condition

Example 4: Optimize DISTINCT

Before:

SELECT DISTINCT user_id
FROM orders
WHERE status = 'completed';

-- DISTINCT requires sorting/deduplication

After:

SELECT user_id
FROM orders
WHERE status = 'completed'
GROUP BY user_id;

-- GROUP BY often faster than DISTINCT
-- Or if unique constraint exists:
SELECT DISTINCT ON (user_id) user_id, created_at
FROM orders
WHERE status = 'completed'
ORDER BY user_id, created_at DESC;

Monitoring Slow Queries

PostgreSQL - pg_stat_statements

-- Enable extension
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Find slowest queries
SELECT
  substring(query, 1, 50) AS short_query,
  round(total_exec_time::numeric, 2) AS total_time,
  calls,
  round(mean_exec_time::numeric, 2) AS mean_time,
  round((100 * total_exec_time / sum(total_exec_time) OVER ())::numeric, 2) AS percentage
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

-- Find queries with most calls
SELECT
  substring(query, 1, 50) AS short_query,
  calls,
  round(mean_exec_time::numeric, 2) AS mean_time
FROM pg_stat_statements
ORDER BY calls DESC
LIMIT 10;

-- Reset statistics
SELECT pg_stat_statements_reset();

MySQL - Slow Query Log

-- Enable slow query log
SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 1; -- Log queries taking > 1 second
SET GLOBAL slow_query_log_file = '/var/log/mysql/slow-query.log';

-- Log queries not using indexes
SET GLOBAL log_queries_not_using_indexes = 'ON';

-- Analyze slow query log
-- Use mysqldumpslow tool:
-- mysqldumpslow -s t -t 10 /var/log/mysql/slow-query.log

Performance Schema (MySQL)

-- Enable performance schema
SET GLOBAL performance_schema = ON;

-- Find slowest statements
SELECT
  DIGEST_TEXT,
  COUNT_STAR AS executions,
  ROUND(AVG_TIMER_WAIT / 1000000000, 2) AS avg_ms,
  ROUND(SUM_TIMER_WAIT / 1000000000, 2) AS total_ms
FROM performance_schema.events_statements_summary_by_digest
ORDER BY SUM_TIMER_WAIT DESC
LIMIT 10;

Best Practices

DO ✓

  • Use EXPLAIN before and after optimization
  • Add indexes on foreign keys - Critical for JOINs
  • Use covering indexes when possible
  • Paginate large result sets - Avoid loading all data
  • Monitor query performance - Use pg_stat_statements or slow query log
  • Test on production-like data - Performance differs with data volume
  • Use connection pooling - Reduce connection overhead
  • Cache frequently accessed data - Redis, Memcached
  • Archive old data - Keep active tables smaller
  • Regular VACUUM/ANALYZE (PostgreSQL) - Update statistics

DON'T ✗

  • **Don't use SELECT *** - Fetch only needed columns
  • Don't over-index - Each index slows down writes
  • Don't ignore EXPLAIN warnings - They indicate problems
  • Don't use functions on indexed columns - Prevents index usage
  • Don't fetch more data than needed - Use LIMIT
  • Don't use OFFSET for deep pagination - Use cursor-based instead
  • Don't ignore database logs - Monitor for errors
  • Don't optimize prematurely - Profile first, optimize bottlenecks
  • Don't forget about write performance - Indexes slow down INSERTs
  • Don't skip testing - Verify optimizations actually help

Query Optimization Checklist

## Query Optimization Checklist

- [ ] Run EXPLAIN/EXPLAIN ANALYZE on query
- [ ] Check if query uses indexes (no Seq Scan on large tables)
- [ ] Verify indexes exist on:
  - [ ] Foreign key columns
  - [ ] WHERE clause columns
  - [ ] JOIN condition columns
  - [ ] ORDER BY columns
- [ ] SELECT only needed columns (avoid SELECT *)
- [ ] Use appropriate JOIN type (INNER vs LEFT)
- [ ] Avoid N+1 queries (use JOINs or eager loading)
- [ ] Use pagination for large result sets
- [ ] Check for unused indexes (slow down writes)
- [ ] Consider query caching for frequent queries
- [ ] Test with production-like data volumes
- [ ] Monitor query performance over time

Notes

  • Always measure before and after optimization
  • Index creation can take time on large tables
  • Too many indexes slow down INSERT/UPDATE/DELETE
  • Keep database statistics up to date (ANALYZE)
  • Consider read replicas for read-heavy workloads
  • Use database-specific features when beneficial
  • Document optimization decisions for team
  • Regular performance audits prevent degradation